Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ExtensionArray construction with given dtype (sort of shallow_copy?) #20747

Closed
jorisvandenbossche opened this issue Apr 19, 2018 · 2 comments · Fixed by #21160 or #23613
Closed

ExtensionArray construction with given dtype (sort of shallow_copy?) #20747

jorisvandenbossche opened this issue Apr 19, 2018 · 2 comments · Fixed by #21160 or #23613
Labels
ExtensionArray Extending pandas with custom dtypes or arrays.
Milestone

Comments

@jorisvandenbossche
Copy link
Member

Many of the extension arrays tests are skipped for Categorical because the reconstruction of the expected result does not preserve the categoricals (so kind of the "metadata" of the dtype).

For example:

@pytest.mark.skip(reason="Unobserved categories preseved in concat.")
def test_align(self, data, na_value):
pass

because in the actual test, the expected result is constructed from a list:

r1, r2 = pd.Series(a).align(pd.Series(b, index=[1, 2, 3]))
# Assumes that the ctor can take a list of scalars of the type
e1 = pd.Series(type(data)(list(a) + [na_value]))

(still with type(data)(..), but replacing that with data._constructor_from_sequence(..) in #20746).

This is kind of a recurrent pattern, so that might indicate we should find a solution for this?

So do we want a canonical way in the extension array interface to construct an ExtensionArray that has a certain dtype?

Possible solution is to add a dtype keyword to _constructor_from_sequence

@jorisvandenbossche jorisvandenbossche added the ExtensionArray Extending pandas with custom dtypes or arrays. label Apr 19, 2018
@jorisvandenbossche jorisvandenbossche added this to the 0.23.0 milestone Apr 19, 2018
@TomAugspurger
Copy link
Contributor

This is kind of a recurrent pattern, so that might indicate we should find a solution for this?

Agreed. This is somewhat similar to _from_factorized right, where we pass original rather than the dtype?

No thoughts on a solution yet, but a good test may be to parametrize DecimalDtype with a decimal.Context context. Then we could ensure that the correct Context is passed through on these kinds of operations, via the dtype.

@jreback jreback modified the milestones: 0.23.0, 0.24.0 Apr 24, 2018
jreback added a commit to jreback/pandas that referenced this issue Jul 4, 2018
jreback added a commit to jreback/pandas that referenced this issue Jul 4, 2018
jreback added a commit to jreback/pandas that referenced this issue Jul 5, 2018
jreback added a commit to jreback/pandas that referenced this issue Jul 7, 2018
jreback added a commit to jreback/pandas that referenced this issue Jul 7, 2018
jreback added a commit to jreback/pandas that referenced this issue Jul 7, 2018
jreback added a commit to jreback/pandas that referenced this issue Jul 8, 2018
jreback added a commit to jreback/pandas that referenced this issue Jul 8, 2018
jreback added a commit to jreback/pandas that referenced this issue Jul 8, 2018
jreback added a commit to jreback/pandas that referenced this issue Jul 10, 2018
jreback added a commit to jreback/pandas that referenced this issue Jul 11, 2018
jreback added a commit to jreback/pandas that referenced this issue Jul 12, 2018
jreback added a commit to jreback/pandas that referenced this issue Jul 16, 2018
jreback added a commit that referenced this issue Jul 20, 2018
* ENH: add integer-na support via an ExtensionArray

closes #20700
closes #20747
@jorisvandenbossche
Copy link
Member Author

The tests that I mentioned in the original post are still skipped, so we need to check if this can now be resolved with the added dtype argument.

Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this issue Oct 1, 2018
* ENH: add integer-na support via an ExtensionArray

closes pandas-dev#20700
closes pandas-dev#20747
TomAugspurger added a commit to TomAugspurger/pandas that referenced this issue Nov 10, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ExtensionArray Extending pandas with custom dtypes or arrays.
Projects
None yet
3 participants