Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEPR: Deprecate generic timestamp dtypes #15987

Merged
merged 2 commits into from Apr 14, 2017

Conversation

Projects
None yet
3 participants
@gfyoung
Copy link
Member

commented Apr 12, 2017

We only use the nanosecond frequency, and numpy doesn't even handle generic timestamp dtype well internally with its ndarray object.

xref #15524 (comment).

@gfyoung gfyoung force-pushed the forking-repos:timestamp64-no-freq branch from 910bcbc to 3ec69e8 Apr 12, 2017

dtype = np.timedelta64
s = Series([], dtype=dtype)

self.assertTrue(s.empty)

This comment has been minimized.

Copy link
@jreback

jreback Apr 12, 2017

Contributor

use new pytest when you can, e.g.

assert s.empty
assert s.dtype == 'm8[ns]'

This comment has been minimized.

Copy link
@gfyoung

gfyoung Apr 12, 2017

Author Member

Sure thing. Done.

This comment has been minimized.

Copy link
@gfyoung

gfyoung Apr 12, 2017

Author Member

As a side note, not really a fan of having to rewrite every single test comparison just to be more "idiomatic" with the existing test standard.

Why not write our own in pandas.util.testing that we can implement according to idiom and call from everywhere else?

e.g.

def assert_true(val):
    assert val

def assert_equal(a, b):
    assert a == b

The idiom is thus abstracted away from our tests, and we don't have to worry about it when writing test cases.

This comment has been minimized.

Copy link
@gfyoung

gfyoung Apr 12, 2017

Author Member

Actually, as I write this, I see that we do have one of these functions already:

assert_equal(a, b, msg="")

I think from now on, we should advocate that people use our static util.testing functions (and avoid using self.assert... or assert keyword).

def test_astype_empty_constructor_equality(self):
# see gh-15524

for dtype in np.typecodes['All']:

This comment has been minimized.

Copy link
@jreback

jreback Apr 12, 2017

Contributor

paramerize when you can on new tests (you may need to create a new class that inherits from object and/or write as bare functions)

This comment has been minimized.

Copy link
@gfyoung

gfyoung Apr 12, 2017

Author Member

unittest classes and pytest decorators do not mesh well together based on experience. I can't find anything so far in documentation on the pytest to do so. Unless you know of a straightforward way to do this, I'm leaving as is.

This comment has been minimized.

Copy link
@gfyoung

gfyoung Apr 12, 2017

Author Member

Barring some native pytest way to implement, a test decorator is the only way to do this (similar to what was done with capturing stdout and stderr). This is probably preferable since it would be independent of testing platform.

However, I would prefer to leave that as a follow-up so I can experiment separately.

This comment has been minimized.

This comment has been minimized.

Copy link
@gfyoung

gfyoung Apr 12, 2017

Author Member

Those docs don't answer my question. Does not explain how to parameterize a test method within tm.TestCase. As I already mentioned, unittest and pytest don't mesh well.

This comment has been minimized.

Copy link
@gfyoung

gfyoung Apr 12, 2017

Author Member

I'm not saying that I don't disagree about parameterizing. However, I think it is preferable and easier to implement our own parameterizaton decorator (just as we did with capture), that is all.

This comment has been minimized.

Copy link
@jreback

jreback Apr 12, 2017

Contributor

i am trying to shrink the codebase
implementing parameteize is not trivial and would need a whole test suite - it diesnt add any value

This comment has been minimized.

Copy link
@gfyoung

gfyoung Apr 13, 2017

Author Member

Yes but neither is incorporating pytest's builtin paramterize without adding new code (i.e. an entirely new class). The only way to avoid that is to write this is as a bare function (not a test method of tm.TestCase), but I prefer the test class organization if possible.

This comment has been minimized.

Copy link
@jreback

jreback Apr 13, 2017

Contributor

you can just make another class inherit from object AS i show in the example

This comment has been minimized.

Copy link
@gfyoung

gfyoung Apr 13, 2017

Author Member

There actually is no example where you create a class that inherits from object. However, that being said, having been able to sit down and re-examine the class suggestion, I do see what you mean now.

Several things can come from this discussion:

  1. Document how you can instantiate test classes that inherit from object (and not tm.TestCase). I still prefer writing test classes instead of raw functions for organizational reasons, though it will still follow pytest idiom.

  2. Make a new PR that modifies the changes in #15952 to use pytest fixtures for redirecting stdout and stdin (that will take substantial work though).

  3. Modify the existing test class in this file to inherit only from object (new commit).

# numpy arrays don't handle generic timestamp dtypes well. Since
# we only use the nanosecond frequency, interpret generic timestamp
# dtypes as nanosecond frequency.
if dtype.name == "datetime64":

This comment has been minimized.

Copy link
@jreback

jreback Apr 12, 2017

Contributor

this is way too hacky to put this here, this should be trapped in maybe_cast_to_datetime, or if not, then let's create a generic way (via function there to do this).

This comment has been minimized.

Copy link
@gfyoung

gfyoung Apr 12, 2017

Author Member

I would agree, except that when we call .astype, maybe_cast_to_datetime does not get called. But certainly, there are other places that casting can go.

@gfyoung gfyoung force-pushed the forking-repos:timestamp64-no-freq branch from 3ec69e8 to d581e3e Apr 12, 2017

@codecov

This comment has been minimized.

Copy link

commented Apr 12, 2017

Codecov Report

Merging #15987 into master will decrease coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #15987      +/-   ##
==========================================
- Coverage   91.03%   91.01%   -0.02%     
==========================================
  Files         145      145              
  Lines       49587    49585       -2     
==========================================
- Hits        45141    45132       -9     
- Misses       4446     4453       +7
Flag Coverage Δ
#multiple 88.78% <100%> (-0.02%) ⬇️
#single 40.57% <33.33%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/internals.py 93.55% <100%> (ø) ⬆️
pandas/types/cast.py 86.51% <100%> (+0.77%) ⬆️
pandas/tools/plotting.py 71.75% <0%> (-0.72%) ⬇️
pandas/core/common.py 91.03% <0%> (+0.34%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7b8a6b1...d581e3e. Read the comment docs.

@codecov

This comment has been minimized.

Copy link

commented Apr 12, 2017

Codecov Report

Merging #15987 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #15987      +/-   ##
==========================================
+ Coverage      91%   91.01%   +<.01%     
==========================================
  Files         145      145              
  Lines       49639    49649      +10     
==========================================
+ Hits        45175    45188      +13     
+ Misses       4464     4461       -3
Flag Coverage Δ
#multiple 88.78% <100%> (ø) ⬆️
#single 40.53% <25%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/types/cast.py 86.14% <100%> (+1.02%) ⬆️
pandas/core/common.py 90.68% <0%> (-0.35%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cd35d22...eb4df87. Read the comment docs.

@gfyoung gfyoung force-pushed the forking-repos:timestamp64-no-freq branch 3 times, most recently from 172f94d to 7cfa4d5 Apr 13, 2017

@gfyoung

This comment has been minimized.

Copy link
Member Author

commented Apr 13, 2017

@jreback : All comments have been addressed, and everything is green.

@jorisvandenbossche

This comment has been minimized.

Copy link
Member

commented Apr 13, 2017

In the issue, the discussion was about construction of empty series objects, while here you also added constructing/astyping actual data with non-parameterized np.datetime64. As I said in the issue (#15524 (comment), so my comment was relevant after all :-)), I don't see a strong reason to allow this.

@jorisvandenbossche

This comment has been minimized.

Copy link
Member

commented Apr 13, 2017

@gfyoung I see now you replied in the issue:

Those examples you bring up above, should we follow in numpy 's footsteps? We only use one frequency, so I don't see why those two examples wouldn't work.

There is maybe not a strong reason that they shouldn't work, but I also don't see a good reason that they should work. It just expands what users can do in an unnecessary way. Why is it needed that people can do .astype(np.datetime) while they already can do .astype('datetime64[ns]') which is just more explicit (and does not 'misuse' the numpy dtype by interpreting it differently as numpy intended).

@gfyoung

This comment has been minimized.

Copy link
Member Author

commented Apr 13, 2017

@jorisvandenbossche : Well, it has to go one way or the other. Either you can astype and initialize with the dtype (empty or non-empty), or you can't do either. The consensus with @jreback was that you should be able to do both. The point of this PR was to make behavior consistent.

I don't have any major issues going the other way and preventing anyone from casting to the generic dtype (i.e. np.datetime64 or np.timedelta64) via astype or the constructor (though personally, having this convenience is nice because I sometimes forget how to format / exactly specify the frequency). However, some agreement on how to proceed with this would be good.

@jreback

This comment has been minimized.

Copy link
Contributor

commented Apr 13, 2017

I am ok with this (still have to review the change again) as we already support this.

In [4]: Series([]).astype(np.datetime64)
Out[4]: Series([], dtype: datetime64[ns])

this should be consistent with direct construction (via dtype).

@jreback

This comment has been minimized.

Copy link
Contributor

commented Apr 13, 2017

On second though, we could deprecate accepting direct np.datetime64 as a passed dtype (this doesn't affect accept an array already in this format, just the dtype= kw and .astype())

So the principal of construction / conversion should work with the same type will be preserved.

I think I do like being more explicit here (IOW doing the deprecation). @gfyoung can you see what this breaks?

IIRC this was originally an issue to fix something in csv parsing. But this was an orthogonal change (I don't even really understand how this change was related, maybe you can show that?)

@jorisvandenbossche

This comment has been minimized.

Copy link
Member

commented Apr 13, 2017

(this doesn't affect accept an array already in this format, just the dtype= kw and .astype())

And for dtype= it already fails:

In [5]: Series([], dtype=np.datetime64)
...
TypeError: cannot convert datetimelike to dtype [datetime64]

so it is only astype that would change.

If possible, I would be in favor of not allowing bare np.datetime64 (and thus deprecating the astype case)

@jreback

This comment has been minimized.

Copy link
Contributor

commented Apr 13, 2017

@jorisvandenbossche yep its the .astype that works now, not the direct construction. So let's investigate deprecating directly specifying np.datetime64, np.timedelta64 in .astype and see what happens.

@gfyoung

This comment has been minimized.

Copy link
Member Author

commented Apr 13, 2017

@jreback , @jorisvandenbossche : That sounds like a good plan. Shouldn't be too bad I imagine.

@jorisvandenbossche BTW, the reason for that weird error message that you saw in the issue here is because at some point, we call arr.view(dtype), which when dtype=np.datetime64 causes numpy to get angry (internal representation breaks), and it can't display a proper error message.

@gfyoung gfyoung force-pushed the forking-repos:timestamp64-no-freq branch from 7cfa4d5 to 5a5c1e6 Apr 13, 2017

@gfyoung gfyoung changed the title ENH: Handle generic timestamp dtypes with Series DEPR: Deprecate generic timestamp dtypes Apr 13, 2017

@gfyoung gfyoung force-pushed the forking-repos:timestamp64-no-freq branch from 5a5c1e6 to 7453b63 Apr 13, 2017

@gfyoung

This comment has been minimized.

Copy link
Member Author

commented Apr 13, 2017

Deprecation worked out reasonably well, but I can't really figure out why AppVeyor and Circle are failing my tests. The Travis logs are clean (and passing), so I don't think that there's any hidden warning being issued that's causing AppVeyor and Circle to miss them.

Suggestions on how to fix this?

@jorisvandenbossche
Copy link
Member

left a comment

Don't directly see what could be going on with circle ci and appveyor.

"deprecated and will raise in a future version. "
"Please pass in '{dtype}[ns]' instead.")
warnings.warn(msg.format(dtype=dtype.name),
FutureWarning, stacklevel=2)

This comment has been minimized.

Copy link
@jorisvandenbossche

jorisvandenbossche Apr 13, 2017

Member

I think this stacklevel should be 8

This comment has been minimized.

Copy link
@gfyoung

gfyoung Apr 13, 2017

Author Member

I think you meant 5? (same as the one beneath). Done.

This comment has been minimized.

Copy link
@jorisvandenbossche

jorisvandenbossche Apr 14, 2017

Member

No, I think this should be 8 (at least, that's what I counted). But you didn't remove the stacklevel=False, so this does not fail the tests currently

if dtype.name in ('datetime64', 'datetime64[ns]'):
if dtype.name == 'datetime64':
warnings.warn(msg.format(dtype=dtype.name),
FutureWarning, stacklevel=2)

This comment has been minimized.

Copy link
@jorisvandenbossche

jorisvandenbossche Apr 13, 2017

Member

and this one 5

This comment has been minimized.

Copy link
@gfyoung

gfyoung Apr 13, 2017

Author Member

Done.

if dtype.name in ('timedelta64', 'timedelta64[ns]'):
if dtype.name == 'timedelta64':
warnings.warn(msg.format(dtype=dtype.name),
FutureWarning, stacklevel=2)

This comment has been minimized.

Copy link
@jorisvandenbossche

jorisvandenbossche Apr 13, 2017

Member

and this one also 5

This comment has been minimized.

Copy link
@gfyoung

gfyoung Apr 13, 2017

Author Member

Done.

data = [1]

with tm.assert_produces_warning(FutureWarning,
check_stacklevel=False):

This comment has been minimized.

Copy link
@jorisvandenbossche

jorisvandenbossche Apr 13, 2017

Member

Can you try to remove the check_stacklevel here? (btw, easiest to try out what it should be is the set warnings to raise an error, and then you can count the number of stacks in the traceback)

This comment has been minimized.

Copy link
@gfyoung

gfyoung Apr 13, 2017

Author Member

Yep, that's fair. Done.

This comment has been minimized.

Copy link
@jorisvandenbossche

jorisvandenbossche Apr 14, 2017

Member

It doesn't look like you removed this?

warning_type = FutureWarning

with tm.assert_produces_warning(warning_type,
check_stacklevel=False):

This comment has been minimized.

Copy link
@jorisvandenbossche

jorisvandenbossche Apr 13, 2017

Member

If it may help, I don't think it is necessarily needed to assert the warning here (as you already have an explicit test for that above), but you can also just catch warnings so they do not appear in the log (with warnings.catch_warnings(record=True):)

This comment has been minimized.

Copy link
@gfyoung

gfyoung Apr 13, 2017

Author Member

Makes sense. Done.

@gfyoung gfyoung force-pushed the forking-repos:timestamp64-no-freq branch 2 times, most recently from 282fd39 to 6445606 Apr 13, 2017

@jsexauer jsexauer referenced this pull request Apr 13, 2017

Open

DEPR: deprecations from prior versions #6581

0 of 94 tasks complete
@jreback

This comment has been minimized.

Copy link
Contributor

commented Apr 13, 2017

lgtm. I see you 'fixed' the passing np.datetime64 to the constructor (and then deprecated it), but that's fine.

ping on green.

@gfyoung

This comment has been minimized.

Copy link
Member Author

commented Apr 13, 2017

Sounds good...once I can figure out these warnings 😁

gfyoung added some commits Apr 12, 2017

DEPR: Deprecate generic timestamp dtypes
We only use the nanosecond frequency, and numpy
doesn't even handle generic timestamp dtypes well.

xref gh-15524 (comment).

@gfyoung gfyoung force-pushed the forking-repos:timestamp64-no-freq branch from 6445606 to eb4df87 Apr 14, 2017

@gfyoung

This comment has been minimized.

Copy link
Member Author

commented Apr 14, 2017

@jreback : Everything is green again!

@jreback jreback added this to the 0.20.0 milestone Apr 14, 2017

@jreback jreback merged commit ebc0c09 into pandas-dev:master Apr 14, 2017

5 checks passed

ci/circleci Your tests passed on CircleCI!
Details
codecov/patch 100% of diff hit (target 50%)
Details
codecov/project 91.01% (target 82%)
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@jreback

This comment has been minimized.

Copy link
Contributor

commented Apr 14, 2017

thanks!

@gfyoung gfyoung deleted the forking-repos:timestamp64-no-freq branch Apr 14, 2017

linebp added a commit to linebp/pandas that referenced this pull request Apr 17, 2017

DEPR: Deprecate generic timestamp dtypes (pandas-dev#15987)
* DEPR: Deprecate generic timestamp dtypes

We only use the nanosecond frequency, and numpy
doesn't even handle generic timestamp dtypes well.

xref pandas-devgh-15524 (comment).

* TST: Use pytest idioms in series/test_dtypes.py

gfyoung added a commit to forking-repos/pandas that referenced this pull request Oct 28, 2018

API: Disallow dtypes w/o frequency when casting
Previously deprecated for Series constructor
and the `.astype` method. Now being enforced.

xref pandas-devgh-15987.

@jreback jreback referenced this pull request Oct 28, 2018

Open

DEPR: deprecations log for removed issues #13777

135 of 135 tasks complete

gfyoung added a commit to forking-repos/pandas that referenced this pull request Oct 28, 2018

API: Disallow dtypes w/o frequency when casting
Previously deprecated for Series constructor
and the `.astype` method. Now being enforced.

xref pandas-devgh-15987.

gfyoung added a commit to forking-repos/pandas that referenced this pull request Oct 28, 2018

API: Disallow dtypes w/o frequency when casting
Previously deprecated for Series constructor
and the `.astype` method. Now being enforced.

xref pandas-devgh-15987.

jreback added a commit that referenced this pull request Oct 28, 2018

API: Disallow dtypes w/o frequency when casting (#23392)
Previously deprecated for Series constructor
and the `.astype` method. Now being enforced.

xref gh-15987.

brute4s99 added a commit to brute4s99/pandas that referenced this pull request Nov 19, 2018

API: Disallow dtypes w/o frequency when casting (pandas-dev#23392)
Previously deprecated for Series constructor
and the `.astype` method. Now being enforced.

xref pandas-devgh-15987.

Pingviinituutti added a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019

API: Disallow dtypes w/o frequency when casting (pandas-dev#23392)
Previously deprecated for Series constructor
and the `.astype` method. Now being enforced.

xref pandas-devgh-15987.

Pingviinituutti added a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019

API: Disallow dtypes w/o frequency when casting (pandas-dev#23392)
Previously deprecated for Series constructor
and the `.astype` method. Now being enforced.

xref pandas-devgh-15987.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.