Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEPR: Deprecate pandas.core.datetools #14105

Merged

Conversation

Projects
None yet
4 participants
@gfyoung
Copy link
Member

commented Aug 28, 2016

Title is self-explanatory. Closes #14094.

@codecov-io

This comment has been minimized.

Copy link

commented Aug 28, 2016

Current coverage is 85.26% (diff: 78.26%)

Merging #14105 into master will decrease coverage by <.01%

@@             master     #14105   diff @@
==========================================
  Files           139        140     +1   
  Lines         50562      50604    +42   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits          43116      43149    +33   
- Misses         7446       7455     +9   
  Partials          0          0          

Powered by Codecov. Last update 59524af...2bb887d

@jorisvandenbossche

This comment has been minimized.

Copy link
Member

commented Aug 28, 2016

Can you leave out the changes in the benchmarks? (Avoiding conflicts with my pr, i will incorporate changes there)

@gfyoung

This comment has been minimized.

Copy link
Member Author

commented Aug 28, 2016

@jorisvandenbossche : Ah, I didn't see #14099. Sure thing. Once I get Travis to pass, I'll remove the commit. In the meantime, I'll add a reminder to yours.

@gfyoung gfyoung force-pushed the forking-repos:datetools-refactor branch Aug 28, 2016

@jreback

View changes

pandas/api/tests/test_api.py Outdated
deprecated_classes_in_future = ['Term', 'Panel']

# these should be removed from top-level namespace
remove_classes_from_top_level_namespace = ['Expr']

# external modules exposed in pandas namespace
modules = ['np', 'datetime', 'datetools']
modules = ['np', 'datetime', 'sys', 'warnings']

This comment has been minimized.

Copy link
@jreback

jreback Aug 28, 2016

Contributor

no don't add

This comment has been minimized.

Copy link
@gfyoung

gfyoung Aug 28, 2016

Author Member

Saying "don't add" doesn't help since I needed to add them to avoid test failures.

This comment has been minimized.

Copy link
@jreback

jreback Aug 28, 2016

Contributor

I already told u this on the issue

you are polluting the namespace

This comment has been minimized.

Copy link
@gfyoung

gfyoung Aug 28, 2016

Author Member

Sure, but you provided no solution to not do it given my situation. Saying "don't do something" with no suggestion of a solution is not helpful whatsoever.

This comment has been minimized.

Copy link
@jreback

jreback Aug 28, 2016

Contributor

you are writing the pr
I am telling you that this is wrong
that is helpful
it's up to you fix it
if you see later I do tell you how though
by using a separate module

This comment has been minimized.

Copy link
@gfyoung

gfyoung Aug 28, 2016

Author Member

It's not helpful because I don't know how to proceed with it. Sure, you did provide comments below that provide further guidance, but in isolation, "fixing it" is far from clear.

@jreback

View changes

pandas/core/api.py Outdated
# legacy
import pandas.core.datetools as datetools

class _DatetoolsModule(object):

This comment has been minimized.

Copy link
@jreback

jreback Aug 28, 2016

Contributor

doesn't belong here

This comment has been minimized.

Copy link
@gfyoung

gfyoung Aug 28, 2016

Author Member

Well, where does it go then?

This comment has been minimized.

Copy link
@jreback

jreback Aug 28, 2016

Contributor

make this a bit more generic and put in pandas.util

@jreback

View changes

pandas/tseries/offsets.py Outdated

cday = CDay()
customBusinessMonthEnd = CBMonthEnd()
customBusinessDay = CustomBusinessDay()

This comment has been minimized.

Copy link
@jreback

jreback Aug 28, 2016

Contributor

no don't add these

This comment has been minimized.

Copy link
@gfyoung

gfyoung Aug 28, 2016

Author Member

What do you mean? They're littered all over the code base. The tests actually make good use of these objects.

This comment has been minimized.

Copy link
@jreback

jreback Aug 28, 2016

Contributor

they need to be removed
AND these need specific function depreciations as I said

these are non pythonic

This comment has been minimized.

Copy link
@gfyoung

gfyoung Aug 28, 2016

Author Member

Read your comments again in the issue. You did not have a similar level of conviction.

This comment has been minimized.

Copy link
@jreback

jreback Aug 28, 2016

Contributor

you wholesale added these back I never said that
I just want them deprecated
I don't say this was easy

@gfyoung gfyoung force-pushed the forking-repos:datetools-refactor branch Aug 28, 2016

@jreback jreback added the Deprecate label Aug 28, 2016

@gfyoung gfyoung force-pushed the forking-repos:datetools-refactor branch 2 times, most recently Aug 28, 2016

@gfyoung

This comment has been minimized.

Copy link
Member Author

commented Sep 1, 2016

@jreback , @jorisvandenbossche : Travis is passing. Ready to merge if there are no other concerns.

@jreback

View changes

pandas/util/depr_module.py Outdated
self.removals = removals

def __getattr__(self, name):
if name in dir(self.__class__):

This comment has been minimized.

Copy link
@jreback

jreback Sep 1, 2016

Contributor

set this on creation as this is a list so checking will be expensive (make it a set)

This comment has been minimized.

Copy link
@gfyoung

gfyoung Sep 1, 2016

Author Member

Let's upgrade to a frozenset. Done.

@jreback

View changes

pandas/util/depr_module.py Outdated
def __init__(self, deprmod, alts=None, removals=None):
self.deprmod = deprmod
self.alts = alts
self.removals = removals

This comment has been minimized.

Copy link
@jreback

jreback Sep 1, 2016

Contributor

make these sets

This comment has been minimized.

Copy link
@gfyoung

gfyoung Sep 1, 2016

Author Member

As before, let's upgrade to a frozenset. Done.

_alts = ['pandas.tseries.tools', 'pandas.tseries.offsets',
'pandas.tseries.frequencies']
_removals = ['day', 'bday', 'businessDay', 'cday', 'customBusinessDay',
'customBusinessMonthEnd', 'customBusinessMonthBegin',

This comment has been minimized.

Copy link
@jreback

jreback Sep 1, 2016

Contributor

hmm can't u just introspect the module when u r creating _DeprecatedModule? iow look at all of the symbols the dir() pulls up and use them? rather than listing them all here (just seems simpler / less error prone)

This comment has been minimized.

Copy link
@gfyoung

gfyoung Sep 1, 2016

Author Member

I tried, but I don't think sys.modules picked it up when I removed the import.


if self.alts is not None:
for modname in self.alts:
module = importlib.import_module(modname)

This comment has been minimized.

Copy link
@jreback

jreback Sep 1, 2016

Contributor

I think need to keep track of what's imported here so don't have to iterate thru these each time?

This comment has been minimized.

Copy link
@gfyoung

gfyoung Sep 1, 2016

Author Member

The imports are cached, so the iterating should have no effect I would think.

@gfyoung gfyoung force-pushed the forking-repos:datetools-refactor branch Sep 1, 2016

@gfyoung

This comment has been minimized.

Copy link
Member Author

commented Sep 1, 2016

@jreback , @jorisvandenbossche : Travis is passing. Ready to merge if there are no other concerns.

@jreback

This comment has been minimized.

Copy link
Contributor

commented Sep 2, 2016

need to add __dir__ to _DeprecatedModule to get something reasonable, IOW the list of removals should be there and the namespace imported from the alts.

In [11]: pd.datetools.alts
Out[11]: 
frozenset({'pandas.tseries.frequencies',
           'pandas.tseries.offsets',
           'pandas.tseries.tools'})

In [12]: pd.datetools.removals
Out[12]: 
frozenset({'bday',
           'bmonthBegin',
           'bmonthEnd',
           'bquarterEnd',
           'businessDay',
           'byearEnd',
           'cbmonthBegin',
           'cbmonthEnd',
           'cday',
           'customBusinessDay',
           'customBusinessMonthBegin',
           'customBusinessMonthEnd',
           'day',
           'monthEnd',
           'quarterEnd',
           'week',
           'yearBegin',
           'yearEnd'})
@jreback

This comment has been minimized.

Copy link
Contributor

commented Sep 2, 2016

further I am not sure this actually works

In [1]: dir(pandas.datetools)
Out[1]: 
['__class__',
 '__delattr__',
 '__dict__',
 '__doc__',
 '__format__',
 '__getattr__',
 '__getattribute__',
 '__hash__',
 '__init__',
 '__module__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 'alts',
 'deprmod',
 'removals',
 'self_dir']

In [17]: from pandas.datetools import to_datetime
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-17-014abe5066ec> in <module>()
----> 1 from pandas.datetools import to_datetime

ImportError: No module named datetools

something is wrong

@jreback

This comment has been minimized.

Copy link
Contributor

commented Sep 2, 2016

pls add some tests to verify the actual deprecations themselves (e.g. that you can import the previous things, or a sample of them)

@gfyoung

This comment has been minimized.

Copy link
Member Author

commented Sep 2, 2016

@jreback : Have you tried running from pandas.datetools import to_datetime on master? It doesn't work either. I don't think that's a problem.

@gfyoung

This comment has been minimized.

Copy link
Member Author

commented Sep 2, 2016

@jreback : I don't want to overload __dir__ as you described because then I can't differentiate methods that apart of the class itself and methods that are meant to be in removals OR alts. That's the purpose of my first check in __getattr__.

@gfyoung

This comment has been minimized.

Copy link
Member Author

commented Sep 2, 2016

@jreback : Don't fully understand your comments about testing. Aren't I doing that already?

@jreback

This comment has been minimized.

Copy link
Contributor

commented Sep 2, 2016

In [2]: pd.__version__
Out[2]: '0.18.1+425.gd26363b'

In [3]: dir(pd.datetools)[0:10]
Out[3]: 
['ABCDataFrame',
 'ABCIndexClass',
 'ABCSeries',
 'AmbiguousTimeError',
 'BDay',
 'BMonthBegin',
 'BMonthEnd',
 'BQuarterBegin',
 'BQuarterEnd',
 'BYearBegin']

@jreback

This comment has been minimized.

Copy link
Contributor

commented Sep 2, 2016

@gfyoung you can easily override __dir__, you know whats in alt/ removals (well you know as soon as you introspect them), which you can do lazily, e.g. when its needed

@gfyoung

This comment has been minimized.

Copy link
Member Author

commented Sep 2, 2016

@jreback : Ah, that's a fair point. I can just first check if it's in dir and then introspect.

@jreback

This comment has been minimized.

Copy link
Contributor

commented Sep 2, 2016

yep. The ideal thing is to replicate as much as possible the existing behavior (and just show a depreaction warning).

@gfyoung

This comment has been minimized.

Copy link
Member Author

commented Sep 6, 2016

@jorisvandenbossche : That isn't a property frozenset, but rather one of set in particular. set is unordered by definition. However, your point about the "wrong" import being used is indicative of namespace pollution within each module.

What would you suggest I do then?

@jorisvandenbossche

This comment has been minimized.

Copy link
Member

commented Sep 6, 2016

I didn't want to imply it was a consequence of the 'frozen' aspect of the sets :-). But initially you used a list I think? And performance was the reason to change it to set? Simply using a list would keep the preferred order.
Personally, my preference in this case would go for correctness rather than performance (although what is correct is also debatable ... as you noted correctly the polluted namespaces between frequencies and offsets), alts is in this case a list/set of 3 elements, the in check for that will not be that critical here IMO

@jorisvandenbossche jorisvandenbossche added this to the 0.19.0rc milestone Sep 6, 2016

@gfyoung

This comment has been minimized.

Copy link
Member Author

commented Sep 6, 2016

@jorisvandenbossche : It was a general performance consideration that @jreback brought up. I originally used a list, but then he pointed out that set is faster.

While I do see your point about the imports and how they should come from specific places to be consistent with documentation, IMO the code is correct as is and shouldn't have to tailor to the namespace pollution that I pointed out.

@jreback

This comment has been minimized.

Copy link
Contributor

commented Sep 6, 2016

@gfyoung its a bit more work, but you can figure out where the imports are actually from, e.g. you can actually do the from pandas.tseries.tools import * (for each of the removals), then creating a mapping from the attr to the import. I know its a pain, but I think its necessary.

I actually did something like this (for some code I am working on which is old). I know where things are, and it still took some time / trial and error to figure out the correct imports :)

not to mention the monthEnd = MonthEnd() is really odd (though it IS kind of like a singelton)

@gfyoung

This comment has been minimized.

Copy link
Member Author

commented Sep 6, 2016

@jreback : from <tseries module> import * won't help since it still will import all other namespaces that are polluting it IINM.

@jorisvandenbossche

This comment has been minimized.

Copy link
Member

commented Sep 6, 2016

I know it is not generic, but just using a list for alts instead of set works for this case. I don't see the need to make it more complicated than that.

@jorisvandenbossche

This comment has been minimized.

Copy link
Member

commented Sep 6, 2016

Actually, can't we get the information from the object itself? (of course in this case where we want the full path it will give the right thing (frequencies or offsets), it will also not work generically for all imports where more top-level paths are used).

In [23]: getattr(pd.dateools, 'BDay')
Out[23]: pandas.tseries.offsets.BusinessDay

gives you that it should be 'offsets' and not frequencies

@gfyoung

This comment has been minimized.

Copy link
Member Author

commented Sep 6, 2016

@jorisvandenbossche : How do you extract the full class path from this?

>>> getattr(pd.dateools, 'BDay')
<class 'pandas.tseries.offsets.BusinessDay'>

Perhaps there's an obvious way, but I don't see one.

If there isn't, then we might need to switch back to a list or the OrderedSet cookbook here 😄

@jreback : Thoughts?

@jorisvandenbossche

This comment has been minimized.

Copy link
Member

commented Sep 7, 2016

__module__ gives me the path:

In [40]: obj = getattr(pd.datetools, 'BDay')

In [41]: obj.__module__
Out[41]: 'pandas.tseries.offsets'
@gfyoung

This comment has been minimized.

Copy link
Member Author

commented Sep 7, 2016

@jorisvandenbossche : That was indeed an obvious way. Completely escaped my mind. 😄

@jorisvandenbossche

This comment has been minimized.

Copy link
Member

commented Sep 7, 2016

I am going to merge this, we can fix the actual depr warnings with correct modules in a follow-up PR, but then at least the deprecations are included in the rc

@jorisvandenbossche jorisvandenbossche merged commit 3f3839b into pandas-dev:master Sep 7, 2016

3 checks passed

codecov/patch 78.26% of diff hit (target 50.00%)
Details
codecov/project 85.26% (target 82.00%)
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@jorisvandenbossche jorisvandenbossche modified the milestones: 0.19.0rc, 0.19.0 Sep 7, 2016

@gfyoung gfyoung deleted the forking-repos:datetools-refactor branch Sep 7, 2016

gfyoung added a commit to forking-repos/pandas that referenced this pull request Sep 9, 2016

MAINT: Use __module__ in _DeprecatedModule.
Follow-up to pandas-devgh-14105. Uses the '__module__' method
to correctly determine the location of the alternative
module to use.

trbs added a commit to trbs/pandas that referenced this pull request Sep 12, 2016

Merge github.com:pydata/pandas into issue_9084_get_schema_index_param…
…eter

* github.com:pydata/pandas: (554 commits)
  BUG: compat with Stata ver 111
  Fix: F999 dictionary key '2000q4' repeated with different values (pandas-dev#14198)
  BLD: Test for Python 3.5 with C locale
  BUG: DatetimeTZBlock can't assign values near dst boundary
  BUG: union_categorical with Series and cat idx
  BUG: fix str.contains for series containing only nan values
  BUG: Categorical constructor not idempotent with ext dtype
  TST: Make encoded sep check more locale sensitive (pandas-dev#14161)
  DOC: minor typo in 0.19.0 whatsnew file (pandas-dev#14185)
  BUG: fix tz-aware datetime convert to DatetimeIndex (GH 14088)
  BUG : bug in setting a slice of a Series with a np.timedelta64
  RLS: v0.19.0rc1
  DOC: clean-up 0.19.0 whatsnew file (pandas-dev#14176)
  DOC: cleanup build warnings (pandas-dev#14172)
  Add steps to run gbq integration testing to the contributing docs (pandas-dev#14144)
  ENH: concat and append now can handle unordered categories (pandas-dev#13767)
  DEPR: Deprecate pandas.core.datetools (pandas-dev#14105)
  API/DEPR: Remove +/- as setops for DatetimeIndex/PeriodIndex (GH9630) (pandas-dev#14164)
  Fix trivial typo in comment (pandas-dev#14174)
  API/DEPR: Remove +/- as setops for Index (GH8227) (pandas-dev#14127)
  ...

trbs added a commit to trbs/pandas that referenced this pull request Sep 12, 2016

Merge github.com:pydata/pandas into to_sql_secondary_indexes
* github.com:pydata/pandas: (554 commits)
  BUG: compat with Stata ver 111
  Fix: F999 dictionary key '2000q4' repeated with different values (pandas-dev#14198)
  BLD: Test for Python 3.5 with C locale
  BUG: DatetimeTZBlock can't assign values near dst boundary
  BUG: union_categorical with Series and cat idx
  BUG: fix str.contains for series containing only nan values
  BUG: Categorical constructor not idempotent with ext dtype
  TST: Make encoded sep check more locale sensitive (pandas-dev#14161)
  DOC: minor typo in 0.19.0 whatsnew file (pandas-dev#14185)
  BUG: fix tz-aware datetime convert to DatetimeIndex (GH 14088)
  BUG : bug in setting a slice of a Series with a np.timedelta64
  RLS: v0.19.0rc1
  DOC: clean-up 0.19.0 whatsnew file (pandas-dev#14176)
  DOC: cleanup build warnings (pandas-dev#14172)
  Add steps to run gbq integration testing to the contributing docs (pandas-dev#14144)
  ENH: concat and append now can handle unordered categories (pandas-dev#13767)
  DEPR: Deprecate pandas.core.datetools (pandas-dev#14105)
  API/DEPR: Remove +/- as setops for DatetimeIndex/PeriodIndex (GH9630) (pandas-dev#14164)
  Fix trivial typo in comment (pandas-dev#14174)
  API/DEPR: Remove +/- as setops for Index (GH8227) (pandas-dev#14127)
  ...

jorisvandenbossche added a commit that referenced this pull request Sep 14, 2016

MAINT: Use __module__ in _DeprecatedModule. (#14181)
Follow-up to gh-14105. Uses the '__module__' method
to correctly determine the location of the alternative
module to use.

timkpaine added a commit to timkpaine/mdf that referenced this pull request May 7, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.