Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BLD: consolidate remaining extensions #15537

Closed
wants to merge 1 commit into from

Conversation

Projects
None yet
6 participants
@jreback
Copy link
Contributor

commented Mar 1, 2017

moves extensions to pandas/libs, which holds the extension code and also the generated builds (as its importable). pandas/src is now almost an includes dir, holding low-frequency changing code.

This consolidates the import process making it more uniform and consistent throughout the codebase.

Finally this cleans up the remaining top-level namespace (with some deprecations in place for example pandas.lib, pandas.tslib, pandas.json, pandas.parser. I listed all of the changes in the whatsnew, but I don't think worthwhile deprecating anything else.

@jreback jreback added the Build label Mar 1, 2017

@jreback jreback added this to the 0.20.0 milestone Mar 1, 2017

@jreback jreback force-pushed the jreback:extensions3 branch from 679a3ce to 515a1f7 Mar 1, 2017

@chris-b1

This comment has been minimized.

Copy link
Contributor

commented Mar 1, 2017

I'd suggest deprecating at least tslib. I suspect there are some things like from pandas.tslib import Timestamp in the wild.

@jreback jreback force-pushed the jreback:extensions3 branch 3 times, most recently from df4f898 to 407a415 Mar 1, 2017

@jreback

This comment has been minimized.

Copy link
Contributor Author

commented Mar 1, 2017

@chris-b1 added

@jsexauer jsexauer referenced this pull request Mar 1, 2017

Open

DEPR: deprecations from prior versions #6581

0 of 94 tasks complete

@jreback jreback added the Deprecate label Mar 1, 2017

@jreback

This comment has been minimized.

Copy link
Contributor Author

commented Mar 1, 2017

cc @wesm

@codecov-io

This comment has been minimized.

Copy link

commented Mar 1, 2017

Codecov Report

Merging #15537 into master will decrease coverage by -0.07%.
The diff coverage is 95.42%.

@@            Coverage Diff             @@
##           master   #15537      +/-   ##
==========================================
- Coverage   91.06%   90.99%   -0.07%     
==========================================
  Files         137      143       +6     
  Lines       49313    49295      -18     
==========================================
- Hits        44906    44858      -48     
- Misses       4407     4437      +30
Impacted Files Coverage Δ
pandas/io/msgpack/_version.py 100% <ø> (ø)
pandas/compat/pickle_compat.py 68.29% <ø> (ø)
pandas/io/msgpack/exceptions.py 83.33% <ø> (ø)
pandas/core/base.py 95.51% <100%> (ø)
pandas/io/sas/sas7bdat.py 90.66% <100%> (ø)
pandas/tslib.py 100% <100%> (ø)
pandas/tseries/frequencies.py 96.03% <100%> (-0.01%)
pandas/formats/format.py 95.33% <100%> (ø)
pandas/core/nanops.py 98.17% <100%> (ø)
pandas/util/testing.py 81.11% <100%> (ø)
... and 74 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a347ecb...a6d6cfa. Read the comment docs.

@jreback

This comment has been minimized.

Copy link
Contributor Author

commented Mar 1, 2017

cc @shoyer

FYI (I could make this a DeprecationWarning, but those get ignored :<)

(pandas) bash-3.2$ python
Python 3.5.2 |Continuum Analytics, Inc.| (default, Jul  2 2016, 17:52:12) 
[GCC 4.2.1 Compatible Apple LLVM 4.2 (clang-425.0.28)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas
>>> import xarray
/Users/jreback/miniconda3/envs/pandas/lib/python3.5/site-packages/xarray/conventions.py:9: FutureWarning: The pandas.tslib module is deprecated and will be removed in a future version. Please import from the pandas.libs.tslib instead
  from pandas.tslib import OutOfBoundsDatetime
@shoyer

This comment has been minimized.

Copy link
Member

commented Mar 1, 2017

We used from pandas.tslib import OutOfBoundsDatetime because we needed to catch that specific exception and couldn't find it exposed in the public API anywhere.

It would be nice to move that to public API somewhere, maybe pandas.errors if the top level namespace is getting too crowded.

@jreback jreback referenced this pull request Mar 1, 2017

Closed

API: expose pandas.errors #15541

@jreback jreback force-pushed the jreback:extensions3 branch 4 times, most recently from 81be544 to 6f264e5 Mar 1, 2017

@wesm

This comment has been minimized.

Copy link
Member

commented Mar 2, 2017

This looks fine from superficial glance. Might want to nest src at pandas/libs/src. Sort of a "hear lies all ye compiled code" (eventually headed for Davy Jones's locker once the libpandas games get going. I feel like a broken record but it really is going to happen =) )

@jreback

This comment has been minimized.

Copy link
Contributor Author

commented Mar 2, 2017

@wesm sure that sounds good. yeah that's the idea :>

@jreback jreback force-pushed the jreback:extensions3 branch 2 times, most recently from 5d1d491 to 00a7751 Mar 2, 2017

@jreback

This comment has been minimized.

Copy link
Contributor Author

commented Mar 3, 2017

going to merge on pass.

@jorisvandenbossche

This comment has been minimized.

Copy link
Member

commented Mar 3, 2017

Big +1 on consolidation of where and how the c/cython extensions are organized (and for removing them from the top-level namespace).

Some comments:

  • is there a specific reason libwindow is in core and not libs?
  • the same as I commented in the other PR, IMO we should think about whether we need to recommend users to import from pandas.libs directly (the deprecation warnings now do that).
  • Regarding the indication of 'privateness', something like pandas.tools.libhash looks more public compared to pandas._hash to me (it's not because it is not top-level anymore, that it is not public. We have public non-top-level API). So if we regard that as non-public, I find that not an improvement (that aspect).

I am personally in favour of using more underscores, making it very clear that something is private stuff.

@shoyer

This comment has been minimized.

Copy link
Member

commented Mar 3, 2017

@jreback

This comment has been minimized.

Copy link
Contributor Author

commented Mar 3, 2017

well, underscores do a lot to make this non-readable.

I could make this pandas._libs (IOW, make all extensions private). I prefer to have imports inside pandas code like this

from pandas._libs import tslib, lib
rather than
from pandas.libs import _tslib, _lib

which looks really ugly to me.

@jreback

This comment has been minimized.

Copy link
Contributor Author

commented Mar 3, 2017

to some of @jorisvandenbossche points

  • is there a specific reason libwindow is in core and not libs?

so from a code organization perspective. directly attributable libs make sense to be right next to there respective libs. general libs like lib, algos are centrally located. I did consider window to be with this. But from a dev point of view, editing window.py and then window.pyx right next to each is powerful. It moves the code together.

Similarly for json, sas, and the parser. They are self-contained. Its a tradeoff between dev flexibility and privacy really. I think the lib* naming really helps to differentiate this though and pretty much solves both problems.

  • the same as I commented in the other PR, IMO we should think about whether we need to recommend users to import from pandas.libs directly (the deprecation warnings now do that).

I could do this as pandas._libs as I indicated above. But in reality having underscores all over the place is simply distracting from a dev point of view (and ugly). So could do that, or put a README.

  • Regarding the indication of 'privateness', something like pandas.tools.libhash looks more public compared to pandas._hash to me (it's not because it is not top-level anymore, that it is not public. We have public non-top-level API). So if we regard that as non-public, I find that not an improvement (that aspect).

see comment above. I mean ALL of pandas code is 'public', but that's the point of having pandas.api in the first place anyhow. Sure any code can be used (even _ leading). But I think we need a warning about expectations generally. lib* should never be regarded as public.
I am personally in favour of using more underscores, making it very clear that something is private stuff.

@shoyer

This comment has been minimized.

Copy link
Member

commented Mar 3, 2017

I could make this pandas._libs (IOW, make all extensions private). I prefer to have imports inside pandas code like this

from pandas._libs import tslib, lib
rather than
from pandas.libs import _tslib, _lib

which looks really ugly to me.

Agreed, there's no need to use more than one level of underscores. I also prefer using _libs.* to libs._*.

@jreback

This comment has been minimized.

Copy link
Contributor Author

commented Mar 3, 2017

renamed to _libs. This drops if from the main pd namespace as well, so that's a plus.

@jorisvandenbossche w.r.t. the lib*. I can make those private (but still keep the code in those locations) if desired (prob in a followup though)

@jreback jreback force-pushed the jreback:extensions3 branch from 20b6dbd to 60c0b75 Mar 3, 2017

@jreback jreback changed the title BLD: conslidate remaining extensions BLD: consolidate remaining extensions Mar 4, 2017

@jreback jreback force-pushed the jreback:extensions3 branch 4 times, most recently from bec46a7 to 44be976 Mar 5, 2017

BLD: rename / move some extensions
xref #12588

pandas.parser -> io/libparsers.pyx
pandas.json -> pandas.io.json.libjson
pandas.io.sas.saslib -> libsas
pandas.msgpack -> pandas.io.msgpack
pandas._testing -> pandas.util.libtesting
pandas._sparse -> pandas.sparse.libsparse
pandas._hash -> pandas.tools.libhash
pandas.tslib -> pandas.libs.tslib
pandas.index -> pandas.libs.index
pandas.algos -> pandas.libs.algos
pandas.lib -> pandas.libs.lib
pandas.hashtable -> pandas.libs.hashtable
pandas._window -> pandas.core.libwindow
pandas._join -> pandas.libs.join
move algos*.in, index*.in, hashtable*.in to libs
pandas._period -> pandas.libs.period

@jreback jreback force-pushed the jreback:extensions3 branch from 44be976 to a6d6cfa Mar 7, 2017

@jreback

This comment has been minimized.

Copy link
Contributor Author

commented Mar 7, 2017

bombs away.

@jreback jreback closed this in 648ae4f Mar 7, 2017

jreback added a commit to jreback/pandas that referenced this pull request Mar 8, 2017

jreback added a commit to jreback/pandas that referenced this pull request Mar 8, 2017

jreback added a commit to jreback/pandas that referenced this pull request Mar 8, 2017

jreback added a commit that referenced this pull request Mar 8, 2017

AnkurDedania added a commit to AnkurDedania/pandas that referenced this pull request Mar 21, 2017

BLD: consolidate remaining extensions
moves extensions to pandas/_libs, which holds the extension code
and also the generated builds (as its importable).

pandas/_libs/src is now almost an includes dir, holding low-frequency changing code.
This consolidates the import process making it more uniform and
consistent throughout the codebase.

Finally this cleans up the remaining top-level namespace (with some deprecations in place for
example pandas.lib, pandas.tslib, pandas.json, pandas.parser.

I listed all of the changes in the whatsnew, but I
don't think worthwhile deprecating anything else.

Author: Jeff Reback <jeff@reback.net>

Closes pandas-dev#15537 from jreback/extensions3 and squashes the following commits:

a6d6cfa [Jeff Reback] BLD: rename / move some extensions

AnkurDedania added a commit to AnkurDedania/pandas that referenced this pull request Mar 21, 2017

@jreback jreback referenced this pull request Sep 18, 2018

Open

DEPR: deprecations log for removed issues #13777

135 of 135 tasks complete

gfyoung added a commit to forking-repos/pandas that referenced this pull request Oct 27, 2018

gfyoung added a commit to forking-repos/pandas that referenced this pull request Oct 27, 2018

CLN: Remove pd.parser, lib, and tslib
xref pandas-devgh-15537.

Need to bumpy xarray version for
Travis 2.7 because 0.8.0 becomes
incompatible with this change.

jreback added a commit that referenced this pull request Oct 28, 2018

CLN: Remove pd.parser, lib, and tslib (#23378)
xref gh-15537.

Need to bumpy xarray version for
Travis 2.7 because 0.8.0 becomes
incompatible with this change.

brute4s99 added a commit to brute4s99/pandas that referenced this pull request Nov 19, 2018

CLN: Remove pd.parser, lib, and tslib (pandas-dev#23378)
xref pandas-devgh-15537.

Need to bumpy xarray version for
Travis 2.7 because 0.8.0 becomes
incompatible with this change.

Pingviinituutti added a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019

CLN: Remove pd.parser, lib, and tslib (pandas-dev#23378)
xref pandas-devgh-15537.

Need to bumpy xarray version for
Travis 2.7 because 0.8.0 becomes
incompatible with this change.

Pingviinituutti added a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019

CLN: Remove pd.parser, lib, and tslib (pandas-dev#23378)
xref pandas-devgh-15537.

Need to bumpy xarray version for
Travis 2.7 because 0.8.0 becomes
incompatible with this change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.