Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BLD: consolidate remaining extensions #15537

Closed
wants to merge 1 commit into from

Conversation

jreback
Copy link
Contributor

@jreback jreback commented Mar 1, 2017

moves extensions to pandas/libs, which holds the extension code and also the generated builds (as its importable). pandas/src is now almost an includes dir, holding low-frequency changing code.

This consolidates the import process making it more uniform and consistent throughout the codebase.

Finally this cleans up the remaining top-level namespace (with some deprecations in place for example pandas.lib, pandas.tslib, pandas.json, pandas.parser. I listed all of the changes in the whatsnew, but I don't think worthwhile deprecating anything else.

@jreback jreback added the Build Library building on various platforms label Mar 1, 2017
@jreback jreback added this to the 0.20.0 milestone Mar 1, 2017
@chris-b1
Copy link
Contributor

chris-b1 commented Mar 1, 2017

I'd suggest deprecating at least tslib. I suspect there are some things like from pandas.tslib import Timestamp in the wild.

@jreback jreback force-pushed the extensions3 branch 3 times, most recently from df4f898 to 407a415 Compare March 1, 2017 16:53
@jreback
Copy link
Contributor Author

jreback commented Mar 1, 2017

@chris-b1 added

@jreback jreback added the Deprecate Functionality to remove in pandas label Mar 1, 2017
@jreback
Copy link
Contributor Author

jreback commented Mar 1, 2017

cc @wesm

@codecov-io
Copy link

codecov-io commented Mar 1, 2017

Codecov Report

Merging #15537 into master will decrease coverage by -0.07%.
The diff coverage is 95.42%.

@@            Coverage Diff             @@
##           master   #15537      +/-   ##
==========================================
- Coverage   91.06%   90.99%   -0.07%     
==========================================
  Files         137      143       +6     
  Lines       49313    49295      -18     
==========================================
- Hits        44906    44858      -48     
- Misses       4407     4437      +30
Impacted Files Coverage Δ
pandas/io/msgpack/_version.py 100% <ø> (ø)
pandas/compat/pickle_compat.py 68.29% <ø> (ø)
pandas/io/msgpack/exceptions.py 83.33% <ø> (ø)
pandas/core/base.py 95.51% <100%> (ø)
pandas/io/sas/sas7bdat.py 90.66% <100%> (ø)
pandas/tslib.py 100% <100%> (ø)
pandas/tseries/frequencies.py 96.03% <100%> (-0.01%)
pandas/formats/format.py 95.33% <100%> (ø)
pandas/core/nanops.py 98.17% <100%> (ø)
pandas/util/testing.py 81.11% <100%> (ø)
... and 74 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a347ecb...a6d6cfa. Read the comment docs.

@jreback
Copy link
Contributor Author

jreback commented Mar 1, 2017

cc @shoyer

FYI (I could make this a DeprecationWarning, but those get ignored :<)

(pandas) bash-3.2$ python
Python 3.5.2 |Continuum Analytics, Inc.| (default, Jul  2 2016, 17:52:12) 
[GCC 4.2.1 Compatible Apple LLVM 4.2 (clang-425.0.28)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas
>>> import xarray
/Users/jreback/miniconda3/envs/pandas/lib/python3.5/site-packages/xarray/conventions.py:9: FutureWarning: The pandas.tslib module is deprecated and will be removed in a future version. Please import from the pandas.libs.tslib instead
  from pandas.tslib import OutOfBoundsDatetime

@shoyer
Copy link
Member

shoyer commented Mar 1, 2017

We used from pandas.tslib import OutOfBoundsDatetime because we needed to catch that specific exception and couldn't find it exposed in the public API anywhere.

It would be nice to move that to public API somewhere, maybe pandas.errors if the top level namespace is getting too crowded.

@jreback jreback force-pushed the extensions3 branch 4 times, most recently from 81be544 to 6f264e5 Compare March 2, 2017 19:06
@wesm
Copy link
Member

wesm commented Mar 2, 2017

This looks fine from superficial glance. Might want to nest src at pandas/libs/src. Sort of a "hear lies all ye compiled code" (eventually headed for Davy Jones's locker once the libpandas games get going. I feel like a broken record but it really is going to happen =) )

@jreback
Copy link
Contributor Author

jreback commented Mar 2, 2017

@wesm sure that sounds good. yeah that's the idea :>

@jreback jreback force-pushed the extensions3 branch 2 times, most recently from 5d1d491 to 00a7751 Compare March 3, 2017 13:17
@jreback
Copy link
Contributor Author

jreback commented Mar 3, 2017

going to merge on pass.

@jorisvandenbossche
Copy link
Member

Big +1 on consolidation of where and how the c/cython extensions are organized (and for removing them from the top-level namespace).

Some comments:

  • is there a specific reason libwindow is in core and not libs?
  • the same as I commented in the other PR, IMO we should think about whether we need to recommend users to import from pandas.libs directly (the deprecation warnings now do that).
  • Regarding the indication of 'privateness', something like pandas.tools.libhash looks more public compared to pandas._hash to me (it's not because it is not top-level anymore, that it is not public. We have public non-top-level API). So if we regard that as non-public, I find that not an improvement (that aspect).

I am personally in favour of using more underscores, making it very clear that something is private stuff.

@shoyer
Copy link
Member

shoyer commented Mar 3, 2017 via email

@jreback
Copy link
Contributor Author

jreback commented Mar 3, 2017

well, underscores do a lot to make this non-readable.

I could make this pandas._libs (IOW, make all extensions private). I prefer to have imports inside pandas code like this

from pandas._libs import tslib, lib
rather than
from pandas.libs import _tslib, _lib

which looks really ugly to me.

@jreback
Copy link
Contributor Author

jreback commented Mar 3, 2017

to some of @jorisvandenbossche points

  • is there a specific reason libwindow is in core and not libs?

so from a code organization perspective. directly attributable libs make sense to be right next to there respective libs. general libs like lib, algos are centrally located. I did consider window to be with this. But from a dev point of view, editing window.py and then window.pyx right next to each is powerful. It moves the code together.

Similarly for json, sas, and the parser. They are self-contained. Its a tradeoff between dev flexibility and privacy really. I think the lib* naming really helps to differentiate this though and pretty much solves both problems.

  • the same as I commented in the other PR, IMO we should think about whether we need to recommend users to import from pandas.libs directly (the deprecation warnings now do that).

I could do this as pandas._libs as I indicated above. But in reality having underscores all over the place is simply distracting from a dev point of view (and ugly). So could do that, or put a README.

  • Regarding the indication of 'privateness', something like pandas.tools.libhash looks more public compared to pandas._hash to me (it's not because it is not top-level anymore, that it is not public. We have public non-top-level API). So if we regard that as non-public, I find that not an improvement (that aspect).

see comment above. I mean ALL of pandas code is 'public', but that's the point of having pandas.api in the first place anyhow. Sure any code can be used (even _ leading). But I think we need a warning about expectations generally. lib* should never be regarded as public.
I am personally in favour of using more underscores, making it very clear that something is private stuff.

@shoyer
Copy link
Member

shoyer commented Mar 3, 2017

I could make this pandas._libs (IOW, make all extensions private). I prefer to have imports inside pandas code like this

from pandas._libs import tslib, lib
rather than
from pandas.libs import _tslib, _lib

which looks really ugly to me.

Agreed, there's no need to use more than one level of underscores. I also prefer using _libs.* to libs._*.

@jreback
Copy link
Contributor Author

jreback commented Mar 3, 2017

renamed to _libs. This drops if from the main pd namespace as well, so that's a plus.

@jorisvandenbossche w.r.t. the lib*. I can make those private (but still keep the code in those locations) if desired (prob in a followup though)

@jreback jreback changed the title BLD: conslidate remaining extensions BLD: consolidate remaining extensions Mar 4, 2017
@jreback jreback force-pushed the extensions3 branch 4 times, most recently from bec46a7 to 44be976 Compare March 7, 2017 15:01
xref pandas-dev#12588

pandas.parser -> io/libparsers.pyx
pandas.json -> pandas.io.json.libjson
pandas.io.sas.saslib -> libsas
pandas.msgpack -> pandas.io.msgpack
pandas._testing -> pandas.util.libtesting
pandas._sparse -> pandas.sparse.libsparse
pandas._hash -> pandas.tools.libhash
pandas.tslib -> pandas.libs.tslib
pandas.index -> pandas.libs.index
pandas.algos -> pandas.libs.algos
pandas.lib -> pandas.libs.lib
pandas.hashtable -> pandas.libs.hashtable
pandas._window -> pandas.core.libwindow
pandas._join -> pandas.libs.join
move algos*.in, index*.in, hashtable*.in to libs
pandas._period -> pandas.libs.period
@jreback
Copy link
Contributor Author

jreback commented Mar 7, 2017

bombs away.

@jreback jreback closed this in 648ae4f Mar 7, 2017
jreback added a commit to jreback/pandas that referenced this pull request Mar 8, 2017
jreback added a commit to jreback/pandas that referenced this pull request Mar 8, 2017
jreback added a commit to jreback/pandas that referenced this pull request Mar 8, 2017
AnkurDedania pushed a commit to AnkurDedania/pandas that referenced this pull request Mar 21, 2017
moves extensions to pandas/_libs, which holds the extension code
and also the generated builds (as its importable).

pandas/_libs/src is now almost an includes dir, holding low-frequency changing code.
This consolidates the import process making it more uniform and
consistent throughout the codebase.

Finally this cleans up the remaining top-level namespace (with some deprecations in place for
example pandas.lib, pandas.tslib, pandas.json, pandas.parser.

I listed all of the changes in the whatsnew, but I
don't think worthwhile deprecating anything else.

Author: Jeff Reback <jeff@reback.net>

Closes pandas-dev#15537 from jreback/extensions3 and squashes the following commits:

a6d6cfa [Jeff Reback] BLD: rename / move some extensions
AnkurDedania pushed a commit to AnkurDedania/pandas that referenced this pull request Mar 21, 2017
gfyoung added a commit to forking-repos/pandas that referenced this pull request Oct 27, 2018
gfyoung added a commit to forking-repos/pandas that referenced this pull request Oct 27, 2018
xref pandas-devgh-15537.

Need to bumpy xarray version for
Travis 2.7 because 0.8.0 becomes
incompatible with this change.
jreback pushed a commit that referenced this pull request Oct 28, 2018
xref gh-15537.

Need to bumpy xarray version for
Travis 2.7 because 0.8.0 becomes
incompatible with this change.
tm9k1 pushed a commit to tm9k1/pandas that referenced this pull request Nov 19, 2018
xref pandas-devgh-15537.

Need to bumpy xarray version for
Travis 2.7 because 0.8.0 becomes
incompatible with this change.
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
xref pandas-devgh-15537.

Need to bumpy xarray version for
Travis 2.7 because 0.8.0 becomes
incompatible with this change.
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
xref pandas-devgh-15537.

Need to bumpy xarray version for
Travis 2.7 because 0.8.0 becomes
incompatible with this change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Build Library building on various platforms Deprecate Functionality to remove in pandas
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants