Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEPR: deprecate some top-level non-used functions #15538

Merged
merged 4 commits into from Mar 3, 2017

Conversation

Projects
None yet
3 participants
@jreback
Copy link
Contributor

commented Mar 1, 2017

closes #13790

pd.pnow()
pd.groupby()
pd.match()
pd.Term
pd.Expr

remove info.py

@jreback jreback added the Deprecate label Mar 1, 2017

@jreback jreback added this to the 0.20.0 milestone Mar 1, 2017

@jsexauer jsexauer referenced this pull request Mar 1, 2017

Open

DEPR: deprecations from prior versions #6581

0 of 94 tasks complete

@jreback jreback force-pushed the jreback:top_level branch from 8dcbd34 to 12a571e Mar 1, 2017

@jorisvandenbossche

This comment has been minimized.

Copy link
Member

commented Mar 2, 2017

@jreback Thanks for the PR!

For me, the main question is, do we actually need those functions? If not, I don't think we have to recommend an alternative import path, but just say they are "deprecated and will be removed in a future version"

  • groupby -> referring to the method is of course OK
  • pnow -> you can also use Period.now() (similar to Timestamp.now), so if we do a recommendation, I would refer to that instead of tseries.periods.pnow (that would also mean the actual tseries.periods.pnow can be deprecated, instead of only the top-level one by wrapping it)
  • Term / Expr: I never used them, so I can't really judge: do they have a usecase? If so, it is OK to refer to a more nested path for me, but then that should be documented / added to the public API.
  • match: the same. Do we want to have this as a public function? If not, I would recommend the alternative import path.
@jreback

This comment has been minimized.

Copy link
Contributor Author

commented Mar 2, 2017

  • pnow -> you can also use Period.now() (similar to Timestamp.now), so if we do a recommendation, I would refer to that instead of tseries.periods.pnow (that would also mean the actual tseries.periods.pnow can be deprecated, instead of only the top-level one by wrapping it)

didn't realize this, yes I will point that out as instead (and deprecate Period.pnow() as well)

  • Term / Expr: I never used them, so I can't really judge: do they have a usecase? If so, it is OK to refer to a more nested path for me, but then that should be documented / added to the public API.
    match: the same. Do we want to have this as a public function? If not, I would recommend the alternative import path.

Expr, you never need to directly use this (its used in .eval under the hood, should not have been exposed at all). Term was needed quite a while back, but string expressions replaced this (years ago).

  • match: the same. Do we want to have this as a public function? If not, I would recommend the alternative import path.

This is barely tested, not documented, and AFAICT never used anywhere. So it might have utility, but certainly not as a top-level function.

@jreback jreback force-pushed the jreback:top_level branch 3 times, most recently from da53aac to 7c52dd3 Mar 2, 2017

@codecov-io

This comment has been minimized.

Copy link

commented Mar 2, 2017

Codecov Report

Merging #15538 into master will decrease coverage by -0.02%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #15538      +/-   ##
==========================================
- Coverage   91.04%   91.02%   -0.02%     
==========================================
  Files         136      136              
  Lines       49088    49107      +19     
==========================================
+ Hits        44694    44702       +8     
- Misses       4394     4405      +11
Impacted Files Coverage Δ
pandas/computation/api.py 100% <100%> (ø)
pandas/init.py 91.66% <100%> (ø)
pandas/tseries/period.py 92.8% <100%> (+0.02%)
pandas/core/api.py 100% <100%> (ø)
pandas/io/api.py 100% <100%> (ø)
pandas/io/gbq.py 25% <0%> (-58.34%)
pandas/tools/merge.py 91.78% <0%> (-0.35%)
pandas/core/frame.py 97.82% <0%> (-0.1%)
pandas/tseries/index.py 95.32% <0%> (-0.1%)
pandas/core/common.py 91.36% <0%> (+0.33%)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cd67704...59d2b61. Read the comment docs.

@jorisvandenbossche
Copy link
Member

left a comment

Thanks for the update!

I personally wouldn't even mention the alternative path for match, if you say that it is "barely tested, not documented, and AFAICT never used anywhere", we shouldn't encourage it.
If someone uses it, sees the removal message, they can always complain that it was useful and we can think about how to publicly expose it (I don't think core.algorithms should be that place).

@@ -534,6 +534,12 @@ Deprecations
- ``Series.sortlevel`` and ``DataFrame.sortlevel`` have been deprecated in favor of ``Series.sort_index`` and ``DataFrame.sort_index`` (:issue:`15099`)
- importing ``concat`` from ``pandas.tools.merge`` has been deprecated in favor of imports from the ``pandas`` namespace. This should only affect explict imports (:issue:`15358`)
- ``Series/DataFrame/Panel.consolidate()`` been deprecated as a public method. (:issue:`15483`)
- The following top-level pandas functions have been deprecated and will be removed in a future version (:issue:`13790`)
* ``pd.now()``, replaced by a direct import from ``pandas.tseries.period``

This comment has been minimized.

Copy link
@jorisvandenbossche

jorisvandenbossche Mar 2, 2017

Member

I don't think we have to refer to tseries.period, but just remove it altogether? (or at least not recommend it)

This comment has been minimized.

Copy link
@jorisvandenbossche

jorisvandenbossche Mar 2, 2017

Member

actually you also have pd.Period.now(freq='H')


warnings.warn("pd.Term is deprecated as it is not "
"applicable to user code. Instead use in-line "
"string expressions in for searching HDFStore",

This comment has been minimized.

Copy link
@jorisvandenbossche

jorisvandenbossche Mar 2, 2017

Member

Can you add this sentence also to the whatsnew overview?

pandas - a powerful data analysis and manipulation library for Python
=====================================================================
See http://pandas.pydata.org/ for full documentation. Otherwise, see the

This comment has been minimized.

Copy link
@jorisvandenbossche

jorisvandenbossche Mar 2, 2017

Member

Good idea to remove this file. To have the same effect (people doing pd? see some info), you can put eg what is above this comment in __init__.py as module docstring. It's not that useful, but it does also no harm to have a minimal explanation.

This comment has been minimized.

Copy link
@jreback

jreback Mar 2, 2017

Author Contributor

I could theorectically replace this with something (maybe a minimal README type of thing, so its on the actual doc-string). any thoughts?

This comment has been minimized.

Copy link
@jreback

jreback Mar 2, 2017

Author Contributor

yeah should replace with something, maybe a pointer to the docs (plus some minmal stuff). I can simply define this directly I think (in init.py). Any thoughts on what this should say?

This comment has been minimized.

Copy link
@jorisvandenbossche

jorisvandenbossche Mar 2, 2017

Member

Yes, you can just put the exact same string on the first lines of __init__.py, and it will be interpreted as the module docstring.

I would put just the same as here (title, and link to full docs). In principle we could do some clever parsing of a README file and adding that, but I wouldn't do that effort personally.

This comment has been minimized.

Copy link
@jreback

jreback Mar 2, 2017

Author Contributor

I just did this

In [1]: pd?
Type:        module
String form: <module 'pandas' from '/Users/jreback/pandas/pandas/__init__.py'>
File:        ~/pandas/pandas/__init__.py
Docstring:  
pandas - a powerful data analysis and manipulation library for Python
=====================================================================

**pandas** is a Python package providing fast, flexible, and expressive data
structures designed to make working with "relational" or "labeled" data both
easy and intuitive. It aims to be the fundamental high-level building block for
doing practical, **real world** data analysis in Python. Additionally, it has
the broader goal of becoming **the most powerful and flexible open source data
analysis / manipulation tool available in any language**. It is already well on
its way toward this goal.

Main Features
-------------
Here are just a few of the things that pandas does well:

  - Easy handling of missing data in floating point as well as
    non-floating point data
  - Size mutability: columns can be inserted and deleted from
    DataFrame and higher dimensional objects
  - Automatic and explicit data alignment: objects can  be
    explicitly aligned to a set of labels, or the user can simply
    ignore the labels and let `Series`, `DataFrame`, etc. automatically
    align the data for you in computations
  - Powerful, flexible group by functionality to perform
    split-apply-combine operations on data sets, for both aggregating
    and transforming data
  - Make it easy to convert ragged, differently-indexed data in
    other Python and NumPy data structures into DataFrame objects
  - Intelligent label-based slicing, fancy indexing, and subsetting of
    large data sets
  - Intuitive merging and joining data sets
  - Flexible reshaping and pivoting of data sets
  - Hierarchical labeling of axes (possible to have multiple
    labels per tick)
  - Robust IO tools for loading data from flat files
    (CSV and delimited), Excel files, databases,
    and saving/loading data from the ultrafast HDF5 format
  - Time series-specific functionality: date range
    generation and frequency conversion, moving window statistics,
    moving window linear regressions, date shifting and lagging, etc.

as the README.md has all kinds of html and such (and is too long)

This comment has been minimized.

Copy link
@jorisvandenbossche

jreback added some commits Mar 1, 2017

DEPR: deprecate some top-level non-used functions
closes #13790

pd.pnow
pd.groupby
pd.match
pd.Term
pd.Expr

remove info.py

@jreback jreback force-pushed the jreback:top_level branch from 7c52dd3 to 6df38ed Mar 2, 2017

@jreback

This comment has been minimized.

Copy link
Contributor Author

commented Mar 2, 2017

so we have this mini example in

http://pandas-docs.github.io/pandas-docs-travis/comparison_with_r.html

In [1]: s = pd.Series(np.arange(5),dtype=np.float32)

In [2]: s
Out[2]: 
0    0.0
1    1.0
2    2.0
3    3.0
4    4.0
dtype: float32

In [3]: pd.Series(pd.match(s,[2,4],np.nan))
Out[3]: 
0    NaN
1    NaN
2    0.0
3    NaN
4    1.0
dtype: float64

which is actually confusing / misleading.


In [7]: s.where(s.isin([2,4]))
Out[7]: 
0    NaN
1    NaN
2    2.0
3    NaN
4    4.0
dtype: float32

is much more useful (I don't know why you would want that kind of transformation)

though I don't use R much, so not sure what it is trying to show.

@jorisvandenbossche

This comment has been minimized.

Copy link
Member

commented Mar 2, 2017

Hmm, I would maybe just remove that example (isin is already showed just above the match example).
match has its use (if you want exactly what is in the example), question is whether we need to support that.

@jreback

This comment has been minimized.

Copy link
Contributor Author

commented Mar 2, 2017

@jorisvandenbossche yep sounds good. Yeah the function still exists, its just not top-level (like .unique, .factorize, .value_counts); though even the latter 2 don't really need to be exposed directly to the user (.value_counts is a method, .factorize is just an implementation detail for category).

These made much more sense when directly allowing numpy arrays was useful (but that is less and less useful). But this is another story.

ok will remove that part of the doc.

@jorisvandenbossche jorisvandenbossche merged commit 524a9a0 into pandas-dev:master Mar 3, 2017

3 checks passed

ci/circleci Your tests passed on CircleCI!
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

AnkurDedania added a commit to AnkurDedania/pandas that referenced this pull request Mar 21, 2017

DEPR: deprecate some top-level non-used functions (pandas-dev#15538)
closes pandas-dev#13790

pd.pnow
pd.groupby
pd.match
pd.Term
pd.Expr

remove info.py

gfyoung added a commit to forking-repos/pandas that referenced this pull request Oct 27, 2018

CLN: Cleanup top-level pandas namespace
Drops the following:

* pd.pnow
* pd.match
* pd.groupby
* pd.get_store

* pd.Expr
* pd.Term

xref pandas-devgh-15538.
xref pandas-devgh-15940.

@jreback jreback referenced this pull request Oct 27, 2018

Open

DEPR: deprecations log for removed issues #13777

135 of 135 tasks complete

jreback added a commit that referenced this pull request Oct 28, 2018

CLN: Cleanup top-level pandas namespace (#23380)
Drops the following:

* pd.pnow
* pd.match
* pd.groupby
* pd.get_store

* pd.Expr
* pd.Term

xref gh-15538.
xref gh-15940.

brute4s99 added a commit to brute4s99/pandas that referenced this pull request Nov 19, 2018

CLN: Cleanup top-level pandas namespace (pandas-dev#23380)
Drops the following:

* pd.pnow
* pd.match
* pd.groupby
* pd.get_store

* pd.Expr
* pd.Term

xref pandas-devgh-15538.
xref pandas-devgh-15940.

Pingviinituutti added a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019

CLN: Cleanup top-level pandas namespace (pandas-dev#23380)
Drops the following:

* pd.pnow
* pd.match
* pd.groupby
* pd.get_store

* pd.Expr
* pd.Term

xref pandas-devgh-15538.
xref pandas-devgh-15940.

Pingviinituutti added a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019

CLN: Cleanup top-level pandas namespace (pandas-dev#23380)
Drops the following:

* pd.pnow
* pd.match
* pd.groupby
* pd.get_store

* pd.Expr
* pd.Term

xref pandas-devgh-15538.
xref pandas-devgh-15940.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.