Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: deprecate from_csv in favor of read_csv #4191

Closed
jreback opened this issue Jul 10, 2013 · 16 comments
Closed

API: deprecate from_csv in favor of read_csv #4191

jreback opened this issue Jul 10, 2013 · 16 comments
Labels
API Design IO CSV read_csv, to_csv IO Data IO issues that don't fit into a more specific label Needs Discussion Requires discussion from core team before further action
Milestone

Comments

@jreback
Copy link
Contributor

jreback commented Jul 10, 2013

just too confusion and not consistent

related: #3418, #3171

@ghost
Copy link

ghost commented Aug 18, 2013

Diffrent defaults for parse_dates is about all of it from looking at those two issues.
depreaction warning would be fine. removing completely is too harsh IMO
, due to breaking old code.

@jreback
Copy link
Contributor Author

jreback commented Aug 18, 2013

fair enough

@jreback
Copy link
Contributor Author

jreback commented Sep 22, 2013

closing in favor of #4916

@jorisvandenbossche
Copy link
Member

Reopening this because #4916 is also more general about integrating functionality of the from_dict/records functions inside the DataFrame constructor.

Regarding deprecating from_csv, is this something we still want to do? Or has the ship sailed? (and we keep it as an alias with slightly different defaults indefinitely)

@jorisvandenbossche jorisvandenbossche modified the milestones: 0.21.0, No action Aug 18, 2017
@jorisvandenbossche jorisvandenbossche added the Needs Discussion Requires discussion from core team before further action label Aug 18, 2017
@gfyoung
Copy link
Member

gfyoung commented Aug 18, 2017

I think we should just redefine as an alias to read_csv, with the parameter squeeze=True passed in for Series if being called from there.

@jorisvandenbossche
Copy link
Member

I think we should just redefine as an alias to read_csv

What is the advantage of redefining? (I suppose you mean making it use the same defaults and pass everything just through to read_csv ?)
Because that will break people's code. Deprecating of course also takes action of the user, but then they at least have the time to adapt.
We can of course also deprecate the different defaults, but I wonder whether that is then worth the complexity.

@gfyoung
Copy link
Member

gfyoung commented Aug 18, 2017

No, absolutely we would need a deprecation cycle. However, I think that would be consistent with functions that we expose both in the pandas namespace (e.g. .merge) and in the DataFrame namespace for example.

That being said, it's largely a convenience thing and not entirely necessary for us to do (we could just remove it if it's simpler).

@jorisvandenbossche
Copy link
Member

However, I think that would be consistent with functions that we expose both in the pandas namespace (e.g. .merge) and in the DataFrame namespace for example.

I think that is a bit a different case, as that actually does something with the frame, while from_csv is a class method. And all other read_ functions have no equivalent class method.

But of course, this will mainly annoy people that use it, so not sure it is worth the trouble. But is from time to time comes up (why it exists, why it has different defaults)

@gfyoung
Copy link
Member

gfyoung commented Aug 18, 2017

But of course, this will mainly annoy people that use

The question is: who uses it though?

@jreback
Copy link
Contributor Author

jreback commented Sep 23, 2017

We should deprecate (and eventually remove) anything that is not consistent with the APIs that we have. There should be 1 way to do things. @gfyoung can you do this for 0.21.0?

@jreback jreback modified the milestones: 0.21.0, 1.0 Oct 2, 2017
@gfyoung
Copy link
Member

gfyoung commented Oct 4, 2017

@jreback : I would like to squeeze this one in, but I would like to get some feedback on how to proceed with tests. Given that .from_csv differs in some defaults compared to read_csv, the output of replacing .from_csv with .read_csv will change on some tests.

Should we continue testing for the output that we currently get from .from_csv but use a modified version of read_csv in the test to get the same output? Or should we modify the expected output for our tests?

@gfyoung
Copy link
Member

gfyoung commented Oct 4, 2017

These are not unit tests per se but largely roundtrip tests, as you can see from the search:

https://github.com/pandas-dev/pandas/search?l=Python&q=from_csv&type=&utf8=%E2%9C%93

@jreback
Copy link
Contributor Author

jreback commented Oct 4, 2017

@gfyoung you can just catch the warning around existing tests is fine for now (agree not very many). The warning message should be very clear on how to migrate though.

@gfyoung
Copy link
Member

gfyoung commented Oct 4, 2017

The warning message should be very clear on how to migrate though.

When describing how to migrate, to what level of detail should I provide besides pointing them to the docs that describe the differences with the defaults in from_csv and read_csv ?

@jreback
Copy link
Contributor Author

jreback commented Oct 4, 2017

i would give them a constructed function that could be copy pasted with the args filled in
we do this for the resample warnings (it’s not hard just a little bit of formatting)
ok with having a fairly instructive warning message

@jorisvandenbossche
Copy link
Member

The current docstring already gives the alternative spelling with read_csv (although for the default settings): https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.from_csv.html, so that would be good to include.

gfyoung added a commit to forking-repos/pandas that referenced this issue Oct 7, 2017
gfyoung added a commit to forking-repos/pandas that referenced this issue Oct 7, 2017
@jreback jreback modified the milestones: 1.0, 0.21.0 Oct 8, 2017
gfyoung added a commit to forking-repos/pandas that referenced this issue Oct 8, 2017
gfyoung added a commit to forking-repos/pandas that referenced this issue Oct 10, 2017
gfyoung added a commit to forking-repos/pandas that referenced this issue Oct 10, 2017
ghost pushed a commit to reef-technologies/pandas that referenced this issue Oct 16, 2017
alanbato pushed a commit to alanbato/pandas that referenced this issue Nov 10, 2017
No-Stream pushed a commit to No-Stream/pandas that referenced this issue Nov 28, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design IO CSV read_csv, to_csv IO Data IO issues that don't fit into a more specific label Needs Discussion Requires discussion from core team before further action
Projects
None yet
Development

No branches or pull requests

3 participants