DEPR: pd.read_table #21954

dahlbaek · 2018-07-17T21:20:05Z

pd.read_table is deprecated and replaced by pd.read_csv.

closes DEPR: read_table #21948
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

WillAyd · 2018-07-17T21:23:30Z

Initial comments:

Needs test(s)
Since we are replacing read_table with read_csv and a sep of \tyou should reroute the call to read_table to reflect this
Needs whatsnew note

pep8speaks · 2018-07-17T21:49:27Z

Hello @dahlbaek! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on August 02, 2018 at 10:48 Hours UTC

dahlbaek · 2018-07-17T21:52:33Z

Added a whatsnew note.

I'm not sure what to do about tests. pd.read_csv has been around for as long as I've been using pandas, so I would suspect that tests are already plentiful.

I changed the warning message so that it will reflect whether or not the user is supplying a value to sep. The result is now similar to what I proposed in #21948 but without the decorator, as suggested by @gfyoung. Result:

In [1]: import pandas as pd

In [2]: df = pd.DataFrame({"a": [1], "b": [2]})
   ...: df.to_csv("test.csv", sep="\t")

In [3]: pd.read_table("test.csv", index_col=0)
/home/dahlbaek/virtualenvs/pandas-dev/bin/ipython:1: FutureWarning: read_table is deprecated, use read_csv with sep='\t' instead.
  #!/home/dahlbaek/virtualenvs/pandas-dev/bin/python
Out[3]: 
   a  b
0  1  2

In [4]: pd.read_table("test.csv", "\t", index_col=0)
/home/dahlbaek/virtualenvs/pandas-dev/bin/ipython:1: FutureWarning: read_table is deprecated, use read_csv instead.
  #!/home/dahlbaek/virtualenvs/pandas-dev/bin/python
Out[4]: 
   a  b
0  1  2

In [5]: pd.read_table("test.csv", sep="\t", index_col=0)
/home/dahlbaek/virtualenvs/pandas-dev/bin/ipython:1: FutureWarning: read_table is deprecated, use read_csv instead.
  #!/home/dahlbaek/virtualenvs/pandas-dev/bin/python
Out[5]: 
   a  b
0  1  2

In [6]: pd.read_table("test.csv", delimiter="\t", index_col=0)
/home/dahlbaek/virtualenvs/pandas-dev/bin/ipython:1: FutureWarning: read_table is deprecated, use read_csv instead.
  #!/home/dahlbaek/virtualenvs/pandas-dev/bin/python
Out[6]: 
   a  b
0  1  2

In [7]: pd.read_csv("test.csv", "\t", index_col=0)
Out[7]: 
   a  b
0  1  2

In [8]: pd.read_csv("test.csv", sep="\t", index_col=0)
Out[8]: 
   a  b
0  1  2

In [9]: pd.read_csv("test.csv", delimiter="\t", index_col=0)
Out[9]: 
   a  b
0  1  2

gfyoung · 2018-07-17T22:15:28Z

We definitely have tests for read_table in the codebase. Modify them to assert that a warning is produced.

dahlbaek · 2018-07-18T13:26:15Z

@gfyoung I modified the tests, but am unsure how to proceed with the stacklevel parameter. It seems to me stacklevel=2 gives the correct behaviour. Since the warning message is hardcoded into the function body, this will give the user the file/line of the call to the function producing the warning message, so the user can make suitable adjustments. Example:

$ cat test_warn.py 
import pandas as pd

pd.read_table("test.csv", index_col=0)
$ python test_warn.py 
test_warn.py:3: FutureWarning: read_table is deprecated, use read_csv with sep='\t' instead.
  pd.read_table("test.csv", index_col=0)

This is exactly the behaviour I would want as a user. Nevertheless, pandas.util.testing.assert_produces_warning complains that the stacklevel is not set correctly, which causes failed tests.

Please Advise.

gfyoung · 2018-07-18T15:52:07Z

@dahlbaek : Pass in check_stacklevel=False in your tests.

dahlbaek · 2018-07-18T15:54:26Z

@dahlbaek : Pass in check_stacklevel=False in your tests.

Perfect, that's what I figured.

gfyoung · 2018-07-24T00:45:15Z

pandas/io/parsers.py

+                              "instead.",
+                              FutureWarning, stacklevel=2)
+            if sep is False:
+                sep = default_sep


Why do we need this if condition here?

(yes, I see the other changes that you made for passing in default_sep. This question is more asking why the overall change was needed for this deprecation).

If the user is setting sep or delimiter, I want to suggest using read_csv, but if neither is set, I also want to suggest passing sep='\t'. This was the simplest solution I could come up with which handles the case correctly where the user is passing sep=',' to read_table.

More generally, the if clause is needed because the definition of read_table is entangled with the definition of read_csv. The cleaner solution would be to just have two separate definitions, but that solution was deemed not worth the effort.

Hmm...I see. Okay, fair enough.

dahlbaek · 2018-07-24T19:49:26Z

some comments. can you run the test suite and make sure you are catching all deprecation warnings for read_table.

I'm not seeing any FutureWarnings from read_table, but I am getting the following failed test:

(pandas-dev) $ pwd
/home/dahlbaek/git/pandas/pandas/tests
(pandas-dev) $ pytest test_errors.py
=============================== test session starts ===============================
platform linux -- Python 3.6.6, pytest-3.6.3, py-1.5.4, pluggy-0.6.0
rootdir: /home/dahlbaek/git/pandas, inifile: setup.cfg
plugins: xdist-1.22.2, forked-0.2, cov-2.5.1
collected 12 items                                                                

test_errors.py ..........F.                                                 [100%]

==================================== FAILURES =====================================
________________________________ test_error_rename ________________________________

    def test_error_rename():
        # see gh-12665
        from pandas.errors import ParserError
        from pandas.io.common import CParserError
    
        try:
            raise CParserError()
        except ParserError:
            pass
    
        try:
            raise ParserError()
        except CParserError:
            pass
    
        with catch_warnings(record=True):
            try:
>               raise ParserError()
E               pandas.errors.ParserError

test_errors.py:52: ParserError
======================= 1 failed, 11 passed in 0.07 seconds =======================
(pandas-dev) $

Not sure exactly what is happening here…

gfyoung · 2018-07-24T20:07:07Z

@dahlbaek : That's an odd test failure. Seems to be fine on Travis though. Also, the test logs don't seem to have any deprecation warnings related to read_table AFAICT.

jreback · 2018-07-24T22:07:59Z

@dahlbaek we don't usually run from the test dir itself, rather from the base dir e.t.
pytest path/to/test. running from the test dir might not find the fixtures.

jreback · 2018-07-24T22:08:30Z

@jorisvandenbossche @TomAugspurger any objections?

dahlbaek · 2018-07-25T00:28:34Z

@jreback After recreating my virtual environment from scratch, I still get that failed test, even on master, when running from the base dir. Not sure what the cause is, but as pointed out by @gfyoung it seems to be a local problem on my machine.

jreback · 2018-07-25T10:02:15Z

@dahlbaek see my comments above. The tests are not designed to be run from subdirs, nor are they on our CI. pytest can't find things.

dahlbaek · 2018-07-25T19:27:21Z

@jreback I think I am now running the test the way proposed in the docs, right? I still get the failure (even on master). In any case, this seems unrelated to the current PR. I suppose it is a local phenomenon that I will have to figure out myself…

(pandas-dev) $ pytest pandas/tests/test_errors.py -k test_error_rename
=============================== test session starts ===============================
platform linux -- Python 3.6.6, pytest-3.6.3, py-1.5.4, pluggy-0.6.0
rootdir: /home/dahlbaek/git/pandas, inifile: setup.cfg
plugins: xdist-1.22.3, forked-0.2, cov-2.5.1
collected 12 items / 11 deselected                                                

pandas/tests/test_errors.py F                                               [100%]

==================================== FAILURES =====================================
________________________________ test_error_rename ________________________________

    def test_error_rename():
        # see gh-12665
        from pandas.errors import ParserError
        from pandas.io.common import CParserError
    
        try:
            raise CParserError()
        except ParserError:
            pass
    
        try:
            raise ParserError()
        except CParserError:
            pass
    
        with catch_warnings(record=True):
            try:
>               raise ParserError()
E               pandas.errors.ParserError

pandas/tests/test_errors.py:52: ParserError
===================== 1 failed, 11 deselected in 0.06 seconds =====================
(pandas-dev) $

jreback · 2018-07-26T12:53:07Z

@dahlbaek hmm that looks like a real error

dahlbaek · 2018-07-26T18:04:32Z

@jreback I'll try to look into it when I have time and make a separate issue.

dahlbaek · 2018-08-01T11:54:50Z

@jreback I re-cloned the repo and installed a fresh python3.7 environment. I don't know what was causing the problem, but now it is gone.

jreback · 2018-08-01T22:04:34Z

@TomAugspurger if you have any issues here

gfyoung · 2018-08-01T23:14:10Z

@dahlbaek : This looks good to me, though if you could fix the merge conflicts, that would be good.

- pd.read_table is deprecated and replaced by pd.read_csv. - add whatsnew note - change tests to test for warning messages - change DataFrame.from_csv to use pandas.read_csv instead of pandas.read_table - Change pandas.read_clipboard to use pandas.read_csv instead of pandas.read_table

dahlbaek · 2018-08-02T09:12:47Z

@gfyoung Done.

TomAugspurger · 2018-08-02T10:48:50Z

Pushed a doc formatting fix.

Thanks @dahlbaek!

* DEPR: pd.read_table - pd.read_table is deprecated and replaced by pd.read_csv. - add whatsnew note - change tests to test for warning messages - change DataFrame.from_csv to use pandas.read_csv instead of pandas.read_table - Change pandas.read_clipboard to use pandas.read_csv instead of pandas.read_table * Add sep note to whatsnew

* master: (47 commits) Run tests in conda build [ci skip] (pandas-dev#22190) TST: Check DatetimeIndex.drop on DST boundary (pandas-dev#22165) CI: Fix Travis failures due to lint.sh on pandas/core/strings.py (pandas-dev#22184) Documentation: typo fixes in MultiIndex / Advanced Indexing (pandas-dev#22179) DOC: added .join to 'see also' in Series.str.cat (pandas-dev#22175) DOC: updated Series.str.contains see also section (pandas-dev#22176) 0.23.4 whatsnew (pandas-dev#22177) fix: scalar timestamp assignment (pandas-dev#19843) (pandas-dev#19973) BUG: Fix get dummies unicode error (pandas-dev#22131) Fixed py36-only syntax [ci skip] (pandas-dev#22167) DEPR: pd.read_table (pandas-dev#21954) DEPR: Removing previously deprecated datetools module (pandas-dev#6581) (pandas-dev#19119) BUG: Matplotlib scatter datetime (pandas-dev#22039) CLN: Use public method to capture UTC offsets (pandas-dev#22164) implement tslibs/src to make tslibs self-contained (pandas-dev#22152) Fix categorical from codes nan 21767 (pandas-dev#21775) BUG: Better handling of invalid na_option argument for groupby.rank(pandas-dev#22124) (pandas-dev#22125) use memoryviews instead of ndarrays (pandas-dev#22147) Remove depr. warning in SeriesGroupBy.count (pandas-dev#22155) API: Default to_* methods to compression='infer' (pandas-dev#22011) ...

* DEPR: pd.read_table - pd.read_table is deprecated and replaced by pd.read_csv. - add whatsnew note - change tests to test for warning messages - change DataFrame.from_csv to use pandas.read_csv instead of pandas.read_table - Change pandas.read_clipboard to use pandas.read_csv instead of pandas.read_table * Add sep note to whatsnew

WillAyd added this to the 0.24.0 milestone Jul 17, 2018

WillAyd added IO Data IO issues that don't fit into a more specific label Deprecate Functionality to remove in pandas labels Jul 17, 2018

dahlbaek force-pushed the depr/read_table branch 2 times, most recently from 529c70f to 4f41cd0 Compare July 17, 2018 21:49

dahlbaek force-pushed the depr/read_table branch 2 times, most recently from 3e229e0 to 01b3b0f Compare July 17, 2018 22:03

dahlbaek force-pushed the depr/read_table branch 14 times, most recently from 2a659b6 to 15dfae8 Compare July 18, 2018 13:18

dahlbaek force-pushed the depr/read_table branch from 15dfae8 to 9eb7631 Compare July 18, 2018 15:33

dahlbaek force-pushed the depr/read_table branch from 9eb7631 to 23d53a1 Compare July 18, 2018 16:32

gfyoung reviewed Jul 24, 2018

View reviewed changes

dahlbaek force-pushed the depr/read_table branch from c75f12a to 2149356 Compare July 24, 2018 18:05

jreback approved these changes Jul 24, 2018

View reviewed changes

dahlbaek force-pushed the depr/read_table branch 2 times, most recently from ea4a798 to 5431629 Compare July 25, 2018 00:25

dahlbaek force-pushed the depr/read_table branch from 5431629 to aceaa23 Compare July 25, 2018 00:30

dahlbaek force-pushed the depr/read_table branch from aceaa23 to c24446a Compare July 26, 2018 16:10

dahlbaek force-pushed the depr/read_table branch from c24446a to f417c92 Compare August 2, 2018 07:56

Add sep note to whatsnew

9002460

TomAugspurger merged commit 83f6be5 into pandas-dev:master Aug 2, 2018

dhimmel mentioned this pull request Jan 29, 2019

Request for DataFrame.to_tsv() for reading tab delimited text #10327

Closed

st-bender mentioned this pull request Feb 8, 2019

Keep (un-deprecate) read_table() for API stability #25220

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DEPR: pd.read_table #21954

DEPR: pd.read_table #21954

dahlbaek commented Jul 17, 2018 •

edited

WillAyd commented Jul 17, 2018

pep8speaks commented Jul 17, 2018 •

edited

dahlbaek commented Jul 17, 2018

gfyoung commented Jul 17, 2018

dahlbaek commented Jul 18, 2018 •

edited

gfyoung commented Jul 18, 2018

dahlbaek commented Jul 18, 2018

gfyoung Jul 24, 2018 •

edited

dahlbaek Jul 24, 2018 •

edited

gfyoung Jul 24, 2018

dahlbaek commented Jul 24, 2018 •

edited

gfyoung commented Jul 24, 2018

jreback commented Jul 24, 2018

jreback commented Jul 24, 2018

dahlbaek commented Jul 25, 2018

jreback commented Jul 25, 2018

dahlbaek commented Jul 25, 2018 •

edited

jreback commented Jul 26, 2018

dahlbaek commented Jul 26, 2018

dahlbaek commented Aug 1, 2018 •

edited

jreback commented Aug 1, 2018

gfyoung commented Aug 1, 2018

dahlbaek commented Aug 2, 2018

TomAugspurger commented Aug 2, 2018

DEPR: pd.read_table #21954

DEPR: pd.read_table #21954

Conversation

dahlbaek commented Jul 17, 2018 • edited

WillAyd commented Jul 17, 2018

pep8speaks commented Jul 17, 2018 • edited

Comment last updated on August 02, 2018 at 10:48 Hours UTC

dahlbaek commented Jul 17, 2018

gfyoung commented Jul 17, 2018

dahlbaek commented Jul 18, 2018 • edited

gfyoung commented Jul 18, 2018

dahlbaek commented Jul 18, 2018

gfyoung Jul 24, 2018 • edited

Choose a reason for hiding this comment

dahlbaek Jul 24, 2018 • edited

Choose a reason for hiding this comment

gfyoung Jul 24, 2018

Choose a reason for hiding this comment

dahlbaek commented Jul 24, 2018 • edited

gfyoung commented Jul 24, 2018

jreback commented Jul 24, 2018

jreback commented Jul 24, 2018

dahlbaek commented Jul 25, 2018

jreback commented Jul 25, 2018

dahlbaek commented Jul 25, 2018 • edited

jreback commented Jul 26, 2018

dahlbaek commented Jul 26, 2018

dahlbaek commented Aug 1, 2018 • edited

jreback commented Aug 1, 2018

gfyoung commented Aug 1, 2018

dahlbaek commented Aug 2, 2018

TomAugspurger commented Aug 2, 2018

dahlbaek commented Jul 17, 2018 •

edited

pep8speaks commented Jul 17, 2018 •

edited

dahlbaek commented Jul 18, 2018 •

edited

gfyoung Jul 24, 2018 •

edited

dahlbaek Jul 24, 2018 •

edited

dahlbaek commented Jul 24, 2018 •

edited

dahlbaek commented Jul 25, 2018 •

edited

dahlbaek commented Aug 1, 2018 •

edited