Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC : setting dialect on pandas.read_table overrides a few other kwds #14898

Closed
rahulporuri opened this issue Dec 16, 2016 · 5 comments
Closed
Labels
API Design IO CSV read_csv, to_csv
Milestone

Comments

@rahulporuri
Copy link

Problem description

Correct me if I'm wrong but TextFileReader overwrites the delimiter, quoting etc kwds if the dialect kwarg is passed. But this isn't highlighted in the pandas.read_table documentation.

I don't know if this was implicit. Sorry for the noise if this is a known issue or if it isn't actually one.

@jreback
Copy link
Contributor

jreback commented Dec 16, 2016

can you give an example

@rahulporuri
Copy link
Author

the data contained in the file test_data_quoting.txt is

1,False,1.0,"one",2015-01-01,"quote space"
2,True,2.0,"two",2015-01-02,noquote space
3,False,3.0,three,2015-01-03,"quote space"
4,True,4.0,four,2015-01-04,"escaped""quote"

and for an example of kwds being overridden -

In [24]: pandas.read_table('test_data_quoting.txt', header=None, delimiter=',', quotechar=None, quoting=csv.QUOTE_NONE)
Out[24]: 
   0      1    2      3           4                 5
0  1  False  1.0  "one"  2015-01-01     "quote space"
1  2   True  2.0  "two"  2015-01-02     noquote space
2  3  False  3.0  three  2015-01-03     "quote space"
3  4   True  4.0   four  2015-01-04  "escaped""quote"

In [25]: pandas.read_table('test_data_quoting.txt', header=None, delimiter=',', quotechar=None, quoting=csv.QUOTE_NONE, dialect='excel')
Out[25]: 
   0      1    2      3           4              5
0  1  False  1.0    one  2015-01-01    quote space
1  2   True  2.0    two  2015-01-02  noquote space
2  3  False  3.0  three  2015-01-03    quote space
3  4   True  4.0   four  2015-01-04  escaped"quote

Again, I'm not advocating to change the behavior or anything of those sorts. I just wanted to point out that I didn't find the behavior to have been documented.

@jreback
Copy link
Contributor

jreback commented Dec 16, 2016

hmm, ok though maybe we should raise / warn if these are not default parameters. That seems a bit unexpected.

cc @gfyoung
@jorisvandenbossche
?

@jreback jreback added API Design IO CSV read_csv, to_csv labels Dec 16, 2016
@gfyoung
Copy link
Member

gfyoung commented Dec 17, 2016

I agree that behaviour should not be changed at this point but warn that we have conflicting options. I sense that we may want to raise in the future though?

@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Dec 18, 2016

Mentioning this in the explanation of the keyword would also already be a start. Warning seems OK for me as well (although I am not sure how simple this is, as you would need to detect whether it is passed or not for each of the keywords), not sure if raising is needed.

gfyoung added a commit to forking-repos/pandas that referenced this issue Dec 18, 2016
1) Update documentation about how the dialect
parameter is handled.

2) Verify that the dialect parameter passed in
is valid before accessing the dialect attributes.

Closes pandas-devgh-14898.
@jreback jreback added this to the 0.20.0 milestone Dec 18, 2016
gfyoung added a commit to forking-repos/pandas that referenced this issue Dec 18, 2016
1) Update documentation about how the dialect
parameter is handled.

2) Verify that the dialect parameter passed in
is valid before accessing the dialect attributes.

Closes pandas-devgh-14898.
gfyoung added a commit to forking-repos/pandas that referenced this issue Dec 19, 2016
1) Update documentation about how the dialect
parameter is handled.

2) Verify that the dialect parameter passed in
is valid before accessing the dialect attributes.

Closes pandas-devgh-14898.
gfyoung added a commit to forking-repos/pandas that referenced this issue Dec 19, 2016
1) Update documentation about how the dialect
parameter is handled.

2) Verify that the dialect parameter passed in
is valid before accessing the dialect attributes.

Closes pandas-devgh-14898.
gfyoung added a commit to forking-repos/pandas that referenced this issue Dec 19, 2016
1) Update documentation about how the dialect
parameter is handled.

2) Verify that the dialect parameter passed in
is valid before accessing the dialect attributes.

Closes pandas-devgh-14898.
gfyoung added a commit to forking-repos/pandas that referenced this issue Dec 23, 2016
1) Update documentation about how the dialect
parameter is handled.

2) Verify that the dialect parameter passed in
is valid before accessing the dialect attributes.

Closes pandas-devgh-14898.
gfyoung added a commit to forking-repos/pandas that referenced this issue Dec 24, 2016
1) Update documentation about how the dialect
parameter is handled.

2) Verify that the dialect parameter passed in
is valid before accessing the dialect attributes.

Closes pandas-devgh-14898.
gfyoung added a commit to forking-repos/pandas that referenced this issue Dec 29, 2016
1) Update documentation about how the dialect
parameter is handled.

2) Verify that the dialect parameter passed in
is valid before accessing the dialect attributes.

Closes pandas-devgh-14898.
gfyoung added a commit to forking-repos/pandas that referenced this issue Dec 30, 2016
1) Update documentation about how the dialect
parameter is handled.

2) Verify that the dialect parameter passed in
is valid before accessing the dialect attributes.

Closes pandas-devgh-14898.
gfyoung added a commit to forking-repos/pandas that referenced this issue Dec 30, 2016
1) Update documentation about how the dialect
parameter is handled.

2) Verify that the dialect parameter passed in
is valid before accessing the dialect attributes.

Closes pandas-devgh-14898.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design IO CSV read_csv, to_csv
Projects
None yet
Development

No branches or pull requests

4 participants