Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: Weird documentation (or behavior) of pd.read_csv escapechar= arg #23717

Closed
keanpantraw opened this issue Nov 15, 2018 · 1 comment · Fixed by #24490
Closed

DOC: Weird documentation (or behavior) of pd.read_csv escapechar= arg #23717

keanpantraw opened this issue Nov 15, 2018 · 1 comment · Fixed by #24490
Labels
Milestone

Comments

@keanpantraw
Copy link

keanpantraw commented Nov 15, 2018

Problem description

Documentation of pandas.read_csv states:

escapechar : str (length 1), default None

    One-character string used to escape delimiter when quoting is QUOTE_NONE.

But in reality you can correctly parse following CSV:

id, name
1, "Hello, my name is \"John\""

using this code:

import pandas as pd

df=pd.read_csv("john.csv", quotechar='"', escapechar="\\")
print(df.name) # prints Hello, my name is "John"
print(df.id) # prints 1

Also same doc states that default quoting is QUOTE_MINIMAL, not QUOTE_NONE in example above.

Trying same with quoting=csv.QUOTE_NONE will give you incorrect result:

import pandas as pd
import csv

df=pd.read_csv("john.csv", quotechar='"', escapechar="\\", quoting=csv.QUOTE_NONE)
print(df.name) # prints my name is John""
print(df.id) # prints 1, "Hello

Trying same without escapechar will also give you incorrect result:

import pandas as pd

df=pd.read_csv("john.csv", quotechar='"')
print(df.name) # prints Hello, my name is \John\""
print(df.id) # prints 1

Looks like doc is misleading here, incorrectly stating that escapechar is used only for QUOTE_NONE when in fact it can be used with default mode (maybe as result of some bugfixes?)

@gfyoung gfyoung added the Docs label Nov 15, 2018
@gfyoung
Copy link
Member

gfyoung commented Nov 15, 2018

@frenzykryger : Thanks for reporting this! Indeed, the documentation on escapechar can be much improved and should really be more generic in being just an "escape character" (and not as tied to the quoting parameter). PR is more than welcome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants