-
-
Notifications
You must be signed in to change notification settings - Fork 18.9k
Labels
BugIO DataIO issues that don't fit into a more specific labelIO issues that don't fit into a more specific label
Milestone
Description
read_csv behaves oddly when na_values is set to non-string values. Sometimes
it correctly replaces the assigned number with NaN, and sometimes it doesn't. Here are some examples. Note in particular the different behavior of the last two statements:
Create file
df = DataFrame({'A' : [-999, 2, 3], 'B' : [1.2, -999, 4.5]})
df.to_csv('test2.csv', sep=' ', index=False)
print read_csv('test2.csv', sep= ' ', header=0, na_values=[-999])
A B
0 NaN 1.2
1 2 -999.0
2 3 4.5
print read_csv('test2.csv', sep= ' ', header=0, na_values=[-999.0])
A B
0 -999 1.2
1 2 NaN
2 3 4.5
print read_csv('test2.csv', sep= ' ', header=0, na_values=[-999.0,-999])
A B
0 -999 1.2
1 2 NaN
2 3 4.5
print read_csv('test2.csv', sep= ' ', header=0, na_values=[-999,-999.0])
A B
0 NaN 1.2
1 2 -999.0
2 3 4.5
Metadata
Metadata
Assignees
Labels
BugIO DataIO issues that don't fit into a more specific labelIO issues that don't fit into a more specific label