Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with CSV parser when using usecols #3192

Closed
tr11 opened this issue Mar 27, 2013 · 2 comments · Fixed by #4406

Comments

@tr11
Copy link
Contributor

commented Mar 27, 2013

The following code:

import pandas
from io import StringIO

data = u'1,2,3\n4,5,6\n7,8,9'
df = pandas.read_csv(StringIO(data), 
                     dtype={'B': int, 'C':float},
                     header=None,
                     names=['A', 'B', 'C'],
                     converters={'A': str},
)
print(df.dtypes)
print
df = pandas.read_csv(StringIO(data), usecols=[0,2],
                     dtype={'B': int, 'C':float},
                     header=None,
                     names=['A', 'B', 'C'],
                     converters={'A': str},
)
print(df.dtypes)
print
df = pandas.read_csv(StringIO(data), usecols=[0,1,2],
                     dtype={'B': int, 'C':float},
                     header=None,
                     names=['A', 'B', 'C'],
                     converters={'A': str},
)
print(df.dtypes)

outputs

A     object
B      int32
C    float64
dtype: object

A     object
C    float64
dtype: object

A    object
B    object
C    object
dtype: object

The last dtype should be the same as the first one. The issue arises when passing in a converters dictionary together with a usecols that uses all the available columns.

The commit https://github.com/tr11/pandas/commit/1ac3700e72a5861a7d8544a72d77a4d64c71f118 in my fork seems to fix this issue.

@garaud

This comment has been minimized.

Copy link
Contributor

commented Jun 20, 2013

Hi,

I read your issue by curiosity and I wonder if you sent a 'pull request' of your fix or not. I didn't see it in https://github.com/pydata/pandas/pulls Maybe check if this issue hasn't been fixed already.

Damien G.

@tr11

This comment has been minimized.

Copy link
Contributor Author

commented Jun 22, 2013

Damien,

I didn't send a pull request. I'll do so after checking if the issue still
exists.

Thanks for the reminder,

TR
On Jun 20, 2013 3:40 AM, "Damien Garaud" notifications@github.com wrote:

Hi,

I read your issue by curiosity and I wonder if you sent a 'pull request'
of your fix or not. I didn't see it in
https://github.com/pydata/pandas/pulls Maybe check if this issue hasn't
been fixed already.

Damien G.


Reply to this email directly or view it on GitHubhttps://github.com//issues/3192#issuecomment-19735395
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.