BUG: read_csv: dtype={'id' : np.str}: Datatype not understood #3209

amelio-vazquez-reina opened this Issue Mar 29, 2013 · 4 comments

3 participants


I have a CSV with several columns. The first of which is a field called id with entries of the type 0001, 0002, etc.

When loading this file, the following works:

pd.read_csv(my_path, dtype={'id' : np.int})

but the following doesn't:

pd.read_csv(my_path, dtype={'id' : np.str})

nor does this either:

pd.read_csv(my_path, dtype={'id' : str})

I get: Datatype not understood

This is with pandas-0.10.1


use np.object_ dtype
np.str is a very specifc dtype that needs size information, so hard to deal with

In [13]: data = """1,0001

In [20]: pd.read_csv(StringIO.StringIO(data),header=None,
                                 names=['int','object'],dtype={1 : np.object_ })
   int object
0    1   0001
1    2   0002
2    3   0003

In [21]: pd.read_csv(StringIO.StringIO(data),header=0,
                                 names=['int','object'],dtype={1 : np.object_ }).dtypes
int        int64
object    object
dtype: object

@ribonoous did this solve your issue?


Yes @jreback Sorry I didn't acknowledge this earlier. I am all set!


it works now

D = pd.read_csv(filep, sep=sep, dtype=mm,header=None,names=feature_name,\
            keep_default_na=False,na_values={m:'' for m,v in mm.items() if v==np.object_})
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment