Skip to content
This repository has been archived by the owner on Dec 11, 2023. It is now read-only.

ctable.fromdataframe() does not support named columns #314

Closed
dirkbike opened this issue Aug 19, 2016 · 5 comments
Closed

ctable.fromdataframe() does not support named columns #314

dirkbike opened this issue Aug 19, 2016 · 5 comments

Comments

@dirkbike
Copy link
Contributor

Not sure if this is a bug, but it seems that fromdataframe should be able to support the following use case:

In [1]: import pandas as pd

In [2]: from pandas_datareader import data as web

In [3]: import bcolz as bz

In [4]: pd.__version__
Out[4]: '0.18.1'

In [5]: bz.__version__
Out[5]: '1.1.0'

In [6]: pd.options.display.width = 120

In [7]: df = web.get_data_yahoo('^gspc', '1960-01-01', '2016-01-01')

In [8]: df.head()
Out[8]:
                 Open       High        Low      Close   Volume  Adj Close
Date
1960-01-04  59.910000  59.910000  59.910000  59.910000  3990000  59.910000
1960-01-05  60.389999  60.389999  60.389999  60.389999  3710000  60.389999
1960-01-06  60.130001  60.130001  60.130001  60.130001  3730000  60.130001
1960-01-07  59.689999  59.689999  59.689999  59.689999  3310000  59.689999
1960-01-08  59.500000  59.500000  59.500000  59.500000  3290000  59.500000

In [9]: dfz = bz.ctable.fromdataframe(df, rootdir='gspc.bz')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-9-a5ef4f82fc21> in <module>()
----> 1 dfz = bz.ctable.fromdataframe(df, rootdir='gspc.bz')

C:\Python34\lib\site-packages\bcolz\ctable.py in fromdataframe(df, **kwargs)
    694
    695         # Create the ctable
--> 696         ct = ctable(cols, names, **kwargs)
    697         return ct
    698

C:\Python34\lib\site-packages\bcolz\ctable.py in __init__(self, columns, names, **kwargs)
    253             self._open_ctable()
    254         elif columns is not None:
--> 255             self._create_ctable(columns, names, **kwargs)
    256             _new = True
    257         else:

C:\Python34\lib\site-packages\bcolz\ctable.py in _create_ctable(self, columns, names, **kwargs)
    289                     "`columns` and `names` must have the same length")
    290         # Check names validity. Cast to string.
--> 291         names = validate_names(names)
    292
    293         # Guess the kind of columns input

C:\Python34\lib\site-packages\bcolz\ctable.py in validate_names(columns, keyword)
     33 def validate_names(columns, keyword='names'):
     34     if not all([is_identifier(x) and not iskeyword(x) for x in columns]):
---> 35         raise ValueError("{0} are not valid idenifiers".format(keyword))
     36     return list(map(str, columns))
     37

ValueError: names are not valid idenifiers

Also, note that the error message misspelled the word "identifiers."

@ankravch
Copy link

Column names should not contain whitespace characters like 'Adj Close'

This should work
df.columns = [''.join(icol.split()) for icol in df.columns.values.tolist()]
dfz = bz.ctable.fromdataframe(df, rootdir='gspc.bz')

On Fri, Aug 19, 2016 at 8:58 AM, dirkbike notifications@github.com wrote:

Not sure if this is a bug, but it seems that fromdataframe should be able
to support the following use case:

In [1]: import pandas as pd

In [2]: from pandas_datareader import data as web

In [3]: import bcolz as bz

In [4]: pd.version
Out[4]: '0.18.1'

In [5]: bz.version
Out[5]: '1.1.0'

In [6]: pd.options.display.width = 120

In [7]: df = web.get_data_yahoo('^gspc', '1960-01-01', '2016-01-01')

In [8]: df.head()
Out[8]:
Open High Low Close Volume Adj Close
Date
1960-01-04 59.910000 59.910000 59.910000 59.910000 3990000 59.910000
1960-01-05 60.389999 60.389999 60.389999 60.389999 3710000 60.389999
1960-01-06 60.130001 60.130001 60.130001 60.130001 3730000 60.130001
1960-01-07 59.689999 59.689999 59.689999 59.689999 3310000 59.689999
1960-01-08 59.500000 59.500000 59.500000 59.500000 3290000 59.500000

In [9]: dfz = bz.ctable.fromdataframe(df, rootdir='gspc.bz')

ValueError Traceback (most recent call last)
in ()
----> 1 dfz = bz.ctable.fromdataframe(df, rootdir='gspc.bz')

C:\Python34\lib\site-packages\bcolz\ctable.py in fromdataframe(df, *_kwargs)
694
695 # Create the ctable
--> 696 ct = ctable(cols, names, *_kwargs)
697 return ct
698

C:\Python34\lib\site-packages\bcolz\ctable.py in init(self, columns, names, *_kwargs)
253 self._open_ctable()
254 elif columns is not None:
--> 255 self._create_ctable(columns, names, *_kwargs)
256 _new = True
257 else:

C:\Python34\lib\site-packages\bcolz\ctable.py in _create_ctable(self, columns, names, **kwargs)
289 "columns and names must have the same length")
290 # Check names validity. Cast to string.
--> 291 names = validate_names(names)
292
293 # Guess the kind of columns input

C:\Python34\lib\site-packages\bcolz\ctable.py in validate_names(columns, keyword)
33 def validate_names(columns, keyword='names'):
34 if not all([is_identifier(x) and not iskeyword(x) for x in columns]):
---> 35 raise ValueError("{0} are not valid idenifiers".format(keyword))
36 return list(map(str, columns))
37

ValueError: names are not valid idenifiers

Also, note that the error message misspelled the word "identifiers."


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#314, or mute the thread
https://github.com/notifications/unsubscribe-auth/AJmNzUa32hque4OQ3ci5Fqqfcx030TWeks5qhdKWgaJpZM4JonMV
.

@dirkbike
Copy link
Contributor Author

Thanks, that does work. I recommend changing the error message from "names are not valid idenifiers" to "column names must be valid Python identifiers." That will make the error a lot more obvious.

@FrancescAlted
Copy link
Member

@dirkbike Yes, I think your suggestion is better. Could you please provide a PR?

@dirkbike
Copy link
Contributor Author

see #315

@FrancescAlted
Copy link
Member

Fixed in #315

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants