Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot save DataFrame with unicode to CSV #705

Closed
jtbates opened this issue Jan 27, 2012 · 4 comments

Comments

@jtbates
Copy link

commented Jan 27, 2012

In [1]: from pandas import DataFrame
In [2]: df = DataFrame({u'c/\u03c3':[1,2,3]})
In [3]: df.to_csv('test')
---------------------------------------------------------------------------
UnicodeEncodeError                        Traceback (most recent call last)
.../<ipython-input-3-9b2e5ea53beb> in <module>()
----> 1 df.to_csv('test')

.../lib/python2.7/site-packages/pandas-0.7.0.dev_88fcac5-py2.7-macosx-10.4-x86_64.egg/pandas/core/frame.pyc in to_csv(self, path, sep, na_rep, cols, header, index, index_label, mode, nanRep)
    891                     # given a string for a DF with Index

    892                     index_label = [index_label]
--> 893                 csvout.writerow(list(index_label) + list(cols))
    894             else:
    895                 csvout.writerow(cols)

UnicodeEncodeError: 'ascii' codec can't encode character u'\u03c3' in position 2: ordinal not in range(128)

I think this should be separate from #680. The CSV issue is also mentioned in this comment on bug #300.

@adamklein

This comment has been minimized.

Copy link
Contributor

commented Jan 27, 2012

I presume you're using python version < 3? The csv module does not handle unicode unfortunately. I'll see if there is a workaround, but as you can tell by the recurring issues, pandas isn't exactly unicode-friendly on <= python 2.7, but neither is python 2.7 ...

@adamklein adamklein closed this in c0fc368 Jan 27, 2012

@adamklein

This comment has been minimized.

Copy link
Contributor

commented Jan 31, 2012

I had to rewrite this b/c it slowed down CSV reading/writing. If you want to write a UTF-8 encoded csv in python version < 3, you need to pass df.to_csv(..., encoding='utf-8').

@jtbates

This comment has been minimized.

Copy link
Author

commented Jan 31, 2012

Yes, I'm on 2.7. Thanks @adamklein !

yarikoptic added a commit to neurodebian/pandas that referenced this issue Feb 10, 2012
Merge commit 'v0.7.0rc1-94-ge3df4e2' into debian
* commit 'v0.7.0rc1-94-ge3df4e2':
  DOC: added info on encoding parameter for csv i/o
  TST: renamed io b/c module conflict, made suite check for config
  added vbench for write csv
  BUG: made encoding optional on csv read/write, addresses pandas-dev#717
  BUG: float64 hash table for handling NAs in Series.unique, close pandas-dev#714
  TST: add bench_unique.py
  TST: added better testing for pandas-dev#709
  BUG: closes pandas-dev#709, bug in ix + multiindex use case
  DOC: release notes
  BUG: don't assume that each object contains every unique block type in concat, GH pandas-dev#708
  BUG: inconsistency in .ix with integer label and float index
  Fix test that assumed py2.
  Don't use unnecessary UnicodeReader on Python 3.
  BUG: remove poor man's breakpoint
  BUG: closes pandas-dev#705, csv is encoded utf-8 and then decoded on the read side
  updated support contact info
  DOC: note EWMA adjustment, closes pandas-dev#703
  ENH: close pandas-dev#694, pandas-dev#693, pandas-dev#692
  BUG: Bar plot fails if axis parameter supplied, closes pandas-dev#702
@imsrgadich

This comment has been minimized.

Copy link

commented Jun 14, 2018

@adamklein 2018 and still the same issue. your trick helped. thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.