Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeError when writing to excel using numpy.timedelta64 #3372

Closed
bnice5000 opened this issue Apr 15, 2013 · 11 comments
Closed

TypeError when writing to excel using numpy.timedelta64 #3372

bnice5000 opened this issue Apr 15, 2013 · 11 comments
Labels
Dtype Conversions Unexpected or buggy dtype conversions Enhancement IO Excel read_excel, to_excel Output-Formatting __repr__ of pandas objects, to_string

Comments

@bnice5000
Copy link

Let me start by saying that I apologize if this is already on your radar. I looked, but could not find a similar issue. When I try to write out numpy.timedelta64 to excel I get the following error:

pandas.__version__
Out[22]: '0.11.0.dev-6cda5dc'

run tester
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/Users/bnice/.virtualenvs/ipython/lib/python2.7/site-packages/IPython/utils/py3compat.pyc in execfile(fname, *where)
    176             else:
    177                 filename = fname
--> 178             __builtin__.execfile(filename, *where)

/Users/bnice/Source/latchmatchcatch/bin/tester.py in <module>()
     27 outputter = writeout_string + '.xlsx'
     28 writer = pandas.io.parsers.ExcelWriter(outputter)
---> 29 df.to_excel(writer, 'results', index=False)
     30 
     31 

/Users/bnice/.virtualenvs/ipython/src/pandas/pandas/core/frame.pyc in to_excel(self, excel_writer, sheet_name, na_rep, float_format, cols, header, index, index_label, startrow, startcol)
   1407         formatted_cells = formatter.get_formatted_cells()
   1408         excel_writer.write_cells(formatted_cells, sheet_name,
-> 1409                                  startrow=startrow, startcol=startcol)
   1410         if need_save:
   1411             excel_writer.save()

/Users/bnice/.virtualenvs/ipython/src/pandas/pandas/io/parsers.pyc in write_cells(self, cells, sheet_name, startrow, startcol)
   2192                             'cur_sheet property')
   2193         if self.use_xlsx:
-> 2194             self._writecells_xlsx(cells, sheet_name, startrow, startcol)
   2195         else:
   2196             self._writecells_xls(cells, sheet_name, startrow, startcol)

/Users/bnice/.virtualenvs/ipython/src/pandas/pandas/io/parsers.pyc in _writecells_xlsx(self, cells, sheet_name, startrow, startcol)
   2210             colletter = get_column_letter(startcol + cell.col + 1)
   2211             xcell = wks.cell("%s%s" % (colletter, startrow + cell.row + 1))
-> 2212             xcell.value = _conv_value(cell.val)
   2213             if cell.style:
   2214                 style = CellStyleConverter.to_xlsx(cell.style)

/Users/bnice/.virtualenvs/ipython/lib/python2.7/site-packages/openpyxl/cell.pyc in _set_value(self, value)
    322     def _set_value(self, value):
    323         """Set the value and infer type and display options."""
--> 324         self.bind_value(value)
    325 
    326     value = property(_get_value, _set_value,

/Users/bnice/.virtualenvs/ipython/lib/python2.7/site-packages/openpyxl/cell.pyc in bind_value(self, value)
    311                 self.set_value_explicit(value, self.TYPE_NUMERIC)
    312                 return True
--> 313         self.set_value_explicit(value, self._data_type)
    314 
    315     def _get_value(self):

/Users/bnice/.virtualenvs/ipython/lib/python2.7/site-packages/openpyxl/cell.pyc in set_value_explicit(self, value, data_type)
    228             self.TYPE_BOOL: bool, }
    229         try:
--> 230             self._value = type_coercion_map[data_type](value)
    231         except KeyError:
    232             if data_type not in self.VALID_TYPES:

/Users/bnice/.virtualenvs/ipython/lib/python2.7/site-packages/openpyxl/cell.pyc in check_string(self, value)
    200         # convert to unicode string
    201         if not isinstance(value, unicode):
--> 202             value = unicode(value, self.encoding)
    203         value = unicode(value)
    204         # string must never be longer than 32,767 characters

TypeError: coercing to Unicode: need string or buffer, numpy.timedelta64 found

My code to generate this error is as follows:

import pandas, numpy, StringIO, datetime

txt = '''ID,DATE
002691c9cec109e64558848f1358ac16,2003-08-13 00:00:00
002691c9cec109e64558848f1358ac16,2003-08-13 00:00:00
0088f218a1f00e0fe1b94919dc68ec33,2006-05-07 00:00:00
0088f218a1f00e0fe1b94919dc68ec33,2006-06-03 00:00:00
00d34668025906d55ae2e529615f530a,2006-03-09 00:00:00
00d34668025906d55ae2e529615f530a,2006-03-09 00:00:00
0101d3286dfbd58642a7527ecbddb92e,2007-10-13 00:00:00
0101d3286dfbd58642a7527ecbddb92e,2007-10-27 00:00:00
0103bd73af66e5a44f7867c0bb2203cc,2001-02-01 00:00:00
0103bd73af66e5a44f7867c0bb2203cc,2008-01-20 00:00:00
'''
df = pandas.read_csv(StringIO.StringIO(txt))
df = df.sort('DATE')
df.DATE = pandas.to_datetime(df.DATE)
grouped = df.groupby('ID')
df['X_SEQUENCE_GAP'] = pandas.concat([g['DATE'].sub(g['DATE'].shift(), fill_value=0) for title,g in grouped])

now = datetime.datetime.now()
timestamp = now.strftime('%Y%m%d%H%M%S%f')
outfolder = 'output'
out_file_name = 'results'
writeout_string = '../{0}/{1}-{2}'.format(outfolder, out_file_name, timestamp)
outputter = writeout_string + '.xlsx'
writer = pandas.io.parsers.ExcelWriter(outputter)
df.to_excel(writer, 'results', index=False)

There is an easy workaround. I can convert the delta to a string in my code. I just wondered if this was expected behavior?

@jreback
Copy link
Contributor

jreback commented Apr 15, 2013

I am not sure that xlwt/openpyxl supports, this, though I don't know. If you find it does, pls open an issue, but prob easier to convert to a string anyhow

@jreback
Copy link
Contributor

jreback commented Apr 15, 2013

FYI..I answered your SO question

@bnice5000
Copy link
Author

Thank you much. I have already selected your answer and gave it an up vote. I appreciate the help! I will see if I can find out about xlwt.

@jreback
Copy link
Contributor

jreback commented Apr 15, 2013

np...since you are writing .xlsx you actually use openpyxl. I am not sure of how excel represents date subtraction, and openpyxl deals with it. Excel is somewhat finicky, so this might not be easy

@jreback
Copy link
Contributor

jreback commented Apr 19, 2013

any luck with xlwt/openpyxl? (what do they support for a timedetlta like value)

@jreback
Copy link
Contributor

jreback commented Apr 22, 2013

I think you need to export as fractions of a day to excel

http://www.excelforum.com/excel-formulas-and-functions/573306-how-to-convert-date-time-to-seconds.html

@bnice5000
Copy link
Author

Thank you for the help. I am looping back to this now and I will report back shortly.

@jmcnamara
Copy link
Contributor

There no longer a TypeError when running the sample code in the latest version on master.

If there is still an issue with numpy.timedelta64 handling in to_excel then that should be raised as a separate issue.

I vote to close this.

@jreback jreback closed this as completed Nov 10, 2013
@bnice5000
Copy link
Author

Openpyxl has gotten back to me. They are willing to work with up to deal with timedela64 in numpy/pandas. I would vote for h:mm:ss format. Any thoughts?

https://bitbucket.org/ericgazoni/openpyxl/issue/198/numpytimedelta64-and-openpyxl

@jreback
Copy link
Contributor

jreback commented Dec 17, 2013

you can have days.hours,minutes,seconds,partial seconds

best prob to keep it in nano seconds and just a format

@valentas-kurauskas
Copy link

I too have this problem with a certain type (q.qtypes.q_none from a kdb q python interface). I am surprised by this behaviour of openpyxl. It should be possible to output to excel any data type, some special value if the rule is not known in the worst case. It seems quite difficult to fix this problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dtype Conversions Unexpected or buggy dtype conversions Enhancement IO Excel read_excel, to_excel Output-Formatting __repr__ of pandas objects, to_string
Projects
None yet
Development

No branches or pull requests

4 participants