Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding float32 series to dataframe and then attempting to join() #1049

Closed
bburan opened this issue Apr 13, 2012 · 2 comments
Closed

Adding float32 series to dataframe and then attempting to join() #1049

bburan opened this issue Apr 13, 2012 · 2 comments
Labels
Milestone

Comments

@bburan
Copy link

bburan commented Apr 13, 2012

Tested with Pandas 0.7.3

I have a IPython notebook that explores this issue (basically it only seems to occur when you add a float32 series to the DataFrame after creating it, no other datatype I tested seems to trigger this bug). Please contact me if you want the notebook since I don't seem to be able to attach files to this issue. The simplest way to replicate it is via the following lines:

import numpy as np
import pandas
a = np.random.randint(0, 5, 100)
df = pandas.DataFrame({'a': a})
s = pandas.Series(np.random.random(5), name='md')
df.join(s, on='a') # this is OK
df['b'] = np.random.randint(0, 5, 100)
df.join(s, on='a') # this is still OK
df['c'] = np.random.randint(0, 5, 100).astype('f')
df.join(s, on='a') # this fails

The traceback is:

ValueError                                Traceback (most recent call last)
C:\Users\brad\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Enthought\<ipython-input-1-454edb5f6546> in <module>()
      8 df.join(s, on='a') # this is still OK
      9 df['c'] = np.random.randint(0, 5, 100).astype('f')
---> 10 df.join(s, on='a') # this fails
     11 

C:\Python27\lib\site-packages\pandas\core\frame.pyc in join(self, other, on, how, lsuffix, rsuffix, sort)
   3285         # For SparseDataFrame's benefit

   3286         return self._join_compat(other, on=on, how=how, lsuffix=lsuffix,
-> 3287                                  rsuffix=rsuffix, sort=sort)
   3288 
   3289     def _join_compat(self, other, on=None, how='left', lsuffix='', rsuffix='',

C:\Python27\lib\site-packages\pandas\core\frame.pyc in _join_compat(self, other, on, how, lsuffix, rsuffix, sort)
   3298             return merge(self, other, left_on=on, how=how,
   3299                          left_index=on is None, right_index=True,
-> 3300                          suffixes=(lsuffix, rsuffix), sort=sort)
   3301         else:
   3302             if on is not None:

C:\Python27\lib\site-packages\pandas\tools\merge.pyc in merge(left, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy)
     29                          right_index=right_index, sort=sort, suffixes=suffixes,
     30                          copy=copy)
---> 31     return op.get_result()
     32 if __debug__: merge.__doc__ = _merge_doc % '\nleft : DataFrame'
     33 

C:\Python27\lib\site-packages\pandas\tools\merge.pyc in get_result(self)
     80                                       copy=self.copy)
     81 
---> 82         result_data = join_op.get_result()
     83         result = DataFrame(result_data)
     84 

C:\Python27\lib\site-packages\pandas\tools\merge.pyc in get_result(self)
    495         for klass in kinds:
    496             klass_blocks = [mapping.get(klass) for mapping in blockmaps]
--> 497             res_blk = self._get_merged_block(klass_blocks)
    498             result_blocks.append(res_blk)
    499 

C:\Python27\lib\site-packages\pandas\tools\merge.pyc in _get_merged_block(self, blocks)
    509 
    510         if len(to_merge) > 1:
--> 511             return self._merge_blocks(to_merge)
    512         else:
    513             unit, block = to_merge[0]

C:\Python27\lib\site-packages\pandas\tools\merge.pyc in _merge_blocks(self, merge_chunks)
    545                 com.take_fast(blk.values, unit.indexer,
    546                               None, False,
--> 547                               axis=self.axis, out=out_chunk)
    548 
    549             sofar += len(blk)

C:\Python27\lib\site-packages\pandas\core\common.pyc in take_fast(arr, indexer, mask, needs_masking, axis, out, fill_value)
    264         return take_2d(arr, indexer, out=out, mask=mask,
    265                        needs_masking=needs_masking,
--> 266                        axis=axis, fill_value=fill_value)
    267 
    268     result = arr.take(indexer, axis=axis, out=out)

C:\Python27\lib\site-packages\pandas\core\common.pyc in take_2d(arr, indexer, out, mask, needs_masking, axis, fill_value)
    236             out = np.empty(out_shape, dtype=arr.dtype)
    237         take_f = _get_take2d_function(dtype_str, axis=axis)
--> 238         take_f(arr, indexer, out=out, fill_value=fill_value)
    239         return out
    240     else:

C:\Python27\lib\site-packages\pandas\_tseries.pyd in pandas._tseries.take_2d_axis1_float64 (pandas\src\tseries.c:49365)()

ValueError: Buffer dtype mismatch, expected 'float64_t' but got 'float'
@bburan
Copy link
Author

bburan commented Apr 20, 2012

Update: I have tested this with the latest code (0.8.0.dev-d840eda) and the problem is still present.

@wesm
Copy link
Member

wesm commented Sep 18, 2012

This has been fixed in the development version, look for 0.9 release out shortly

@wesm wesm closed this as completed Sep 18, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants