Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Complex types are coerced to float when summing along level of MultiIndex #12902

Closed
elinoreroebber opened this issue Apr 14, 2016 · 2 comments
Labels
Milestone

Comments

@elinoreroebber
Copy link

When trying to sum complex numbers along one level of a MultiIndex, I get a ComplexWarning and the resulting datatypes are float64 with imaginary components discarded. Summing without specifying the level or by unstacking the MultiIndex works as expected.

Based on the error message, I assume that groupby is being called internally, and that the bug is actually there. I'm using pandas version 17.1.

Code Sample

In [1]: df = pd.DataFrame(data=np.arange(4)*(1+2j), index=pd.MultiIndex.from_product(([0,1], [0,1]), names=['a','b']))
In [2]: print df
          0
a b        
0 0      0j
  1  (1+2j)
1 0  (2+4j)
  1  (3+6j)

In [3]: df.sum(level='a')
/Users/elinore/Library/Python/2.7/lib/python/site-packages/pandas/core/groupby.py:1629: ComplexWarning: Casting complex values to real discards the imaginary part
  values = _algos.ensure_float64(values)
Out[3]: 
   0
a   
0  1
1  5

Expected Output

Out[3]:
       0
a   
0  (1+2j)
1  (5+10j)

output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 2.7.11.final.0
python-bits: 64
OS: Darwin
OS-release: 14.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_CA.UTF-8

pandas: 0.17.1
nose: 1.3.7
pip: 7.1.2
setuptools: 19.2
Cython: 0.23.5
numpy: 1.10.4
scipy: 0.17.0
statsmodels: None
IPython: 4.1.2
sphinx: 1.4
patsy: None
dateutil: 2.5.1
pytz: 2016.3
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.5.1
matplotlib: 1.5.1
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
Jinja2: None

@jreback jreback added Bug Groupby Complex Complex Numbers labels Apr 16, 2016
@jreback jreback added this to the Next Major Release milestone Apr 16, 2016
@jreback
Copy link
Contributor

jreback commented Apr 16, 2016

This is about here

pull-requests are welcome. For now we need to coerce this to object (rather than treat this as a float)

This does it. just need some tests and such.

diff --git a/pandas/core/groupby.py b/pandas/core/groupby.py
index 6996254..e2a4482 100644
--- a/pandas/core/groupby.py
+++ b/pandas/core/groupby.py
@@ -1747,7 +1747,7 @@ class BaseGrouper(object):
             values = _algos.ensure_float64(values)
         elif com.is_integer_dtype(values):
             values = values.astype('int64', copy=False)
-        elif is_numeric:
+        elif is_numeric and not com.is_complex_dtype(values):
             values = _algos.ensure_float64(values)
         else:
             values = values.astype(object)

@a-p-man
Copy link
Contributor

a-p-man commented Apr 18, 2016

The bug also occurs when using a Series and invoking groupby with level=0 (so it is not specific to MultiIndex, only indirectly via groupby, as suggested).

In [1]: series = pd.Series(data=np.arange(4)*(1+2j),index=[0,0,1,1])
In [2]: print series
0        0j
0    (1+2j)
1    (2+4j)
1    (3+6j)
dtype: complex128

In [3]: series.groupby(level=0).sum()
Out[3]: 
0    1.0
1    5.0
dtype: float64

Expected:

Out[3]: 
0     (1+2j)
1    (5+10j)
dtype: complex128

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants