Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nan rows for filtered out data using bcolz backend #1221

Closed
tdennist opened this issue Aug 28, 2015 · 0 comments · Fixed by #1223
Closed

Nan rows for filtered out data using bcolz backend #1221

tdennist opened this issue Aug 28, 2015 · 0 comments · Fixed by #1223
Assignees
Milestone

Comments

@tdennist
Copy link

This quirk was reported in this thread: https://groups.google.com/a/continuum.io/forum/#!topic/blaze-dev/BqJx4LAy0xE

In [28]: blaze.__version__
'0.8.0'

In [29]: bcolz.__version__
'0.8.1'

The bcolz backend to Blaze seems to display a nan entry for the filtered out rows:

In [22]: df = blaze.Data(bcolz.ctable([[1,1,3,3], [1,2,3,4]]))

In [23]: df = df[df['f1']==3]

In [24]: blaze.by(df[['f0']], Sum=df['f1'].sum())

   f0  Sum
0   1  NaN
1   3    3


The Pandas backend behaves as I would expect:

In [25]: df = blaze.Data(pandas.DataFrame(dict(f0=[1,1,3,3], f1=[1,2,3,4])))

In [26]: df = df[df['f1']==3]

In [27]: blaze.by(df[['f0']], Sum=df['f1'].sum())

   f0  Sum
0   3    3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants