New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pandas discovery of float #366

Merged
merged 2 commits into from Nov 24, 2015

Conversation

Projects
None yet
3 participants
@llllllllll
Member

llllllllll commented Nov 23, 2015

float types are not optional in pandas. They are properly modelled in the domain of float32 or float64. Until we have option blocks these should not discover as option types.

Also, if a dataframe column reports that it is optional, discovering just that column should not drop the optionality

@llllllllll llllllllll force-pushed the quantopian:pandas-optionals branch from 1d64560 to 042c11a Nov 23, 2015

@llllllllll llllllllll changed the title from pandas discovery of float. to pandas discovery of float Nov 23, 2015

llllllllll added some commits Nov 23, 2015

BUG: fix the discovery of pandas types.
In pandas, float columns are not optional because NaN is an element of float.

@cpcloud cpcloud added this to the 0.4.0 milestone Nov 24, 2015

@cpcloud

This comment has been minimized.

Member

cpcloud commented Nov 24, 2015

Reviewing this now. Going to test against blaze and if passes then I'll merge

@cpcloud

This comment has been minimized.

Member

cpcloud commented Nov 24, 2015

This makes a blaze test fail but I'll fix it after merging this

cpcloud added a commit that referenced this pull request Nov 24, 2015

@cpcloud cpcloud merged commit 852d944 into blaze:master Nov 24, 2015

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@asaglimbeni

This comment has been minimized.

asaglimbeni commented May 10, 2016

The Optional is need for missing values in a columns with NaN values.
This commit broke other behavior for instance the SQL method dshape_to_alchemy now create columns with nullable at false https://github.com/blaze/odo/blob/master/odo/backends/sql.py#L405 even if the serie have missing values that in SQL means NULL values. So there are two options or rollback or you have to fix even in the other drivers.

@llllllllll

This comment has been minimized.

Member

llllllllll commented May 10, 2016

NaN is not missing for float, it is a valid element of float. Optional float is the set containing all floats (including NaN) and one new NULL value. You can store NaN in a non nullable float column:

test=# select * from floattest ;
  a  
-----



   1
   2
   3
 NaN
 NaN
 NaN
 0.0
 0.0
 0.0
(12 rows)

this column has valid floats (including NaN) and also NULL

@asaglimbeni

This comment has been minimized.

asaglimbeni commented May 11, 2016

Not all the DBs accept NaN as valid value in non nullable column.
And in any case the append_iterator_to_table https://github.com/blaze/odo/blob/master/odo/backends/sql.py#L449 don't do that because leave free sqlAlchemy to build the insert stantment to the DB.
The case is the follow if you try to use odo to write a dataframe (with nan values in float columns) to the DB like MSSQL it creates float columns with nullable at false and when it try to insert row with NaN the DBe engine rises becuase it is not allowed.
I guess we need to refactor a litle the append_iterator_to_table to be coherent with this behaivor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment