[GEN-2381] Pandas handling of nullable cells#1272
Conversation
…ts and that the data is stored and retrieved accurately.\
| pd.testing.assert_series_equal( | ||
| results["column_string"], pd.DataFrame(dict_data)["column_string"] | ||
| results["column_string"], | ||
| pd.DataFrame(dict_data)["column_string"].convert_dtypes(), |
There was a problem hiding this comment.
For discussion: Should expected dataframes in these tests be created with the dtypes set so that we are testing the fact that convert_dtypes is being done?
The concern here is if there are issues with "convert_dtypes" function, it won't be caught because we are using the function itself.
There was a problem hiding this comment.
The only reason I added convert_dtypes is to make data types match between columns. But I forgot that we can use check_dtype=False instead. I would prefer to use check_dtype because it would be a quicker fix.
There was a problem hiding this comment.
It's still expected that the result of calling query will return object type for something that is integer type and has nulls.
Then using convert_dtypes() would convert it to the correct pandas dtype (in this case Int64)What we are trying to prevent was that previously query would return something from a table that is integer with nulls -> float.
If that is the case, then i think this is good to go for the genie scenario.
* Adding support for Python 3.14 and dropping support for python 3.9
Problem:
convert_dtypesintroduces int64 data not serialized by json error and attributes not matching error in integration tests such as StringDtype vs object.Solution:
convert_dtypesand thedtypeargument when reading in a CSV to pandas DFTesting: