Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pandas' new high-precision csv float parser is sometimes quite low precision #24

Closed
CalebBell opened this issue Jan 25, 2021 · 1 comment

Comments

@CalebBell
Copy link
Owner

Describe the bug
I was wondering why the CI started failing, and it turns out Pandas 1.2.0 updated some defaults for their CSV parser. Well, one of those was to use a higher-precision floating point converter. Chemicals reveals at least one bug in the new parser.

Minimal Reproducible Example

chemicals.viscosity.mu_data_VDI_PPDS_8['D']

In Pandas 1.1.2 when reading "0.00000000000001953" we get:
1.953E-14

In Pandas 1.2.1 we get:
1.95E-14

Additional context
This also breaks results in people using data data source from this library.

Workaround
It is possible to set the old behavior with float_precision='legacy'. The two data files with this bug have had this default set to this in master now. Ideally, Pandas will fix their bug. I didn't find any issue reported with this in a cursory search.

@CalebBell
Copy link
Owner Author

I release 1.0.0 with this fix and have now submitted a bug to pandas: pandas-dev/pandas#39514

Almost 40,000 bugs? I don't envy working on that project!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant