Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: whole-number numerics read from PG were losing trailing zeroes #867

Merged
merged 3 commits into from
Jun 13, 2023

Conversation

olsen232
Copy link
Collaborator

Description

Also strengthened our roundtrip tests to catch any other bugs like this.
Didn't find any other data-loss bugs, but found a weird property that reading float32s from PG would be rounded to the nearest decimal value that accurately describes the float32. The resulting value actually has more bits than a float32. Kart can store the extra bits no problem since it stores everything in float64, but it's better if we don't store these extra bits - they don't mean anything, and they don't roundtrip through other database types.

Related links:

Fixes #863

Checklist:

  • Have you reviewed your own change?
  • Have you included test(s)?
  • Have you updated the changelog?

- don't rstrip 0s from numerics (IMPORTANT)
- make sure float32's from PG are actually float32s
- make sure mysql can write PK 0
@olsen232 olsen232 requested review from craigds and rcoup June 12, 2023 23:02
Copy link
Member

@craigds craigds left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What did we decide to do about unconstrained numerics imported from postgres and then checked out into MSSQL/MySQL?

AFAICT they will get truncated at scale=0.

Not sure how we can handle them better than that. We could add some sensible nonzero scale to them maybe?

@@ -19,6 +19,7 @@ _When adding new entries to the changelog, please include issue/PR numbers where
- Fixes a bug where git-lfs (and potentially other subprocesses) wasn't working reliably with helper mode, affects macOS and Linux. [#853](https://github.com/koordinates/kart/issues/853)
- `import` now informs the user to use `add-dataset` instead if they try to import a table that is already inside the working copy. [#249](https://github.com/koordinates/kart/issues/249)
- Fixes a bug where datasets with no CRS couldn't be checked out into a Geopackage working copy. [#855](https://github.com/koordinates/kart/issues/855)
- Fixes a bad data-loss bug where whole-number numerics read from a Postgres database were losing trailing zeroes (eg 100 was read as 1). [#863](https://github.com/koordinates/kart/issues/863)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd drop the 'bad data-loss bug' here. The bug isn't necessarily bad - depends what you're doing and for many users it won't be relevant or important.

It's worth highlighting bigger than a bottom-of-list item though, so the user can make their own decision. Maybe let's move this to its own section? how about this:

Fix: Corruption of Postgres NUMERIC values

⚠️ This release fixes a bug causing corruption of NUMERIC fields when read from Postgres databases.

If your repositories use the NUMERIC or NUMERIC(a,b) types and were imported/committed from a Postgres source or working copy, you may have lost data.

Specifically, any numeric values with a scale of zero (no digits after the decimal point) were affected. If the only numerics in your repositories were defined with a scale greater than zero, no data corruption should have occured.

For example, a value of 100 in a postgres database with the type NUMERIC(10, 0) or NUMERIC would have been imported as 1 - losing its trailing zeroes.

However, a value of 100.0 in a field declared as NUMERIC(10, 1) would not have been affected.

Other data types (integer, floating-point etc) are not affected, and repos sourced from other SQL databases (Geopackage, MSSQL, MySQL) were not affected.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll put this in when I reformat this changelog entry for the actual release 👍


def python_postread(self, value):
if value is not None and self.size == 32:
return self.float32_packer.unpack(self.float32_packer.pack(value))[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably need to add a comment on why

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

kart/sqlalchemy/adapter/postgis.py Outdated Show resolved Hide resolved
@olsen232 olsen232 requested a review from craigds June 13, 2023 01:59
@olsen232 olsen232 merged commit 4fbaba8 into master Jun 13, 2023
11 checks passed
@olsen232 olsen232 deleted the numeric-fix branch June 13, 2023 03:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Kart truncates trailing zeroes in values of type numeric
2 participants