Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

All rational numbers are parsed as Decimals.Decimal instead of Float #254

Closed
svilupp opened this issue Sep 5, 2022 · 3 comments
Closed

Comments

@svilupp
Copy link
Contributor

svilupp commented Sep 5, 2022

I have noticed unexpected behaviour. It might be intended but I haven't found any reference in the documentation.

Expected behaviour: rational numbers like 1.0 and 0.1 will be represented as Float

Actual behaviour: all non-integer numbers are represented as Decimals.Decimal

MWE:

using LibPQ
# mocking up a download with rational numbers
sql="""
select avg(1.0) as average, 1.0 as float_number, 0.0000000001 as really_small_number
from bap_aed.aed_metrics
"""
result = execute(conn,sql);

# you can see that LibPQ creates Decimals.Decimal types
result.column_types
# 3-element Vector{Type}:
#  Decimals.Decimal
#  Decimals.Decimal
#  Decimals.Decimal

Package version (but I've seen the same behaviour on 0.9)
[194296ae] LibPQ v1.14.0

versioninfo() (but it was the same on 1.7)
Julia Version 1.8.0
Commit 5544a0fab76 (2022-08-17 13:38 UTC)
Platform Info:
OS: macOS (arm64-apple-darwin21.3.0)
CPU: 8 × Apple M1 Pro
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-13.0.1 (ORCJIT, apple-m1)
Threads: 6 on 6 virtual cores

@iamed2
Copy link
Collaborator

iamed2 commented Sep 6, 2022

Only columns with type numeric/decimal will appear this way; float4/real and float8/double precision will not.

The numeric type is an arbitrary-precision number that cannot be represented fully in a Float64. However, Decimals.Decimal is not a great type, and a replacement (or improvements to that package) would be welcome.

You can either force the type in Julia:

julia> execute(conn, sql; type_map=Dict(:numeric => Float64)).column_types
3-element Vector{Type}:
 Float64
 Float64
 Float64

or in SQL:

julia> sql="""
       select avg(1.0)::float8 as average, 1.0::float8 as float_number, 0.0000000001::float8 as really_small_number;
       """
"select avg(1.0)::float8 as average, 1.0::float8 as float_number, 0.0000000001::float8 as really_small_number;\n"

julia> execute(conn, sql).column_types
3-element Vector{Type}:
 Float64
 Float64
 Float64

@iamed2
Copy link
Collaborator

iamed2 commented Sep 6, 2022

psycopg2 in Python does the same thing, using the decimal module in Python's standard library.

https://www.psycopg.org/docs/faq.html#faq-float

@svilupp
Copy link
Contributor Author

svilupp commented Sep 7, 2022

Thank you for the explanation, @iamed2!

I've been eyeing type_map in the docs to force the conversion, so the example is much appreciated.

I'll close the issue as LibPQ works as expected.


Some additional background:
I've used a standard pattern of download SQL -> DataFrame -> save to .arrow for persistence/version control.
Upon opening, my saved data was rubbish.
It turns out that Arrow.write(...;compress=:lz4) didn't know what to do with Decimals type and saved it however without any warning (an issue opened here).

As for the python side, you're right.
I've just checked that both psycopg2 and sqlalchemy (using the former as the driver) return decimals.
The challenge is that I (and perhaps others) use Pandas to download data (pd.read_sql_query), which forces the Float type, so I will have never seen the decimal type come out of a SQL / I didn't know it's the expected behaviour.

Good learning!

@svilupp svilupp closed this as completed Sep 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants