cx_Oracle >=6.0.3 doesn't allow to use NaNs #184

leonid-butenko · 2018-05-23T13:19:55Z

I found here #91 that the usage of NaN values corrupts the data. We've been using NaNs for a few years now and with a recent update of cx_Oracle have realized it's not supported any longer. What are the alternatives for NaNs? From scientific point of view, having NaNs is quite essential and important.

here is the code snippet I use to reproduce the issue:

import cx_Oracle

c = cx_Oracle.connect(
    dsn=cx_Oracle.makedsn("db", 1521, sid="TEST"), user="USER", password="PWD")

cursor = c.cursor()

try:
    cursor.execute("drop table foo")
except:
    pass

cursor.execute("""
    create table foo(
        id integer primary key,
        data BINARY_FLOAT
    )
""")

cursor.execute("INSERT INTO foo (id, data) VALUES (:id, :data)", {"id": 1, "data": float(100.0)})
cursor.execute("UPDATE foo SET data=:data", {"data": float('nan')})

c.commit()

I get the following error:

>python ./test.py 
Traceback (most recent call last):
  File "./test.py", line 21, in <module>
    cursor.execute("UPDATE foo SET data=:data", {"data": float('nan')})
cx_Oracle.DatabaseError: DPI-1055: value is not a number (NaN) and cannot be used in Oracle numbers

Answer the following questions:

What is your version of Python? Is it 32-bit or 64-bit?

python 2.7.14, 64 bits

What is your version of cx_Oracle?

6.3

What is your version of the Oracle client (e.g. Instant Client)? How was it
installed? Where is it installed?

I tried with 11.2 and 12.2

What is your version of the Oracle Database?

11g enterprise

What is your OS and version?

ubuntu 18.04 and Debian 9

What compiler version did you use? For example, with GCC, run
gcc --version.

gcc (GCC) 4.8.5
Copyright (C) 2015 Free Software Foundation, Inc.

What environment variables did you set? How exactly did you set them?

I use anaconda cloud running in docker.

What exact command caused the problem (e.g. what command did you try to
install with)? Who were you logged in as?

>docker run -it --rm continuumio/miniconda bash
>conda install -c conda-forge -c anaconda  cx_oracle=6.0.3 oracle-instantclient libaio
>python test.py

What error(s) you are seeing?

Traceback (most recent call last):
File "./test.py", line 21, in
cursor.execute("UPDATE foo SET data=:data", {"data": float('nan')})
cx_Oracle.DatabaseError: DPI-1055: value is not a number (NaN) and cannot be used in Oracle numbers

The text was updated successfully, but these errors were encountered:

zzzeek · 2018-05-23T13:38:58Z

Oracle's documentation implies NaN is supported: https://docs.oracle.com/cd/E11882_01/server.112/e41084/sql_elements001.htm#SQLRF50979

this answer https://stackoverflow.com/a/28461506/34549 seems to indicate you need to use binary_float or binary_double datatypes. should that work ?

leonid-butenko · 2018-05-23T13:49:43Z

Thanks for finding links! Pls note, I use BINARY_FLOAT in the snippet above.

anthony-tuininga · 2018-05-23T15:02:40Z

Thanks, @zzzeek. Yes, NaN is supported, but only in BINARY_FLOAT and BINARY_DOUBLE. If you use it with NUMBER (which is the default type conversion) you may run into data corruption issues, which is why the change was made. So you simply need to use cursor.setinputsizes() with the type cx_Oracle.NATIVE_FLOAT to tell it you are inserting into a BINARY_FLOAT or BINARY_DOUBLE column.

zzzeek · 2018-05-23T15:27:12Z

Well, good thing I added logic to allow more specific setinputsizes setup for specific SQLAlchemy sub-types.

What are the downsides to NATIVE_FLOAT? I need to know when I'm using this and when I'm not. Again, we go to #68 (comment), where I changed SQLAlchemy to not setinputsizes() for numeric types. Especially, if the datatype is BINARY_DOUBLE, and in an executemany() there are lots of wide-precision numeric values coming in, how do I set NATIVE_FLOAT there without introducing truncation for the non-NaN values ?

anthony-tuininga · 2018-05-23T15:53:10Z

The downsides to NATIVE_FLOAT? Well, first of all it has a different set of values that it represents than does NUMBER. It handles NaN but won't handle values greater than 2^53 and the precision capabilities are different. Second, if the target type is different (NATIVE_FLOAT going to NUMBER, for example) there is a conversion that is required, which implies a performance penalty will be in place. So unfortunately the decision to use one type or the other is going to be based on the target -- which cx_Oracle can not know about. Most of the time this is NUMBER and the default works well but if you know otherwise, then you have to use setinputsizes() or accept the truncation/performance penalty.

You can use cursor.setinputsizes() for cursor.executemany() in exactly the same way that you use it for cursor.execute(). You are telling cx_Oracle that all of the data that follows will match the type you have specified. If this is not true on the first non-None value passed in and conversion is not possible then cx_Oracle silently picks a type based on that value instead. If this is not true on subsequent non-None values and conversion is not possible then you get an error. If you know the target is BINARY_FLOAT or BINARY_DOUBLE then you should use cursor.setinputsizes() to let cx_Oracle know to use the correct type.

What exactly do you mean by wide-precision numeric values coming in for calls to executemany()? If they are float values coming in they will be sent to the database unchanged when using NATIVE_FLOAT. Can you clarify your comment?

zzzeek · 2018-05-23T16:40:22Z

ah you know I was confusing DOUBLE with NUMBER. per https://docs.oracle.com/cd/B28359_01/server.111/b28285/sqlqr06.htm#i76716, BINARY_FLOAT and BINARY_DOUBLE are both floating point types. but...what is the precision we expect with these two types? my goal is to avoid the bug in #68. I need numeric values to round-trip perfectly given the precision that the type defines.

so far it seems like I should setinputsizes() to NATIVE_FLOAT with the BINARY_FLOAT and BINARY_DOUBLE types. I don't care about a slight performance penalty. I care about accuracy being maintained and no surprise truncation.

anthony-tuininga · 2018-05-23T17:18:26Z

BINARY_FLOAT is single precision floating point and BINARY_DOUBLE is double precision floating point (the same type that Python uses internally for its "float" type). So you will get the same precision as those values allow. See here for all the nitty gritty details, but the main point for double is you get 2^53 as the maximum integer that can be used and about 15 significant digits and a maximum value of 10^308. Oracle's NUMBER format has a far smaller scale but greater precision. You can typically get 38 significant digits and all of those can be used to represent an integer but the maximum value is only 10^126.

Given those constraints the input from Python and the target in the database are both important. If you have BINARY_FLOAT in the database you have to ensure that the input data is Python floats, integers or decimal values that will fit inside the constraints of BINARY_FLOAT. If you don't do so, you will get silent truncation. The same will be true for BINARY_DOUBLE. Clearly Python floats will always fit perfectly inside BINARY_DOUBLE but integers and decimals can easily exceed its constraints. For NUMBER data Python floats will always result in some truncation for some numbers. Most of the time that isn't an issue but its always there because NUMBER is a decimal format, not a binary format (unlike BINARY_FLOAT and BINARY_DOUBLE which are binary format). For that reason Python decimals and integers fit perfectly into Oracle NUMBER but again both integers and decimals can easily exceed the constraints of NUMBER. And for the same reason Python decimals will always result in conversion errors when going into binary format.

If it sounds complicated, that's because it is! Thankfully most of the time all of these "issues" can be ignored as the precision/scale that most people are dealing with are well within the constraints of all of these types. But if you deal with numbers of larger precision/scale and values like NaN on a regular basis then you have to know all of this stuff and know how to make the software behave the way it should. cx_Oracle has the ability to do this using cursor.setinputsizes() and tries its best to use reasonable defaults. There is no way that I am aware of to do it perfectly without programmer input -- but I'm interested in anything you might suggest to improve things!

zzzeek · 2018-05-23T20:16:22Z

Hi Anthony -

OK, I have Python float("nan") round tripping. But now what about the same thing #78 did for INFINITY? Decimal("NaN") should round trip as well. I'm calling setinputsizes(NATIVE_FLOAT):

cx_Oracle.DatabaseError) DPI-1043: invalid number [SQL: 'INSERT INTO t1 (intcol, numericcol) VALUES (:intcol, :numericcol)'] [parameters: [{'intcol': 1, 'numericcol': Decimal('NaN')}, {'intcol': 2, 'numericcol': Decimal('NaN')}]]

leonid-butenko · 2018-05-24T18:36:48Z

After more experimenting today I confirm that setinputsizes with NATIVE_FLOAT fixes the original problem. But not everywhere. As i mentioned above I use anaconda cloud to set up my environment. I have realized that using most recent cx_oracle module version 6.2.1 available from official anaconda channels still crashes even with setinputsizes. Then I ran more tests and figured out that among the following versions

6.0.3
6.1
6.2.1
6.3
6.3.1

only 6.2.1 behaves badly and produces the following error:

Traceback (most recent call last):
  File "test_cx.py", line 21, in <module>
    cursor.execute("INSERT INTO foo (id, data) VALUES (:id, :data)", {"id": 1, "data": float('nan')})
cx_Oracle.DatabaseError: DPI-1055: value is not a number (NaN) and cannot be used in Oracle numbers

So the question is, are you aware of this? Is it a kind of expected behavior of that particular version?

…FLOAT as requested in issue 184 (#184).

anthony-tuininga · 2018-05-24T22:42:51Z

@zzzeek, I have just pushed a commit that will permit binding decimal values to cx_Oracle.NATIVE_FLOAT. Truncation will occur, of course, if the decimal value exceeds the constraints of a double value, but since you are binding to cx_Oracle.NATIVE_FLOAT I think that makes sense. This is not permitted for cx_Oracle.NUMBER however, to avoid the problem noted in issue #68.

@leonid-butenko, apparently that was a regression in 6.2 which was corrected in 6.3 (and a test case added to the test suite to prevent it from happening again).

anthony-tuininga · 2018-05-30T20:06:32Z

Assuming all is well due to lack of activity.

zzzeek · 2018-05-30T22:53:41Z

hi Anthony -

havent' had time to test it. ill see if I can ping this issue when I get around to it.

leonid-butenko changed the title ~~cx_Oracle >=6.0.3 doesn't allow a usage of NaNs~~ cx_Oracle >=6.0.3 doesn't allow to use NaNs May 23, 2018

anthony-tuininga added the question label May 23, 2018

anthony-tuininga added a commit that referenced this issue May 24, 2018

Added support for binding decimal.Decimal values to cx_Oracle.NATIVE_…

9781810

…FLOAT as requested in issue 184 (#184).

anthony-tuininga closed this as completed May 30, 2018

sqlalchemy-bot mentioned this issue Nov 27, 2018

Passing NaN to Float column in Oracle leads to crash sqlalchemy/sqlalchemy#4264

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cx_Oracle >=6.0.3 doesn't allow to use NaNs #184

cx_Oracle >=6.0.3 doesn't allow to use NaNs #184

leonid-butenko commented May 23, 2018 •

edited

Loading

zzzeek commented May 23, 2018

leonid-butenko commented May 23, 2018

anthony-tuininga commented May 23, 2018

zzzeek commented May 23, 2018

anthony-tuininga commented May 23, 2018

zzzeek commented May 23, 2018

anthony-tuininga commented May 23, 2018

zzzeek commented May 23, 2018 •

edited

Loading

leonid-butenko commented May 24, 2018

anthony-tuininga commented May 24, 2018

anthony-tuininga commented May 30, 2018

zzzeek commented May 30, 2018

cx_Oracle >=6.0.3 doesn't allow to use NaNs #184

cx_Oracle >=6.0.3 doesn't allow to use NaNs #184

Comments

leonid-butenko commented May 23, 2018 • edited Loading

Answer the following questions:

zzzeek commented May 23, 2018

leonid-butenko commented May 23, 2018

anthony-tuininga commented May 23, 2018

zzzeek commented May 23, 2018

anthony-tuininga commented May 23, 2018

zzzeek commented May 23, 2018

anthony-tuininga commented May 23, 2018

zzzeek commented May 23, 2018 • edited Loading

leonid-butenko commented May 24, 2018

anthony-tuininga commented May 24, 2018

anthony-tuininga commented May 30, 2018

zzzeek commented May 30, 2018

leonid-butenko commented May 23, 2018 •

edited

Loading

zzzeek commented May 23, 2018 •

edited

Loading