-
Notifications
You must be signed in to change notification settings - Fork 358
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cx_Oracle >=6.0.3 doesn't allow to use NaNs #184
Comments
Oracle's documentation implies NaN is supported: https://docs.oracle.com/cd/E11882_01/server.112/e41084/sql_elements001.htm#SQLRF50979 this answer https://stackoverflow.com/a/28461506/34549 seems to indicate you need to use binary_float or binary_double datatypes. should that work ? |
Thanks for finding links! Pls note, I use BINARY_FLOAT in the snippet above. |
Thanks, @zzzeek. Yes, NaN is supported, but only in BINARY_FLOAT and BINARY_DOUBLE. If you use it with NUMBER (which is the default type conversion) you may run into data corruption issues, which is why the change was made. So you simply need to use cursor.setinputsizes() with the type cx_Oracle.NATIVE_FLOAT to tell it you are inserting into a BINARY_FLOAT or BINARY_DOUBLE column. |
Well, good thing I added logic to allow more specific setinputsizes setup for specific SQLAlchemy sub-types. What are the downsides to NATIVE_FLOAT? I need to know when I'm using this and when I'm not. Again, we go to #68 (comment), where I changed SQLAlchemy to not setinputsizes() for numeric types. Especially, if the datatype is BINARY_DOUBLE, and in an executemany() there are lots of wide-precision numeric values coming in, how do I set NATIVE_FLOAT there without introducing truncation for the non-NaN values ? |
The downsides to NATIVE_FLOAT? Well, first of all it has a different set of values that it represents than does NUMBER. It handles NaN but won't handle values greater than 2^53 and the precision capabilities are different. Second, if the target type is different (NATIVE_FLOAT going to NUMBER, for example) there is a conversion that is required, which implies a performance penalty will be in place. So unfortunately the decision to use one type or the other is going to be based on the target -- which cx_Oracle can not know about. Most of the time this is NUMBER and the default works well but if you know otherwise, then you have to use setinputsizes() or accept the truncation/performance penalty. You can use cursor.setinputsizes() for cursor.executemany() in exactly the same way that you use it for cursor.execute(). You are telling cx_Oracle that all of the data that follows will match the type you have specified. If this is not true on the first non-None value passed in and conversion is not possible then cx_Oracle silently picks a type based on that value instead. If this is not true on subsequent non-None values and conversion is not possible then you get an error. If you know the target is BINARY_FLOAT or BINARY_DOUBLE then you should use cursor.setinputsizes() to let cx_Oracle know to use the correct type. What exactly do you mean by wide-precision numeric values coming in for calls to executemany()? If they are float values coming in they will be sent to the database unchanged when using NATIVE_FLOAT. Can you clarify your comment? |
ah you know I was confusing DOUBLE with NUMBER. per https://docs.oracle.com/cd/B28359_01/server.111/b28285/sqlqr06.htm#i76716, BINARY_FLOAT and BINARY_DOUBLE are both floating point types. but...what is the precision we expect with these two types? my goal is to avoid the bug in #68. I need numeric values to round-trip perfectly given the precision that the type defines. so far it seems like I should setinputsizes() to NATIVE_FLOAT with the BINARY_FLOAT and BINARY_DOUBLE types. I don't care about a slight performance penalty. I care about accuracy being maintained and no surprise truncation. |
BINARY_FLOAT is single precision floating point and BINARY_DOUBLE is double precision floating point (the same type that Python uses internally for its "float" type). So you will get the same precision as those values allow. See here for all the nitty gritty details, but the main point for double is you get 2^53 as the maximum integer that can be used and about 15 significant digits and a maximum value of 10^308. Oracle's NUMBER format has a far smaller scale but greater precision. You can typically get 38 significant digits and all of those can be used to represent an integer but the maximum value is only 10^126. Given those constraints the input from Python and the target in the database are both important. If you have BINARY_FLOAT in the database you have to ensure that the input data is Python floats, integers or decimal values that will fit inside the constraints of BINARY_FLOAT. If you don't do so, you will get silent truncation. The same will be true for BINARY_DOUBLE. Clearly Python floats will always fit perfectly inside BINARY_DOUBLE but integers and decimals can easily exceed its constraints. For NUMBER data Python floats will always result in some truncation for some numbers. Most of the time that isn't an issue but its always there because NUMBER is a decimal format, not a binary format (unlike BINARY_FLOAT and BINARY_DOUBLE which are binary format). For that reason Python decimals and integers fit perfectly into Oracle NUMBER but again both integers and decimals can easily exceed the constraints of NUMBER. And for the same reason Python decimals will always result in conversion errors when going into binary format. If it sounds complicated, that's because it is! Thankfully most of the time all of these "issues" can be ignored as the precision/scale that most people are dealing with are well within the constraints of all of these types. But if you deal with numbers of larger precision/scale and values like NaN on a regular basis then you have to know all of this stuff and know how to make the software behave the way it should. cx_Oracle has the ability to do this using cursor.setinputsizes() and tries its best to use reasonable defaults. There is no way that I am aware of to do it perfectly without programmer input -- but I'm interested in anything you might suggest to improve things! |
Hi Anthony - OK, I have Python float("nan") round tripping. But now what about the same thing #78 did for INFINITY? Decimal("NaN") should round trip as well. I'm calling setinputsizes(NATIVE_FLOAT):
|
After more experimenting today I confirm that setinputsizes with NATIVE_FLOAT fixes the original problem. But not everywhere. As i mentioned above I use anaconda cloud to set up my environment. I have realized that using most recent cx_oracle module version 6.2.1 available from official anaconda channels still crashes even with setinputsizes. Then I ran more tests and figured out that among the following versions
only 6.2.1 behaves badly and produces the following error:
So the question is, are you aware of this? Is it a kind of expected behavior of that particular version? |
…FLOAT as requested in issue 184 (#184).
@zzzeek, I have just pushed a commit that will permit binding decimal values to cx_Oracle.NATIVE_FLOAT. Truncation will occur, of course, if the decimal value exceeds the constraints of a double value, but since you are binding to cx_Oracle.NATIVE_FLOAT I think that makes sense. This is not permitted for cx_Oracle.NUMBER however, to avoid the problem noted in issue #68. @leonid-butenko, apparently that was a regression in 6.2 which was corrected in 6.3 (and a test case added to the test suite to prevent it from happening again). |
Assuming all is well due to lack of activity. |
hi Anthony - havent' had time to test it. ill see if I can ping this issue when I get around to it. |
I found here #91 that the usage of NaN values corrupts the data. We've been using NaNs for a few years now and with a recent update of cx_Oracle have realized it's not supported any longer. What are the alternatives for NaNs? From scientific point of view, having NaNs is quite essential and important.
here is the code snippet I use to reproduce the issue:
I get the following error:
Answer the following questions:
python 2.7.14, 64 bits
6.3
installed? Where is it installed?
I tried with 11.2 and 12.2
11g enterprise
ubuntu 18.04 and Debian 9
gcc --version
.gcc (GCC) 4.8.5
Copyright (C) 2015 Free Software Foundation, Inc.
I use anaconda cloud running in docker.
install with)? Who were you logged in as?
The text was updated successfully, but these errors were encountered: