New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

more confusion about binding NATIVE_INT #257

Closed
zzzeek opened this Issue Jan 19, 2019 · 12 comments

Comments

Projects
None yet
3 participants
@zzzeek
Copy link

zzzeek commented Jan 19, 2019

"question" label here, because as usual I can see why this is happening, it is just another surprise (note I differentiate between "surprise" and "expected" :) )

In SQLAlchemy, we have an outputtypehandler that in one particular case, is essentially this:

def handler(cursor, name, default_type, size, precision, scale):

    if precision and scale == 0:
        return cursor.var(
            cx_Oracle.NATIVE_INT,
            255,
            arraysize=cursor.arraysize,
            outconverter=None,
        )

    else:
        return None

it seems well intentioned enough until one of my users decided to stick 9999999999999999999 into it, producing the error:

cx_Oracle.DatabaseError: ORA-01455: converting column overflows integer datatype

Apparently the max int size for NATIVE_INT is 32 bits, so...maybe if you can confirm that, and what kind of datatype is coming back if I dont do that. What's the purpose of NATIVE_INT exactly, and do I have to worry about this on the input side as well (though this user hasn't tried that yet :) )

I can either not return a var at all, or I can do this:

    return cursor.var(
        int, 255, arraysize=cursor.arraysize, outconverter=None
    )

and both of those seem to work great for everything I'm testing plus these new tests. In #208 it seemed like I had decided to use NATIVE_INT but that was on the input side. On the output side can you confirm I should set "int" on the var and/or not return the var?

@zzzeek

This comment has been minimized.

Copy link
Author

zzzeek commented Jan 19, 2019

it seems like all the way back to cx_Oracle 5.2, which note doesn't even have NATIVE_INT, I can do this:

    return cursor.var(
        int, 255, arraysize=cursor.arraysize, outconverter=None
    )

can you confirm that's the best practice here? (or at least a perfectly fine practice when I see precision and scale == 0 ?)

@cjbj cjbj added the question label Jan 19, 2019

@zzzeek

This comment has been minimized.

Copy link
Author

zzzeek commented Jan 19, 2019

update, on Python 2 I would need to say "long" and not "int" for the larger ints, I get another kind of int overflow otherwise. It seems like cx_Oracle does the right thing if I omit the handler entirely for ints so going to see if that passes, it's unclear if I hardcode it to "long" that it is using more resources under py2k for non-large ints or if it just doesn't matter.

@anthony-tuininga

This comment has been minimized.

Copy link
Member

anthony-tuininga commented Jan 22, 2019

@zzzeek, a few notes that may help answer your questions:

  • the max size for NATIVE_INT is a signed 64-bit integer
  • the data type returned when using NATIVE_INT is a Python integer; in Python 2.x on Windows this can be a Python long integer if the integer exceeds the size capable of being stored in a plain integer; on Python 3.x this difference no longer exists, of course
  • if using NATIVE_INT on Windows and the value exceeds the value of a signed long value, the value is silently truncated (this can be fixed and I plan to fix it)
  • cx_Oracle examines precision and scale for numeric values and automatically returns an integer if the precision and scale warrant it; as such the default for fetch is perfectly adequate in most instances
  • in cx_Oracle master (what will become 7.1 shortly), if you use the value int it will return an integer if the value is an integer; otherwise, cx_Oracle will return a floating point if it can be represented accurately (less than or equal to 15 digits of precision) and a decimal value in all other cases

I hope this explains things well enough? If not, ask further questions as needed.

anthony-tuininga added a commit that referenced this issue Jan 22, 2019

Ensure that values that exceed sizeof(long) for Python 2 on Windows w…
…hen using

the Oracle type NATIVE_INT are not silently truncated
(#257).
@zzzeek

This comment has been minimized.

Copy link
Author

zzzeek commented Jan 22, 2019

* in cx_Oracle master (what will become 7.1 shortly), if you use the value int it will return an integer if the value is an integer; otherwise, cx_Oracle will return a floating point if it can be represented accurately (less than or equal to 15 digits of precision) and a decimal value in all other cases

For the moment I'm letting cx_Oracle handle the return type and if scale == 0 I'm returning None for the handler. The most important thing for me is if the value is an integer , a Python int is returned. Above, it seems like in 7.1 you are reversing this to return a floating point value? If scale == 0 it should be integer. I tried using an output type handler with the value of int, but then in Python 2 it crashes for larger ints (I guess I should set it to long on python 2 if the precision is high?)

here is my latest approach simplified:

def handler(cursor, name, default_type, size, precision, scale):
    # we already know we are NUMERIC here

    if precision:
        if scale == 0:
            return None
        else:
            type_ = cx_Oracle.NATIVE_FLOAT

    else:
        if scale == 0:
            return None
        else:
            type_ = cx_Oracle.NATIVE_FLOAT

    return cursor.var(
        type_,
        255,
        arraysize=cursor.arraysize,
    )

if scale == 0, it has to return integer, not floating point. will this suddenly change in 7.1 ? (what issue is being fixed?)

@anthony-tuininga

This comment has been minimized.

Copy link
Member

anthony-tuininga commented Jan 22, 2019

cx_Oracle has always checked for scale == 0 and returned integers (or long integers for Python 2 where applicable). If scale was not 0 then a floating point number was returned. That can result in truncation, however, since Oracle numbers can have up to 38 digits of precision but floating point numbers can't go beyond 15 digits of precision. So for the new version I check to see if there are more than 15 digits of precision, and if so, I automatically return a decimal value instead. Does that make more sense?

What you appear to be doing above, however, is turning off that approach and insisting that the value be returned as a floating point number -- so truncation will take place when the number of digits of precision exceeds 15. The performance will also suffer as the Oracle database will have to transform Oracle numbers into native floating point numbers during fetch. Is there a reason you're doing that? I don't see the point of the handler above. The default processing should work fine!

I tried using an output type handler with the value of int, but then in Python 2 it crashes for larger ints (I guess I should set it to long on python 2 if the precision is high?)

Can you let me know what values you tried with Python 2? And which platform and cx_Oracle version? You should be able to use "int" on both Python 2 and 3 -- at least I've always done so -- so if that isn't the case I'd like to know about it. :-)

@zzzeek

This comment has been minimized.

Copy link
Author

zzzeek commented Jan 22, 2019

OK don't worry about the NATIVE_FLOAT part, that's only a partial form of the code and in that codepath, the user has requested that they explicitly do not want a Decimal object back, they want a Python float. We're here worried about integers.

what I want is, if scale==0 , I get Python int or long back. When you say, "however, since Oracle numbers can have up to 38 digits of precision but floating point numbers can't go beyond 15 digits of precision", you're confusing me, because we're talking about ints, not floating points. ints in Python 3 are variable length so have no size limit. So I don't want Decimal or float() if scale == 0, because int can handle it. is there a case I'm missing?

as for the python 2 crash, here is that:

import cx_Oracle

conn = cx_Oracle.connect(
    user="scott",
    password="tiger",
    dsn=cx_Oracle.makedsn("oracle1120", 1521, sid=""),
)


def handler(cursor, name, default_type, size, precision, scale):
    # fails: cx_Oracle.DatabaseError: OCI-22053: overflow error
    return cursor.var(int, 255, arraysize=cursor.arraysize, outconverter=None)

    # works
    # return cursor.var(long, 255, arraysize=cursor.arraysize, outconverter=None)


conn.outputtypehandler = handler

cursor = conn.cursor()
cursor.execute("SELECT CAST(9999999999999999999 AS NUMBER(19,0)) FROM dual")
cursor.fetchone()

that's cx_Oracle 7.0.0, Python is:

Python 2.7.15 (default, Oct 15 2018, 15:24:06) 
[GCC 8.1.1 20180712 (Red Hat 8.1.1-5)] on linux2

oracle is 11xe.

@anthony-tuininga

This comment has been minimized.

Copy link
Member

anthony-tuininga commented Jan 22, 2019

what I want is, if scale==0 , I get Python int or long back. When you say, "however, since Oracle numbers can have up to 38 digits of precision but floating point numbers can't go beyond 15 digits of precision", you're confusing me, because we're talking about ints, not floating points. ints in Python 3 are variable length so have no size limit. So I don't want Decimal or float() if scale == 0, because int can handle it. is there a case I'm missing?

Sorry for the confusion. If scale == 0 an integer is returned. Full stop. :-) In Python 2 you get an integer until it exceeds the value capable of being stored in a C long variable after which it returns a "long" value. Is that clear enough?

What I was talking about is when scale != 0 and the Oracle number can refer to either an integer or a number with a decimal point. In that case, the default behaviour is to return a floating point number. What has changed in the new version (yet to be released) is that if the number of digits of precision in the actual number that is returned (not the metadata) exceeds 15, then a decimal number is returned in order to avoid truncation. So even if you have number(24, 2) but you've stored the number 1.5 (2 digits of precision) you will get back a floating point number, but if you store a value with more than 15 digits of precision you will get back a decimal number. Is that also clear?

I will see if I can replicate the problem with cx_Oracle 7.0 and get back to you.

@zzzeek

This comment has been minimized.

Copy link
Author

zzzeek commented Jan 22, 2019

taht's all great for the floating point stuff, we are usually using Decimal with our return handler for non-ints. the float thing is to support API contracts where the user explcitly requests floats or decimals back, rather than it being arbitrary / based on driver version / etc.

@anthony-tuininga

This comment has been minimized.

Copy link
Member

anthony-tuininga commented Jan 22, 2019

Ok. I just tried your example. And yes, you get error OCI-22053: overflow error if you tell it you explicitly want an integer. If you let cx_Oracle do its own thing it will return an integer (well, Python long integer to be more precise in Python 2) automatically. In the new version of cx_Oracle (yet to be released) this will also work as expected (when using an output type handler to explicitly return integers). I discovered this little discrepancy between Python 2 and 3 when I was looking into bug #241.

@zzzeek

This comment has been minimized.

Copy link
Author

zzzeek commented Jan 22, 2019

yup I'm letting cx_Oracle decide now. I think many years ago there was some case where an int was coming back as a float, when it came from a sequence or something, and that was where this logic came from, but I don't know what happened to that case now.

@anthony-tuininga

This comment has been minimized.

Copy link
Member

anthony-tuininga commented Jan 22, 2019

Right. At some point (I don't know exactly when!) I also added the case where the metadata says that it is just a "number" and doesn't specify precision or scale at all -- which is frequently the case for numeric expressions. In that case I try to return an integer and if that fails then I return a floating point number (and soon, if the number of digits of precision exceeds 15) a decimal number.

@anthony-tuininga

This comment has been minimized.

Copy link
Member

anthony-tuininga commented Jan 31, 2019

Assuming that your questions have been answered!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment