Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DB_TYPE_NVARCHAR recipe from cx_Oracle does not seem to work with oracledb #12

Closed
zzzeek opened this issue Jun 8, 2022 · 17 comments
Closed
Labels
bug Something isn't working patch available

Comments

@zzzeek
Copy link

zzzeek commented Jun 8, 2022

hey Anthony -

we here are still working with the changes you suggested in oracle/python-cx_Oracle#596. These are working for cx_Oracle but failing for oracledb.

Test script:

#import cx_Oracle
import oracledb as cx_Oracle
import random


conn = cx_Oracle.connect(
    user="scott",
    password="tiger",
    dsn=cx_Oracle.makedsn("oracle18c", 1521, service_name="xe"),
)


cursor = conn.cursor()

try:
    cursor.execute("drop table long_text")
except:
    pass

cursor.execute(
    """
CREATE TABLE long_text (
    x INTEGER,
    y NCLOB,
    z INTEGER
)
"""
)


# the third character is the failure character
word_seed = "ab🐍’«cdefg"

data = " ".join(
    "".join(random.choice(word_seed) for j in range(150))
    for i in range(100)
)

# succeeds
# cursor.setinputsizes(**{"y": cx_Oracle.NCLOB})

# fails with oracledb only:
# ORA-01461: can bind a LONG value only for insert into a LONG column
cursor.setinputsizes(**{"y": cx_Oracle.DB_TYPE_NVARCHAR})

# no setinputsizes: fails on oracledb only with:
# ORA-01483: invalid length for DATE or NUMBER bind variable

cursor.execute(
    "INSERT INTO long_text (x, y, z) VALUES (:x, :y, :z)",
    {"x": 5, "y": data, "z": 10},
)

With cx_Oracle. all three setinputsizes patterns: using NCLOB, using DB_TYPE_NVARCHAR, not calling setinputsizes, all succeed.

using oracledb: NCLOB succeeds, DB_TYPE_NVARCHAR produces "ORA-01461: can bind a LONG value only for insert into a LONG column", and not using setinputsizes produces "ORA-01483: invalid length for DATE or NUMBER bind variable".

I am trying to integrate every improvement suggested in the above mentioned issue as can be seen in this patch: https://gerrit.sqlalchemy.org/c/sqlalchemy/sqlalchemy/+/3903 tests all succeed for cx_Oracle but we have those failures for oracledb.

FWIW, before I made those changes, we do have oracledb passing all tests that cx_Oracle passes, very good!

@zzzeek zzzeek added the bug Something isn't working label Jun 8, 2022
@anthony-tuininga
Copy link
Member

Thanks, @zzzeek, I'll take a look and get back to you on that!

@cjbj
Copy link
Member

cjbj commented Jun 8, 2022

For the record, there is no error in Thick mode, only with Thin mode.

@cjbj
Copy link
Member

cjbj commented Jun 8, 2022

@zzzeek in cases it's relevant, and you haven't seen the new feature, the use of an output type handler for fetching LOBs as string/bytes is now obsoleted by oracledb.defaults.fetch_lobs = False, see Fetching LOBs as Strings and Bytes.

@zzzeek
Copy link
Author

zzzeek commented Jun 8, 2022

@zzzeek in cases it's relevant, and you haven't seen the new feature, the use of an output type handler for fetching LOBs as string/bytes is now obsoleted by oracledb.defaults.fetch_lobs = False, see Fetching LOBs as Strings and Bytes.

yes, this might be relevant at some point, but for now we are operating with the cx_Oracle dialect with just some small adaptions for oracledb and was hoping to start with that before we refactor them apart further.

@anthony-tuininga
Copy link
Member

anthony-tuininga commented Jun 8, 2022

Ok. I have discovered the source of the issue. It has nothing to do with the special characters! It has to do with the fact that column data that exceeds 32767 bytes must be sent after column data that does not exceed 32767 bytes. If you change the data to be less than 32767 bytes the problem goes away. If you change the order of the bind variables in the SQL statement so that the columns greater than 32767 bytes are at the end the problem also goes away! I'll have this happen automatically internally so that you don't have to do so -- but you have a workaround if that is of interest!

@zzzeek
Copy link
Author

zzzeek commented Jun 8, 2022

wow I have so many questions I have to bullet them:

  • what happens if there are two columns that are longer than 32767?
  • previously we were using BLOB for this so I assume the slower approach there didnt have any issue?
  • is it still recommended I try to bind these withDB_TYPE_NVARCHAR ?
  • how is the "thick" driver and cx_Oracle doing it?
  • does the oracledb.defaults.fetch_lobs = False parameter affect this? note this is on INSERT, is that affected?

@anthony-tuininga
Copy link
Member

wow I have so many questions I have to bullet them:

  • what happens if there are two columns that are longer than 32767?
  • previously we were using BLOB for this so I assume the slower approach there didnt have any issue?
  • is it still recommended I try to bind these withDB_TYPE_NVARCHAR ?
  • how is the "thick" driver and cx_Oracle doing it?
  • does the oracledb.defaults.fetch_lobs = False parameter affect this? note this is on INSERT, is that affected?

Here are your answers:

  • All of the columns less than 32767 bytes are sent first, then all of the columns greater than 32767 bytes, in the original order
  • correct, the BLOB locator is less than 32767 bytes so there was no issue
  • yes, you should still bind with DB_TYPE_NVARCHAR to account for when the database character set is incapable of handling that data; this is becoming increasingly uncommon as the default character set is AL32UTF8 (aka UTF-8) but there are still databases out there with a single byte character set, and using DB_TYPE_VARCHAR would result in data loss
  • the thick driver is doing it by simply rearranging the data as it is sent to the server; I will do the same for the thin driver
  • the oracledb.defaults.fetch_lobs parameter only affects fetching it does not affect binding, so this situation would not be affected if you changed that parameter

@zzzeek
Copy link
Author

zzzeek commented Jun 8, 2022

perfect answers! the upshot is "you'll fix it" :) great

not sure if you know but I assume Oracle INSERT only works with a single VALUES clause.....for most DBs we will be looking to batch INSERTs as single statements so that we can use RETURNING.

I think you mentioned Oracle executemany() can use RETURNING?

@anthony-tuininga
Copy link
Member

Yes, executemany() supports DML returning. See test cases 1601, 1606, 1610 and 1611 in the DML returning test suite for examples. I've seen the way that PostgreSQL performs batch insert and yes, Oracle is different. :-)

@zzzeek
Copy link
Author

zzzeek commented Jun 8, 2022

well the many values() thing works on MySQL (which is starting to support RETURNING), SQLite (now has RETURNING), Postgresql, so, it's got a lot of pull on this end. i will try to consdier how oracle's version might fit in to what im going to build for all of them.

@anthony-tuininga
Copy link
Member

Sounds good. If you run into any trouble, I'm sure you'll let us know!

anthony-tuininga added a commit that referenced this issue Jun 9, 2022
@anthony-tuininga
Copy link
Member

@zzzeek, see the supplied patch which addresses the bind ordering. I've added appropriate test cases as well. The setinputsizes() call is needed solely to avoid data loss when the main database character set is incapable of encoding those special characters -- it will work without it if your database character set is the default AL32UTF8 (aka UTF-8). Let me know if that addresses your issues!

@zzzeek
Copy link
Author

zzzeek commented Jun 9, 2022

looking good, this works and it looks like I can link to the git main branch in my tox file so it will work on CI. thanks!

@anthony-tuininga
Copy link
Member

This has been included in python-oracledb 1.0.1 which was just released.

@chichur
Copy link

chichur commented Feb 15, 2023

Hi @anthony-tuininga! I have similar problem when i try insert large values use bind variables. Let' see code below.

create test table:

create table x_test(name varchar2(10000 char), date d1);

Python 3.10, oracledb 1.2.1, Oracle 19

import oracledb
import datetime

dsn = f"*************************"
connection = oracledb.connect(dsn)
cursor = connection.cursor()

test_value = "x" * 9000
# if not this then raise ORA-01483: invalid length for DATE or NUMBER bind variable
cursor.setinputsizes(name=oracledb.DB_TYPE_CLOB)

# when operator in not 'begin end' block, code will be running without errors.
cursor.execute("""
            begin
                insert into x_test(name, d1) values(:name, :d);
            end;
        """,   name=test_value,
               d=datetime.datetime.now())

# this execute will be running without error 
# cursor.setinputsizes(name=oracledb.DB_TYPE_CLOB)
cursor.execute("""
            begin
                insert into x_test(name, d1) values(:name, sysdate);
            end;
        """,   name=test_value))
connection.commit()
connection.close()

@anthony-tuininga
Copy link
Member

Can you raise a new issue asking this question?

@chichur
Copy link

chichur commented Feb 15, 2023

Ok #146

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working patch available
Projects
None yet
Development

No branches or pull requests

4 participants