New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch-inserting via executemany() fails with NotSupportedError: Python value cannot be converted to a database value #236

Closed
andrenam opened this Issue Nov 9, 2018 · 9 comments

Comments

Projects
None yet
3 participants
@andrenam
Copy link

andrenam commented Nov 9, 2018

Batch-inserting via executemany() fails with NotSupportedError: Python value cannot be converted to a database value

Batch-inserting many rows in chunks of 1000, 500, 100, 50, 20, 10 via executemany() fails with
NotSupportedError: Python value cannot be converted to a database value
If I insert every row line by line, it works.
maybe similar problem as in
https://stackoverflow.com/questions/50713180/cx-oracle-notsupportederror-python-value-cannot-be-converted-to-a-database-valu

Oracle Server

Oracle Database 12c Standard Edition Release 12.1.0.2.0 - 64bit Production
PL/SQL Release 12.1.0.2.0 - Production
CORE 12.1.0.2.0 Production
TNS for 64-bit Windows: Version 12.1.0.2.0 - Production
NLSRTL Version 12.1.0.2.0 - Production

Client Software

Client OS: Windows 7 Pro 64bit
Python 3.6.7, 64bit (via Anaconda)
cx_Oracle 7.0.0
Oracle Client Version: 11.2.0.1.0
(determined via https://stackoverflow.com/questions/1171643/whats-the-best-way-to-determine-which-version-of-oracle-client-im-running)
Windows PATH:

C:\ProgramData\Anaconda3\Library\bin;C:\oracle\product\11.2.0\client_1\bin;C:\Program Files (x86)\Intel\iCLS Client\;C:\Program Files\Intel\iCLS Client\;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Program Files\Intel\Intel(R) Management Engine Components\DAL;C:\Program Files\Intel\Intel(R) Management Engine Components\IPT;C:\Program Files (x86)\Intel\Intel(R) Management Engine Components\DAL;C:\Program Files (x86)\Intel\Intel(R) Management Engine Components\IPT;C:\Windows\System32\Windows System Resource Manager\bin;;C:\Windows\idmu\common;C:\Program Files (x86)\Attachmate\E!E2K\;C:\Program Files\TortoiseSVN\bin;C:\Program Files\R\R-3.1.2\bin\x64;C:\Program Files\Common Files\IBM\SPSS\DataCollection\7\DMOM\;C:\Program Files\IBM\SPSS\DataCollection\7\Accessories\;C:\Program Files (x86)\d.velop\d3client

Steps to reproduce

  • Create target structure:
CREATE TABLE CX_ORACLE_ISSUE (
  col_1 varchar2(75) DEFAULT NULL,
  col_2 number(19) NOT NULL,
  col_3 number(19) DEFAULT NULL,
  col_4 number(19) DEFAULT NULL,
  col_5 number(19) DEFAULT NULL,
  col_6 number(19) DEFAULT NULL,
  col_7 varchar2(75) DEFAULT NULL,
  col_8 timestamp(6) DEFAULT NULL,
  col_9 timestamp(6) DEFAULT NULL,
  col_10 number(19) DEFAULT NULL,
  col_11 number(19) DEFAULT NULL,
  col_12 number(19) DEFAULT NULL,
  col_13 clob,
  col_14 varchar2(75) DEFAULT NULL,
  col_15 binary_double DEFAULT NULL,
  col_16 clob,
  col_17 varchar2(150) DEFAULT NULL,
  col_18 clob,
  col_19 clob,
  col_20 varchar2(75) DEFAULT NULL,
  col_21 varchar2(75) DEFAULT NULL,
  col_22 varchar2(75) DEFAULT NULL,
  col_23 timestamp(6) DEFAULT NULL,
  col_24 timestamp(6) DEFAULT NULL,
  col_25 timestamp(6) DEFAULT NULL,
  col_26 number(3) DEFAULT NULL,
  col_27 number(3) DEFAULT NULL,
  col_28 number(19) DEFAULT NULL,
  col_29 clob,
  col_30 timestamp(6) DEFAULT NULL,
  col_31 number(10) DEFAULT NULL,
  col_32 number(19) DEFAULT NULL,
  col_33 varchar2(75) DEFAULT NULL,
  col_34 timestamp(6) DEFAULT NULL,
  PRIMARY KEY (col_2)
)  
import pickle
import cx_Oracle


def chunks(l, n):
    """Yield successive n-sized chunks from l."""
    for i in range(0, len(l), n):
        yield l[i:i + n]



# connect to database server
pwd = "SECRET"
connection = cx_Oracle.connect("USER", pwd, "SERVER")
cursor = connection.cursor()
del pwd


# read data from pickle file
with open('01_data_anonymous.pckl', 'rb') as f:
    data = pickle.load(f)

# sql statement to insert
sql = """INSERT INTO CX_ORACLE_ISSUE 
(col_1,col_2,col_3,col_4,col_5,col_6,col_7,col_8,col_9,
col_10,col_11,col_12,col_13,col_14,col_15,col_16,col_17,col_18,
col_19,col_20,col_21,col_22,col_23,col_24,col_25,
col_26,col_27,col_28,col_29,col_30,col_31,col_32,
col_33,col_34) 
VALUES 
(:col_1,:col_2,:col_3,:col_4,:col_5,:col_6,:col_7,:col_8,:col_9,
:col_10,:col_11,:col_12,:col_13,:col_14,:col_15,:col_16,:col_17,:col_18,
:col_19,:col_20,:col_21,:col_22,:col_23,:col_24,:col_25,
:col_26,:col_27,:col_28,:col_29,:col_30,:col_31,:col_32,
:col_33,:col_34)"""

# insert data in chunks of 100 -> fail
chunk_size = 100
for i,rows_chunk in enumerate(chunks(data, chunk_size)):
    print("chunk nr. %i, rows=%i-%i" % ( i, chunk_size*i, chunk_size*(i+1)))
    cursor.executemany(sql, rows_chunk)

# commit inserts
connection.commit()

# close connection
connection.close()

Outputs:

chunk nr. 0, rows=0-100
chunk nr. 1, rows=100-200
Traceback (most recent call last):

  File "<ipython-input-9-2b555f278a3a>", line 1, in <module>
    runfile('P:/Projects/MiVo/Data Analytics/Datenbank_MiVo_TMARK/99_cxOracle_Bug/reproduce/03_reproduce_anonym.py', wdir='P:/Projects/MiVo/Data Analytics/Datenbank_MiVo_TMARK/99_cxOracle_Bug/reproduce')

  File "C:\Users\601276\conda_envs\cx_Oracle7\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 668, in runfile
    execfile(filename, namespace)

  File "C:\Users\601276\conda_envs\cx_Oracle7\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 108, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)

  File "P:/Projects/MiVo/Data Analytics/Datenbank_MiVo_TMARK/99_cxOracle_Bug/reproduce/03_reproduce_anonym.py", line 48, in <module>
    cursor.executemany(sql, rows_chunk)

NotSupportedError: Python value cannot be converted to a database value
@anthony-tuininga

This comment has been minimized.

Copy link
Member

anthony-tuininga commented Nov 9, 2018

Thanks for the detailed explanation of the issue you are experiencing. I'll take a look and let you know if I am able to reproduce the issue.

@andrenam

This comment has been minimized.

Copy link
Author

andrenam commented Nov 10, 2018

@anthony-tuininga thank you. I appreciate that very much.

I managed to reproduce the issue on a complete seperate environment (seperate from our corporate Windows Environment). On my linux home computer, I setup a linux container and installed the free Oracle Express Edition Server 11.2.0-1.0 on Archlinux with Kernel 4.14.78-1-lts (amd64).

As client, I used Archlinux on another machine with python 3.7.1, cx_Oracle 7.0.0 and Oracle Instant Client (Basic Package) 18.3.0.0.0

Note: Using the dataset from above on linux, I ran into some utf-8 encoding issues, that I don't think is relevant here, so I replaced the problematic contents with randomized data and uploaded a new dataset 01_data_anonymous.pckl.zip. So, if you run into some (non-relevant) encoding issues, please try this dataset.

@jsidlertx

This comment has been minimized.

Copy link

jsidlertx commented Nov 27, 2018

I am having the exact same problem with similiar coding. If I limit the blocksize to 1 it works perfect albeit very slow

@anthony-tuininga

This comment has been minimized.

Copy link
Member

anthony-tuininga commented Nov 29, 2018

I am able to reproduce the issue using your data set. I'll get back to you on why this is happening.

@anthony-tuininga

This comment has been minimized.

Copy link
Member

anthony-tuininga commented Nov 29, 2018

I now understand why this is happening. In your first batch two of your columns never have any value other than None. cx_Oracle defers type assignment until the end of the batch and then simply assumes None is the only value possible (string with maximum length 1). In your second batch at least one of the rows actually has a value, but that value isn't compatible. The error message is very misleading, however, so I'll see if I can address that. You have a few options:

(1) perform all inserts in a single batch (but depending on size this might not be usable)
(2) create a new cursor for each batch (so that cx_Oracle calculates the types for each batch)
(3) use cursor.setinputsizes() to specify the types and sizes (may be cumbersome)

Of those three options, the second option is the simplest and will only add a small bit of overhead. The third option is the one you'll want to use for the best performance.

@anthony-tuininga

This comment has been minimized.

Copy link
Member

anthony-tuininga commented Nov 29, 2018

Just for clarity, here is a simple test case that demonstrates the issue:

import cx_Oracle

conn = cx_Oracle.connect("cx_Oracle/welcome")
cursor = conn.cursor()

cursor.execute("truncate table TestTempTable")

sql = "insert into TestTempTable values (:1, :2)"
cursor.executemany(sql, [(1, None), (2, None)])
cursor.executemany(sql, [(3, None), (4, "Testing")])

The second call to executemany() will fail with the error you reported.

anthony-tuininga added a commit that referenced this issue Nov 30, 2018

Handle case when first call to cursor.executemany() has one or more c…
…olumns

that are always null and a subsequent call to cursor.executemany() has a value
other than None in the same column
(#236).
@anthony-tuininga

This comment has been minimized.

Copy link
Member

anthony-tuininga commented Nov 30, 2018

I've just added a commit that should correct this issue and added a test case as well. Let me know if that addresses your issue. Thanks!

@andrenam

This comment has been minimized.

Copy link
Author

andrenam commented Dec 5, 2018

@anthony-tuininga thank you very much. I don't have the possibility to install cx_Oracle from source in our environment, but I will happily await till it gets packaged for my distribution.

meanwhile I will try out your workaround number 2) and create a new cursor for each batch.

Thanks a lot :-)

@anthony-tuininga

This comment has been minimized.

Copy link
Member

anthony-tuininga commented Feb 5, 2019

Corrected in cx_Oracle 7.1 which was released yesterday.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment