Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch-inserting via executemany() fails with NotSupportedError: Python value cannot be converted to a database value #236

Closed
andrenam opened this issue Nov 9, 2018 · 15 comments
Labels

Comments

@andrenam
Copy link

andrenam commented Nov 9, 2018

Batch-inserting via executemany() fails with NotSupportedError: Python value cannot be converted to a database value

Batch-inserting many rows in chunks of 1000, 500, 100, 50, 20, 10 via executemany() fails with
NotSupportedError: Python value cannot be converted to a database value
If I insert every row line by line, it works.
maybe similar problem as in
https://stackoverflow.com/questions/50713180/cx-oracle-notsupportederror-python-value-cannot-be-converted-to-a-database-valu

Oracle Server

Oracle Database 12c Standard Edition Release 12.1.0.2.0 - 64bit Production
PL/SQL Release 12.1.0.2.0 - Production
CORE 12.1.0.2.0 Production
TNS for 64-bit Windows: Version 12.1.0.2.0 - Production
NLSRTL Version 12.1.0.2.0 - Production

Client Software

Client OS: Windows 7 Pro 64bit
Python 3.6.7, 64bit (via Anaconda)
cx_Oracle 7.0.0
Oracle Client Version: 11.2.0.1.0
(determined via https://stackoverflow.com/questions/1171643/whats-the-best-way-to-determine-which-version-of-oracle-client-im-running)
Windows PATH:

C:\ProgramData\Anaconda3\Library\bin;C:\oracle\product\11.2.0\client_1\bin;C:\Program Files (x86)\Intel\iCLS Client\;C:\Program Files\Intel\iCLS Client\;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Program Files\Intel\Intel(R) Management Engine Components\DAL;C:\Program Files\Intel\Intel(R) Management Engine Components\IPT;C:\Program Files (x86)\Intel\Intel(R) Management Engine Components\DAL;C:\Program Files (x86)\Intel\Intel(R) Management Engine Components\IPT;C:\Windows\System32\Windows System Resource Manager\bin;;C:\Windows\idmu\common;C:\Program Files (x86)\Attachmate\E!E2K\;C:\Program Files\TortoiseSVN\bin;C:\Program Files\R\R-3.1.2\bin\x64;C:\Program Files\Common Files\IBM\SPSS\DataCollection\7\DMOM\;C:\Program Files\IBM\SPSS\DataCollection\7\Accessories\;C:\Program Files (x86)\d.velop\d3client

Steps to reproduce

  • Create target structure:
CREATE TABLE CX_ORACLE_ISSUE (
  col_1 varchar2(75) DEFAULT NULL,
  col_2 number(19) NOT NULL,
  col_3 number(19) DEFAULT NULL,
  col_4 number(19) DEFAULT NULL,
  col_5 number(19) DEFAULT NULL,
  col_6 number(19) DEFAULT NULL,
  col_7 varchar2(75) DEFAULT NULL,
  col_8 timestamp(6) DEFAULT NULL,
  col_9 timestamp(6) DEFAULT NULL,
  col_10 number(19) DEFAULT NULL,
  col_11 number(19) DEFAULT NULL,
  col_12 number(19) DEFAULT NULL,
  col_13 clob,
  col_14 varchar2(75) DEFAULT NULL,
  col_15 binary_double DEFAULT NULL,
  col_16 clob,
  col_17 varchar2(150) DEFAULT NULL,
  col_18 clob,
  col_19 clob,
  col_20 varchar2(75) DEFAULT NULL,
  col_21 varchar2(75) DEFAULT NULL,
  col_22 varchar2(75) DEFAULT NULL,
  col_23 timestamp(6) DEFAULT NULL,
  col_24 timestamp(6) DEFAULT NULL,
  col_25 timestamp(6) DEFAULT NULL,
  col_26 number(3) DEFAULT NULL,
  col_27 number(3) DEFAULT NULL,
  col_28 number(19) DEFAULT NULL,
  col_29 clob,
  col_30 timestamp(6) DEFAULT NULL,
  col_31 number(10) DEFAULT NULL,
  col_32 number(19) DEFAULT NULL,
  col_33 varchar2(75) DEFAULT NULL,
  col_34 timestamp(6) DEFAULT NULL,
  PRIMARY KEY (col_2)
)  
import pickle
import cx_Oracle


def chunks(l, n):
    """Yield successive n-sized chunks from l."""
    for i in range(0, len(l), n):
        yield l[i:i + n]



# connect to database server
pwd = "SECRET"
connection = cx_Oracle.connect("USER", pwd, "SERVER")
cursor = connection.cursor()
del pwd


# read data from pickle file
with open('01_data_anonymous.pckl', 'rb') as f:
    data = pickle.load(f)

# sql statement to insert
sql = """INSERT INTO CX_ORACLE_ISSUE 
(col_1,col_2,col_3,col_4,col_5,col_6,col_7,col_8,col_9,
col_10,col_11,col_12,col_13,col_14,col_15,col_16,col_17,col_18,
col_19,col_20,col_21,col_22,col_23,col_24,col_25,
col_26,col_27,col_28,col_29,col_30,col_31,col_32,
col_33,col_34) 
VALUES 
(:col_1,:col_2,:col_3,:col_4,:col_5,:col_6,:col_7,:col_8,:col_9,
:col_10,:col_11,:col_12,:col_13,:col_14,:col_15,:col_16,:col_17,:col_18,
:col_19,:col_20,:col_21,:col_22,:col_23,:col_24,:col_25,
:col_26,:col_27,:col_28,:col_29,:col_30,:col_31,:col_32,
:col_33,:col_34)"""

# insert data in chunks of 100 -> fail
chunk_size = 100
for i,rows_chunk in enumerate(chunks(data, chunk_size)):
    print("chunk nr. %i, rows=%i-%i" % ( i, chunk_size*i, chunk_size*(i+1)))
    cursor.executemany(sql, rows_chunk)

# commit inserts
connection.commit()

# close connection
connection.close()

Outputs:

chunk nr. 0, rows=0-100
chunk nr. 1, rows=100-200
Traceback (most recent call last):

  File "<ipython-input-9-2b555f278a3a>", line 1, in <module>
    runfile('P:/Projects/MiVo/Data Analytics/Datenbank_MiVo_TMARK/99_cxOracle_Bug/reproduce/03_reproduce_anonym.py', wdir='P:/Projects/MiVo/Data Analytics/Datenbank_MiVo_TMARK/99_cxOracle_Bug/reproduce')

  File "C:\Users\601276\conda_envs\cx_Oracle7\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 668, in runfile
    execfile(filename, namespace)

  File "C:\Users\601276\conda_envs\cx_Oracle7\lib\site-packages\spyder_kernels\customize\spydercustomize.py", line 108, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)

  File "P:/Projects/MiVo/Data Analytics/Datenbank_MiVo_TMARK/99_cxOracle_Bug/reproduce/03_reproduce_anonym.py", line 48, in <module>
    cursor.executemany(sql, rows_chunk)

NotSupportedError: Python value cannot be converted to a database value
@anthony-tuininga
Copy link
Member

Thanks for the detailed explanation of the issue you are experiencing. I'll take a look and let you know if I am able to reproduce the issue.

@andrenam
Copy link
Author

@anthony-tuininga thank you. I appreciate that very much.

I managed to reproduce the issue on a complete seperate environment (seperate from our corporate Windows Environment). On my linux home computer, I setup a linux container and installed the free Oracle Express Edition Server 11.2.0-1.0 on Archlinux with Kernel 4.14.78-1-lts (amd64).

As client, I used Archlinux on another machine with python 3.7.1, cx_Oracle 7.0.0 and Oracle Instant Client (Basic Package) 18.3.0.0.0

Note: Using the dataset from above on linux, I ran into some utf-8 encoding issues, that I don't think is relevant here, so I replaced the problematic contents with randomized data and uploaded a new dataset 01_data_anonymous.pckl.zip. So, if you run into some (non-relevant) encoding issues, please try this dataset.

@jsidlertx
Copy link

I am having the exact same problem with similiar coding. If I limit the blocksize to 1 it works perfect albeit very slow

@anthony-tuininga
Copy link
Member

I am able to reproduce the issue using your data set. I'll get back to you on why this is happening.

@anthony-tuininga
Copy link
Member

I now understand why this is happening. In your first batch two of your columns never have any value other than None. cx_Oracle defers type assignment until the end of the batch and then simply assumes None is the only value possible (string with maximum length 1). In your second batch at least one of the rows actually has a value, but that value isn't compatible. The error message is very misleading, however, so I'll see if I can address that. You have a few options:

(1) perform all inserts in a single batch (but depending on size this might not be usable)
(2) create a new cursor for each batch (so that cx_Oracle calculates the types for each batch)
(3) use cursor.setinputsizes() to specify the types and sizes (may be cumbersome)

Of those three options, the second option is the simplest and will only add a small bit of overhead. The third option is the one you'll want to use for the best performance.

@anthony-tuininga
Copy link
Member

Just for clarity, here is a simple test case that demonstrates the issue:

import cx_Oracle

conn = cx_Oracle.connect("cx_Oracle/welcome")
cursor = conn.cursor()

cursor.execute("truncate table TestTempTable")

sql = "insert into TestTempTable values (:1, :2)"
cursor.executemany(sql, [(1, None), (2, None)])
cursor.executemany(sql, [(3, None), (4, "Testing")])

The second call to executemany() will fail with the error you reported.

anthony-tuininga added a commit that referenced this issue Nov 30, 2018
…olumns

that are always null and a subsequent call to cursor.executemany() has a value
other than None in the same column
(#236).
@anthony-tuininga
Copy link
Member

I've just added a commit that should correct this issue and added a test case as well. Let me know if that addresses your issue. Thanks!

@andrenam
Copy link
Author

andrenam commented Dec 5, 2018

@anthony-tuininga thank you very much. I don't have the possibility to install cx_Oracle from source in our environment, but I will happily await till it gets packaged for my distribution.

meanwhile I will try out your workaround number 2) and create a new cursor for each batch.

Thanks a lot :-)

@anthony-tuininga
Copy link
Member

Corrected in cx_Oracle 7.1 which was released yesterday.

@omrumcetin
Copy link

omrumcetin commented Oct 8, 2019

Reproduced in cx_oracle 7.1.3
Just try to executemany on 2 number rows then 1 string, throws error expecting number.

@anthony-tuininga
Copy link
Member

In the same batch of data that you pass to cx_Oracle, yes, you will get an exception. That is expected. The bug that was fixed was regarding the use of all null values in one batch followed by a numeric value in a subsequent batch.

@omrumcetin
Copy link

omrumcetin commented Oct 9, 2019

no you missunderstand me, It cannot be expected behaviour.
I'm inserting to single column, as row order of number,number and string order.
values = [1,2,'foo']
sql = "INSERT INTO TABLE (SINGLECOLUMN) VALUES (:1)"
cursor.executemany(sql,values)
So in oracle side I'm defining column as VARCHAR2, but that part of list contains number or string, I shouldnt care about it.

@cjbj
Copy link
Member

cjbj commented Oct 9, 2019

@omrumcetin To avoid misunderstandings please show (complete) runnable code that has the problem and the SQL to create the schema.

If you have one column and it is defined as a number, why are you trying to insert a string, or vice versa?

Review Batch Statement Execution and Bulk Loading. The section on defining memory areas is worth careful study.

@omrumcetin
Copy link

omrumcetin commented Oct 9, 2019

So in oracle side I'm defining column as VARCHAR2

I'm defining VARCHAR2 because inserted value sometimes can be number, sometimes can be a string. Its very common stuff.
I have read the documentary, this case should be included in.

@anthony-tuininga
Copy link
Member

Can you provide a link to the documentation that suggests this case should be supported? cx_Oracle doesn't attempt to perform any conversions from number to string or from date to string -- and I believe that is the right approach. You can use a variable converter function if you wish to perform any conversions -- but generally it is just better to ensure that the data is available in a uniform fashion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants