You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Originally posted by rh8056 December 20, 2023
Hello,
I posted this question on StackExchange yesterday as well, but figured it would make sense to ask here as well.
I'm struggling to deal with what I think is corrupt data stored in an Oracle database when reading it in a python script. I have the following:
def output_type_handler(cursor, metadata):
if metadata.type_code is oracledb.DB_TYPE_VARCHAR:
return cursor.var(metadata.type_code,
arraysize=cursor.arraysize,
encoding_errors="replace")
mydb, mycursor = connectOracle()
mycursor.outputtypehandler = output_type_handler
mycursor.execute('''select file_id, md5_value,
case when file_content is not null
then
utl_raw.cast_to_varchar2(dbms_lob.substr(file_content))
else
''
end as FILE_CONTENT from archive_file where archive_id = 123 and file_name = 'file_name.txt' ''')
row = mycursor.fetchone()
print(row)
When I run this, I'm getting the following:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc4 in position 647: invalid continuation byte
If I run this directly in SQL Developer, I get an output, with � in the file content. My SQL Developer settings are set to UTF-8, and my Oracle database NLS_NCHAR_CHARACTERSET is set to AL32UTF8. My understanding with the outputtypehandler change is that my python script would also output a � in this instance.
I added some print() commands inside of the def output_type_handler() to verify that it is indeed being called, and I saw output, so it appears that it is, but it also seems like the encoding_errors="replace" is being ignored. What am I missing here? Thanks!
The text was updated successfully, but these errors were encountered:
I confirmed that this was indeed a regression from cx_Oracle and have added a test case to confirm that it works correctly now. If you are able to build from source you can verify that it works for you, too.
Discussed in #272
Originally posted by rh8056 December 20, 2023
Hello,
I posted this question on StackExchange yesterday as well, but figured it would make sense to ask here as well.
I'm struggling to deal with what I think is corrupt data stored in an Oracle database when reading it in a python script. I have the following:
When I run this, I'm getting the following:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc4 in position 647: invalid continuation byte
If I run this directly in SQL Developer, I get an output, with � in the file content. My SQL Developer settings are set to
UTF-8
, and my Oracle databaseNLS_NCHAR_CHARACTERSET
is set toAL32UTF8
. My understanding with theoutputtypehandler
change is that my python script would also output a � in this instance.I added some
print()
commands inside of thedef output_type_handler()
to verify that it is indeed being called, and I saw output, so it appears that it is, but it also seems like theencoding_errors="replace"
is being ignored. What am I missing here? Thanks!The text was updated successfully, but these errors were encountered: