UnicodeDecodeError: 'utf-8' codec can't decode byte (outputtypehandler encoding_errors="replace" not being honored)

### Discussed in https://github.com/oracle/python-oracledb/discussions/272

<div type='discussions-op-text'>

<sup>Originally posted by **rh8056** December 20, 2023</sup>
Hello,

I posted this question on StackExchange yesterday as well, but figured it would make sense to ask here as well.

I'm struggling to deal with what I think is corrupt data stored in an Oracle database when reading it in a python script. I have the following:

```
    def output_type_handler(cursor, metadata):
      if metadata.type_code is oracledb.DB_TYPE_VARCHAR:
        return cursor.var(metadata.type_code,
                          arraysize=cursor.arraysize,
                          encoding_errors="replace")

    mydb, mycursor = connectOracle()
    mycursor.outputtypehandler = output_type_handler

    mycursor.execute('''select file_id, md5_value,
                            case when file_content is not null
                                then
                                    utl_raw.cast_to_varchar2(dbms_lob.substr(file_content))
                                else
                                    ''
                                end as FILE_CONTENT from archive_file where archive_id = 123 and file_name = 'file_name.txt' ''')
    row = mycursor.fetchone()
    print(row)
```
When I run this, I'm getting the following:

```UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc4 in position 647: invalid continuation byte```
If I run this directly in SQL Developer, I get an output, with � in the file content. My SQL Developer settings are set to ```UTF-8```, and my Oracle database ```NLS_NCHAR_CHARACTERSET``` is set to ```AL32UTF8```. My understanding with the ```outputtypehandler``` change is that my python script would also output a � in this instance.

I added some ```print()``` commands inside of the ```def output_type_handler()``` to verify that it is indeed being called, and I saw output, so it appears that it is, but it also seems like the ```encoding_errors="replace"``` is being ignored. What am I missing here? Thanks!</div>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

UnicodeDecodeError: 'utf-8' codec can't decode byte (outputtypehandler encoding_errors="replace" not being honored) #279

Discussed in #272

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

UnicodeDecodeError: 'utf-8' codec can't decode byte (outputtypehandler encoding_errors="replace" not being honored) #279

Description

Discussed in #272

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions