Skip to content

Commit

Permalink
Further work on adjusting attribute, method and parameter names to be
Browse files Browse the repository at this point in the history
consistent and to comply with PEP 8 naming guidelines; also adjust
implementation of #385 (originally done in pull request #549) to use the
parameter name `bypass_decode` instead of `bypassencoding`.
  • Loading branch information
anthony-tuininga committed Apr 23, 2021
1 parent ab6e6f0 commit 96f9382
Show file tree
Hide file tree
Showing 7 changed files with 155 additions and 159 deletions.
40 changes: 26 additions & 14 deletions doc/src/api_manual/cursor.rst
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ Cursor Object
The DB API definition does not define this attribute.


.. method:: Cursor.arrayvar(data_type, value, [size])
.. method:: Cursor.arrayvar(typ, value, [size])

Create an array variable associated with the cursor of the given type and
size and return a :ref:`variable object <varobj>`. The value is either an
Expand Down Expand Up @@ -587,19 +587,19 @@ Cursor Object
The DB API definition does not define this attribute.


.. method:: Cursor.var(dataType, [size, arraysize, inconverter, outconverter, \
typename, encodingErrors, bypassencoding])
.. method:: Cursor.var(typ, [size, arraysize, inconverter, outconverter, \
typename, encoding_errors, bypass_encoding])

Create a variable with the specified characteristics. This method was
designed for use with PL/SQL in/out variables where the length or type
cannot be determined automatically from the Python object passed in or for
use in input and output type handlers defined on cursors or connections.

The dataType parameter specifies the type of data that should be stored in
the variable. This should be one of the
:ref:`database type constants <dbtypes>`, :ref:`DB API constants <types>`,
an object type returned from the method :meth:`Connection.gettype()` or one
of the following Python types:
The typ parameter specifies the type of data that should be stored in the
variable. This should be one of the :ref:`database type constants
<dbtypes>`, :ref:`DB API constants <types>`, an object type returned from
the method :meth:`Connection.gettype()` or one of the following Python
types:

.. list-table::
:header-rows: 1
Expand Down Expand Up @@ -642,17 +642,29 @@ Cursor Object
specified when using type :data:`cx_Oracle.OBJECT` unless the type object
was passed directly as the first parameter.

The encodingErrors parameter specifies what should happen when decoding
The encoding_errors parameter specifies what should happen when decoding
byte strings fetched from the database into strings. It should be one of
the values noted in the builtin
`decode <https://docs.python.org/3/library/stdtypes.html#bytes.decode>`__
function.

The bypassencoding parameter, if specified, should be passed as
boolean. This feature allows results of database types CHAR, NCHAR,
LONG_STRING, NSTRING, STRING to be returned raw meaning cx_Oracle
won't do any decoding conversion. See
:ref:`Fetching raw data <fetching-raw-data>` for more information.
The bypass_encoding parameter, if specified, should be passed as a
boolean value. Passing a `True` value causes values of database types
:data:`~cx_Oracle.DB_TYPE_VARCHAR`, :data:`~cx_Oracle.DB_TYPE_CHAR`,
:data:`~cx_Oracle.DB_TYPE_NVARCHAR`, :data:`~cx_Oracle.DB_TYPE_NCHAR` and
:data:`~cx_Oracle.DB_TYPE_LONG` to be returned as `bytes` instead of `str`,
meaning that cx_Oracle doesn't do any decoding. See :ref:`Fetching raw
data <fetching-raw-data>` for more information.

.. versionadded:: 8.2

The parameter `bypass_encoding` was added.

.. versionchanged:: 8.2

For consistency and compliance with the PEP 8 naming style, the
parameter `encodingErrors` was renamed to `encoding_errors`. The old
name will continue to work as a keyword parameter for a period of time.

.. note::

Expand Down
2 changes: 2 additions & 0 deletions doc/src/api_manual/deprecations.rst
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,8 @@ if applicable. The most recent deprecations are listed first.
- Replace with parameter name `keyword_parameters`
* - `keywordParameters` parameter to :meth:`Cursor.callproc()`
- Replace with parameter name `keyword_parameters`
* - `encodingErrors` parameter to :meth:`Cursor.var()`
- Replace with parameter name `encoding_errors`
* - `Cursor.fetchraw()`
- Replace with :meth:`Cursor.fetchmany()`
* - `Queue.deqMany`
Expand Down
6 changes: 6 additions & 0 deletions doc/src/release_notes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,12 @@ Version 8.2 (TBD)
:meth:`cx_Oracle.SessionPool()` in order to permit specifying the size of
the statement cache during the creation of pools and standalone
connections.
#) Added parameter `bypass_decode` to :meth:`Cursor.var()` in order to allow
the `decode` step to be bypassed when converting data from Oracle Database
into Python strings
(`issue 385 <https://github.com/oracle/python-cx_Oracle/issues/385>`__).
Initial work was done in `PR 549
<https://github.com/oracle/python-cx_Oracle/pull/549>`__.
#) Threaded mode is now always enabled when creating connection pools with
:meth:`cx_Oracle.SessionPool()`. Any `threaded` parameter value is ignored.
#) Eliminated a memory leak when calling :meth:`SodaOperation.filter()` with a
Expand Down
105 changes: 48 additions & 57 deletions doc/src/user_guide/sql_execution.rst
Original file line number Diff line number Diff line change
Expand Up @@ -288,7 +288,7 @@ or the value ``None``. The value ``None`` indicates that the default type
should be used.

Examples of output handlers are shown in :ref:`numberprecision`,
:ref:`directlobs` and :ref:`fetching-raw-data`. Also see samples such as `samples/TypeHandlers.py
:ref:`directlobs` and :ref:`fetching-raw-data`. Also see samples such as `samples/type_handlers.py
<https://github.com/oracle/python-cx_Oracle/blob/master/samples/type_handlers.py>`__

.. _numberprecision:
Expand Down Expand Up @@ -347,82 +347,73 @@ See `samples/return_numbers_as_decimals.py
.. _fetching-raw-data:

Fetching Raw Data
---------------------

Sometimes cx_Oracle may have problems converting data to unicode and you may
want to inspect the problem closer rather than auto-fix it using the
encodingerrors parameter. This may be useful when a database contains
records or fields that are in a wrong encoding altogether.
-----------------

It is not recommended to use mixed encodings in databases.
This functionality is aimed at troubleshooting databases
that have inconsistent encodings for external reasons.
Sometimes cx_Oracle may have problems converting data stored in the database to
Python strings. This can occur if the data stored in the database doesn't match
the character set defined by the database. The `encoding_errors` parameter to
:meth:`Cursor.var()` permits the data to be returned with some invalid data
replaced, but for additional control the parameter `bypass_decode` can be set
to `True` and cx_Oracle will bypass the decode step and return `bytes` instead
of `str` for data stored in the database as strings. The data can then be
examined and corrected as required. This approach should only be used for
troubleshooting and correcting invalid data, not for general use!

For these cases, you can pass in the in additional keyword argument
``bypassencoding = True`` into :meth:`Cursor.var()`. This needs
to be used in combination with :ref:`outputtypehandlers`
The following sample demonstrates how to use this feature:

.. code-block:: python
#defining output type handlers method
def ConvertStringToBytes(cursor, name, defaultType, size, precision, scale):
if defaultType == cx_Oracle.STRING:
return cursor.var(str, arraysize=cursor.arraysize, bypassencoding = True)

#set cursor outputtypehandler to the method above
cursor = connection.cursor()
ursor.outputtypehandler = ConvertStringToBytes

# define output type handler
def return_strings_as_bytes(cursor, name, default_type, size,
precision, scale):
if default_type == cx_Oracle.DB_TYPE_VARCHAR:
return cursor.var(str, arraysize=cursor.arraysize,
bypass_decode=True)

This will allow you to receive data as raw bytes.
# set output type handler on cursor before fetching data
with connection.cursor() as cursor:
cursor.outputtypehandler = return_strings_as_bytes
cursor.execute("select content, charset from SomeTable")
data = cursor.fetchall()

.. code-block:: python
This will produce output as::

statement = cursor.execute("select content, charset from SomeTable")
data = statement.fetchall()
[(b'Fianc\xc3\xa9', b'UTF-8')]


This will produce output as:
Note that last \xc3\xa9 is é in UTF-8. Since this is valid UTF-8 you can then
perform a decode on the data (the part that was bypassed):

.. code-block:: python
[(b'Fianc\xc3\xa9', b'UTF-8')]

value = data[0][0].decode("UTF-8")
Note that last \xc3\xa9 is é in UTF-8. Then in you can do following:
This will return the value "Fiancé".

If you want to save ``b'Fianc\xc3\xa9'`` into the database directly without
using a Python string, you will need to create a variable using
:meth:`Cursor.var()` that specifies the type as
:data:`~cx_Oracle.DB_TYPE_VARCHAR` (otherwise the value will be treated as
:data:`~cx_Oracle.DB_TYPE_RAW`). The following sample demonstrates this:

.. code-block:: python
import codecs
# data = [(b'Fianc\xc3\xa9', b'UTF-8')]
unicodecontent = data[0][0].decode(data[0][1].decode()) # Assuming your charset encoding is UTF-8


This will revert it back to "Fiancé".

If you want to save ``b'Fianc\xc3\xa9'`` to database you will need to create
:meth:`Cursor.var()` that will tell cx_Oracle that the value is indeed
intended as a string:


.. code-block:: python
connection = cx_Oracle.connect("hr", userpwd, "dbhost.example.com/orclpdb1")
cursor = connection.cursor()
cursorvariable = cursor.var(cx_Oracle.STRING)
cursorvariable.setvalue(0, "Fiancé".encode("UTF-8")) # b'Fianc\xc4\x9b'
cursor.execute("update SomeTable set SomeColumn = :param where id = 1", param=cursorvariable)


At that point, the bytes will be assumed to be in the correct encoding and should insert as you expect.
with cx_Oracle.connect(user="hr", password=userpwd,
dsn="dbhost.example.com/orclpdb1") as conn:
with conn.cursor() cursor:
var = cursor.var(cx_Oracle.DB_TYPE_VARCHAR)
var.setvalue(0, b"Fianc\xc4\x9b")
cursor.execute("""
update SomeTable set
SomeColumn = :param
where id = 1""",
param=var)

.. warning::
This functionality is "as-is": when saving strings like this,
the bytes will be assumed to be in the correct encoding and will
insert like that. Proper encoding is the responsibility of the user and
no correctness of any data in the database can be assumed
to exist by itself.

The database will assume that the bytes provided are in the character set
expected by the database so only use this for troubleshooting or as
directed.


.. _outconverters:
Expand Down
75 changes: 0 additions & 75 deletions samples/QueringRawData.py

This file was deleted.

49 changes: 49 additions & 0 deletions samples/query_strings_as_bytes.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
#------------------------------------------------------------------------------
# Copyright (c) 2021, Oracle and/or its affiliates. All rights reserved.
#------------------------------------------------------------------------------

#------------------------------------------------------------------------------
# query_strings_as_bytes.py
#
# Demonstrates how to query strings as bytes (bypassing decoding of the bytes
# into a Python string). This can be useful when attempting to fetch data that
# was stored in the database in the wrong encoding.
#
# This script requires cx_Oracle 8.2 and higher.
#------------------------------------------------------------------------------

import cx_Oracle as oracledb
import sample_env

STRING_VAL = 'I bought a cafetière on the Champs-Élysées'

def return_strings_as_bytes(cursor, name, default_type, size, precision,
scale):
if default_type == oracledb.DB_TYPE_VARCHAR:
return cursor.var(str, arraysize=cursor.arraysize, bypass_decode=True)

with oracledb.connect(sample_env.get_main_connect_string()) as conn:

# truncate table and populate with our data of choice
with conn.cursor() as cursor:
cursor.execute("truncate table TestTempTable")
cursor.execute("insert into TestTempTable values (1, :val)",
val=STRING_VAL)
conn.commit()

# fetch the data normally and show that it is returned as a string
with conn.cursor() as cursor:
cursor.execute("select IntCol, StringCol from TestTempTable")
print("Data fetched using normal technique:")
for row in cursor:
print(row)
print()

# fetch the data, bypassing the decode and show that it is returned as
# bytes
with conn.cursor() as cursor:
cursor.outputtypehandler = return_strings_as_bytes
cursor.execute("select IntCol, StringCol from TestTempTable")
print("Data fetched using bypass decode technique:")
for row in cursor:
print(row)
Loading

0 comments on commit 96f9382

Please sign in to comment.