@@ -110,14 +110,18 @@ Or to iterate:
110110Data Frame Type Mapping
111111-----------------------
112112
113+ Default Data Frame Type Mapping
114+ +++++++++++++++++++++++++++++++
115+
113116Internally, python-oracledb's :ref: `DataFrame <oracledataframeobj >` support
114117makes use of `Apache nanoarrow <https://arrow.apache.org/nanoarrow/ >`__
115118libraries to build data frames.
116119
117- The following data type mapping occurs from Oracle Database types to the Arrow
118- types used in python-oracledb DataFrame objects. Querying any other data types
119- from Oracle Database will result in an exception. :ref: `Output type handlers
120- <outputtypehandlers>` cannot be used to map data types.
120+ When querying, the following default data type mapping occurs from Oracle
121+ Database types to the Arrow types used in python-oracledb DataFrame
122+ objects. Querying any other data types from Oracle Database will result in an
123+ exception. :ref: `Output type handlers <outputtypehandlers >` cannot be used to
124+ map data types.
121125
122126.. list-table-with-summary :: Mapping from Oracle Database to Arrow data types
123127 :header-rows: 1
@@ -258,6 +262,99 @@ When converting Oracle Database DATEs and TIMESTAMPs:
258262 * - 7 - 9
259263 - nanoseconds
260264
265+ Explicit Data Frame Type Mapping
266+ ++++++++++++++++++++++++++++++++
267+
268+ You can explicitly set the data types and names that a :ref: `DataFrame
269+ <oracledataframeobj>` will use for query results. This provides fine-grained
270+ control over the physical data representation of the resulting Arrow arrays. It
271+ allows you to specify a representation that is more efficient for its specific
272+ use case. This can reduce memory consumption and improve processing speed.
273+
274+ The parameter ``requested_schema `` parameter to
275+ :meth: `Connection.fetch_df_all() `, :meth: `Connection.fetch_df_batches() `,
276+ :meth: `AsyncConnection.fetch_df_all() `, or
277+ :meth: `AsyncConnection.fetch_df_batches() ` should be an object implementing the
278+ `Arrow PyCapsule schema interface
279+ <https://arrow.apache.org/docs/python/generated/pyarrow.Schema.html> `__.
280+
281+ For example, the ``pyarrow.schema() `` factory function can be used to create a
282+ new schema. This takes a list of field definitions as input. Each field can be
283+ a tuple of ``(name, DataType) ``:
284+
285+ .. code-block :: python
286+
287+ import pyarrow
288+
289+ # Default fetch
290+
291+ odf = connection.fetch_df_all(
292+ " select 123 c1, 'Scott' c2 from dual"
293+ )
294+ tab = pyarrow.table(odf)
295+ print (" Default Output:" , tab)
296+
297+ # Fetching with an explicit schema
298+
299+ schema = pyarrow.schema([
300+ (" col_1" , pyarrow.int16()),
301+ (" C2" , pyarrow.string())
302+ ])
303+
304+ odf = connection.fetch_df_all(
305+ " select 456 c1, 'King' c2 from dual" ,
306+ requested_schema = schema
307+ )
308+ tab = pyarrow.table(odf)
309+ print (" \n New Output:" , tab)
310+
311+ The schema should have an entry for each queried column.
312+
313+ Running the example shows that the number column with the explicit schema was
314+ fetched into the requested type INT16. Its name has also changed::
315+
316+ Default Output: pyarrow.Table
317+ C1: double
318+ C2: string
319+ ----
320+ C1: [[123]]
321+ C2: [["Scott"]]
322+
323+ New Output: pyarrow.Table
324+ col_1: int16
325+ C2: string
326+ ----
327+ col_1: [[456]]
328+ C2: [["King"]]
329+
330+ **Supported Explicit Type Mapping **
331+
332+ The following table shows the explicit type mappings that are supported. An
333+ error will occur if the database type or the data cannot be represented in the
334+ requested schema type.
335+
336+ .. list-table-with-summary ::
337+ :header-rows: 1
338+ :class: wy-table-responsive
339+ :widths: 1 1
340+ :align: left
341+ :summary: The first column is the Oracle Database data type. The second column shows supported Arrow data types.
342+
343+ * - Oracle Database Type
344+ - Arrow Data Types
345+ * - DB_TYPE_NUMBER
346+ - INT8, INT16, INT32, INT64, UINT8, UINT16, UINT32, UINT64, DECIMAL128(p, s), DOUBLE, FLOAT
347+ * - DB_TYPE_RAW, DB_TYPE_LONG_RAW
348+ - BINARY, FIXED SIZE BINARY, LARGE BINARY
349+ * - DB_TYPE_BOOLEAN
350+ - BOOLEAN
351+ * - DB_TYPE_DATE, DB_TYPE_TIMESTAMP, DB_TYPE_TIMESTAMP_LTZ, DB_TYPE_TIMESTAMP_TZ
352+ - DATE32, DATE64, TIMESTAMP
353+ * - DB_TYPE_BINARY_DOUBLE, DB_TYPE_BINARY_FLOAT
354+ - DOUBLE, FLOAT
355+ * - DB_TYPE_VARCHAR, DB_TYPE_CHAR, DB_TYPE_LONG, DB_TYPE_NVARCHAR, DB_TYPE_NCHAR, DB_TYPE_LONG_NVARCHAR
356+ - STRING, LARGE_STRING
357+
261358.. _convertingodf :
262359
263360Converting python-oracledb's DataFrame to Other Data Frames
0 commit comments