feat(oracledb): native JSON, VECTOR ergonomics, smart LOB coercion#430
Merged
feat(oracledb): native JSON, VECTOR ergonomics, smart LOB coercion#430
Conversation
Adds sqlspec/adapters/oracledb/_json_handlers.py implementing: - json_converter_in_clob / json_converter_in_blob — serialize Python dict/list/tuple to JSON string / UTF-8 bytes for CLOB / BLOB binding. - json_converter_out_clob / json_converter_out_blob — parse JSON read back into native Python values. - json_input_type_handler — column-aware routing across server versions: 21c+ → DB_TYPE_JSON (binary OSON), 19c-20c → DB_TYPE_BLOB+OSON, 12c-18c → DB_TYPE_CLOB+JSON-string. Reads major from cursor connection attribute _sqlspec_oracle_major; defaults to 21c+ when unset. - json_output_type_handler — passthrough for native DB_TYPE_JSON; claims BLOB / CLOB columns whose type_name carries JSON for round-trip parse. - register_json_handlers — chaining-aware install (preserves any existing inputtypehandler / outputtypehandler via wrapper classes). The input handler explicitly does NOT claim list[float] / tuple[float, ...] so the vector handler retains ownership of embedding payloads. Re-exports the public surface in sqlspec/adapters/oracledb/__init__.py. Tests: 36 new unit tests in tests/unit/adapters/test_oracledb/ test_json_handlers.py covering converters, input dispatch table across server majors, output passthrough/claim/ignore matrix, chaining, and round-trip integrity. Beads: closes C1.T1 (sqlspec-ffm), C1.T5 (sqlspec-3la), C1.T6 (sqlspec-ci4). Part of sqlspec-aa9 (C1: Native JSON binding pipeline) under sqlspec-i6j.
Activates the JSON handlers from the prior commit by installing them at session-callback time and flipping the parameter profile so dict / list parameters are no longer pre-serialised at parameter prep. config.py: - Adds _extract_oracle_major(connection) helper that parses the leading digit of connection.version into an int (None when unavailable). - Both sync and async _init_connection now register JSON handlers unconditionally (no driver-feature flag — native DB_TYPE_JSON is the correct path on every supported Oracle), and stash the major version on the connection as _sqlspec_oracle_major so the handler avoids per-bind metadata queries. - Imports register_json_handlers from sqlspec.adapters.oracledb._json_handlers. core.py: - DriverParameterProfile.json_serializer_strategy flips from "helper" to "driver". The "helper" strategy installed to_json into type_coercion_map for dict/list/tuple, which pre-serialised every JSON parameter to a string before binding. "driver" mode passes the encoder/decoder onto ParameterStyleConfig but leaves dict/list payloads intact, so they reach cursor.execute and the JSON inputtypehandler claims them. - requires_session_callback now returns True unconditionally (the JSON handler registration must always fire) and ignores driver_features. Bundles unrelated dep drift the worktree was already carrying: - ruff pre-commit pin v0.15.11 → v0.15.12. - click-extra 7.13.0 → 7.14.0 (uv.lock). Verification: 1212 unit tests + 4 skipped, ruff + mypy clean. Beads: closes C1.T2 (sqlspec-dzp), C1.T3 (sqlspec-b5f). Part of sqlspec-aa9 (C1: Native JSON binding pipeline) under sqlspec-i6j.
Verifies the C1 contract end-to-end against an Oracle 23ai container: - dict payloads round-trip bit-identical through native JSON columns — no createlob workaround needed regardless of payload size. - list[dict] payloads round-trip. - Large dicts (>4000 bytes serialised, ~8000-byte string + 500-elem int list) bind via DB_TYPE_JSON, proving the helper-strategy CLOB-coercion path is no longer triggered for native JSON columns. - bool / None / int / nested list / nested dict survive round-trip. - executemany over multiple dicts. - Sync driver parity. Float values in native JSON columns come back as decimal.Decimal per python-oracledb's default OSON-numeric coercion; tracked as a separate follow-up concern (sqlspec-ohv). Verification: 6 new integration tests pass; 32 adjacent integration tests (test_msgspec_clob, test_numpy_vectors, test_uuid_binary) still pass — no regressions in CLOB / vector / UUID handler chains. Beads: closes C1.T7 (sqlspec-5lo). Part of sqlspec-aa9 (C1: Native JSON binding pipeline) under sqlspec-i6j.
The handlers cover all DB_TYPE_VECTOR-bound Python sequences (ndarray,
array.array, list, tuple), not just NumPy arrays. Renaming the module
reflects the broader scope ahead of C3.T2/T3, which extend the input
handler to claim list/tuple sequences and add vector_return_format
routing. Public symbols keep their numpy_* prefix; the user-facing
rename is a separate follow-up.
DTYPE_TO_ARRAY_CODE gains int16 ('h') and int32 ('i') unconditionally
and float16 ('e') on Python 3.13+ (where array.array first accepted
the typecode). Per-typecode names are hoisted to module constants so
PLR2004 magic-value checks do not fire when downstream tasks use them
in the input handler's dispatch.
Internal imports in __init__.py, config.py, and type_converter.py move
to the new path; the package re-exports the module under the new
``vector_handlers`` alias and drops the old ``numpy_handlers`` alias.
Adds vector_return_format to OracleDriverFeatures with the documented
"numpy" (NumPy installed) / "list" (otherwise) default. Wired through
apply_driver_features so downstream session-callback code can consume
the value without re-deriving the policy. Unblocks C3.T2/T3 which
read this field at handler-registration time.
- core.py: setdefault("vector_return_format", "numpy" if NUMPY_INSTALLED else "list")
- config.py: NotRequired[str] field + docstring entry
- tests/unit/adapters/test_oracledb/test_core_driver_features.py: 5 new tests
…ispatch Closes the C3 vector ergonomics gap: list[float] / tuple[float] / list[int] embeddings from LLM clients now bind directly to DB_TYPE_VECTOR with no manual conversion, and the read path dispatches to numpy / list / array via a per-connection vector_return_format setting. C3.T2 — _input_type_handler: claims numpy.ndarray (existing path), array.array (passthrough), and list/tuple of int|float (auto-pack as int8 when entirely in [-128, 127] else float32). Bool sequences and list[dict] fall through so the JSON handler can claim them. C3.T3 — _output_type_handler: reads connection._sqlspec_vector_return_format and dispatches to numpy_converter_out / list / passthrough. RuntimeError when "numpy" requested without numpy installed; ValueError on invalid format. C3.T4 — config._init_connection (sync + async): stashes vector_return_format on the connection alongside the C1 oracle_major cache. register_numpy_handlers now runs unconditionally so pure-Python list[float] binds work without enable_numpy_vectors=True. Tests: new tests/unit/adapters/test_oracledb/test_vector_handlers.py (18 tests covering the dispatch matrix). test_numpy_handlers.py tests that asserted "skip register when NUMPY_INSTALLED=False" updated to the always-register policy. Full unit suite: 140 passed.
…nding (C2.T1) New module `sqlspec.adapters.oracledb._param_types` adds three slot-based wrapper classes — OracleClob, OracleBlob, OracleJson — that let users override the size-based heuristics in coerce_large_parameters_*. The wrappers are pure containers (no validation in __init__); type discipline moves to the T2 routing site where errors can carry database context. Per chapter-2/spec.md §3 T1. Sets the import surface for T2's wrapper-aware coercion and T6's __init__.py re-export.
coerce_large_parameters_sync / _async now route Oracle{Clob,Blob,Json}
wrappers ahead of the size-based fallback so power users can express
explicit type intent. OracleClob(bytes) decodes utf-8; OracleBlob(str)
encodes utf-8; OracleJson unwraps so the C1 input handler claims the
value.
Adds OracleBlob/OracleClob/OracleJson to the package re-export surface
and 12 unit tests (6 sync + 6 async) covering the wrapper paths plus
the user-configurable threshold-override path that T3 will wire up.
The new _param_types.py (C2.T1) defines three slot-based wrapper classes that sit on the parameter-binding hot path for every Oracle execute. Pure-Python with no conditional imports or metaclass tricks — clean mypyc target. Audit of the existing C1/C3 handler modules (_json_handlers, _vector_handlers, _uuid_handlers) for similar inclusion is tracked as sqlspec-llu (needs a wheel-build verification pass).
…/T4/T5) The 4000/2000-byte thresholds previously baked in as module constants in driver.py now live in driver_features. dispatch_execute (sync + async) reads oracle_varchar2_byte_limit / oracle_raw_byte_limit from self.driver_features.get(...) so MAX_STRING_SIZE=EXTENDED databases can opt into 32767-byte VARCHAR2. apply_driver_features fills the defaults so the dispatch fallback path is a one-time bootstrap concern, not a per-call default. The OracleDriverFeatures TypedDict advertises both fields and the docstring documents the EXTENDED scenario. 6 new unit tests cover the defaults, user-override preservation, and TypedDict surface.
…2.T8) Seven cases against the Oracle 23ai container exercise the wrapper-aware routing landed in C2.T2: OracleClob/OracleBlob/OracleJson round-trip (async + sync), the C1 native-JSON handler claiming OracleJson without a CLOB intermediary, the demo's bytes-payload workaround replacement, and the threshold-override path skipped on non-EXTENDED containers. Discovered during testing that the wrappers only fire for dict-style parameters; positional binds bypass coerce_large_parameters_* entirely and reach python-oracledb raw, raising DPY-3002. Tests use named binds to match the documented contract; the positional path is filed as a follow-up. 6 pass on Oracle 23ai (STANDARD), 1 EXTENDED-only test skipped.
…205)
coerce_large_parameters_sync / _async previously short-circuited on
anything that wasn't a dict, leaving Oracle{Clob,Blob,Json} wrappers
inside positional tuples / lists to reach python-oracledb raw and
raise DPY-3002. Routing now extends to tuple and list parameters via
the new _coerce_value_sync / _coerce_value_async helpers shared with
the dict path.
Tuples are returned as new lists when iterated — the driver's existing
cast(..., 'list[Any] | tuple[Any, ...] | dict[Any, Any] | None') keeps
this contract. The pre-existing identity-based passthrough assertion
on lists is updated to value-equality (the new path always returns a
fresh list when iterating).
13 unit tests cover OracleClob/Blob/Json + plain str threshold +
empty-tuple short-circuit across sync and async; 2 integration tests
verify the path end-to-end against Oracle 23ai.
…llu) Centralizes the runtime oracledb constants (DB_TYPE_BLOB / CLOB / JSON / RAW) in _typing.py — already excluded from mypyc — and rewires the four handler modules to import from there instead of doing lazy import oracledb inside every function. Removes a per-call import lookup on the input / output handler hot path and makes the handler modules pure mypyc targets. Adds the three handler modules to the mypyc include glob alongside the already-compiled _param_types.py. Wheel build with HATCH_BUILD_HOOKS_ENABLE=1 produces a cp310-linux_x86_64 wheel cleanly; 190 unit tests pass under both interpreted and compiled imports. Other adapters (asyncpg / psycopg / psqlpy) handle vectors via pgvector's register_vector inside type_converter.py, which is already in the sqlspec/adapters/**/type_converter.py glob — no parallel gap there.
Extract rows[0] to a local variable so isinstance narrowing applies to the dict/sequence branch — pyright otherwise re-evaluates rows[0] and loses the narrowing.
PYTHONWARNINGS scoped to google.adk.features._feature_decorator so the
[EXPERIMENTAL] PLUGGABLE_AUTH notice doesn't leak into our lint output.
The authlib.jose deprecation can't be suppressed via env vars — authlib
calls warnings.simplefilter("always", AuthlibDeprecationWarning) in its
deprecate module which resets the filter list at import time. Will
disappear when authlib ships 2.0.
Upstream regression: mysql-connector-python 9.7.0 dropped cp312/cp313/ cp314 wheels (9.6.0 had all five ABIs). CI on cp312 fails with "doesn't have a source distribution or wheel for the current platform". - Add per-Python-version markers on the mysql-connector extra so 3.10/ 3.11 stay unconstrained while 3.12+ caps below 9.7. - Add a [tool.uv] override-dependencies entry so the same cap applies to the transitive pull through pytest-databases[mysql].
…s for mysql-connector-python
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Overhauls Oracle's type coercion path so the common cases just work and the
uncommon cases have an explicit escape hatch.
Native JSON
dict/listdirectly viaDB_TYPE_JSON(binary OSON);19c-20c falls back to
BLOB CHECK (... IS JSON); pre-19c usesCLOB CHECK (... IS JSON). The right path is picked from the server'smajor version, cached on the connection.
"driver"so the binary pathisn't skipped by an upstream string serialization.
DB_TYPE_JSONcolumns return Python objects as-is, and BLOB /CLOB columns whose
type_nameincludesJSONare auto-parsed.VECTOR ergonomics (Oracle 23ai)
list[float],list[int],tuple[...],array.array, andnp.ndarrayall bind to
DB_TYPE_VECTORwith no flag toggle. Integer sequences in theint8 range pack as int8; everything else falls back to float32.
vector_return_formatdriver feature ("numpy"/"list"/"array")controls how VECTOR reads materialize. Defaults to
"numpy"when NumPy isinstalled,
"list"otherwise. Errors loudly if"numpy"is requestedwithout NumPy.
_numpy_handlersto_vector_handlersto reflectthe broader payload coverage. Public API (
numpy_converter_in, etc.)is unchanged.
Smart LOB coercion
OracleClob,OracleBlob,OracleJson— let usersbypass the size heuristics when they want explicit control.
OracleClob(bytes)decodes utf-8 before binding;
OracleBlob(str)encodes utf-8;OracleJson(...)defers to the JSON handler chain so the value never getscoerced into a CLOB intermediary.
{"col": OracleClob(...)}) and positional(
(1, OracleClob(...))) bind shapes.driver_featuressettings —oracle_varchar2_byte_limitandoracle_raw_byte_limit— so users ondatabases with
MAX_STRING_SIZE=EXTENDEDcan opt into 32767-byte VARCHAR2without auto-coercion to CLOB.