Skip to content

Address Oracle connection for encrypted, ssl instance#27374

Open
harshach wants to merge 5 commits intomainfrom
oracle_connector
Open

Address Oracle connection for encrypted, ssl instance#27374
harshach wants to merge 5 commits intomainfrom
oracle_connector

Conversation

@harshach
Copy link
Copy Markdown
Collaborator

@harshach harshach commented Apr 15, 2026

Summary

  • Upgrade oracledb from 1.x to 2.x (oracledb>=2.1,<3) — removes dead cx_Oracle dependency
  • Thin-first, thick-fallback connection strategy — tries thin mode (no deps), automatically escalates to thick mode (Oracle Instant Client) if thin fails, enabling Oracle Native Network Encryption (NNE) for servers with SQLNET.ENCRYPTION_SERVER=required
  • Better error surfacing — thick mode init failures now log a WARNING with NNE guidance instead of a silent INFO
  • New Oracle connection integration tests — 5 tests covering ServiceName, TNS, DatabaseSchema URL building, NNE thick mode requirement, and thick mode fallback

Context

A customer on Oracle 19c Enterprise with SQLNET.ENCRYPTION_SERVER=required could not connect. Investigation revealed:

  • NNE is only available in thick mode (Oracle Instant Client) — thin mode cannot negotiate it
  • The connector silently fell back to thin mode when thick init failed, then gave an opaque DPY-4011 error
  • oracledb was pinned to 1.x which lacks improvements in 2.x

Verified end-to-end against Oracle Enterprise 19c with encryption required: thick mode successfully negotiates AES256 encryption.

Test plan

  • 5 new integration tests pass (ServiceName, TNS, URL building, NNE check, thick fallback)
  • 14 existing Oracle unit tests pass with oracledb 2.x
  • End-to-end verified against Oracle Enterprise 19c with SQLNET.ENCRYPTION_SERVER=required
  • SonarCloud: 0 issues, 100% coverage on new code, 0% duplication

🤖 Generated with Claude Code

@harshach harshach requested a review from a team as a code owner April 15, 2026 00:05
Copilot AI review requested due to automatic review settings April 15, 2026 00:05
@github-actions github-actions bot added backend safe to test Add this label to run secure Github workflows on PRs labels Apr 15, 2026
Comment thread ingestion/tests/integration/connections/conftest.py Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the Oracle ingestion connector to better support/enforce python-oracledb 2.x usage and adds integration coverage around Oracle connection modes, including thick-mode initialization fallback behavior.

Changes:

  • Add Oracle Testcontainers support and an Oracle container fixture for integration tests.
  • Add new Oracle connection integration tests covering Service Name, TNS, URL building, and thick-mode fallback.
  • Update Oracle connector dependency to oracledb>=2.1,<3 and improve thick-client initialization handling/logging.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
ingestion/tests/integration/containers.py Adds OracleContainerConfigs and get_oracle_container helper using OracleDbContainer.
ingestion/tests/integration/connections/conftest.py Adds package-scoped oracle_container fixture and grants needed Oracle privileges for test connection steps.
ingestion/tests/integration/connections/test_oracle_connection.py New integration test suite for Oracle connection types and thick/thin mode behavior expectations.
ingestion/src/metadata/ingestion/source/database/oracle/connection.py Improves thick-client init error handling (ProgrammingError vs DatabaseError) and warning message for NNE scenarios.
ingestion/setup.py Updates oracle extra dependency set to use python-oracledb 2.x and removes direct cx_Oracle pin from the extra.

port: int = 1521
dbname: str = "test"
container_name: str = "test-oracle"
exposed_port: Optional[int] = None
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OracleContainerConfigs is later used as a mutable carrier for extra attributes (e.g., docker_container in the connections conftest), but that attribute is not declared on the dataclass. Either add a typed field for it (e.g., docker_container: Optional[OracleDbContainer] = None) or avoid attaching dynamic attributes so editors/type-checkers and readers can understand what the object contains.

Suggested change
exposed_port: Optional[int] = None
exposed_port: Optional[int] = None
docker_container: Optional[OracleDbContainer] = None

Copilot uses AI. Check for mistakes.
oracle_config = OracleContainerConfigs(container_name=str(uuid.uuid4()))
with get_oracle_container(oracle_config) as container:
oracle_config.with_exposed_port(container)
oracle_config.docker_container = container
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fixture sets oracle_config.docker_container = container, but docker_container is not part of OracleContainerConfigs and it is not used anywhere else in this module. Please either remove this assignment or add docker_container as an explicit, typed field on the dataclass to avoid undocumented dynamic attributes.

Suggested change
oracle_config.docker_container = container

Copilot uses AI. Check for mistakes.
Comment on lines +47 to +64
connection = oracledb.connect(
user="sys",
password=config.oracle_password,
dsn=dsn,
mode=oracledb.AUTH_MODE_SYSDBA,
)
cursor = connection.cursor()
grants = [
f"GRANT SELECT ANY DICTIONARY TO {config.username}",
f"GRANT SELECT ON gv_$sql TO {config.username}",
f"GRANT SELECT ON v_$sql TO {config.username}",
f"GRANT CREATE TABLE TO {config.username}",
]
for grant in grants:
cursor.execute(grant)
connection.commit()
cursor.close()
connection.close()
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_grant_oracle_privileges opens a DB connection and cursor but doesn’t ensure they’re closed if any cursor.execute(...) fails (which can happen if the container isn’t ready yet or grants change). Use context managers / try-finally (or with oracledb.connect(...) as conn and with conn.cursor() as cursor) so resources are always released and failures don’t leak connections in the test process.

Suggested change
connection = oracledb.connect(
user="sys",
password=config.oracle_password,
dsn=dsn,
mode=oracledb.AUTH_MODE_SYSDBA,
)
cursor = connection.cursor()
grants = [
f"GRANT SELECT ANY DICTIONARY TO {config.username}",
f"GRANT SELECT ON gv_$sql TO {config.username}",
f"GRANT SELECT ON v_$sql TO {config.username}",
f"GRANT CREATE TABLE TO {config.username}",
]
for grant in grants:
cursor.execute(grant)
connection.commit()
cursor.close()
connection.close()
grants = [
f"GRANT SELECT ANY DICTIONARY TO {config.username}",
f"GRANT SELECT ON gv_$sql TO {config.username}",
f"GRANT SELECT ON v_$sql TO {config.username}",
f"GRANT CREATE TABLE TO {config.username}",
]
with oracledb.connect(
user="sys",
password=config.oracle_password,
dsn=dsn,
mode=oracledb.AUTH_MODE_SYSDBA,
) as connection:
with connection.cursor() as cursor:
for grant in grants:
cursor.execute(grant)
connection.commit()

Copilot uses AI. Check for mistakes.
Comment thread ingestion/setup.py
"nifi": {}, # uses requests
"openlineage": {*COMMONS["kafka"]},
"oracle": {"cx_Oracle>=8.3.0,<9", "oracledb~=1.2", DATA_DIFF["oracle"]},
"oracle": {"oracledb>=2.1,<3", DATA_DIFF["oracle"]},
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description still contains placeholders (e.g. “Fixes ”, “I worked on ... because ...”) and does not describe what was changed, why, or how it was tested. Please update the PR description to include the linked issue (if any), a brief rationale, and the validation performed—this is important given the dependency bump and new integration coverage.

Copilot uses AI. Check for mistakes.
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 15, 2026

🔴 Playwright Results — 1 failure(s), 18 flaky

✅ 2984 passed · ❌ 1 failed · 🟡 18 flaky · ⏭️ 83 skipped

Shard Passed Failed Flaky Skipped
🟡 Shard 1 476 0 4 4
🟡 Shard 2 641 0 4 7
🟡 Shard 4 622 0 4 22
🔴 Shard 5 611 1 0 42
🟡 Shard 6 634 0 6 8

Genuine Failures (failed on all attempts)

Pages/Glossary.spec.ts › Add and Remove Assets (shard 5)
�[31mTest timeout of 180000ms exceeded.�[39m
🟡 18 flaky test(s) (passed on retry)
  • Features/DataAssetRulesDisabled.spec.ts › Verify the ApiEndpoint entity item action after rules disabled (shard 1, 1 retry)
  • Features/DataAssetRulesDisabled.spec.ts › Verify the Database entity item action after rules disabled (shard 1, 1 retry)
  • Pages/AuditLogs.spec.ts › should apply both User and EntityType filters simultaneously (shard 1, 1 retry)
  • Pages/UserCreationWithPersona.spec.ts › Create user with persona and verify on profile (shard 1, 1 retry)
  • Features/BulkEditEntity.spec.ts › Glossary (shard 2, 1 retry)
  • Features/ChangeSummaryBadge.spec.ts › Automated badge should appear on entity description with Automated source (shard 2, 1 retry)
  • Features/DataQuality/TestCaseImportExportE2eFlow.spec.ts › Admin: Complete export-import-validate flow (shard 2, 1 retry)
  • Features/DataQuality/TestCaseResultPermissions.spec.ts › User with only VIEW cannot PATCH results (shard 2, 1 retry)
  • Pages/Customproperties-part2.spec.ts › entityReferenceList shows item count, scrollable list, no expand toggle (shard 4, 1 retry)
  • Pages/Domains.spec.ts › Rename domain with subdomains attached verifies subdomain accessibility (shard 4, 1 retry)
  • Pages/Domains.spec.ts › Rename domain with deeply nested subdomains (3+ levels) verifies FQN propagation (shard 4, 1 retry)
  • Pages/Domains.spec.ts › Rename domain with tags and glossary terms preserves associations (shard 4, 1 retry)
  • Pages/Lineage/DataAssetLineage.spec.ts › verify create lineage for entity - Topic (shard 6, 1 retry)
  • Pages/Lineage/LineageFilters.spec.ts › Verify lineage schema filter selection (shard 6, 1 retry)
  • Pages/Lineage/LineageRightPanel.spec.ts › Verify custom properties tab IS visible for supported type: searchIndex (shard 6, 1 retry)
  • Pages/UserDetails.spec.ts › Create team with domain and verify visibility of inherited domain in user profile after team removal (shard 6, 1 retry)
  • Pages/Users.spec.ts › Permissions for table details page for Data Consumer (shard 6, 1 retry)
  • VersionPages/GlossaryVersionPage.spec.ts › GlossaryTerm (shard 6, 1 retry)

📦 Download artifacts

How to debug locally
# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

Comment thread ingestion/src/metadata/ingestion/source/database/oracle/connection.py Outdated
Copilot AI review requested due to automatic review settings April 15, 2026 04:12
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Comment on lines +98 to +103
logger.info(
"Thin mode connection failed. Attempting thick mode with"
" Oracle Instant Client for NNE support."
)
if self._init_thick_mode():
engine = self._create_engine()
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In _get_client_with_fallback, the log says it's attempting thick mode for NNE, but _init_thick_mode() immediately returns False when instantClientDirectory is empty. This produces misleading logs and means the advertised “thin-first, thick-fallback” path can never actually try thick mode unless a directory is configured. Consider either (a) only logging/attempting thick mode when a lib dir is configured, or (b) attempting init_oracle_client() without lib_dir when no directory is provided (so system-installed Instant Client can be used).

Copilot uses AI. Check for mistakes.
Comment thread ingestion/src/metadata/ingestion/source/database/oracle/connection.py Outdated
Comment on lines +116 to +135
@pytest.mark.order(4)
def test_nne_requires_thick_mode():
"""Verify that NNE (Native Network Encryption) requires thick mode.

Oracle NNE is only available in thick mode (with Oracle Instant Client).
Thin mode cannot negotiate NNE — when the server has
SQLNET.ENCRYPTION_SERVER=required, thin mode gets DPY-4011.

This was verified against Oracle Enterprise 19c:
- Thick mode (Instant Client 19c): AES256 encryption negotiated successfully
- Thin mode (any oracledb version): connection rejected with DPY-4011

The gvenzl/oracle-free container used in CI does not include NNE (it's
Enterprise-only), so we validate that oracledb is on 2.x (for other
improvements) and that thick mode is the documented path for NNE.
"""
assert oracledb.is_thin_mode is not None # module loaded
major_version = int(oracledb.__version__.split(".")[0])
assert major_version >= 2, f"oracledb {oracledb.__version__} should be >= 2.0"

Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_nne_requires_thick_mode name/docstring claims to verify that NNE requires thick mode, but the assertions only check that the installed oracledb major version is >= 2. This makes the test misleading and could pass even if thick-mode/NNE handling regresses. Either rename the test to reflect what it actually validates, or add an assertion that exercises/validates the connector’s thick-mode path (e.g., via mocking or verifying the fallback behavior triggered by a thin-mode failure).

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings April 15, 2026 06:34
@gitar-bot
Copy link
Copy Markdown

gitar-bot bot commented Apr 15, 2026

Code Review ✅ Approved 3 resolved / 3 findings

Oracle connection logic updated for encrypted SSL instances while resolving resource cleanup gaps, redundant probe connections, and incorrect engine instantiation. No issues remain.

✅ 3 resolved
Quality: _grant_oracle_privileges lacks safe resource cleanup

📄 ingestion/tests/integration/connections/conftest.py:44-58
The _grant_oracle_privileges function creates an oracledb connection and cursor but closes them without a try/finally or context manager. If any cursor.execute(grant) call fails, both the cursor and connection will leak. While this is test-only code, Oracle connections are heavyweight resources and a leaked connection could cause the container to hit its session limit, making subsequent tests flaky.

Performance: Unnecessary probe connection when instantClientDirectory is set

📄 ingestion/src/metadata/ingestion/source/database/oracle/connection.py:80-84
When instantClientDirectory is explicitly configured, the user clearly intends thick mode, yet _get_client always attempts a thin-mode connection first. This wastes a TCP round-trip and connection slot against the Oracle server, and will always fail on NNE-required servers. Consider skipping the thin-mode attempt when instantClientDirectory is set.

Bug: _get_client returns known-broken engine when both modes fail

📄 ingestion/src/metadata/ingestion/source/database/oracle/connection.py:80-94 📄 ingestion/src/metadata/ingestion/source/database/oracle/connection.py:106-112 📄 ingestion/src/metadata/ingestion/source/database/oracle/connection.py:85-88
When thin mode fails and thick mode also fails (or isn't configured), _get_client silently returns the engine that already failed _can_connect. Other BaseConnection implementations raise on failure; here, the caller receives a non-functional engine with no indication of why both modes failed. The error only surfaces later in test_connection or during ingestion, far from the root cause.

Additionally, _can_connect catches all exceptions without logging, making thin-mode failures completely invisible. If the real issue is bad credentials or a network problem (not NNE), the user sees no diagnostic info about the thin-mode attempt, only a misleading message about trying thick mode for NNE.

Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Comment on lines +59 to +60
else:
kwargs["instantClientDirectory"] = ""
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These integration tests force instantClientDirectory to "", which bypasses the schema default of /instantclient. Given the connection logic now branches heavily on this field, consider adding coverage for the default/omitted case (e.g., ensure a default /instantclient that isn’t present does not break thin-mode connectivity, or validate the intended fallback behavior).

Suggested change
else:
kwargs["instantClientDirectory"] = ""

Copilot uses AI. Check for mistakes.

def _grant_oracle_privileges(config: OracleContainerConfigs) -> None:
"""Grant DBA-level privileges needed by test_connection steps."""
dsn = oracledb.makedsn("localhost", config.exposed_port, service_name=config.dbname)
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

config.exposed_port is typed as Optional[int] and is used directly in makedsn(...). Since a missing/failed port mapping would lead to a confusing runtime error here, consider asserting it is set (e.g., after with_exposed_port) before building the DSN to fail with a clearer message.

Suggested change
dsn = oracledb.makedsn("localhost", config.exposed_port, service_name=config.dbname)
if config.exposed_port is None:
raise ValueError(
"Oracle container exposed port is not set. "
"Ensure with_exposed_port(container) succeeds before granting privileges."
)
dsn = oracledb.makedsn(
"localhost", config.exposed_port, service_name=config.dbname
)

Copilot uses AI. Check for mistakes.
Comment on lines +82 to +85
if self.service_connection.instantClientDirectory:
return self._get_client_thick_mode()
return self._get_client_thin_mode()

Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instantClientDirectory has a schema default of /instantclient (see openmetadata-spec/.../oracleConnection.json), so this check will route most configs into thick mode by default. With the new fail-fast behavior in _get_client_thick_mode, environments that don’t have Instant Client mounted at /instantclient will now error instead of falling back to thin mode (previous behavior). Consider implementing the advertised thin-first + thick-fallback strategy (try thin, then init thick/retry on failure) or only enforcing fail-fast when the directory was explicitly set by the user (non-default), to avoid a breaking change for existing Oracle connections.

Copilot uses AI. Check for mistakes.
Comment on lines +141 to +142
oracledb.init_oracle_client(lib_dir=lib_dir)
os.environ[LD_LIB_ENV] = lib_dir
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LD_LIBRARY_PATH is being set after calling oracledb.init_oracle_client(). Since the schema/doc for instantClientDirectory says it is used to set LD_LIBRARY_PATH, and env vars typically need to be set before native library loading, it’d be safer to set/prepend the env var before initialization (or remove the env mutation if lib_dir is sufficient) to avoid cases where dependent libraries can’t be resolved at init time.

Suggested change
oracledb.init_oracle_client(lib_dir=lib_dir)
os.environ[LD_LIB_ENV] = lib_dir
existing_ld_library_path = os.environ.get(LD_LIB_ENV)
if existing_ld_library_path:
ld_library_path_entries = existing_ld_library_path.split(os.pathsep)
if lib_dir not in ld_library_path_entries:
os.environ[LD_LIB_ENV] = (
f"{lib_dir}{os.pathsep}{existing_ld_library_path}"
)
else:
os.environ[LD_LIB_ENV] = lib_dir
oracledb.init_oracle_client(lib_dir=lib_dir)

Copilot uses AI. Check for mistakes.
@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants