Skip to content

#744 Fix table existence checks in QueryExecutorJdbc to use Hive metadata and DESCRIBE TABLE#745

Merged
yruslan merged 2 commits into
mainfrom
feature/744-fail-safe-hive-table-existence-check
May 16, 2026
Merged

#744 Fix table existence checks in QueryExecutorJdbc to use Hive metadata and DESCRIBE TABLE#745
yruslan merged 2 commits into
mainfrom
feature/744-fail-safe-hive-table-existence-check

Conversation

@yruslan
Copy link
Copy Markdown
Collaborator

@yruslan yruslan commented May 15, 2026

Closes #744

Summary by CodeRabbit

  • Refactor

    • Improved table-existence checks with an optimized metadata-first strategy and safe fallback probes, enhancing reliability and performance when verifying Hive tables.
    • Reduced log noise for JDBC statement cleanup to make logs clearer.
  • Tests

    • Added test coverage for table-existence verification to ensure consistent behavior across environments.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 15, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5f5b44fd-3fdc-4583-bd3b-77ea51b1403a

📥 Commits

Reviewing files that changed from the base of the PR and between 7e8034d and 7220d87.

📒 Files selected for processing (2)
  • pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/QueryExecutorJdbc.scala
  • pramen/core/src/test/scala/za/co/absa/pramen/core/tests/utils/hive/QueryExecutorJdbcSuite.scala
🚧 Files skipped from review as they are similar to previous changes (2)
  • pramen/core/src/test/scala/za/co/absa/pramen/core/tests/utils/hive/QueryExecutorJdbcSuite.scala
  • pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/QueryExecutorJdbc.scala

Walkthrough

QueryExecutorJdbc refactors table existence detection with optimization-aware branching: when optimized, it checks Hive metadata via DatabaseMetaData and falls back to DESCRIBE; when unoptimized, it uses a dedicated SQL-based probe. Two new helper methods implement the fallback strategies. Logging levels are adjusted during cleanup, and a test validates the metadata-based detection behavior.

Changes

Table existence check optimization and fallback strategies

Layer / File(s) Summary
Optimized table existence detection and helper methods
pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/QueryExecutorJdbc.scala
doesTableExist branches on optimizedExistQuery to select detection strategy: metadata-first with DESCRIBE fallback when optimized, or dedicated SQL probe when disabled. doesTableExistUsingHiveMetadata queries JDBC DatabaseMetaData.getTables using escaped identifiers. doesTableExistUsingDescribeTable executes DESCRIBE <fullTableName> and treats errors as non-existence. doesTableExistUsingSqlQuery runs SELECT 1 ... WHERE 0 = 1 with same error-handling. JDBC statement cleanup log messages downgraded from info to debug.
Table existence check test validation
pramen/core/src/test/scala/za/co/absa/pramen/core/tests/utils/hive/QueryExecutorJdbcSuite.scala
New test case "check table existence" creates a temporary table, verifies metadata-based and describe-based existence checks, and asserts the metadata-check result.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 A metadata hop, a DESCRIBE fall back,
When Hive won't tell what tables it's got,
The query now branches both clever and smart,
Three paths to the truth, none leaving a gap! 🌿

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: implementing fail-safe table existence checks using Hive metadata and DESCRIBE TABLE fallback in QueryExecutorJdbc.
Linked Issues check ✅ Passed The pull request successfully implements the objectives from issue #744: it adds Hive metadata-based existence checks with DESCRIBE TABLE fallback as required.
Out of Scope Changes check ✅ Passed All changes are directly related to implementing fail-safe table existence checks. The log level downgrade for JDBC cleanup is a minor refactor supporting the main objective.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/744-fail-safe-hive-table-existence-check

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/QueryExecutorJdbc.scala (1)

132-149: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Handle metadata lookup failures and trigger fallback.

doesTableExistUsingHiveMetadata can throw (e.g., SQLException) and currently bubbles up, so the DESCRIBE fallback is skipped when metadata listing fails. That breaks the fail-safe behavior this PR is implementing.

Proposed fix
-  def doesTableExistUsingHiveMetadata(databaseNameOpt: Option[String], tableName: String): Boolean = {
+  def doesTableExistUsingHiveMetadata(databaseNameOpt: Option[String], tableName: String): Boolean = {
     import za.co.absa.pramen.core.utils.UsingUtils.Implicits._

-    val conn = getConnection(false)
-    val metadata = conn.getMetaData
-
-    val db = databaseNameOpt match {
-      case Some(s) => getEscapedMetadataString(s, metadata)
-      case None => null
-    }
-
-    val table = getEscapedMetadataString(tableName, metadata)
-
-    for (rs <- metadata.getTables(null, db, table, HIVE_TABLE_TYPES)) yield {
-      val exists = rs.next
-      exists
+    Try {
+      val conn = getConnection(false)
+      val metadata = conn.getMetaData
+      val db = databaseNameOpt match {
+        case Some(s) => getEscapedMetadataString(s, metadata)
+        case None => null
+      }
+      val table = getEscapedMetadataString(tableName, metadata)
+      for (rs <- metadata.getTables(null, db, table, HIVE_TABLE_TYPES)) yield rs.next
+    }.recover {
+      case NonFatal(ex) =>
+        log.warn(s"Metadata table existence check failed for '$tableName'. Falling back to DESCRIBE TABLE.", ex)
+        false
     }
   }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/QueryExecutorJdbc.scala`
around lines 132 - 149, doesTableExistUsingHiveMetadata currently lets
exceptions from metadata.getTables bubble up which prevents the DESCRIBE
fallback; wrap the metadata lookup in a try/catch that catches SQLException (or
Throwable) and returns false so the caller will run the fallback, and ensure the
JDBC Connection from getConnection(false) is always closed (use the existing
UsingUtils.Implicits or a try-finally around getConnection(false)).
Specifically, in doesTableExistUsingHiveMetadata wrap the call to
metadata.getTables(null, db, table, HIVE_TABLE_TYPES) and the ResultSet handling
in a safe block that on exception logs/debugs the error and returns false; keep
using getEscapedMetadataString for db/table and preserve resource cleanup.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/QueryExecutorJdbc.scala`:
- Around line 159-161: The log in QueryExecutorJdbc's DESCRIBE fallback
currently concatenates the exception message without spacing/parentheses; update
the Failure(ex) branch in the method where this appears to build a clearer
message (e.g. use s"The query resulted in an error, assuming the table
$fullTableName does not exist: ${ex.getMessage}" or include the exception as a
parameter to the logger like log.info(message, ex)) so the exception text is
separated and readable; modify the Failure(ex) case accordingly.

In
`@pramen/core/src/test/scala/za/co/absa/pramen/core/tests/utils/hive/QueryExecutorJdbcSuite.scala`:
- Around line 79-85: The test creates my_table2 but checks for MY_TABLE and
ignores the DESCRIBE result; update the calls to target the created table and
assert both outcomes: call qe.doesTableExistUsingHiveMetadata(None, "MY_TABLE2")
(or "my_table2" with matching case expectations) and store the result of
qe.doesTableExistUsingDescribeTable(None, "my_table2") into a variable, then
assert that the Hive-metadata check is true and that the describe-based check
returns the expected value (for HSQL likely false); adjust assertions
accordingly around the qe, doesTableExistUsingHiveMetadata and
doesTableExistUsingDescribeTable calls.

---

Outside diff comments:
In
`@pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/QueryExecutorJdbc.scala`:
- Around line 132-149: doesTableExistUsingHiveMetadata currently lets exceptions
from metadata.getTables bubble up which prevents the DESCRIBE fallback; wrap the
metadata lookup in a try/catch that catches SQLException (or Throwable) and
returns false so the caller will run the fallback, and ensure the JDBC
Connection from getConnection(false) is always closed (use the existing
UsingUtils.Implicits or a try-finally around getConnection(false)).
Specifically, in doesTableExistUsingHiveMetadata wrap the call to
metadata.getTables(null, db, table, HIVE_TABLE_TYPES) and the ResultSet handling
in a safe block that on exception logs/debugs the error and returns false; keep
using getEscapedMetadataString for db/table and preserve resource cleanup.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5e6ef570-a649-49c3-a479-cedda9c6698f

📥 Commits

Reviewing files that changed from the base of the PR and between 105f4be and 7e8034d.

📒 Files selected for processing (2)
  • pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/QueryExecutorJdbc.scala
  • pramen/core/src/test/scala/za/co/absa/pramen/core/tests/utils/hive/QueryExecutorJdbcSuite.scala

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 15, 2026

Unit Test Coverage

Overall Project 77.27% -0.05% 🍏
Files changed 82.88% 🍏

Module Coverage
pramen:core Jacoco Report 78.3% -0.05% 🍏
Files
Module File Coverage
pramen:core Jacoco Report QueryExecutorJdbc.scala 86.05% -7.92% 🍏

@yruslan
Copy link
Copy Markdown
Collaborator Author

yruslan commented May 15, 2026

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 15, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@yruslan yruslan merged commit a30289f into main May 16, 2026
11 of 12 checks passed
@yruslan yruslan deleted the feature/744-fail-safe-hive-table-existence-check branch May 16, 2026 14:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add a fail-safe table existence check for Hive tables

1 participant