Skip to content

#736 Add ability to use schema replacement on Hive schema change instead of full table re-creation#737

Merged
yruslan merged 6 commits intomainfrom
feature/736-use-hive-alter-table
Apr 22, 2026
Merged

#736 Add ability to use schema replacement on Hive schema change instead of full table re-creation#737
yruslan merged 6 commits intomainfrom
feature/736-use-hive-alter-table

Conversation

@yruslan
Copy link
Copy Markdown
Collaborator

@yruslan yruslan commented Apr 22, 2026

Closes #736

Summary by CodeRabbit

  • New Features
    • Add option to update Hive table schemas in-place (no full table recreate required).
    • New configurable "replace schema" Hive query template to control schema-replacement SQL.
    • Improved validation for CHAR/VARCHAR lengths to avoid invalid or out-of-range type sizes.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 22, 2026

Warning

Rate limit exceeded

@yruslan has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 26 minutes and 55 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 26 minutes and 55 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c3301a1f-6c31-447f-bfc2-c6fe472adf27

📥 Commits

Reviewing files that changed from the base of the PR and between 7faeee0 and 69e00c5.

📒 Files selected for processing (5)
  • pramen/core/src/main/scala/za/co/absa/pramen/core/metastore/MetastoreImpl.scala
  • pramen/core/src/main/scala/za/co/absa/pramen/core/metastore/model/HiveConfig.scala
  • pramen/core/src/main/scala/za/co/absa/pramen/core/runner/task/TaskRunnerBase.scala
  • pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/HiveHelperSparkCatalog.scala
  • pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/HiveQueryTemplates.scala

Walkthrough

Adds an opt-in in-place Hive schema replacement flow: a new updateSchema: Boolean parameter is threaded through Job, TaskRunner, Metastore, and HiveHelper to drive executing an ALTER TABLE ... REPLACE COLUMNS template instead of recreating the table and running MSCK REPAIR TABLE.

Changes

Cohort / File(s) Summary
Interfaces / API
pramen/core/src/main/scala/za/co/absa/pramen/core/metastore/Metastore.scala, pramen/core/src/main/scala/za/co/absa/pramen/core/pipeline/Job.scala, pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/HiveHelper.scala
Added updateSchema: Boolean parameter to repairOrCreateHiveTable / createOrRefreshHiveTable. Added abstract replaceHiveTableSchema to HiveHelper.
Metastore implementation
pramen/core/src/main/scala/za/co/absa/pramen/core/metastore/MetastoreImpl.scala
Handles new updateSchema flag: attempts in-place schema replace via HiveHelper.replaceHiveTableSchema when true; falls back to create/update and centralized partition add/repair logic when needed.
Job & Runner plumbing
pramen/core/src/main/scala/za/co/absa/pramen/core/pipeline/JobBase.scala, pramen/core/src/main/scala/za/co/absa/pramen/core/runner/task/TaskRunnerBase.scala
Propagates updateSchema through job calls. TaskRunner separates updateSchema decision from forceReCreateHiveTables.
Hive helper implementations
pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/HiveHelperSparkCatalog.scala, pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/HiveHelperSql.scala
Implemented replaceHiveTableSchema using the configured template (builds DDL from schema, injects @fullTableName/@Schema, executes SQL).
Query templates & config
pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/HiveQueryTemplates.scala, pramen/core/src/main/scala/za/co/absa/pramen/core/metastore/model/HiveConfig.scala
Added replaceSchemaTemplate field and config key replace.schema.template with default ALTER TABLE @fullTableNameREPLACE COLUMNS (@Schema );. Updated template construction and config loading.
Spark utils
pramen/core/src/main/scala/za/co/absa/pramen/core/utils/SparkUtils.scala
Tightened parsing of char/varchar length metadata to constrain lengths within MAXIMUM_VARCHAR_LENGTH; invalid lengths fall back to StringType.
Tests & Mocks
pramen/core/src/test/scala/.../MetastoreSuite.scala, .../HiveConfigSuite.scala, .../MetaTableSuite.scala, .../JobBaseSuite.scala, .../HiveHelperSparkCatalogSuite.scala, .../HiveHelperSqlSuite.scala, .../JobSpy.scala, .../MetastoreSpy.scala
Updated test call sites and mocks to accept/record updateSchema. Added/updated tests asserting replaceSchemaTemplate usage and replaceHiveTableSchema behavior (including expected exceptions).

Sequence Diagram(s)

sequenceDiagram
    participant Task as TaskRunner/Job
    participant JobBase as JobBase
    participant Metastore as MetastoreImpl
    participant HiveHelper as HiveHelper (SQL/SparkCatalog)
    participant HiveConfig as HiveConfig

    Task->>JobBase: createOrRefreshHiveTable(..., updateSchema=true)
    JobBase->>Metastore: repairOrCreateHiveTable(..., updateSchema=true)
    alt table exists and updateSchema=true
        Metastore->>HiveHelper: replaceHiveTableSchema(schema, partitionBy, db, table)
        HiveHelper->>HiveConfig: read replaceSchemaTemplate
        HiveConfig-->>HiveHelper: template (ALTER TABLE ... REPLACE COLUMNS ...)
        HiveHelper->>HiveHelper: render SQL with `@fullTableName/`@schema
        HiveHelper->>HiveHelper: execute SQL (spark.sql / queryExecutor)
    else table exists and updateSchema=false
        Metastore->>Metastore: fall back to add partition or repair (MSCK REPAIR TABLE)
    end
    Metastore-->>JobBase: done
    JobBase-->>Task: return results
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Poem

🐰
Hop, hop — I tweak the table's song,
ALTER replaces where drops were long,
No full rebuild, no slow repair,
Quick as a carrot — schema's fair,
I nibble bugs and leave clean air.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Out of Scope Changes check ⚠️ Warning SparkUtils.scala changes added length validation for VARCHAR types, which appears unrelated to Hive schema replacement and is outside the stated objective of using ALTER TABLE instead of table re-creation. Clarify whether VARCHAR length validation is a necessary supporting change or should be extracted to a separate PR. If necessary, document its relationship to schema replacement.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main feature being implemented: adding schema replacement capability to avoid full table re-creation, which directly addresses the core objective.
Linked Issues check ✅ Passed All changes implement the schema replacement feature using ALTER TABLE REPLACE COLUMNS [#736], with new updateSchema parameter, HiveQueryTemplates extension, and method implementations in HiveHelper implementations to support in-place schema updates without partition re-scanning.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/736-use-hive-alter-table

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pramen/core/src/main/scala/za/co/absa/pramen/core/metastore/MetastoreImpl.scala (1)

206-218: ⚠️ Potential issue | 🟠 Major

Avoid falling through to full table repair after schema replacement.

When updateSchema = true and hivePreferAddPartition is false, this still calls repairHiveTable, reintroducing the expensive partition re-scan the PR is meant to avoid. Prefer registering only the current Parquet partition on the schema-update path, or explicitly skip repair when the table already exists.

🐛 Proposed adjustment
         if (updateSchema) {
-          log.info(s"Updating schema of the Hive table table '$fullTableName'")
+          log.info(s"Updating schema of the Hive table '$fullTableName'")
           hiveHelper.replaceHiveTableSchema(effectiveSchema, Seq(mt.infoDateColumn), mt.hiveConfig.database, hiveTable)
         }
 
-        if (mt.hivePreferAddPartition && mt.format.isInstanceOf[DataFormat.Parquet]) {
+        val shouldAddSinglePartition =
+          mt.format.isInstanceOf[DataFormat.Parquet] && (mt.hivePreferAddPartition || updateSchema)
+
+        if (shouldAddSinglePartition) {
           val location = new Path(effectivePath, s"${mt.infoDateColumn}=${infoDate}")
           log.info(s"The table '$fullTableName' exists. Adding partition '$location'...")
           hiveHelper.addPartition(mt.hiveConfig.database, hiveTable, Seq(mt.infoDateColumn), Seq(infoDate.toString), location.toString)
+        } else if (updateSchema) {
+          log.info(s"The table '$fullTableName' exists and its schema was updated. Skipping full table repair.")
         } else {
           log.info(s"The table '$fullTableName' exists. Repairing it.")
           hiveHelper.repairHiveTable(mt.hiveConfig.database, hiveTable, format)
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@pramen/core/src/main/scala/za/co/absa/pramen/core/metastore/MetastoreImpl.scala`
around lines 206 - 218, The current flow in MetastoreImpl calls
hiveHelper.repairHiveTable even when updateSchema is true, causing an expensive
full repair; modify the logic so that after
hiveHelper.replaceHiveTableSchema(effectiveSchema, ... ) you do not fall through
to repairHiveTable: if mt.format is DataFormat.Parquet then register only the
current partition by calling hiveHelper.addPartition(mt.hiveConfig.database,
hiveTable, Seq(mt.infoDateColumn), Seq(infoDate.toString), new
Path(effectivePath, s"${mt.infoDateColumn}=${infoDate}").toString), otherwise
skip the repair entirely (do not call hiveHelper.repairHiveTable) when
updateSchema was performed; keep the existing add-partition branch for the
mt.hivePreferAddPartition=true case unchanged.
🧹 Nitpick comments (3)
pramen/core/src/main/scala/za/co/absa/pramen/core/metastore/model/HiveConfig.scala (1)

123-126: Prefer named arguments for HiveQueryTemplates.

Now that another string template was inserted into the case class, positional construction is easy to mis-order and hard to review.

♻️ Proposed refactor
-      templates = HiveQueryTemplates(createTableTemplate, createOnlyTableTemplate, updateSchemaTemplate, repairTableTemplate, addPartitionTableTemplate, dropTableTemplate),
+      templates = HiveQueryTemplates(
+        createTableTemplate = createTableTemplate,
+        createOnlyTableTemplate = createOnlyTableTemplate,
+        replaceSchemaTemplate = updateSchemaTemplate,
+        repairTableTemplate = repairTableTemplate,
+        addPartitionTemplate = addPartitionTableTemplate,
+        dropTableTemplate = dropTableTemplate
+      ),
-    HiveQueryTemplates(DEFAULT_CREATE_TABLE_TEMPLATE, DEFAULT_CREATE_ONLY_TABLE_TEMPLATE, DEFAULT_UPDATE_SCHEMA_TEMPLATE, DEFAULT_REPAIR_TABLE_TEMPLATE, DEFAULT_ADD_PARTITION_TEMPLATE, DEFAULT_DROP_TABLE_TEMPLATE),
+    HiveQueryTemplates(
+      createTableTemplate = DEFAULT_CREATE_TABLE_TEMPLATE,
+      createOnlyTableTemplate = DEFAULT_CREATE_ONLY_TABLE_TEMPLATE,
+      replaceSchemaTemplate = DEFAULT_UPDATE_SCHEMA_TEMPLATE,
+      repairTableTemplate = DEFAULT_REPAIR_TABLE_TEMPLATE,
+      addPartitionTemplate = DEFAULT_ADD_PARTITION_TEMPLATE,
+      dropTableTemplate = DEFAULT_DROP_TABLE_TEMPLATE
+    ),

Also applies to: 147-150

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@pramen/core/src/main/scala/za/co/absa/pramen/core/metastore/model/HiveConfig.scala`
around lines 123 - 126, The construction of HiveQueryTemplates using positional
arguments is fragile after a new template parameter was added; update the
HiveConfig instantiations that pass templates (where HiveQueryTemplates(...) is
used) to use named arguments for each template parameter (e.g.,
createTableTemplate = ..., createOnlyTableTemplate = ..., updateSchemaTemplate =
..., repairTableTemplate = ..., addPartitionTableTemplate = ...,
dropTableTemplate = ... or the exact parameter names from HiveQueryTemplates),
and do the same for the other occurrence mentioned so the order cannot be
mistaken.
pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/HiveQueryTemplates.scala (1)

25-101: Naming inconsistency: "replace" vs "update" schema.

The new field/config key use "replace" (replaceSchemaTemplate, REPLACE_SCHEMA_TEMPLATE_KEY = "replace.schema.template"), but the default constant and the local binding in fromConfig use "update" (DEFAULT_UPDATE_SCHEMA_TEMPLATE, val updateSchemaTemplate = ...). Since the SQL it emits is ALTER TABLE ... REPLACE COLUMNS, aligning everything on REPLACE would be more consistent and easier to grep.

♻️ Suggested rename
-  val DEFAULT_UPDATE_SCHEMA_TEMPLATE: String = "ALTER TABLE `@fullTableName` REPLACE COLUMNS ( `@schema` );"
+  val DEFAULT_REPLACE_SCHEMA_TEMPLATE: String = "ALTER TABLE `@fullTableName` REPLACE COLUMNS ( `@schema` );"
@@
-    val updateSchemaTemplate = ConfigUtils.getOptionString(conf, REPLACE_SCHEMA_TEMPLATE_KEY)
-      .getOrElse(DEFAULT_UPDATE_SCHEMA_TEMPLATE)
+    val replaceSchemaTemplate = ConfigUtils.getOptionString(conf, REPLACE_SCHEMA_TEMPLATE_KEY)
+      .getOrElse(DEFAULT_REPLACE_SCHEMA_TEMPLATE)
@@
-      replaceSchemaTemplate = updateSchemaTemplate,
+      replaceSchemaTemplate = replaceSchemaTemplate,
@@
-      replaceSchemaTemplate = DEFAULT_UPDATE_SCHEMA_TEMPLATE,
+      replaceSchemaTemplate = DEFAULT_REPLACE_SCHEMA_TEMPLATE,

Don't forget to update the reference in HiveHelperSparkCatalog.scala (line 86) accordingly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/HiveQueryTemplates.scala`
around lines 25 - 101, The codebase mixes "replace" and "update" naming for the
schema template: rename DEFAULT_UPDATE_SCHEMA_TEMPLATE ->
DEFAULT_REPLACE_SCHEMA_TEMPLATE and the local val updateSchemaTemplate ->
replaceSchemaTemplate in HiveQueryTemplates.fromConfig, update any references
where DEFAULT_UPDATE_SCHEMA_TEMPLATE or updateSchemaTemplate are used (including
the HiveQueryTemplates constructor call that currently passes
updateSchemaTemplate into replaceSchemaTemplate), and adjust
HiveHelperSparkCatalog.scala (the usage at line referenced in review) to use the
new DEFAULT_REPLACE_SCHEMA_TEMPLATE / replaceSchemaTemplate identifiers so all
symbols (REPLACE_SCHEMA_TEMPLATE_KEY, replaceSchemaTemplate,
DEFAULT_REPLACE_SCHEMA_TEMPLATE) consistently use "replace".
pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/HiveHelperSparkCatalog.scala (1)

73-92: Consider verifying the table exists before issuing ALTER TABLE.

replaceHiveTableSchema assumes the table already exists — if called on a missing table, the ALTER TABLE ... REPLACE COLUMNS will fail with a Spark/Hive parse/analysis error that is less informative than the explicit IllegalStateException used elsewhere in this class (see createHiveTable / createOrUpdateHiveTable). If callers always guard with doesTableExist first, this is fine; otherwise consider adding a guard here for symmetry with the rest of the API.

Also note: unlike HiveHelperSql (which reads hiveConfig.replaceSchemaTemplate), this implementation hardcodes HiveQueryTemplates.DEFAULT_UPDATE_SCHEMA_TEMPLATE, so a user-configured replace.schema.template will have no effect when using the Spark Catalog backend. That matches the existing behavior for other operations in this class (which bypass templates entirely), but worth confirming this is intentional.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/HiveHelperSparkCatalog.scala`
around lines 73 - 92, The replaceHiveTableSchema method currently runs an
ALTER/REPLACE without checking existence and hardcodes
HiveQueryTemplates.DEFAULT_UPDATE_SCHEMA_TEMPLATE; update replaceHiveTableSchema
to first call doesTableExist(fullTableName) (or the existing doesTableExist
method in this class) and throw the same IllegalStateException used by
createHiveTable/createOrUpdateHiveTable if the table is missing, and modify the
SQL template usage to respect hiveConfig.replaceSchemaTemplate (fall back to
HiveQueryTemplates.DEFAULT_UPDATE_SCHEMA_TEMPLATE if the config value is empty)
so the Spark Catalog backend honors configured templates.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@pramen/core/src/main/scala/za/co/absa/pramen/core/runner/task/TaskRunnerBase.scala`:
- Around line 426-427: The updateSchema decision currently only checks
schemaChangesBeforeTransform.nonEmpty || schemaChangesAfterTransform.nonEmpty
and thus skips Hive metadata updates when handleSchemaChange() returned (true,
Nil) for a first-time registration; modify the logic so updateSchema is true not
only when the change lists are non-empty but also when the boolean result from
handleSchemaChange() indicates a registration happened (i.e., consider the
boolean part of handleSchemaChange() for before/after schema checks) and then
call task.job.createOrRefreshHiveTable(...) when either a change list is
non-empty OR the corresponding handleSchemaChange boolean is true to ensure
first-time schema registration triggers replaceHiveTableSchema.

In `@pramen/core/src/main/scala/za/co/absa/pramen/core/utils/SparkUtils.scala`:
- Line 53: The MAX_VARCHAR_LENGTH constant in SparkUtils (val
MAX_VARCHAR_LENGTH) does not match the boundary used by
JdbcSparkUtils.addMetadata (which writes maxLength for lengths < 8192), causing
mismatched treatment of varchar(>4096 and <8192) values; fix by centralizing the
boundary constant (e.g., create a shared VarcharLengthLimit constant) or change
val MAX_VARCHAR_LENGTH to 8192 and update all usages (including
SparkUtils.MAX_VARCHAR_LENGTH and JdbcSparkUtils.addMetadata) so both producer
and consumer use the exact same exclusive bound.

In
`@pramen/core/src/test/scala/za/co/absa/pramen/core/tests/utils/hive/HiveHelperSparkCatalogSuite.scala`:
- Around line 79-81: MetastoreImpl.repairOrCreateHiveTable currently calls
replaceHiveTableSchema unconditionally when updateSchema is true which throws
AnalysisException for partitioned Parquet tables with SparkCatalog; wrap the
call to replaceHiveTableSchema inside a try-catch that catches AnalysisException
(or Throwable), logs a warning with the exception, and falls back to recreating
the table (i.e., call the existing table creation/replacement code path used
when replacement is not possible) for SparkCatalog/partitioned tables; ensure
you check the partitioned-table condition and still respect updateSchema while
providing the fallback behavior in MetastoreImpl.repairOrCreateHiveTable.

---

Outside diff comments:
In
`@pramen/core/src/main/scala/za/co/absa/pramen/core/metastore/MetastoreImpl.scala`:
- Around line 206-218: The current flow in MetastoreImpl calls
hiveHelper.repairHiveTable even when updateSchema is true, causing an expensive
full repair; modify the logic so that after
hiveHelper.replaceHiveTableSchema(effectiveSchema, ... ) you do not fall through
to repairHiveTable: if mt.format is DataFormat.Parquet then register only the
current partition by calling hiveHelper.addPartition(mt.hiveConfig.database,
hiveTable, Seq(mt.infoDateColumn), Seq(infoDate.toString), new
Path(effectivePath, s"${mt.infoDateColumn}=${infoDate}").toString), otherwise
skip the repair entirely (do not call hiveHelper.repairHiveTable) when
updateSchema was performed; keep the existing add-partition branch for the
mt.hivePreferAddPartition=true case unchanged.

---

Nitpick comments:
In
`@pramen/core/src/main/scala/za/co/absa/pramen/core/metastore/model/HiveConfig.scala`:
- Around line 123-126: The construction of HiveQueryTemplates using positional
arguments is fragile after a new template parameter was added; update the
HiveConfig instantiations that pass templates (where HiveQueryTemplates(...) is
used) to use named arguments for each template parameter (e.g.,
createTableTemplate = ..., createOnlyTableTemplate = ..., updateSchemaTemplate =
..., repairTableTemplate = ..., addPartitionTableTemplate = ...,
dropTableTemplate = ... or the exact parameter names from HiveQueryTemplates),
and do the same for the other occurrence mentioned so the order cannot be
mistaken.

In
`@pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/HiveHelperSparkCatalog.scala`:
- Around line 73-92: The replaceHiveTableSchema method currently runs an
ALTER/REPLACE without checking existence and hardcodes
HiveQueryTemplates.DEFAULT_UPDATE_SCHEMA_TEMPLATE; update replaceHiveTableSchema
to first call doesTableExist(fullTableName) (or the existing doesTableExist
method in this class) and throw the same IllegalStateException used by
createHiveTable/createOrUpdateHiveTable if the table is missing, and modify the
SQL template usage to respect hiveConfig.replaceSchemaTemplate (fall back to
HiveQueryTemplates.DEFAULT_UPDATE_SCHEMA_TEMPLATE if the config value is empty)
so the Spark Catalog backend honors configured templates.

In
`@pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/HiveQueryTemplates.scala`:
- Around line 25-101: The codebase mixes "replace" and "update" naming for the
schema template: rename DEFAULT_UPDATE_SCHEMA_TEMPLATE ->
DEFAULT_REPLACE_SCHEMA_TEMPLATE and the local val updateSchemaTemplate ->
replaceSchemaTemplate in HiveQueryTemplates.fromConfig, update any references
where DEFAULT_UPDATE_SCHEMA_TEMPLATE or updateSchemaTemplate are used (including
the HiveQueryTemplates constructor call that currently passes
updateSchemaTemplate into replaceSchemaTemplate), and adjust
HiveHelperSparkCatalog.scala (the usage at line referenced in review) to use the
new DEFAULT_REPLACE_SCHEMA_TEMPLATE / replaceSchemaTemplate identifiers so all
symbols (REPLACE_SCHEMA_TEMPLATE_KEY, replaceSchemaTemplate,
DEFAULT_REPLACE_SCHEMA_TEMPLATE) consistently use "replace".
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f9080a27-7fd3-4b9b-bf42-138db54a4aba

📥 Commits

Reviewing files that changed from the base of the PR and between 294b275 and 1c511d1.

📒 Files selected for processing (19)
  • pramen/core/src/main/scala/za/co/absa/pramen/core/metastore/Metastore.scala
  • pramen/core/src/main/scala/za/co/absa/pramen/core/metastore/MetastoreImpl.scala
  • pramen/core/src/main/scala/za/co/absa/pramen/core/metastore/model/HiveConfig.scala
  • pramen/core/src/main/scala/za/co/absa/pramen/core/pipeline/Job.scala
  • pramen/core/src/main/scala/za/co/absa/pramen/core/pipeline/JobBase.scala
  • pramen/core/src/main/scala/za/co/absa/pramen/core/runner/task/TaskRunnerBase.scala
  • pramen/core/src/main/scala/za/co/absa/pramen/core/utils/SparkUtils.scala
  • pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/HiveHelper.scala
  • pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/HiveHelperSparkCatalog.scala
  • pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/HiveHelperSql.scala
  • pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/HiveQueryTemplates.scala
  • pramen/core/src/test/scala/za/co/absa/pramen/core/metastore/MetastoreSuite.scala
  • pramen/core/src/test/scala/za/co/absa/pramen/core/metastore/model/HiveConfigSuite.scala
  • pramen/core/src/test/scala/za/co/absa/pramen/core/metastore/model/MetaTableSuite.scala
  • pramen/core/src/test/scala/za/co/absa/pramen/core/mocks/job/JobSpy.scala
  • pramen/core/src/test/scala/za/co/absa/pramen/core/mocks/metastore/MetastoreSpy.scala
  • pramen/core/src/test/scala/za/co/absa/pramen/core/pipeline/JobBaseSuite.scala
  • pramen/core/src/test/scala/za/co/absa/pramen/core/tests/utils/hive/HiveHelperSparkCatalogSuite.scala
  • pramen/core/src/test/scala/za/co/absa/pramen/core/tests/utils/hive/HiveHelperSqlSuite.scala

Comment thread pramen/core/src/main/scala/za/co/absa/pramen/core/utils/SparkUtils.scala Outdated
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 22, 2026

Unit Test Coverage

Overall Project 77.8% -0.09% 🍏
Files changed 80.08% 🍏

Module Coverage
pramen:core Jacoco Report 78.92% -0.1% 🍏
Files
Module File Coverage
pramen:core Jacoco Report HiveConfig.scala 100% 🍏
HiveQueryTemplates.scala 98.53% -1.96% 🍏
HiveHelperSql.scala 95.3% 🍏
JobBase.scala 92.14% 🍏
HiveHelper.scala 88.68% 🍏
HiveHelperSparkCatalog.scala 87.07% -1.71% 🍏
SparkUtils.scala 86.72% -1.63%
MetastoreImpl.scala 86.22% -4.76%
TaskRunnerBase.scala 82.46% 🍏

@yruslan yruslan merged commit 5bd9e82 into main Apr 22, 2026
7 checks passed
@yruslan yruslan deleted the feature/736-use-hive-alter-table branch April 22, 2026 10:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

When replacing Hive schema use 'ALTER TABLE' to avoid 'MSCK REPAIR TABLE'

1 participant