#736 Add ability to use schema replacement on Hive schema change instead of full table re-creation by yruslan · Pull Request #737 · AbsaOSS/pramen

yruslan · 2026-04-22T08:44:51Z

Closes #736

Summary by CodeRabbit

New Features
- Add option to update Hive table schemas in-place (no full table recreate required).
- New configurable "replace schema" Hive query template to control schema-replacement SQL.
- Improved validation for CHAR/VARCHAR lengths to avoid invalid or out-of-range type sizes.

coderabbitai · 2026-04-22T08:45:04Z

Warning

Rate limit exceeded

@yruslan has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 26 minutes and 55 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 26 minutes and 55 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c3301a1f-6c31-447f-bfc2-c6fe472adf27

📥 Commits

Reviewing files that changed from the base of the PR and between 7faeee0 and 69e00c5.

📒 Files selected for processing (5)

pramen/core/src/main/scala/za/co/absa/pramen/core/metastore/MetastoreImpl.scala
pramen/core/src/main/scala/za/co/absa/pramen/core/metastore/model/HiveConfig.scala
pramen/core/src/main/scala/za/co/absa/pramen/core/runner/task/TaskRunnerBase.scala
pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/HiveHelperSparkCatalog.scala
pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/HiveQueryTemplates.scala

Walkthrough

Adds an opt-in in-place Hive schema replacement flow: a new updateSchema: Boolean parameter is threaded through Job, TaskRunner, Metastore, and HiveHelper to drive executing an ALTER TABLE ... REPLACE COLUMNS template instead of recreating the table and running MSCK REPAIR TABLE.

Changes

Cohort / File(s)	Summary
Interfaces / API `pramen/core/src/main/scala/za/co/absa/pramen/core/metastore/Metastore.scala`, `pramen/core/src/main/scala/za/co/absa/pramen/core/pipeline/Job.scala`, `pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/HiveHelper.scala`	Added `updateSchema: Boolean` parameter to `repairOrCreateHiveTable` / `createOrRefreshHiveTable`. Added abstract `replaceHiveTableSchema` to `HiveHelper`.
Metastore implementation `pramen/core/src/main/scala/za/co/absa/pramen/core/metastore/MetastoreImpl.scala`	Handles new `updateSchema` flag: attempts in-place schema replace via `HiveHelper.replaceHiveTableSchema` when true; falls back to create/update and centralized partition add/repair logic when needed.
Job & Runner plumbing `pramen/core/src/main/scala/za/co/absa/pramen/core/pipeline/JobBase.scala`, `pramen/core/src/main/scala/za/co/absa/pramen/core/runner/task/TaskRunnerBase.scala`	Propagates `updateSchema` through job calls. TaskRunner separates `updateSchema` decision from `forceReCreateHiveTables`.
Hive helper implementations `pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/HiveHelperSparkCatalog.scala`, `pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/HiveHelperSql.scala`	Implemented `replaceHiveTableSchema` using the configured template (builds DDL from schema, injects `@fullTableName/`@Schema, executes SQL).
Query templates & config `pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/HiveQueryTemplates.scala`, `pramen/core/src/main/scala/za/co/absa/pramen/core/metastore/model/HiveConfig.scala`	Added `replaceSchemaTemplate` field and config key `replace.schema.template` with default `ALTER TABLE` @fullTableName`REPLACE COLUMNS (`@Schema `);`. Updated template construction and config loading.
Spark utils `pramen/core/src/main/scala/za/co/absa/pramen/core/utils/SparkUtils.scala`	Tightened parsing of char/varchar length metadata to constrain lengths within MAXIMUM_VARCHAR_LENGTH; invalid lengths fall back to `StringType`.
Tests & Mocks `pramen/core/src/test/scala/.../MetastoreSuite.scala`, `.../HiveConfigSuite.scala`, `.../MetaTableSuite.scala`, `.../JobBaseSuite.scala`, `.../HiveHelperSparkCatalogSuite.scala`, `.../HiveHelperSqlSuite.scala`, `.../JobSpy.scala`, `.../MetastoreSpy.scala`	Updated test call sites and mocks to accept/record `updateSchema`. Added/updated tests asserting `replaceSchemaTemplate` usage and `replaceHiveTableSchema` behavior (including expected exceptions).

Sequence Diagram(s)

sequenceDiagram
    participant Task as TaskRunner/Job
    participant JobBase as JobBase
    participant Metastore as MetastoreImpl
    participant HiveHelper as HiveHelper (SQL/SparkCatalog)
    participant HiveConfig as HiveConfig

    Task->>JobBase: createOrRefreshHiveTable(..., updateSchema=true)
    JobBase->>Metastore: repairOrCreateHiveTable(..., updateSchema=true)
    alt table exists and updateSchema=true
        Metastore->>HiveHelper: replaceHiveTableSchema(schema, partitionBy, db, table)
        HiveHelper->>HiveConfig: read replaceSchemaTemplate
        HiveConfig-->>HiveHelper: template (ALTER TABLE ... REPLACE COLUMNS ...)
        HiveHelper->>HiveHelper: render SQL with `@fullTableName/`@schema
        HiveHelper->>HiveHelper: execute SQL (spark.sql / queryExecutor)
    else table exists and updateSchema=false
        Metastore->>Metastore: fall back to add partition or repair (MSCK REPAIR TABLE)
    end
    Metastore-->>JobBase: done
    JobBase-->>Task: return results

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

#720 Improve Hive table existence check to make it more efficient from the perspective of HMS pressure. #721: Modifies HiveQueryTemplates structure and wiring — overlaps with the new replaceSchemaTemplate addition.
#724 Switch the behavior of Hive repair table on reruns. Do it only if explicitly asked. #725: Adjusts Hive table recreation decision logic in TaskRunnerBase — related to separating updateSchema from forced recreation.

Poem

🐰
Hop, hop — I tweak the table's song,
ALTER replaces where drops were long,
No full rebuild, no slow repair,
Quick as a carrot — schema's fair,
I nibble bugs and leave clean air.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Out of Scope Changes check	⚠️ Warning	SparkUtils.scala changes added length validation for VARCHAR types, which appears unrelated to Hive schema replacement and is outside the stated objective of using ALTER TABLE instead of table re-creation.	Clarify whether VARCHAR length validation is a necessary supporting change or should be extracted to a separate PR. If necessary, document its relationship to schema replacement.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main feature being implemented: adding schema replacement capability to avoid full table re-creation, which directly addresses the core objective.
Linked Issues check	✅ Passed	All changes implement the schema replacement feature using ALTER TABLE REPLACE COLUMNS [`#736`], with new updateSchema parameter, HiveQueryTemplates extension, and method implementations in HiveHelper implementations to support in-place schema updates without partition re-scanning.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feature/736-use-hive-alter-table

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

pramen/core/src/main/scala/za/co/absa/pramen/core/metastore/MetastoreImpl.scala (1)

206-218: ⚠️ Potential issue | 🟠 Major

Avoid falling through to full table repair after schema replacement.

When updateSchema = true and hivePreferAddPartition is false, this still calls repairHiveTable, reintroducing the expensive partition re-scan the PR is meant to avoid. Prefer registering only the current Parquet partition on the schema-update path, or explicitly skip repair when the table already exists.

🐛 Proposed adjustment

         if (updateSchema) {
-          log.info(s"Updating schema of the Hive table table '$fullTableName'")
+          log.info(s"Updating schema of the Hive table '$fullTableName'")
           hiveHelper.replaceHiveTableSchema(effectiveSchema, Seq(mt.infoDateColumn), mt.hiveConfig.database, hiveTable)
         }
 
-        if (mt.hivePreferAddPartition && mt.format.isInstanceOf[DataFormat.Parquet]) {
+        val shouldAddSinglePartition =
+          mt.format.isInstanceOf[DataFormat.Parquet] && (mt.hivePreferAddPartition || updateSchema)
+
+        if (shouldAddSinglePartition) {
           val location = new Path(effectivePath, s"${mt.infoDateColumn}=${infoDate}")
           log.info(s"The table '$fullTableName' exists. Adding partition '$location'...")
           hiveHelper.addPartition(mt.hiveConfig.database, hiveTable, Seq(mt.infoDateColumn), Seq(infoDate.toString), location.toString)
+        } else if (updateSchema) {
+          log.info(s"The table '$fullTableName' exists and its schema was updated. Skipping full table repair.")
         } else {
           log.info(s"The table '$fullTableName' exists. Repairing it.")
           hiveHelper.repairHiveTable(mt.hiveConfig.database, hiveTable, format)
         }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@pramen/core/src/main/scala/za/co/absa/pramen/core/metastore/MetastoreImpl.scala`
around lines 206 - 218, The current flow in MetastoreImpl calls
hiveHelper.repairHiveTable even when updateSchema is true, causing an expensive
full repair; modify the logic so that after
hiveHelper.replaceHiveTableSchema(effectiveSchema, ... ) you do not fall through
to repairHiveTable: if mt.format is DataFormat.Parquet then register only the
current partition by calling hiveHelper.addPartition(mt.hiveConfig.database,
hiveTable, Seq(mt.infoDateColumn), Seq(infoDate.toString), new
Path(effectivePath, s"${mt.infoDateColumn}=${infoDate}").toString), otherwise
skip the repair entirely (do not call hiveHelper.repairHiveTable) when
updateSchema was performed; keep the existing add-partition branch for the
mt.hivePreferAddPartition=true case unchanged.

🧹 Nitpick comments (3)

pramen/core/src/main/scala/za/co/absa/pramen/core/metastore/model/HiveConfig.scala (1)

123-126: Prefer named arguments for HiveQueryTemplates.

Now that another string template was inserted into the case class, positional construction is easy to mis-order and hard to review.

♻️ Proposed refactor

-      templates = HiveQueryTemplates(createTableTemplate, createOnlyTableTemplate, updateSchemaTemplate, repairTableTemplate, addPartitionTableTemplate, dropTableTemplate),
+      templates = HiveQueryTemplates(
+        createTableTemplate = createTableTemplate,
+        createOnlyTableTemplate = createOnlyTableTemplate,
+        replaceSchemaTemplate = updateSchemaTemplate,
+        repairTableTemplate = repairTableTemplate,
+        addPartitionTemplate = addPartitionTableTemplate,
+        dropTableTemplate = dropTableTemplate
+      ),

-    HiveQueryTemplates(DEFAULT_CREATE_TABLE_TEMPLATE, DEFAULT_CREATE_ONLY_TABLE_TEMPLATE, DEFAULT_UPDATE_SCHEMA_TEMPLATE, DEFAULT_REPAIR_TABLE_TEMPLATE, DEFAULT_ADD_PARTITION_TEMPLATE, DEFAULT_DROP_TABLE_TEMPLATE),
+    HiveQueryTemplates(
+      createTableTemplate = DEFAULT_CREATE_TABLE_TEMPLATE,
+      createOnlyTableTemplate = DEFAULT_CREATE_ONLY_TABLE_TEMPLATE,
+      replaceSchemaTemplate = DEFAULT_UPDATE_SCHEMA_TEMPLATE,
+      repairTableTemplate = DEFAULT_REPAIR_TABLE_TEMPLATE,
+      addPartitionTemplate = DEFAULT_ADD_PARTITION_TEMPLATE,
+      dropTableTemplate = DEFAULT_DROP_TABLE_TEMPLATE
+    ),

Also applies to: 147-150

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@pramen/core/src/main/scala/za/co/absa/pramen/core/metastore/model/HiveConfig.scala`
around lines 123 - 126, The construction of HiveQueryTemplates using positional
arguments is fragile after a new template parameter was added; update the
HiveConfig instantiations that pass templates (where HiveQueryTemplates(...) is
used) to use named arguments for each template parameter (e.g.,
createTableTemplate = ..., createOnlyTableTemplate = ..., updateSchemaTemplate =
..., repairTableTemplate = ..., addPartitionTableTemplate = ...,
dropTableTemplate = ... or the exact parameter names from HiveQueryTemplates),
and do the same for the other occurrence mentioned so the order cannot be
mistaken.

pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/HiveQueryTemplates.scala (1)

25-101: Naming inconsistency: "replace" vs "update" schema.

The new field/config key use "replace" (replaceSchemaTemplate, REPLACE_SCHEMA_TEMPLATE_KEY = "replace.schema.template"), but the default constant and the local binding in fromConfig use "update" (DEFAULT_UPDATE_SCHEMA_TEMPLATE, val updateSchemaTemplate = ...). Since the SQL it emits is ALTER TABLE ... REPLACE COLUMNS, aligning everything on REPLACE would be more consistent and easier to grep.

♻️ Suggested rename

-  val DEFAULT_UPDATE_SCHEMA_TEMPLATE: String = "ALTER TABLE `@fullTableName` REPLACE COLUMNS ( `@schema` );"
+  val DEFAULT_REPLACE_SCHEMA_TEMPLATE: String = "ALTER TABLE `@fullTableName` REPLACE COLUMNS ( `@schema` );"
@@
-    val updateSchemaTemplate = ConfigUtils.getOptionString(conf, REPLACE_SCHEMA_TEMPLATE_KEY)
-      .getOrElse(DEFAULT_UPDATE_SCHEMA_TEMPLATE)
+    val replaceSchemaTemplate = ConfigUtils.getOptionString(conf, REPLACE_SCHEMA_TEMPLATE_KEY)
+      .getOrElse(DEFAULT_REPLACE_SCHEMA_TEMPLATE)
@@
-      replaceSchemaTemplate = updateSchemaTemplate,
+      replaceSchemaTemplate = replaceSchemaTemplate,
@@
-      replaceSchemaTemplate = DEFAULT_UPDATE_SCHEMA_TEMPLATE,
+      replaceSchemaTemplate = DEFAULT_REPLACE_SCHEMA_TEMPLATE,

Don't forget to update the reference in HiveHelperSparkCatalog.scala (line 86) accordingly.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/HiveQueryTemplates.scala`
around lines 25 - 101, The codebase mixes "replace" and "update" naming for the
schema template: rename DEFAULT_UPDATE_SCHEMA_TEMPLATE ->
DEFAULT_REPLACE_SCHEMA_TEMPLATE and the local val updateSchemaTemplate ->
replaceSchemaTemplate in HiveQueryTemplates.fromConfig, update any references
where DEFAULT_UPDATE_SCHEMA_TEMPLATE or updateSchemaTemplate are used (including
the HiveQueryTemplates constructor call that currently passes
updateSchemaTemplate into replaceSchemaTemplate), and adjust
HiveHelperSparkCatalog.scala (the usage at line referenced in review) to use the
new DEFAULT_REPLACE_SCHEMA_TEMPLATE / replaceSchemaTemplate identifiers so all
symbols (REPLACE_SCHEMA_TEMPLATE_KEY, replaceSchemaTemplate,
DEFAULT_REPLACE_SCHEMA_TEMPLATE) consistently use "replace".

pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/HiveHelperSparkCatalog.scala (1)

73-92: Consider verifying the table exists before issuing ALTER TABLE.

replaceHiveTableSchema assumes the table already exists — if called on a missing table, the ALTER TABLE ... REPLACE COLUMNS will fail with a Spark/Hive parse/analysis error that is less informative than the explicit IllegalStateException used elsewhere in this class (see createHiveTable / createOrUpdateHiveTable). If callers always guard with doesTableExist first, this is fine; otherwise consider adding a guard here for symmetry with the rest of the API.

Also note: unlike HiveHelperSql (which reads hiveConfig.replaceSchemaTemplate), this implementation hardcodes HiveQueryTemplates.DEFAULT_UPDATE_SCHEMA_TEMPLATE, so a user-configured replace.schema.template will have no effect when using the Spark Catalog backend. That matches the existing behavior for other operations in this class (which bypass templates entirely), but worth confirming this is intentional.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/HiveHelperSparkCatalog.scala`
around lines 73 - 92, The replaceHiveTableSchema method currently runs an
ALTER/REPLACE without checking existence and hardcodes
HiveQueryTemplates.DEFAULT_UPDATE_SCHEMA_TEMPLATE; update replaceHiveTableSchema
to first call doesTableExist(fullTableName) (or the existing doesTableExist
method in this class) and throw the same IllegalStateException used by
createHiveTable/createOrUpdateHiveTable if the table is missing, and modify the
SQL template usage to respect hiveConfig.replaceSchemaTemplate (fall back to
HiveQueryTemplates.DEFAULT_UPDATE_SCHEMA_TEMPLATE if the config value is empty)
so the Spark Catalog backend honors configured templates.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@pramen/core/src/main/scala/za/co/absa/pramen/core/runner/task/TaskRunnerBase.scala`:
- Around line 426-427: The updateSchema decision currently only checks
schemaChangesBeforeTransform.nonEmpty || schemaChangesAfterTransform.nonEmpty
and thus skips Hive metadata updates when handleSchemaChange() returned (true,
Nil) for a first-time registration; modify the logic so updateSchema is true not
only when the change lists are non-empty but also when the boolean result from
handleSchemaChange() indicates a registration happened (i.e., consider the
boolean part of handleSchemaChange() for before/after schema checks) and then
call task.job.createOrRefreshHiveTable(...) when either a change list is
non-empty OR the corresponding handleSchemaChange boolean is true to ensure
first-time schema registration triggers replaceHiveTableSchema.

In `@pramen/core/src/main/scala/za/co/absa/pramen/core/utils/SparkUtils.scala`:
- Line 53: The MAX_VARCHAR_LENGTH constant in SparkUtils (val
MAX_VARCHAR_LENGTH) does not match the boundary used by
JdbcSparkUtils.addMetadata (which writes maxLength for lengths < 8192), causing
mismatched treatment of varchar(>4096 and <8192) values; fix by centralizing the
boundary constant (e.g., create a shared VarcharLengthLimit constant) or change
val MAX_VARCHAR_LENGTH to 8192 and update all usages (including
SparkUtils.MAX_VARCHAR_LENGTH and JdbcSparkUtils.addMetadata) so both producer
and consumer use the exact same exclusive bound.

In
`@pramen/core/src/test/scala/za/co/absa/pramen/core/tests/utils/hive/HiveHelperSparkCatalogSuite.scala`:
- Around line 79-81: MetastoreImpl.repairOrCreateHiveTable currently calls
replaceHiveTableSchema unconditionally when updateSchema is true which throws
AnalysisException for partitioned Parquet tables with SparkCatalog; wrap the
call to replaceHiveTableSchema inside a try-catch that catches AnalysisException
(or Throwable), logs a warning with the exception, and falls back to recreating
the table (i.e., call the existing table creation/replacement code path used
when replacement is not possible) for SparkCatalog/partitioned tables; ensure
you check the partitioned-table condition and still respect updateSchema while
providing the fallback behavior in MetastoreImpl.repairOrCreateHiveTable.

---

Outside diff comments:
In
`@pramen/core/src/main/scala/za/co/absa/pramen/core/metastore/MetastoreImpl.scala`:
- Around line 206-218: The current flow in MetastoreImpl calls
hiveHelper.repairHiveTable even when updateSchema is true, causing an expensive
full repair; modify the logic so that after
hiveHelper.replaceHiveTableSchema(effectiveSchema, ... ) you do not fall through
to repairHiveTable: if mt.format is DataFormat.Parquet then register only the
current partition by calling hiveHelper.addPartition(mt.hiveConfig.database,
hiveTable, Seq(mt.infoDateColumn), Seq(infoDate.toString), new
Path(effectivePath, s"${mt.infoDateColumn}=${infoDate}").toString), otherwise
skip the repair entirely (do not call hiveHelper.repairHiveTable) when
updateSchema was performed; keep the existing add-partition branch for the
mt.hivePreferAddPartition=true case unchanged.

---

Nitpick comments:
In
`@pramen/core/src/main/scala/za/co/absa/pramen/core/metastore/model/HiveConfig.scala`:
- Around line 123-126: The construction of HiveQueryTemplates using positional
arguments is fragile after a new template parameter was added; update the
HiveConfig instantiations that pass templates (where HiveQueryTemplates(...) is
used) to use named arguments for each template parameter (e.g.,
createTableTemplate = ..., createOnlyTableTemplate = ..., updateSchemaTemplate =
..., repairTableTemplate = ..., addPartitionTableTemplate = ...,
dropTableTemplate = ... or the exact parameter names from HiveQueryTemplates),
and do the same for the other occurrence mentioned so the order cannot be
mistaken.

In
`@pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/HiveHelperSparkCatalog.scala`:
- Around line 73-92: The replaceHiveTableSchema method currently runs an
ALTER/REPLACE without checking existence and hardcodes
HiveQueryTemplates.DEFAULT_UPDATE_SCHEMA_TEMPLATE; update replaceHiveTableSchema
to first call doesTableExist(fullTableName) (or the existing doesTableExist
method in this class) and throw the same IllegalStateException used by
createHiveTable/createOrUpdateHiveTable if the table is missing, and modify the
SQL template usage to respect hiveConfig.replaceSchemaTemplate (fall back to
HiveQueryTemplates.DEFAULT_UPDATE_SCHEMA_TEMPLATE if the config value is empty)
so the Spark Catalog backend honors configured templates.

In
`@pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/HiveQueryTemplates.scala`:
- Around line 25-101: The codebase mixes "replace" and "update" naming for the
schema template: rename DEFAULT_UPDATE_SCHEMA_TEMPLATE ->
DEFAULT_REPLACE_SCHEMA_TEMPLATE and the local val updateSchemaTemplate ->
replaceSchemaTemplate in HiveQueryTemplates.fromConfig, update any references
where DEFAULT_UPDATE_SCHEMA_TEMPLATE or updateSchemaTemplate are used (including
the HiveQueryTemplates constructor call that currently passes
updateSchemaTemplate into replaceSchemaTemplate), and adjust
HiveHelperSparkCatalog.scala (the usage at line referenced in review) to use the
new DEFAULT_REPLACE_SCHEMA_TEMPLATE / replaceSchemaTemplate identifiers so all
symbols (REPLACE_SCHEMA_TEMPLATE_KEY, replaceSchemaTemplate,
DEFAULT_REPLACE_SCHEMA_TEMPLATE) consistently use "replace".

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f9080a27-7fd3-4b9b-bf42-138db54a4aba

📥 Commits

Reviewing files that changed from the base of the PR and between 294b275 and 1c511d1.

📒 Files selected for processing (19)

pramen/core/src/main/scala/za/co/absa/pramen/core/metastore/Metastore.scala
pramen/core/src/main/scala/za/co/absa/pramen/core/metastore/MetastoreImpl.scala
pramen/core/src/main/scala/za/co/absa/pramen/core/metastore/model/HiveConfig.scala
pramen/core/src/main/scala/za/co/absa/pramen/core/pipeline/Job.scala
pramen/core/src/main/scala/za/co/absa/pramen/core/pipeline/JobBase.scala
pramen/core/src/main/scala/za/co/absa/pramen/core/runner/task/TaskRunnerBase.scala
pramen/core/src/main/scala/za/co/absa/pramen/core/utils/SparkUtils.scala
pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/HiveHelper.scala
pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/HiveHelperSparkCatalog.scala
pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/HiveHelperSql.scala
pramen/core/src/main/scala/za/co/absa/pramen/core/utils/hive/HiveQueryTemplates.scala
pramen/core/src/test/scala/za/co/absa/pramen/core/metastore/MetastoreSuite.scala
pramen/core/src/test/scala/za/co/absa/pramen/core/metastore/model/HiveConfigSuite.scala
pramen/core/src/test/scala/za/co/absa/pramen/core/metastore/model/MetaTableSuite.scala
pramen/core/src/test/scala/za/co/absa/pramen/core/mocks/job/JobSpy.scala
pramen/core/src/test/scala/za/co/absa/pramen/core/mocks/metastore/MetastoreSpy.scala
pramen/core/src/test/scala/za/co/absa/pramen/core/pipeline/JobBaseSuite.scala
pramen/core/src/test/scala/za/co/absa/pramen/core/tests/utils/hive/HiveHelperSparkCatalogSuite.scala
pramen/core/src/test/scala/za/co/absa/pramen/core/tests/utils/hive/HiveHelperSqlSuite.scala

github-actions · 2026-04-22T08:55:11Z

Unit Test Coverage

Overall Project	77.8% `-0.09%`	🍏
Files changed	80.08%	🍏

Module	Coverage
pramen:core Jacoco Report	78.92% `-0.1%`	🍏

Files

Module	File	Coverage
pramen:core Jacoco Report	HiveConfig.scala	100%	🍏
	HiveQueryTemplates.scala	98.53% `-1.96%`	🍏
	HiveHelperSql.scala	95.3%	🍏
	JobBase.scala	92.14%	🍏
	HiveHelper.scala	88.68%	🍏
	HiveHelperSparkCatalog.scala	87.07% `-1.71%`	🍏
	SparkUtils.scala	86.72% `-1.63%`	❌
	MetastoreImpl.scala	86.22% `-4.76%`	❌
	TaskRunnerBase.scala	82.46%	🍏

…bles if Spark Catalog fails to do it.

…is registered.

yruslan added 4 commits April 21, 2026 14:29

#839 Add update schema SQL template for Hive.

bbcfdf7

#736 Add support for replacing table schemas in Hive.

318cc09

#736 Add maximum varchar length validation in SparkUtils.

39cfa11

#736 Add schema replacement of Hive tables for tables in the metastore.

1c511d1

coderabbitai Bot reviewed Apr 22, 2026

View reviewed changes

yruslan added 2 commits April 22, 2026 11:53

#736 Fix PR suggestions and add fallback for schema update of Hive ta…

7faeee0

…bles if Spark Catalog fails to do it.

#736 Make sure Hive table is created fro the metastore if new schema …

69e00c5

…is registered.

yruslan merged commit 5bd9e82 into main Apr 22, 2026
7 checks passed

yruslan deleted the feature/736-use-hive-alter-table branch April 22, 2026 10:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

#736 Add ability to use schema replacement on Hive schema change instead of full table re-creation#737

#736 Add ability to use schema replacement on Hive schema change instead of full table re-creation#737
yruslan merged 6 commits intomainfrom
feature/736-use-hive-alter-table

yruslan commented Apr 22, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 22, 2026 •

edited

Loading

Rate limit exceeded

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Apr 22, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

yruslan commented Apr 22, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Unit Test Coverage

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

yruslan commented Apr 22, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 22, 2026 •

edited

Loading

github-actions Bot commented Apr 22, 2026 •

edited

Loading