Skip to content

[AMORO-1503][mixed-hive] Add INFO logs for partition operations#4200

Open
lintingbin wants to merge 1 commit intoapache:masterfrom
lintingbin:AMORO-1503-mixed-hive-partition-logs
Open

[AMORO-1503][mixed-hive] Add INFO logs for partition operations#4200
lintingbin wants to merge 1 commit intoapache:masterfrom
lintingbin:AMORO-1503-mixed-hive-partition-logs

Conversation

@lintingbin
Copy link
Copy Markdown
Contributor

Summary

Closes #1503.

Adds INFO/WARN logs on the Mixed Hive Table partition operation paths
(create / drop / alter location). These operations are infrequent, so
logging them at INFO level makes after-the-fact diagnosis of missing
or mis-located partitions much easier with negligible runtime cost.

Logs are added at two layers:

  • HivePartitionUtil — per-partition LOG.info on createPartitionIfAbsent,
    dropPartition, and updatePartitionLocation, plus matching LOG.warn on
    the failure paths. The existing alterPartition log now reports both old
    and new locations.
  • UpdateHiveFiles / ReplaceHivePartitions — per-batch LOG.info for the
    drop / create / alter sets emitted by commitPartitionedTable, with a
    count and a sampled list of partitions (capped at 5, with ... for the
    rest) to avoid log explosion on large commits, plus matching LOG.warn
    on commit failures.

Each log line carries the table identifier, partition values, location
(old → new on alter), and transaction id where available, so a single
log record is enough to identify the partition and trace it back to a
commit.

Sample logs

Per-partition (single-partition path, HivePartitionUtil):

INFO  o.a.a.h.utils.HivePartitionUtil - Creating Hive partition for table catalog.db.tbl, partition values [2024-01-01], location s3://bkt/db/tbl/dt=2024-01-01/.amoro_xact_42
INFO  o.a.a.h.utils.HivePartitionUtil - Dropping Hive partition for table catalog.db.tbl, partition values [2024-01-01], location s3://bkt/db/tbl/dt=2024-01-01/.amoro_xact_41
INFO  o.a.a.h.utils.HivePartitionUtil - Altering Hive partition location for table catalog.db.tbl, partition dt=2024-01-01, location s3://bkt/db/tbl/dt=2024-01-01/.amoro_xact_41 -> s3://bkt/db/tbl/dt=2024-01-01/.amoro_xact_42

Per-batch (transaction commit path, UpdateHiveFiles):

INFO  o.a.a.hive.op.UpdateHiveFiles - Creating 3 Hive partitions for table catalog.db.tbl, txId 42, partitions [Partition(values: [2024-01-01], location: s3://bkt/.../.amoro_xact_42), Partition(values: [2024-01-02], location: s3://bkt/.../.amoro_xact_42), Partition(values: [2024-01-03], location: s3://bkt/.../.amoro_xact_42)]
INFO  o.a.a.hive.op.UpdateHiveFiles - Dropping 1 Hive partitions for table catalog.db.tbl, txId 42, partitions [Partition(values: [2024-01-01], location: s3://bkt/.../.amoro_xact_41)]

Failure path:

WARN  o.a.a.h.utils.HivePartitionUtil - Failed to alter Hive partition location for table catalog.db.tbl, partition dt=2024-01-01, location s3://bkt/.../.amoro_xact_41 -> s3://bkt/.../.amoro_xact_42
    org.apache.thrift.TException: ...

Tests

No new unit tests; existing tests in the module continue to pass:

./mvnw test -pl amoro-format-mixed/amoro-mixed-hive \
  -Dtest=TestRewritePartitions,TestOverwriteFiles,TestSyncHiveMeta
# Tests run: 76, Failures: 0, Errors: 0, Skipped: 18

mvn spotless:check and mvn checkstyle:check are clean for the
touched module.

Add INFO/WARN logging on the Mixed Hive Table partition operation paths
(create / drop / alter location) so that issues like missing or
mis-located partitions can be diagnosed from the logs.

Logging added at two layers:

* HivePartitionUtil: per-partition LOG.info on create, drop, and
  update-location paths, plus matching LOG.warn on failures. The
  alterPartition log now reports both old and new locations.
* UpdateHiveFiles / ReplaceHivePartitions: per-batch LOG.info for the
  drop / create / alter sets emitted by commitPartitionedTable, with a
  count and a sampled list of partitions to avoid log explosion on
  large commits, plus matching LOG.warn on commit failures.

Each log includes the table identifier, partition values, location
(old -> new for alter), and transaction id where available.
@lintingbin
Copy link
Copy Markdown
Contributor Author

Hi @zhoujinsong, could you help take a look at this PR when you have time? Thanks!

Quick note on CI: the Core/hadoop3 job (Spark-3.5) hit failures in TestInternalMixedCatalogService (TestDatabaseOperation.test:168, TestTableOperation.before:186, TestTableCommit.testTableCommit:295), but these are unrelated to this PR's diff (which only touches amoro-format-mixed/amoro-mixed-hive). They're a pre-existing test-isolation flake — master's CI on 2026-04-23 hit the same suite. I've opened #4201 to fix that flake at its root, so once that lands a CI re-run here should be clean.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

module:mixed-hive Hive moduel for Mixed Format

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Improvement]: Add some info logs for Mixed Hive Table when create/drop partitions or alter partition location

1 participant