New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HIVE-21079: Stats replication for partitioned table #522
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…ables. Ashutosh Bapat
…lication In AlterPartitionHandler, we set withinContext.replicationSpec.setIsMetadataOnly(true); In ImportSemanticAnalyzer.createReplImportTasks(), per code around line 1197, we do not add new PartitionSpecs and corresponding tasks. This means that we never apply an ALTER_PARTITION event during incremental load. That looks like a serious bug. Either we should check PartitionDescs irrespective of replicationSpec.setIsMetadataOnly() OR we shouldn’t set replicationSpec.setIsMetadataOnly() to true while dumping an ALTER_PARTITION event. We set replicationSpec.setIsMetadataOnly(true) for ALTER TABLE events as well, so doing that for ALTER PARTITION event looks fine. Ashutosh Bapat.
…ly dump. During bootstrap metadata-only dump we do not dump partitions (See TableExport.getPartitions(). For bootstrap dump we always pass TableSpec with TABLE_ONLY set.). So don't dump partition related events for a metadata-only dump. Ashutosh Bapat.
ashutosh-bapat
changed the title
Hive21079: Stats replication for partitioned table
HIVE-21079: Stats replication for partitioned table
Jan 24, 2019
sankarh
requested changes
Jan 25, 2019
...s/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestStatsReplicationScenarios.java
Show resolved
Hide resolved
...s/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestStatsReplicationScenarios.java
Show resolved
Hide resolved
ql/src/java/org/apache/hadoop/hive/ql/exec/repl/bootstrap/events/filesystem/FSTableEvent.java
Outdated
Show resolved
Hide resolved
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/AlterPartitionHandler.java
Show resolved
Hide resolved
...metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
Show resolved
Hide resolved
...metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
Outdated
Show resolved
Hide resolved
...metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
Show resolved
Hide resolved
ql/src/java/org/apache/hadoop/hive/ql/parse/repl/dump/events/UpdatePartColStatHandler.java
Show resolved
Hide resolved
sankarh
reviewed
Jan 29, 2019
standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift
Outdated
Show resolved
Hide resolved
sankarh
reviewed
Jan 29, 2019
standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift
Outdated
Show resolved
Hide resolved
sankarh
reviewed
Jan 29, 2019
standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift
Outdated
Show resolved
Hide resolved
Do not change get_partitions_by_names metastore API. Instead add another one get_partitions_by_names_req function accepting a GetPartitionsByNamesRequest argument, returning GetPartitionsByNamesResult output. Get or update partition statistics in the same transaction in wich the partiton was obtained or added resp. We do not dump partitions in a metadata-only dump and hence we shouldn't dump DROP PARTITION events as well. Some other cosmetic comments. Ashutosh Bapat.
ashutosh-bapat
force-pushed
the
hive21079
branch
from
January 29, 2019 10:04
15a6928
to
75a1319
Compare
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The first commit is for stats replication for partitioned table. The other two commits are fixing bugs in existing code, AFAIU.
@sankarh can you please review?