-
Notifications
You must be signed in to change notification settings - Fork 4.8k
Hive17967 #270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hive17967 #270
Conversation
|
|
||
| import java.util.List; | ||
|
|
||
| public class SerDeStorageSchemaReader implements StorageSchemaReader { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is this used ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HiveMetaStore.get_fields_with_environment_context(). Previously this method used the SerDe to read the fields from the SerDe parameters rather than the storage descriptor fields. However, standalone-metastore doesn't have access to the serdes. We haven't yet settled on a solution for this. There's a debate raging on HIVE-17714 on what the right way forward is. For now I've created the StorageSchemaReader interface to pull the serde dependency out. There is a default implementation that can be used by the metastore in standalone mode that just fails if its called. SerDeStorageSchemaReader is the implementation for use with Hive that works as before.
| @@ -1,4 +1,4 @@ | |||
| /** | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should remain part of hive. repl dump command is implemented in hive and the cleanup of that also should be in a thread whose impl is in hive.
I guess this falls under same bucket as compactor thread. What is the approach being followed for that ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now the compactor is in Hive because it's tightly entwined in ql, though I plan to move it someday (though Eugene doesn't agree with me on that yet :) ). The compactor really needs to move because without it you cannot operate on ACID files.
For this, I'm fine with leaving it in Hive. I've wondered in general if the repl stuff should come along, though parts already have because of its tight integration into HiveMetaStore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replication can be considered to have 2 parts - metastore event logging (DBNotificationListner) and the repl/dump load.
In my opinion, event logging belongs to metastore. The repl load/dump commands (they are sql commands, that generate query plans) should be in Hive. So associated code, such as cleanup of the repl dump dir etc should also be in Hive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. I can move this class back into metastore.
…rCleanerTask back to metastore. Took the opportunity to genericize how HiveMetaStore handles starting background threads.
54af42a to
d0a20ec
Compare
…es, reviewed by Thejas Nair). (cherry picked from commit 8fcc7f3) Change-Id: I79bd26840eafe792595ceaaa35abe82b15178dab
…es, reviewed by Thejas Nair).
See https://issues.apache.org/jira/browse/HIVE-17967?focusedCommentId=16236387&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16236387 for a few important comments on this patch.