New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP][HUDI-1180] Upgrade HBase to 2.4.9 #4020
Conversation
try { | ||
fileInfo = reader.loadFileInfo(); | ||
ByteBuffer serializedFilter = reader.getMetaBlock(KEY_BLOOM_FILTER_META_BLOCK, false); | ||
fileInfo = reader.getHFileInfo(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A little worried,This method does not exist even in hbase2.2.3
6e7f204
to
9dc39a8
Compare
ca86e9d
to
7c4b6d9
Compare
7c4b6d9
to
548c193
Compare
@danny0405 Could you please help in testing this patch for Flink? |
ReaderContext context = new ReaderContextBuilder() | ||
.withFilePath(path) | ||
.withInputStreamWrapper(stream) | ||
.withFileSize(getFs("hoodie", conf).getFileStatus(path).getLen()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a fake path so getFileStatus() will fail. file size could be content.length?
.build(); | ||
HFileInfo fileInfo = new HFileInfo(context, conf); | ||
this.reader = HFile.createReader(context, fileInfo, new CacheConfig(conf), conf); | ||
fileInfo.initMetaAndIndex(reader); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the use of this explicit call? Is this required or some optimization?
<groupId>org.apache.hbase</groupId> | ||
<artifactId>hbase-hadoop-compat</artifactId> | ||
<version>${hbase.version}</version> | ||
</dependency> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Too many new jar dependencies and i'm worried about the conflicts, we must exclude the jars that are conflict-prone like google guava explicitly if there are indirect dependency.
a23ba86
to
790d2ba
Compare
@hudi-bot run azure |
return readAllRecords(schema, schema); | ||
} | ||
|
||
public List<Pair<String, R>> readRecords(List<String> keys) throws IOException { | ||
reader.loadFileInfo(); | ||
Schema schema = new Schema.Parser().parse(new String(reader.loadFileInfo().get(KEY_SCHEMA.getBytes()))); | ||
reader.getHFileInfo(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we doing this here?
I've been trying to address to fix IT tests after HBase upgrade and kept hitting HBase classes conflicts b/w our HBase deps and Hadoop 2.x deps (there are non-BWC changes). Tried shading our HBase classes but that didn't go too well, and as such decided to go ahead and try upgrading Hadoop to 3.3.x branch. It's WIP and you can track progress in the following PR #4286 |
790d2ba
to
b8a7bfa
Compare
I'm going to take over this PR and get it ready for review. |
Fix some unit tests Resolve dependency issue Upgrade Hadoop to 2.10.1 and fix HFile inline reader test Separate hbase shaded version for presto bundle Resolve hbase dep conflicts in flink, utilities and hadoop-mr bundles
Diasble access time validation
b8a7bfa
to
c904e8f
Compare
I'm closing this in favor of #5004 which has more changes and deviates from this one which has conflicts with master. |
What is the purpose of the pull request
Upgrade HBase version from 1.x to 2.x.
Brief change log
Verify this pull request
(Please pick either of the following options)
This pull request is a trivial rework / code cleanup without any test coverage.
(or)
This pull request is already covered by existing tests, such as (please describe tests).
(or)
This change added tests and can be verified as follows:
(example:)
Committer checklist
Has a corresponding JIRA in PR title & commit
Commit message is descriptive of the change
CI is green
Necessary doc changes done or have another open PR
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.