[HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource#3413
[HUDI-2277] HoodieDeltaStreamer reading ORC files directly using ORCDFSSource#3413xushiyan merged 16 commits intoapache:masterfrom
Conversation
|
@hudi-bot run travis |
vinothchandar
left a comment
There was a problem hiding this comment.
Just some small comments. We can land once they are addressed
| } | ||
| } | ||
|
|
||
| public static void addAvroRecord( |
There was a problem hiding this comment.
can this sit somewhere in a test utils class? given its only used by a test?
There was a problem hiding this comment.
Sure, Done.
| } | ||
|
|
||
| @Test | ||
| public void testORCDFSSourceWithSourceSchemaFileAndNoTransformer() throws Exception { |
There was a problem hiding this comment.
for sake of test runtime. could we test just 1-2 combos here. most of the testing done for row source (ParquetDFS) should cover already?
There was a problem hiding this comment.
Okay, done. Thanks a lot for your review :)
|
Hi @vinothchandar sorry to bother you. Since this patch is passed all ut/it. So could you please take a look at your convince? Thanks a lot! |
| PROPS_FILENAME_TEST_ORC, ORC_SOURCE_ROOT, false); | ||
| } | ||
|
|
||
| private void prepareORCDFSSource(boolean useSchemaProvider, boolean hasTransformer, String sourceSchemaFile, String targetSchemaFile, |
There was a problem hiding this comment.
wonder if we can simplify this test?
There was a problem hiding this comment.
Srue thing. done. thanks a lot for your review.
|
@hudi-bot run azure |
hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java
Show resolved
Hide resolved
hudi-utilities/src/test/java/org/apache/hudi/utilities/testutils/UtilitiesTestBase.java
Outdated
Show resolved
Hide resolved
hudi-utilities/src/test/java/org/apache/hudi/utilities/testutils/UtilitiesTestBase.java
Outdated
Show resolved
Hide resolved
|
@zhangyue19921010 : do ping me here once you have addressed all comments. I can take a look. |
Hi @nsivabalan Thanks a lot for your review. I finished all the changes. PTAL :) |
hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java
Show resolved
Hide resolved
|
Hi @nsivabalan @vinothchandar. Thanks a lot for your attention, review and approve! Could we land it or what else do I need to do? :) |
xushiyan
left a comment
There was a problem hiding this comment.
LGTM. some minor code suggestions. Please go ahead and merge once fixed and Azure CI passed. Thanks.
...tilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamerBase.java
Outdated
Show resolved
Hide resolved
hudi-utilities/src/test/java/org/apache/hudi/utilities/testutils/UtilitiesTestBase.java
Outdated
Show resolved
Hide resolved
hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java
Outdated
Show resolved
Hide resolved
hudi-common/src/test/java/org/apache/hudi/common/testutils/HoodieTestDataGenerator.java
Outdated
Show resolved
Hide resolved
hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java
Outdated
Show resolved
Hide resolved
hudi-utilities/src/test/java/org/apache/hudi/utilities/functional/TestHoodieDeltaStreamer.java
Outdated
Show resolved
Hide resolved
xushiyan
left a comment
There was a problem hiding this comment.
@zhangyue19921010 we're almost there :) just 1 more nitpick. thanks for making the feature.
hudi-common/src/test/java/org/apache/hudi/common/testutils/HoodieTestDataGenerator.java
Outdated
Show resolved
Hide resolved
hudi-common/src/test/java/org/apache/hudi/common/testutils/HoodieTestDataGenerator.java
Outdated
Show resolved
Hide resolved
|
Hi @xushiyan Thanks a lot for your attention and review. My bad for misunderstanding :) code changed and waiting for ci/cd green. |
…FSSource (apache#3413) * add ORCDFSSource to support reading orc file into hudi format && add UTs * remove ununsed import * simplify tes * code review * code review * code review * code review * code review * code review Co-authored-by: yuezhang <yuezhang@freewheel.tv>
https://issues.apache.org/jira/projects/HUDI/issues/HUDI-2277
What is the purpose of the pull request
Develop a new Source named ORCDFSSource extended from RowSource
Now, HoodieDeltaStreamer can read orc files directly using ORCDFSSource.
Also add UTs which are necessary and tested on our local env.
Brief change log
(for example:)
Verify this pull request
(Please pick either of the following options)
This pull request is a trivial rework / code cleanup without any test coverage.
(or)
This pull request is already covered by existing tests, such as (please describe tests).
(or)
This change added tests and can be verified as follows:
(example:)
Committer checklist
Has a corresponding JIRA in PR title & commit
Commit message is descriptive of the change
CI is green
Necessary doc changes done or have another open PR
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.