#13049 Refactored RecordTransformer & merged RecordEnricher#13086
#13049 Refactored RecordTransformer & merged RecordEnricher#13086deepthi912 wants to merge 14 commits intoapache:masterfrom
Conversation
…moved RecordEnricher as it serves similar purpose as RecordTransformer.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #13086 +/- ##
============================================
+ Coverage 61.75% 62.16% +0.40%
+ Complexity 207 198 -9
============================================
Files 2436 2530 +94
Lines 133233 139028 +5795
Branches 20636 21544 +908
============================================
+ Hits 82274 86420 +4146
- Misses 44911 46136 +1225
- Partials 6048 6472 +424
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Jackie-Jiang
left a comment
There was a problem hiding this comment.
We should be able to delete the record enricher related interfaces and general handling classes (e.g. factory, pipeline, registry) after merging it into record transformer package. Can you take a look and see if it is possible to further clean up the code?
|
Sure, I wanted to do that too. Will look and see where I can improve further |
|
@Jackie-Jiang I noticed that registery, pipeline have static methods which are used by some of the classes. These static methods are calling RecordEnricherFactory which is again implemented by function and clp Enricher classes. I tried to check if I could squeeze in anywhere but no right spot. Not sure if by just removing RecordEnricher, we can make any big difference in refactoring though. |
|
Basically the idea is to re-do #12243 and integrate CLP transform as a |
…rom RecordEnricher package
|
@Jackie-Jiang PR is ready for review. |
| private Schema _schema; | ||
| private Set<String> _fieldsToRead; | ||
| private RecordEnricherPipeline _recordEnricherPipeline; | ||
| private RecordTransformer _recordEnricherPipeline; |
There was a problem hiding this comment.
After changing RecordEnricher into RecordTransformer, we should be able to put the RecordEnricher in front of the existing RecordTransformers as part of the default CompositeTransformer
| import org.slf4j.LoggerFactory; | ||
|
|
||
|
|
||
| public class RecordEnricherRegistry { |
There was a problem hiding this comment.
I think we should be able to remove this registry and follow the existing way of creating RecordTransformer. Same for other places where 2 pipelines exist
| private final PartitionGroupConsumptionStatus _partitionGroupConsumptionStatus; | ||
| final String _clientId; | ||
| private final RecordEnricherPipeline _recordEnricherPipeline; | ||
| private final RecordTransformer _recordEnricherPipeline; |
There was a problem hiding this comment.
Same here, we should be able to integrate it into the TransformPipeline
There was a problem hiding this comment.
Got it. I will take a look into TransformPipeline. Didn't know this class existed. Will modify according to it.
| * The record transformer which takes a {@link GenericRow} and transform it based on some custom rules. | ||
| */ | ||
| public interface RecordTransformer extends Serializable { | ||
| final boolean GROOVY_DISABLED = false; |
There was a problem hiding this comment.
Suggest keeping the interface simple. We need to add getInputColumns() from RecordEnricher, other method seems unnecessary. Take a look at CompositeTransformer for record transformer handling
| private long _totalStatsCollectorTime = 0; | ||
| private boolean _continueOnError; | ||
|
|
||
| public static void persistCreationMeta(File indexDir, long crc, long creationTime) |
There was a problem hiding this comment.
These changes are not related to this PR. I guess you might enabled Rearrange code option in IDE during auto reformat. Can you revert these unrelated changes? There are quite some unrelated auto reformat in several files
There was a problem hiding this comment.
Yeah. I will look into this formatting issue.
|
Merge #14601 instead |
Uh oh!
There was an error while loading. Please reload this page.