Skip to content

Deduplicate ExpressionTransformer and CustomFunctionEnricher function evaluation#18321

Open
rsrkpatwari1234 wants to merge 12 commits intoapache:masterfrom
rsrkpatwari1234:rsrkpatwari1234-18216
Open

Deduplicate ExpressionTransformer and CustomFunctionEnricher function evaluation#18321
rsrkpatwari1234 wants to merge 12 commits intoapache:masterfrom
rsrkpatwari1234:rsrkpatwari1234-18216

Conversation

@rsrkpatwari1234
Copy link
Copy Markdown
Contributor

@rsrkpatwari1234 rsrkpatwari1234 commented Apr 24, 2026

Fixes #18216

Motivation

ExpressionTransformer and CustomFunctionEnricher both apply transform-style FunctionEvaluators to a GenericRow. Duplicating per-column evaluation and write behavior in two places makes drift and subtle bugs more likely.

What changed

  • Introduced IngestionFunctionEvaluation as the single place for per-row evaluator application.
    -- One loop: applyFunctionEvaluations(record, evaluators, policy) iterates the Map<String, FunctionEvaluator> once; behavior is selected by IngestionFunctionEvaluation.Policy:
    -- Policy.expressionTransformation(...) — same semantics as before for the record-transformer chain: null / overwrite / implicit MAP-derived columns, collection-or-map backward-compat path, continueOnError with throttled warning and incomplete row marking, and applyTransformedValue null handling.
    -- Policy.enricher() — JSON enricher path: always putValue(column, evaluate(record)) in map order (exceptions propagate), matching prior CustomFunctionEnricher behavior.
  • ExpressionTransformer: transform delegates to applyFunctionEvaluations with Policy.expressionTransformation; constructor, topological ordering, and implicit-MAP discovery are unchanged.
  • CustomFunctionEnricher: enrich delegates to applyFunctionEvaluations with Policy.enricher(); Javadoc links to the shared helper and ExpressionTransformer for table/schema-driven transforms.

Testing

Added IngestionFunctionEvaluationTest to cover unit tests

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 24, 2026

Codecov Report

❌ Patch coverage is 72.88136% with 16 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.48%. Comparing base (ab9ac39) to head (cf995e7).
⚠️ Report is 61 commits behind head on master.

Files with missing lines Patch % Lines
...recordtransformer/IngestionFunctionEvaluation.java 74.54% 5 Missing and 9 partials ⚠️
...rmer/enricher/function/CustomFunctionEnricher.java 0.00% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #18321      +/-   ##
============================================
- Coverage     63.61%   63.48%   -0.14%     
- Complexity     1659     1701      +42     
============================================
  Files          3246     3255       +9     
  Lines        197514   199131    +1617     
  Branches      30578    30835     +257     
============================================
+ Hits         125656   126409     +753     
- Misses        61813    62641     +828     
- Partials      10045    10081      +36     
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-11 ?
java-21 63.48% <72.88%> (-0.11%) ⬇️
temurin 63.48% <72.88%> (-0.14%) ⬇️
unittests 63.47% <72.88%> (-0.14%) ⬇️
unittests1 55.41% <47.45%> (-0.18%) ⬇️
unittests2 35.01% <72.88%> (-0.05%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@xiangfu0
Copy link
Copy Markdown
Contributor

xiangfu0 commented Apr 30, 2026

The idea is to use one function evaluator to process two sides to consolidate & simplify the code.
Just moving the functions around and wrap it is not the purpose

@rsrkpatwari1234
Copy link
Copy Markdown
Contributor Author

The idea is to use one function evaluator to process two sides to consolidate & simplify the code. Just moving the functions around and wrap it is not the purpose

Thanks for the clarification.

Addressed this by one shared entry point, IngestionFunctionEvaluation.applyFunctionEvaluations(record, evaluators, policy), with a single for over Map<String, FunctionEvaluator>.

Policy.enricher() vs Policy.expressionTransformation(...) only switches how each (column, evaluator) is applied (always putValue(evaluate) for the JSON enricher vs. null/overwrite/implicit-MAP/continueOnError/typed collection rules for ExpressionTransformer).

ExpressionTransformer and CustomFunctionEnricher both call that same method with the appropriate policy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Refactor duplicate function-evaluation paths in ExpressionTransformer and CustomFunctionEnricher

3 participants