Skip to content

feat: Enhance jsonExtractKey function with depth control and dot notation support#16306

Merged
xiangfu0 merged 3 commits intoapache:masterfrom
xiangfu0:json-keys
Jul 21, 2025
Merged

feat: Enhance jsonExtractKey function with depth control and dot notation support#16306
xiangfu0 merged 3 commits intoapache:masterfrom
xiangfu0:json-keys

Conversation

@xiangfu0
Copy link
Contributor

@xiangfu0 xiangfu0 commented Jul 9, 2025

This PR significantly enhances the jsonExtractKey function by adding advanced parameter support for depth control and output format customization, making JSON key extraction more flexible and performant.

🚀 New Features

Enhanced jsonExtractKey Function

  • New overload: jsonExtractKey(Object jsonObj, String jsonPath, String paramString)
  • Parameter-based configuration for advanced options using semicolon-delimited key-value pairs

Advanced Parameters

  • MAXDEPTH: Controls maximum extraction depth to prevent overly deep traversals
    • Example: jsonExtractKey(json, '$..**', 'MAXDEPTH=2') - Extract keys up to 2 levels deep
    • Default: Unlimited depth (Integer.MAX_VALUE)
    • Non-positive values treated as unlimited
  • DOTNOTATION: Toggle between JsonPath and dot notation output formats
    • Example: jsonExtractKey(json, '$..**', 'DOTNOTATION=true')
    • JsonPath format: $['a']['b']['c']
    • Dot notation: a.b.c
    • Default: false (JsonPath format)

Enhanced Recursive Extraction

  • Special handling for recursive patterns ($..**, $.., empty string)
  • Intelligent depth traversal for both JSON objects and arrays
  • Format-aware output based on DOTNOTATION parameter

📝 Usage Examples

-- Basic key extraction (existing functionality)
SELECT jsonExtractKey(complexMapStr, '$.*') FROM table

-- Extract keys with depth limit
SELECT jsonExtractKey(complexMapStr, '$..**', 'MAXDEPTH=2') FROM table

-- Extract keys in dot notation format
SELECT jsonExtractKey(complexMapStr, '$..**', 'DOTNOTATION=true') FROM table

-- Combined parameters
SELECT jsonExtractKey(complexMapStr, '$..**', 'MAXDEPTH=3;DOTNOTATION=true') FROM table

🧪 Comprehensive Testing

New Test Coverage

  • Unit tests: JsonFunctionsUtilTest.java (124+ lines) - Parameter parsing, depth calculation, format conversion
  • Function tests: Enhanced JsonFunctionsTest.java (262+ lines) - All parameter combinations
  • Transform tests: JsonExtractScalarTransformFunctionTest.java (59+ lines) - Transform function integration
  • Integration tests: JsonPathTest.java (226+ lines) - End-to-end scenarios with real data

Test Scenarios

  • Parameter validation and edge cases
  • Depth filtering with various JSON structures
  • Format conversion accuracy
  • Recursive extraction performance
  • Error handling and graceful degradation

🎯 Benefits

  • Performance: Depth limits prevent excessive traversal of deep JSON structures
  • Usability: Dot notation provides cleaner, more readable key paths
  • Flexibility: Parameter system allows for future extensibility
  • Robustness: Enhanced error handling and validation
  • Backward Compatibility: Existing jsonExtractKey calls remain unchanged

@xiangfu0 xiangfu0 changed the title Add jsonKeys transform function: scalar, transform, type, and registration Add jsonKeys transform function for extracting all JSON key paths up to a given depth Jul 9, 2025
@xiangfu0 xiangfu0 requested review from Jackie-Jiang and Copilot July 9, 2025 02:51

This comment was marked as outdated.

@codecov-commenter
Copy link

codecov-commenter commented Jul 9, 2025

Codecov Report

Attention: Patch coverage is 73.20261% with 41 lines in your changes missing coverage. Please review.

Project coverage is 56.35%. Comparing base (1a476de) to head (5662a7d).
Report is 462 commits behind head on master.

Files with missing lines Patch % Lines
...he/pinot/common/function/scalar/JsonFunctions.java 75.59% 16 Missing and 15 partials ⚠️
...form/function/JsonExtractKeyTransformFunction.java 65.00% 4 Missing and 3 partials ⚠️
...java/org/apache/pinot/sql/parsers/ParserUtils.java 33.33% 1 Missing and 1 partial ⚠️
...e/pinot/common/function/TransformFunctionType.java 66.66% 0 Missing and 1 partial ⚠️

❗ There is a different number of reports uploaded between BASE (1a476de) and HEAD (5662a7d). Click for more details.

HEAD has 68 uploads less than BASE
Flag BASE (1a476de) HEAD (5662a7d)
java-21 10 1
unittests 6 1
unittests1 3 1
skip-bytebuffers-true 4 0
temurin 15 1
unittests2 3 0
skip-bytebuffers-false 8 0
java-11 5 0
integration 9 0
integration2 3 0
integration1 3 0
custom-integration1 3 0
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #16306      +/-   ##
============================================
- Coverage     62.90%   56.35%   -6.55%     
+ Complexity     1386      699     -687     
============================================
  Files          2867     2320     -547     
  Lines        163354   128414   -34940     
  Branches      24952    20473    -4479     
============================================
- Hits         102755    72373   -30382     
+ Misses        52847    50049    -2798     
+ Partials       7752     5992    -1760     
Flag Coverage Δ
custom-integration1 ?
integration ?
integration1 ?
integration2 ?
java-11 ?
java-21 56.35% <73.20%> (-6.47%) ⬇️
skip-bytebuffers-false ?
skip-bytebuffers-true ?
temurin 56.35% <73.20%> (-6.55%) ⬇️
unittests 56.35% <73.20%> (-6.55%) ⬇️
unittests1 56.35% <73.20%> (+0.53%) ⬆️
unittests2 ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@xiangfu0 xiangfu0 force-pushed the json-keys branch 6 times, most recently from 09481e3 to a9e9fd4 Compare July 16, 2025 03:51
@xiangfu0 xiangfu0 requested review from Jackie-Jiang and Copilot July 16, 2025 06:47

This comment was marked as outdated.

@xiangfu0 xiangfu0 changed the title Add jsonKeys transform function for extracting all JSON key paths up to a given depth Add jsonExtractKey function for extracting JSON object keys using JsonPath Jul 16, 2025
@xiangfu0 xiangfu0 force-pushed the json-keys branch 3 times, most recently from 40acef2 to c6a6b69 Compare July 16, 2025 23:21
@xiangfu0 xiangfu0 requested review from Jackie-Jiang and Copilot July 16, 2025 23:57

This comment was marked as outdated.

@xiangfu0 xiangfu0 changed the title Add jsonExtractKey function for extracting JSON object keys using JsonPath feat: Enhance jsonExtractKey function with depth control and dot notation support Jul 17, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR extends the jsonExtractKey function to support optional parameters for maximum recursion depth and output format (dot notation vs. JsonPath), and adds comprehensive tests to cover these new behaviors.

  • Added maxDepth and dotNotation parsing in JsonExtractKeyTransformFunction
  • Updated JsonFunctions to provide overloaded methods and recursive extraction logic
  • Expanded unit and integration tests for all new parameter combinations

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/custom/JsonPathTest.java New integration tests for depth and dot-notation options
pinot-core/src/test/java/org/apache/pinot/core/operator/transform/function/JsonExtractScalarTransformFunctionTest.java Added transform-function tests for jsonExtractKey overloads
pinot-core/src/main/java/org/apache/pinot/core/operator/transform/function/JsonExtractKeyTransformFunction.java Parse and apply optional parameters (maxDepth, dotNotation)
pinot-common/src/test/java/org/apache/pinot/common/function/scalar/JsonFunctionsUtilTest.java New tests for utility methods including edge cases
pinot-common/src/test/java/org/apache/pinot/common/function/JsonFunctionsTest.java Extensive tests for jsonExtractKey API variations
pinot-common/src/main/java/org/apache/pinot/sql/parsers/ParserUtils.java Updated SQL-operand validation for optional parameters
pinot-common/src/main/java/org/apache/pinot/common/function/scalar/JsonFunctions.java Implemented overloaded jsonExtractKey with parameter parsing, recursive extraction, and dot-notation conversion
pinot-common/src/main/java/org/apache/pinot/common/function/TransformFunctionType.java Extended operator metadata to allow 2–3 arguments
Comments suppressed due to low confidence (2)

pinot-common/src/main/java/org/apache/pinot/common/function/scalar/JsonFunctions.java:376

  • [nitpick] The PR description says non-positive maxDepth is unlimited, but zero currently yields an empty result. Clarify this behavior in the javadoc or update the code to treat zero as unlimited.
    if (params._maxDepth == 0) {

pinot-common/src/test/java/org/apache/pinot/common/function/scalar/JsonFunctionsUtilTest.java:23

  • The test uses assertNull(...) but assertNull is not statically imported or qualified. Add import static org.testng.Assert.assertNull; or prefix calls with Assert..
import static org.testng.Assert.assertEquals;

@xiangfu0 xiangfu0 force-pushed the json-keys branch 2 times, most recently from e88b12b to 9e7156c Compare July 18, 2025 23:18
@xiangfu0 xiangfu0 merged commit 08903aa into apache:master Jul 21, 2025
34 of 36 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants