New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding jsonExtractScalar function to extract field from json object #4597
Conversation
8e4c8ba
to
af7bdc1
Compare
Codecov Report
@@ Coverage Diff @@
## master #4597 +/- ##
==========================================
+ Coverage 66.08% 66.50% +0.42%
==========================================
Files 1072 1077 +5
Lines 54668 54978 +310
Branches 8152 8213 +61
==========================================
+ Hits 36125 36565 +440
+ Misses 15895 15717 -178
- Partials 2648 2696 +48
Continue to review full report at Codecov.
|
...c/main/java/org/apache/pinot/core/operator/transform/function/JsonPathTransformFunction.java
Outdated
Show resolved
Hide resolved
af7bdc1
to
21e2810
Compare
Can you also add documentation on how json columns are to be used? I suppose the json columns cannot be used in filters (yet) right? |
Will add more documents, this could be used in selection/filtering/groupby, please refer to integration tests. Current implementation requires the column to be a string column and the content is the json string in order to be used(e.g. the AVRO field needs to be String type). I'm still investigating how to directly ingest a record/map type into the field. |
f5c17ef
to
0f94d9f
Compare
9e171e1
to
0644c0f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that we have support for udf in filter predicates, this can be an amazing feature.
Can we change the function names to match with presto
3fca70e
to
9ed2e19
Compare
Rename function name to |
9ed2e19
to
fa1bfdc
Compare
We can know after we added the support for expressions in the filter predicates |
8a16890
to
78ae585
Compare
78ae585
to
0004249
Compare
1. Fix the compilation error introduced because of the merge of apache#4597 and apache#5240 2. Fix the bug of not loading the range index if both inverted index and range index exist TODO: The range index triggeres another severe issue of accessing closed DataBuffer which can cause JVM crash. Will address in a separate PR
1. Fix the compilation error introduced because of the merge of #4597 and #5240 2. Fix the bug of not loading the range index if both inverted index and range index exist TODO: The range index triggeres another severe issue of accessing closed DataBuffer which can cause JVM crash. Will address in a separate PR
Right now we could put a JSON blob into a string field.
This udf leverages JsonPath (https://github.com/json-path/JsonPath) DSL to read from a JSON string.
jsonExtractScalar
Function Syntax:
Sample queries:
jsonExtractKey
to extract the paths for a given pattern of a json.E.g. for a given json:
The result of
jsonExtractKey(jsonField, '$.*')
will return a list a string of matched json paths pattern. In above example, the result is["$['k1']", "$['k2']", "$['k3']"]
jsonExtractScalar
andjson_extract_scalar
function names to the same function.