add subset dataset facets to spec#4008
Conversation
fdf951d to
18aed40
Compare
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #4008 +/- ##
=======================================
Coverage 85.05% 85.05%
=======================================
Files 57 57
Lines 3855 3855
=======================================
Hits 3279 3279
Misses 576 576 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull Request Overview
This PR adds subset dataset facets to the OpenLineage specification, introducing a new schema definition for filtering or subsetting datasets based on various conditions like location, partition, comparison, or binary operations.
- Introduces a new
BaseSubsetDatasetFacet.jsonschema with comprehensive condition types for dataset subsetting - Adds test cases demonstrating different subset condition scenarios (compare, binary, location, partition)
- Updates Java code generation to support
constfields in JSON Schema
Reviewed Changes
Copilot reviewed 12 out of 13 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| website/static/spec/facets/1-0-0/BaseSubsetDatasetFacet.json | Main schema definition for subset dataset facets with various condition types |
| spec/facets/BaseSubsetDatasetFacet.json | Duplicate schema file in spec directory |
| spec/tests/BaseSubsetDatasetFacet/*.json | Test cases for different subset condition scenarios |
| client/python/redact_fields.yml | Python client configuration for new subset dataset classes |
| client/python/openlineage/client/facet_v2.py | Python client import for new subset dataset module |
| client/java/src/test/java/io/openlineage/client/OpenLineageTest.java | Java test demonstrating subset facet usage |
| client/java/generator/src/main/java/io/openlineage/client/*.java | Java code generator updates to support const fields |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
julienledem
left a comment
There was a problem hiding this comment.
That makes sense, I have added comments regarding the code generator portion.
| if (field.getValue().has("const")) { | ||
| // only String const are supported | ||
| fields.add(new Field(field.getKey(), new PrimitiveType("string", null), description, field.getValue().get("const").asText())); | ||
| } else { |
There was a problem hiding this comment.
shouldn't this logic be in the parse() method instead?
There was a problem hiding this comment.
I think not, bcz const shouldn't be returned as a Type (like PrimitiveType of String with constant value) and it should be already put at the Field member.
| public PrimitiveType(String name, String format, String constValue) { | ||
| super(); | ||
| this.name = name; | ||
| this.format = format; | ||
| } |
There was a problem hiding this comment.
Thx, constValue should be just a property of a Field instead.
| if (property.getConstValue().isPresent()) { | ||
| resolvedField = new ResolvedField(property, visit(property.getType()), property.getConstValue().get()); | ||
| } else { | ||
| resolvedField = new ResolvedField(property, visit(property.getType())); | ||
| } |
There was a problem hiding this comment.
This seems unnecessarily complicated. You are unwrapping an Optional to pass it to a constructor that will rewrap it in an Optional.
Maybe just add a ResolvedField constructor that takes an optional since this is how it stores it.
ResolveField(Field field, ResolvedType type, Optional<String> constantValue)
|
|
||
| @Override | ||
| public String toString() { | ||
| return "ResolvedField{name: " + field.getName() + ", type: " + type + "}"; |
There was a problem hiding this comment.
same for equals and hashCode bellow
There was a problem hiding this comment.
Changing the code so that constantValue is stored directly within field, so there's no need to change toString, equals, and hashCode for ResolvedField. Thanks.
5347ec7 to
6593b26
Compare
| fields.add(new Field(field.getKey(), fieldType, description)); | ||
| if (field.getValue().has("const")) { | ||
| // only String const are supported | ||
| fields.add(new Field(field.getKey(), new PrimitiveType("string", null), description, field.getValue().get("const").asText())); |
There was a problem hiding this comment.
to avoid surprises maybe we should still call parse() but check the type is String and throw an exception if not.
With this implementation, one can put whatever in the type and it will be replaced by string.
This could lead to some hair pulling on unexpected behavior.
There was a problem hiding this comment.
Makes sense, we should definitely avoid hair pulling. Second commit should resolve that.
6593b26 to
3eed2aa
Compare
mobuchowski
left a comment
There was a problem hiding this comment.
Approved but would appreciate more comments around the Java generator part - it's less frequently accessed and looked at part of code, so it's a good opportunity to document it more. Thanks.
| // only regular fields are used in the builder | ||
| @Override | ||
| public void onField(ResolvedField f) { | ||
| if (f.getConstantValue().isPresent()) { |
There was a problem hiding this comment.
instead of adding this if statement in every call to onField(), you could add an onConstantField() to the Visitor with the default implementation that does nothing.
There was a problem hiding this comment.
Love this—this was exactly what I was missing. Thanks!
julienledem
left a comment
There was a problem hiding this comment.
I made an additional comment that can simplify a bit the section about handling constant fields.
Otherwise, this looks good.
3eed2aa to
9266d7a
Compare
Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
9266d7a to
9623d1e
Compare
constfield in Json-Schema