New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change how type is stored in an enrich policy. #45789
Change how type is stored in an enrich policy. #45789
Conversation
A policy type controls how the enrich index is created and the query executed against the match field. Currently there is a single policy type (`exact_match`). In the near future more policy types will be added and different policy may have different configuration options. For this reason type should be a json object instead of a string field: ``` { "exact_match": { ... } } ``` instead of: ``` { "type": "exact_match", ... } ``` This will make streaming parsing of enrich policies easier as in the new format, the parsing code can know ahead what configuration fields to expect. In the latter format that is not possible if the type field appears not as the first field. Relates to elastic#32789
Pinging @elastic/es-core-features |
@@ -68,7 +70,24 @@ private static void declareParserOptions(ConstructingObjectParser<?, ?> parser) | |||
} | |||
|
|||
public static EnrichPolicy fromXContent(XContentParser parser) throws IOException { | |||
return PARSER.parse(parser, null); | |||
Token token = parser.currentToken(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not super happy with this parsing code, but in ordere to use object parser, another class would need to be introduced for policy types and since there is currently only policy type, this seems overkill to me. I think the important thing here is that the policy format is future proof now.
@@ -268,16 +296,39 @@ public void writeTo(StreamOutput out) throws IOException { | |||
@Override | |||
public XContentBuilder toXContent(XContentBuilder builder, Params params) throws IOException { | |||
builder.startObject(); | |||
builder.startObject(policy.type); | |||
{ | |||
builder.field(NAME.getPreferredName(), name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should name appear outside the policy type object? I assumed not, because otherwise we have the name and policy type at the same level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like name should stay within the policy's definition. The object above this will only have a few fields that are considered valid: the existing policy types. On top of that, I think the previously decided direction is to keep other metadata about the policy (like the es version it was created under) contained in the definition part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@elasticmachine run elasticsearch-ci/default-distro |
1 similar comment
@elasticmachine run elasticsearch-ci/default-distro |
@elasticmachine run elasticsearch-ci/bwc |
"indices": "users", | ||
"match_field": "email", | ||
"enrich_fields": ["first_name", "last_name", "address", "city", "zip", "state"] | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have an example to includes the query
? (its fine if it is not part of the PR, but probably want it in there somewhere)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, there is not. It makes sense to emphasise that in a special section with the fact that reference data can be read from multiple indices.
@@ -129,6 +129,16 @@ public void testDeleteExistingPipeline() throws Exception { | |||
assertOK(client().performRequest(getRequest)); | |||
} | |||
|
|||
public static String generatePolicySource(String index) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can you add a radom boolean to include a query
field with a match all ? (I know the behavior is the same as the default but will help catch a regression since we rarely (ever?) test parsing the query
field)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM a couple suggestions but nothing to block the merging here.
A policy type controls how the enrich index is created and the query executed against the match field. Currently there is a single policy type (`exact_match`). In the near future more policy types will be added and different policy may have different configuration options. For this reason type should be a json object instead of a string field: ``` { "exact_match": { ... } } ``` instead of: ``` { "type": "exact_match", ... } ``` This will make streaming parsing of enrich policies easier as in the new format, the parsing code can know ahead what configuration fields to expect. In the latter format that is not possible if the type field appears not as the first field. Relates to #32789
A policy type controls how the enrich index is created and
the query executed against the match field. Currently there
is a single policy type (
exact_match
). In the near futuremore policy types will be added and different policy may have
different configuration options.
For this reason type should be a json object instead of a string field:
instead of:
This will make streaming parsing of enrich policies easier as in the
new format, the parsing code can know ahead what configuration fields
to expect. In the latter format that is not possible if the type field
appears not as the first field.
Relates to #32789