Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML][Inference] adds new default_field_map field to trained models #53294

Conversation

benwtrent
Copy link
Member

@benwtrent benwtrent commented Mar 9, 2020

Adds a new default_field_map field to trained model config objects.

This allows the model creator to supply field map if it knows that there should be some map for inference to work directly against the training data.

The use case internally is having analytics jobs supply a field mapping for multi-field fields. This allows us to use the model "out of the box" on data where we trained on foo.keyword but the _source only references foo.

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml)

@benwtrent benwtrent force-pushed the feature/ml-inference-fix-field-type-mapping-confusion branch from a8ecd59 to 6edfe4b Compare March 9, 2020 16:47
Copy link
Member

@davidkyle davidkyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -226,6 +235,9 @@ public XContentBuilder toXContent(XContentBuilder builder, Params params) throws
if (licenseLevel != null) {
builder.field(LICENSE_LEVEL.getPreferredName(), licenseLevel);
}
if (defaultFieldMappings != null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (defaultFieldMappings != null) {
if (defaultFieldMappings != null && defaultFieldMappings.isEmpty() == false) {

I prefer not to write empty collections.

randomBoolean() ? null :
Stream.generate(() -> randomAlphaOfLength(10))
.limit(randomIntBetween(1, 10))
.collect(Collectors.toMap(Function.identity(), (k) -> randomAlphaOfLength(10))));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 neat I never thought of building a random map this way

`default_field_mappings` :::
(object)
A string to string mapping that contains the default field mappings to use
when inferring against the model. Any field mapping described in the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a line here about multi fields which is the primary reason for this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sure!

Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@benwtrent
Copy link
Member Author

@elasticmachine update branch

@benwtrent benwtrent changed the title [ML][Inference] adds new default_field_mappings config field to trained models [ML][Inference] adds new default_field_map field to trained models Mar 11, 2020
@benwtrent benwtrent merged commit 4e1f029 into elastic:master Mar 11, 2020
@benwtrent benwtrent deleted the feature/ml-inference-fix-field-type-mapping-confusion branch March 11, 2020 16:24
benwtrent added a commit to benwtrent/elasticsearch that referenced this pull request Mar 11, 2020
…lastic#53294)

Adds a new `default_field_map` field to trained model config objects.

This allows the model creator to supply field map if it knows that there should be some map for inference to work directly against the training data.

The use case internally is having analytics jobs supply a field mapping for multi-field fields. This allows us to use the model "out of the box" on data where we trained on `foo.keyword` but the `_source` only references `foo`.
benwtrent added a commit that referenced this pull request Mar 11, 2020
…53294) (#53419)

Adds a new `default_field_map` field to trained model config objects.

This allows the model creator to supply field map if it knows that there should be some map for inference to work directly against the training data.

The use case internally is having analytics jobs supply a field mapping for multi-field fields. This allows us to use the model "out of the box" on data where we trained on `foo.keyword` but the `_source` only references `foo`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants