Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update .ml-config mappings before indexing job, datafeed or df analytics config #44216

Merged
merged 5 commits into from
Jul 12, 2019

Conversation

przemekwitek
Copy link
Contributor

@przemekwitek przemekwitek commented Jul 11, 2019

This PR updates .ml-config index mappings before indexing job, datafeed or df analytics config.

Without it, if the .ml-config index was created in the old version, and the first config with a new field is indexed, the default mapping (i.e. type "keyword") is applied for this new field as implemented in ElasticsearchMappings::addDefaultMapping. This default behavior is usually not desirable.

Related to #44263

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core

Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this so quickly.

My only code comments are minor (and 3 are the same thing).

I think the really important thing is to validate this using an index created in a 6.6 cluster. You won't be able to do this on the master branch.

One way to do it would be:

  1. Create the 7.3 backport branch now and cherry-pick the commits from your master PR branch to that branch hanging off of 7.3.
  2. Download a 6.6 snapshot build, for example from https://prelert-artifacts.s3-eu-west-1.amazonaws.com/maven/org/elasticsearch/ml/ml/6.6.3-SNAPSHOT/elasticsearch-6.6.3-SNAPSHOT.zip
  3. Extract and run the 6.6 snapshot, and create an ML job (anything will do)
  4. Shut down the ES 6.6 process
  5. With the branch from step 1 checked out, run ./gradlew :distribution:archives:darwin-tar:build to build a snapshot distribution of that branch
  6. Extract the 7.3 snapshot you just built and move the data directory from the 6.6 snapshot install into the 7.3 install
  7. Start up the 7.3 install and create an ML data frame analytics job (anything will do)
  8. Check the mappings on the .ml-config index - have they been updated with the correct mappings for data frame analytics jobs?

In particular it's critical that n_neighbors is an integer and feature_influence_threshold is a double. If the fix didn't work as expected they'll be keyword.

Once you've seen manually that the fix works it should be possible to automate it by adding a new test in x-pack/qa/full-cluster-restart/src/test/java/org/elasticsearch/xpack/restart. We already have MlMigrationFullClusterRestartIT but you could add a new one that:

  1. Creates an anomaly detector job in the old cluster
  2. Creates a data frame analytics job in the new cluster
  3. Asserts on the new cluster mappings after the data frame analytics job is successfully created

However, this test will only validate the 6.6 to 7.3 case when run on the 7.3 branch. When run on the master branch it will validate a 7.x to 8.0 upgrade, which will not currently prove that the mappings updater code is working. This is why I think it's so crucial to do a manual test with 6.6 and 7.3 before merging this PR.

Copy link
Contributor Author

@przemekwitek przemekwitek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I applied the suggested fixes and started manual testing process.

Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM assuming the 6.6->7.3 test works

@przemekwitek
Copy link
Contributor Author

run elasticsearch-ci/default-distro

@przemekwitek
Copy link
Contributor Author

run elasticsearch-ci/1
run elasticsearch-ci/2

Copy link
Member

@davidkyle davidkyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@przemekwitek
Copy link
Contributor Author

LGTM assuming the 6.6->7.3 test works

I've verified manually that the .ml-config has correct mappings after the upgrade.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants