New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for reindexing APM indices #29845

Merged
merged 12 commits into from Feb 6, 2019

Conversation

Projects
None yet
7 participants
@tylersmalley
Copy link
Member

tylersmalley commented Feb 1, 2019

Starting in 7.0, APM will be aligning with the property names defined in the Elastic Common Schema. When APM users upgrade to 7.x, they will need to migrate those indices created prior to 7.0 for the data to be present in the APM app.

Blocked by the finalizing of mappings and the reindex script here: elastic/apm-integration-testing#277

Discussion:

When we re-index, we are appending -reindex-v7 to the current index name. With doing this, we will almost always run into conflicts with the APM template used to create the index in the first place.

By default, APM creates an index template named apm-%{[beat.version]} with an index pattern of apm-%{[beat.version]}-*. The default index names are as follows:

  • apm-%{[beat.version]}-sourcemap
  • apm-%{[beat.version]}-error-%{+yyyy.MM.dd}
  • apm-%{[beat.version]}-transaction-%{+yyyy.MM.dd}
  • apm-%{[beat.version]}-span-%{+yyyy.MM.dd}
  • apm-%{[beat.version]}-metric-%{+yyyy.MM.dd}
  • apm-%{[beat.version]}-onboarding-%{+yyyy.MM.dd}

Unfortunately, there is no way to ignore index templates on index creation or update an index pattern to not match indices ending in -reindexed-v7

There have been a few options which have been discussed which require input from the APM team. @elastic/apm-server @bleskes

  • Produce a warning with a list of possibly effecting templates prior to indexing, letting the user know they might create conflicts and they should remove them if they are no longer used or APM data for that version is no longer being ingested.
  • If the index creation fails, we can give the user additional information on what to do. This could include a link to documentation on identifying and removing the effecting templates. It's also possible that we could produce a list of index templates by iterating over /_template?filter_path=*.index_patterns to see if any patterns match.
  • Similar to how we stop/start ML/Watcher, we could remove the index templates and replace them once we are done with the re-index. This seems way too risky considering it might be a template they are using elsewhere which sets something like replicas. Additionally, they could still be ingesting data from the old APM Beat and removing the template will cause issues.
  • The APM server can prompt for removal during the 7.x setup script.

Please also review the language used in the following screenshots and let us know if it should be changed.

Documentation links to: https://www.elastic.co/guide/en/apm/server/master/breaking-changes.html
More about ECS links to: https://github.com/elastic/ecs

apm-list

When clicking on "Reindex" for an APM index, currently presented is a single warning which the user must check acknowledge the data will be transformed.
apm-flyout

Checklist

Use strikethroughs to remove checklist items you don't feel are applicable to this PR.

For maintainers

Support for reindexing APM indices
Signed-off-by: Tyler Smalley <tyler.smalley@elastic.co>
@elasticmachine

This comment has been minimized.

Copy link

elasticmachine commented Feb 1, 2019

@@ -30,12 +32,24 @@ export default function apmOss(kibana) {
indexPattern: Joi.string().default('apm-*'),

// ES Indices
sourcemapIndices: Joi.string().default('apm-*'),

This comment has been minimized.

@tylersmalley

tylersmalley Feb 1, 2019

Author Member

This was missing.

@elasticmachine

This comment was marked as outdated.

Copy link

elasticmachine commented Feb 1, 2019

tylersmalley added some commits Feb 2, 2019

Fix types
Signed-off-by: Tyler Smalley <tyler.smalley@elastic.co>
@elasticmachine

This comment was marked as outdated.

Copy link

elasticmachine commented Feb 2, 2019

tylersmalley added some commits Feb 2, 2019

Fixes
Signed-off-by: Tyler Smalley <tyler.smalley@elastic.co>
@elasticmachine

This comment was marked as outdated.

Copy link

elasticmachine commented Feb 2, 2019

Fix tests
Signed-off-by: Tyler Smalley <tyler.smalley@elastic.co>

@tylersmalley tylersmalley changed the title [WIP] Support for reindexing APM indices Support for reindexing APM indices Feb 3, 2019

@elasticmachine

This comment was marked as outdated.

Copy link

elasticmachine commented Feb 3, 2019

Update types
Signed-off-by: Tyler Smalley <tyler.smalley@elastic.co>
@elasticmachine

This comment has been minimized.

Copy link

elasticmachine commented Feb 3, 2019

@simitt

This comment has been minimized.

Copy link
Contributor

simitt commented Feb 4, 2019

Regarding template conflicts during reindexing: since we know there will most probably be conflicts while reindexing when using default settings, I suggest to either show the user upfront that they will need to clean up the templates or prompt for removal when the migration script is started. If feasible I'd prefer the options where we detect potential problematic templates and directly show them to the user, over pointing to an external documentation asking the user to query for the templates themselves.

Regarding linking to the documentation, there is also a general APM release notes section. This section should give a high level overview of breaking changes for APM in general, and then link to the more detailed server section. It should also contain explanation around switching to ECS. I'd rather link to these more general APM docs from Kibana.

APM index needs converted to ECS format

How about we put the detail information around field changes, explaining why we move to ECS and linking to it, into the APM docs and only put some more general information into the migration assistant, e.g. APM indices need conversion to 7.x format.

@bmorelli25 I'd appreciate your input regarding documentation linking and wording.

@bmorelli25

This comment has been minimized.

Copy link
Member

bmorelli25 commented Feb 4, 2019

Regarding linking to the documentation, there is also a general APM release notes section. This section should give a high level overview of breaking changes for APM in general, and then link to the more detailed server section. It should also contain explanation around switching to ECS. I'd rather link to these more general APM docs from Kibana.

Yup, let's link to this release notes section from Kibana. From there, I'll handle the linking to other sections of the APM documentation with more details.

How about we put the detail information around field changes, explaining why we move to ECS and linking to it, into the APM docs and only put some more general information into the migration assistant, e.g. APM indices need conversion to 7.x format.

I mostly agree with this. We should have a sentence (for the people who are curious, but not that curious) and the bulk of information should be in the APM documentation. I also thinks it makes more sense to remove the "More about ECS" link from this page. That's something I can link to from the APM documentation. I think directing users to the APM documentation initially is more beneficial.

In other words. Instead of this:

This index will be converted to ECS format
Starting in version 7.0.0, APM data will be represented in the Elastic Common Schema. Historical APM data will not visible until it's reindexed.
Documentation
More about ECS

I'm thinking something more like this:

This index will be converted to the 7.x format
Starting in version 7.0.0, APM data will be represented in the Elastic Common Schema. Historical APM data will not visible until it's reindexed. More information on this change is available in the APM release notes.

@graphaelli

This comment has been minimized.

Copy link
Member

graphaelli commented Feb 4, 2019

We're working through an issue with conflicting field mapping, summarized with this example:

PUT _template/ex-6.6.0
{
  "index_patterns": [
    "ex-6.6.0-*"
  ],
  "mappings": {
    "properties": {
      "docker": {
        "properties": {
          "container": {
            "properties": {
              "image": {
                "type": "keyword",
                "ignore_above": 1024
              }
            }
          }
        }
      },
      "container": {
        "properties": {
          "image": {
            "properties": {
              "name": {
                "type": "alias",
                "path": "docker.container.image"
              }
            }
          }
        }
      }
    }
  }
}
PUT ex-6.6.0-test
{
  "mappings": {
    "properties": {
      "container": {
        "properties": {
          "image": {
            "properties": {
              "name": {
                "ignore_above": 1024,
                "type": "keyword"
              }
            }
          }
        }
      }
    }
  }
}
Updates messaging
Signed-off-by: Tyler Smalley <tyler.smalley@elastic.co>
@elasticmachine

This comment was marked as outdated.

Copy link

elasticmachine commented Feb 4, 2019

@tylersmalley

This comment has been minimized.

Copy link
Member Author

tylersmalley commented Feb 4, 2019

Updated messaging and changed to be a warning instead of critical since critical denotes something which will prevent the cluster from starting.

apm-list

apm-flyout

@elasticmachine

This comment has been minimized.

Copy link

elasticmachine commented Feb 4, 2019

@tylersmalley

This comment has been minimized.

Copy link
Member Author

tylersmalley commented Feb 5, 2019

@graphaelli should we move forward with merging this PR and treat the ES error as a bug?

@joshdover
Copy link
Member

joshdover left a comment

Code LGTM, tested with APM data and was able to see the data in the APM app after reindexing. I'm out of the loop on the template conflict issue, but assuming there's a fix for that, the rest of this looks good.

@jalvz

This comment has been minimized.

Copy link
Contributor

jalvz commented Feb 5, 2019

I don't understand the conflict very well. The conflict is between 6.6 index mappings and 7.0 mappings, or between 7.0 mappings and 7.0-reindexed mappings? (If the former: why having a different version causes conflicts? If the later: doesn't make sense to me, what I am missing?)

And silly question: does it have to be necessarily a reindexed suffix?
Can indices be named like apm-reindexed-7.0-blablabla and change index pattern to apm-*-{beat.version}-* ?

tylersmalley added some commits Feb 6, 2019

Updates APM reindex script/mapping
Signed-off-by: Tyler Smalley <tyler.smalley@elastic.co>
@elasticmachine

This comment has been minimized.

Copy link

elasticmachine commented Feb 6, 2019

@tylersmalley

This comment has been minimized.

Copy link
Member Author

tylersmalley commented Feb 6, 2019

We decided to get around the index template conflicts by pre-pending to the index name, which did here: #30166

@tylersmalley tylersmalley merged commit c16849d into elastic:master Feb 6, 2019

2 checks passed

CLA Commit author is a member of Elasticsearch
Details
kibana-ci Build finished.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment