Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#2048 - Add ILM support for managing jaeger indices in elasticsearch #2739

Closed
wants to merge 0 commits into from

Conversation

bhiravabhatla
Copy link
Contributor

Signed-off-by: santosh bsantosh@thoughtworks.com

Which problem is this PR solving?

Short description of the changes

  • Adds support for ILM policies by creating overriding index templates - which assign the ILM policy and rollover alias and read-alias to the index upon creation. For more details, Please refer to Add support for Elasticsearch ILM Polices #2454. (Original PR)

@bhiravabhatla bhiravabhatla requested a review from a team as a code owner January 25, 2021 12:18
@mergify mergify bot requested a review from jpkrohling January 25, 2021 12:19
@bhiravabhatla bhiravabhatla changed the title #2047 - Add ILM support for managing jaeger indices in elasticsearch #2048 - Add ILM support for managing jaeger indices in elasticsearch Jan 25, 2021
@codecov
Copy link

codecov bot commented Jan 25, 2021

Codecov Report

Merging #2739 (172a081) into master (3efcae5) will increase coverage by 0.00%.
The diff coverage is 97.82%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #2739   +/-   ##
=======================================
  Coverage   95.88%   95.88%           
=======================================
  Files         218      221    +3     
  Lines        9626     9705   +79     
=======================================
+ Hits         9230     9306   +76     
- Misses        327      329    +2     
- Partials       69       70    +1     
Impacted Files Coverage Δ
plugin/storage/es/factory.go 98.14% <94.28%> (-1.86%) ⬇️
cmd/templatizer/app/flags.go 100.00% <100.00%> (ø)
cmd/templatizer/app/renderer/render.go 100.00% <100.00%> (ø)
pkg/es/textTemplate.go 100.00% <100.00%> (ø)
plugin/storage/es/options.go 100.00% <100.00%> (ø)
plugin/storage/es/spanstore/writer.go 100.00% <100.00%> (ø)
cmd/collector/app/server/zipkin.go 73.07% <0.00%> (-3.85%) ⬇️
cmd/query/app/static_handler.go 95.16% <0.00%> (-1.62%) ⬇️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3efcae5...d922700. Read the comment docs.

@bhiravabhatla
Copy link
Contributor Author

bhiravabhatla commented Jan 25, 2021

Finally a green build :). Thank you @albertteoh - for guiding me through out.

Kindly share feedback one last time :). If everything checks out, I can start working on documentation for this. jaegertracing/documentation#471.

@yurishkuro yurishkuro assigned albertteoh and unassigned jpkrohling Jan 25, 2021
cmd/templatizer/app/flags.go Outdated Show resolved Hide resolved
cmd/templatizer/app/flags.go Outdated Show resolved Hide resolved
cmd/templatizer/app/flags.go Outdated Show resolved Hide resolved
cmd/templatizer/app/flags.go Outdated Show resolved Hide resolved
cmd/templatizer/main.go Outdated Show resolved Hide resolved
plugin/storage/es/factory_test.go Outdated Show resolved Hide resolved
plugin/storage/es/factory.go Outdated Show resolved Hide resolved
pkg/es/textTemplate_test.go Outdated Show resolved Hide resolved
@bhiravabhatla
Copy link
Contributor Author

bhiravabhatla commented Jan 27, 2021

@albertteoh - Not sure why kafka integration tests have failed. This failure seems to be flaky - did see one run fail yesterday as well - but subsequent run passed.

@yurishkuro
Copy link
Member

This PR is pretty large (35 files), and went through several iterations of back and forth on reviews. I would recommend finding a way to split it into smaller pieces and merging them as they are approved. For example, the new CLI utility seems relatively independent of the other changes, could we merge if first?

@bhiravabhatla
Copy link
Contributor Author

This PR is pretty large (35 files), and went through several iterations of back and forth on reviews. I would recommend finding a way to split it into smaller pieces and merging them as they are approved. For example, the new CLI utility seems relatively independent of the other changes, could we merge if first?

Agreed. @albertteoh any ideas to approach this - should I raise multiple PRs for independent pieces and have a link of them in this PR. Thoughts?

@albertteoh
Copy link
Contributor

@bhiravabhatla yes, I agree, please raise multiple PRs with independent pieces and reference both this PR as well as the Issue #2048.

For example, as @yurishkuro suggested, the CLI utility along with the json templates, python script and integration tests also seem fairly independent from the main body of code.

@bhiravabhatla
Copy link
Contributor Author

bhiravabhatla commented Jan 29, 2021

@bhiravabhatla yes, I agree, please raise multiple PRs with independent pieces and reference both this PR as well as the Issue #2048.

For example, as @yurishkuro suggested, the CLI utility along with the json templates, python script and integration tests also seem fairly independent from the main body of code.

Broadly - Can we say these would be sub-PRs -

  • CLI for rendering index templates
  • Add useILM flag to jaeger ES storage plugin
  • Add useILM Flag to python esRollover script & integration tests for it.

I would need "useILM flag to jaeger ES storage plugin" to be merged first - as CLI uses FixMapping function from this change

Have to carefully cherrypick my changes. Thoughts?
@albertteoh

@albertteoh
Copy link
Contributor

That makes sense to me, thanks for putting that plan together, @bhiravabhatla .

@bhiravabhatla
Copy link
Contributor Author

bhiravabhatla commented Jan 29, 2021

That makes sense to me, thanks for putting that plan together, @bhiravabhatla .

I see a potential deadlock. Python script changes and golang changes cant be done separately as both of them use same mappings. As part of es storage factory changes we change the templates to use test/template format.
Existing Integration tests will fail for esRollover which expect the mapping to be present in older format. :( @albertteoh

@albertteoh
Copy link
Contributor

This is indeed not a trivial PR to breakup but not to worry, let's see if we can work something out together :)

The smallest first diff I could manage was to copy the following files; it's about 20 files which is fewer that what we started with:

        modified:   cmd/opentelemetry/app/exporter/elasticsearchexporter/exporter.go
        modified:   cmd/opentelemetry/app/exporter/elasticsearchexporter/integration_test.go
        modified:   cmd/opentelemetry/go.sum
        modified:   pkg/es/config/config.go
        new file:   pkg/es/mocks/TemplateApplier.go
        new file:   pkg/es/mocks/TemplateBuilder.go
        new file:   pkg/es/textTemplate.go
        new file:   pkg/es/textTemplate_test.go
        modified:   plugin/storage/es/factory.go
        modified:   plugin/storage/es/factory_test.go
        modified:   plugin/storage/es/mappings/gen_assets.go
        modified:   plugin/storage/es/mappings/jaeger-dependencies-7.json
        modified:   plugin/storage/es/mappings/jaeger-dependencies.json
        modified:   plugin/storage/es/mappings/jaeger-service-7.json
        modified:   plugin/storage/es/mappings/jaeger-service.json
        modified:   plugin/storage/es/mappings/jaeger-span-7.json
        modified:   plugin/storage/es/mappings/jaeger-span.json
        modified:   plugin/storage/es/spanstore/writer.go
        modified:   plugin/storage/es/spanstore/writer_test.go
        modified:   plugin/storage/integration/elasticsearch_test.go
        modified:   plugin/storage/integration/es_index_cleaner_test.go
        new file:   plugin/storage/integration/es_index_rollover_test.go

You are right in that the existing tests relied on the old json templates and these go hand-in-hand with how we apply templates (factory.go), which meant changes to some function signatures which differ to other packages' signatures (e.g. passing in the TemplateBuilder). I feel these function signature changes are necessary (and were my suggestions, sorry!), allowing for dependency injection, and lead to a better design in the long-run.

Without doing a detailed scan, the excluded files refer to the following, which can be added in a follow-up diffs:

  • The additional flag option for useILM
  • Templatizer utility
  • esRollover python script
  • Makefile targets for building the Templatizer utility and running integration tests

@bhiravabhatla what do you think?

@bhiravabhatla
Copy link
Contributor Author

This is indeed not a trivial PR to breakup but not to worry, let's see if we can work something out together :)

The smallest first diff I could manage was to copy the following files; it's about 20 files which is fewer that what we started with:

        modified:   cmd/opentelemetry/app/exporter/elasticsearchexporter/exporter.go
        modified:   cmd/opentelemetry/app/exporter/elasticsearchexporter/integration_test.go
        modified:   cmd/opentelemetry/go.sum
        modified:   pkg/es/config/config.go
        new file:   pkg/es/mocks/TemplateApplier.go
        new file:   pkg/es/mocks/TemplateBuilder.go
        new file:   pkg/es/textTemplate.go
        new file:   pkg/es/textTemplate_test.go
        modified:   plugin/storage/es/factory.go
        modified:   plugin/storage/es/factory_test.go
        modified:   plugin/storage/es/mappings/gen_assets.go
        modified:   plugin/storage/es/mappings/jaeger-dependencies-7.json
        modified:   plugin/storage/es/mappings/jaeger-dependencies.json
        modified:   plugin/storage/es/mappings/jaeger-service-7.json
        modified:   plugin/storage/es/mappings/jaeger-service.json
        modified:   plugin/storage/es/mappings/jaeger-span-7.json
        modified:   plugin/storage/es/mappings/jaeger-span.json
        modified:   plugin/storage/es/spanstore/writer.go
        modified:   plugin/storage/es/spanstore/writer_test.go
        modified:   plugin/storage/integration/elasticsearch_test.go
        modified:   plugin/storage/integration/es_index_cleaner_test.go
        new file:   plugin/storage/integration/es_index_rollover_test.go

You are right in that the existing tests relied on the old json templates and these go hand-in-hand with how we apply templates (factory.go), which meant changes to some function signatures which differ to other packages' signatures (e.g. passing in the TemplateBuilder). I feel these function signature changes are necessary (and were my suggestions, sorry!), allowing for dependency injection, and lead to a better design in the long-run.

Without doing a detailed scan, the excluded files refer to the following, which can be added in a follow-up diffs:

  • The additional flag option for useILM
  • Templatizer utility
  • esRollover python script
  • Makefile targets for building the Templatizer utility and running integration tests

@bhiravabhatla what do you think?

@albertteoh - I thought on above lines as well. But unfortunately we would have failing integration tests with this approach. To be specific - esRollover tests -

err = runEsRollover("init", []string{"ARCHIVE=true", "INDEX_PREFIX=" + prefix})
if err != nil {
return err
}
err = runEsRollover("rollover", []string{"ARCHIVE=true", "INDEX_PREFIX=" + prefix, rolloverNowEnvVar})
if err != nil {
return err
}
// create rollover main indices and roll over to the new index
err = runEsRollover("init", []string{"ARCHIVE=false", "INDEX_PREFIX=" + prefix})
if err != nil {
return err
}
err = runEsRollover("rollover", []string{"ARCHIVE=false", "INDEX_PREFIX=" + prefix, rolloverNowEnvVar})
if err != nil {
return err

If as part of first PR we only change the go assets (ES mapping templates) and we dont change the rollover script(python) - above tests would fail right?

@albertteoh
Copy link
Contributor

If as part of first PR we only change the go assets (ES mapping templates) and we dont change the rollover script(python) - above tests would fail right?

Yes, you're right, I hadn't checked the integration tests.

To get the tests to pass, we'd need to add the python script changes in order to parse the new template format (golang's built-in text/template format), which then requires Templatizer. In turn, in order for Templatizer to be accessible to python scripts in integration tests, we need to update the Dockerfile and Makefile. This all adds up to the entire current PR.

To try and understand and communicate the problem, this is an attempt of drawing up the dependency tree:

2048-dependencies

@yurishkuro, can you see a way of breaking this PR up into smaller commits?

@yurishkuro
Copy link
Member

If it's too difficult to break the PR into smaller ones than it's not worth it. But I thought that at least the new binary could be committed separately, that's 5 files out of 30, plus maybe the makefile changes.

@albertteoh
Copy link
Contributor

But I thought that at least the new binary could be committed separately, that's 5 files out of 30, plus maybe the makefile changes.

The new binary relies on some changes to factory.go for loading and applying templates and factory.go is a dependency to/from other files as in the diagram provided above; otherwise, we would need to duplicate this logic from factory.go to decouple, which we'd probably want to avoid.

@yurishkuro
Copy link
Member

then I would keep a single PR

@bhiravabhatla
Copy link
Contributor Author

@albertteoh - What are our next steps. Shall I update this branch with latest changes from master?

@bhiravabhatla
Copy link
Contributor Author

To try and understand and communicate the problem, this is an attempt of drawing up the dependency tree:

This makes the PR understandable - Thank you for taking effort on doing this :)

Copy link
Contributor

@albertteoh albertteoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - great effort, especially for your first PR, @bhiravabhatla! Thanks!

I think we'll need to address jaegertracing/documentation#471 pretty soon, preferably before release v1.22.0 is cut.

@albertteoh
Copy link
Contributor

@albertteoh - What are our next steps. Shall I update this branch with latest changes from master?

Yes, please.

@bhiravabhatla
Copy link
Contributor Author

LGTM - great effort, especially for your first PR, @bhiravabhatla! Thanks!

Thank you @albertteoh for guiding me through out. :)

@bhiravabhatla
Copy link
Contributor Author

bhiravabhatla commented Feb 1, 2021

I think we'll need to address jaegertracing/documentation#471 pretty soon, preferably before release v1.22.0 is cut.

Yes - I ll start working on it

@bhiravabhatla
Copy link
Contributor Author

@albertteoh - What are our next steps. Shall I update this branch with latest changes from master?

Yes, please.

Done.

@bhiravabhatla
Copy link
Contributor Author

@albertteoh - I see new github actions were added to CI and its taking a lot of time for a single run. This one is stuck from yesterday.

CC - @Ashmita152 @yurishkuro

@albertteoh
Copy link
Contributor

Still seems to be stuck; on codecov steps. Any ideas why?

@yurishkuro
Copy link
Member

codecov has been having this issue lately, if you look at the unit test logs you will see that the report is actually submitted, but we're not getting a ping back. Not sure if there's a support forum for codecov, I never tried using it. It seems to happen the most when there are multiple CI runs (since GHA don't cancel previous run automatically when a new commit is added). I will restart unit tests job.

@albertteoh
Copy link
Contributor

albertteoh commented Feb 4, 2021

Thanks @yurishkuro, I've submitted a support ticket with codecov.io providing details for this PR.

If we don't hear back from codecov support, does it make sense to try restarting the codecov jobs?

@yurishkuro
Copy link
Member

There is no separate codecov job, it's a step in unit-tests, and I already rerun those. This could be a persistent issue, but let me try to merge master.

BTW, there is a conflict in go.sum, but since this PR does not change dependencies, it should have no changes in go.sum, so I am going to accept one from master.

@yurishkuro yurishkuro mentioned this pull request Feb 5, 2021
5 tasks
@yurishkuro
Copy link
Member

please resolve conflict (easiest is to merge as is and then regenerate and commit on top)

@albertteoh
Copy link
Contributor

Sorry @bhiravabhatla, looks like the build failures are my fault. My commit contained an accidental change to the generated assets. You may need to regenerate the assets and commit them in again.

@yurishkuro
Copy link
Member

The DCO check is broken on some commits. I recommend squashing all commits into one and rebasing off fresh master.

@bhiravabhatla
Copy link
Contributor Author

The DCO check is broken on some commits. I recommend squashing all commits into one and rebasing off fresh master.

I am unable to squash commits. I am getting multiple parent issue with one of the merge commits(8c29abe) while squashing my commits -

image

@bhiravabhatla
Copy link
Contributor Author

Sorry - This happened during my trail and error to fix DCO issues.

@bhiravabhatla
Copy link
Contributor Author

Apologies repeating same mistake again :(, Created another PR(#2796) - I am doing something wrong while rebasing/merging with master - which is causing DCO issues. I would try and get to the root of it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for Elasticsearch ILM policies
4 participants