Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend repository_integrity health indicator for unknown and invalid repos #104614

Merged
merged 23 commits into from
Feb 7, 2024

Conversation

nielsbauman
Copy link
Contributor

The main objective of this PR is to address the linked issue below. The main changes in this PR include:

  • LocalHealthMonitor: to allow for a more "scalable" setup, I added the abstract class HealthCheck<T>, which represents a health check that will be performed on every node, and made LocalHealthMonitor work with a list of HealthChecks. This reduces future effort to add new health checks.
    • Also extracted DiskCheck to a separate class/file.
    • And added RepositoriesCheck for this new feature.
  • RepositoryIntegrityHealthIndicatorService: refactored this class to do a more extensive analysis of the repository health in the cluster.
  • Field additions, new TransportVersion, constructor updates, etc.

Fixes #103784

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@elasticsearchmachine
Copy link
Collaborator

Hi @nielsbauman, I've created a changelog YAML for you.

Copy link
Contributor

@gmarouli gmarouli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @nielsbauman for picking this up. This is a great start, I focused my comments at 2 points that I found a bit more difficult to follow.

@gmarouli gmarouli self-requested a review January 31, 2024 15:23
Copy link
Contributor

@gmarouli gmarouli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this @nielsbauman . I haven't finished the review but because you have been waiting long, I send this initial comments and I will continue tomorrow.

Copy link
Contributor

@gmarouli gmarouli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @nielsbauman great progress, I think the structure is already in place and now we are moving towards the small changes.

I moved further with the review; however I would like to ask you if you can revert the conversion of assertThat to assertEquals, since the deprecation is fixed and the code is already there it will greatly help the review process. Changing these assertions introduces extra noise in the PR and it's harder to find the actual changes.

Thank you for the great work!

In the new tests you write you can use the assertion you prefer, no problem there.

Copy link
Contributor

@gmarouli gmarouli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments, I think this is the last round :)

# Conflicts:
#	server/src/main/java/org/elasticsearch/TransportVersions.java
@gmarouli gmarouli self-requested a review February 7, 2024 12:10
Copy link
Contributor

@gmarouli gmarouli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, assuming the CI agrees too :) . Congrats on completing an important and complex feature @nielsbauman !!!

@nielsbauman nielsbauman merged commit 6489101 into elastic:main Feb 7, 2024
15 checks passed
@nielsbauman nielsbauman deleted the repo-health branch February 7, 2024 14:18
nielsbauman added a commit that referenced this pull request Feb 19, 2024
nielsbauman added a commit to nielsbauman/elasticsearch that referenced this pull request Feb 19, 2024
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Apr 18, 2024
This page was split up in elastic#104614 but the `ReferenceDocs` symbol links
to the top-level page still rather than the correct subpage. This fixes
the link.
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Apr 18, 2024
Since elastic#104614 the top-level repo troubleshooting page is just a short
paragraph which talks about "this page" but in fact refers to
information spread across a number of subsequent pages. It's not obvious
to the reader that they need to use the navigation menu to get to the
information they seek. Moreover we link to this page from an exception
message today so there's a reasonable chance that users will find it
when trying to troubleshoot a genuine problem.

This commit rewords things slightly and adds links to the subsequent
pages to the body of the page to avoid this confusion.
DaveCTurner added a commit that referenced this pull request Apr 18, 2024
This page was split up in #104614 but the `ReferenceDocs` symbol links
to the top-level page still rather than the correct subpage. This fixes
the link.
DaveCTurner added a commit that referenced this pull request Apr 18, 2024
Since #104614 the top-level repo troubleshooting page is just a short
paragraph which talks about "this page" but in fact refers to
information spread across a number of subsequent pages. It's not obvious
to the reader that they need to use the navigation menu to get to the
information they seek. Moreover we link to this page from an exception
message today so there's a reasonable chance that users will find it
when trying to troubleshoot a genuine problem.

This commit rewords things slightly and adds links to the subsequent
pages to the body of the page to avoid this confusion.
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Apr 18, 2024
This page was split up in elastic#104614 but the `ReferenceDocs` symbol links
to the top-level page still rather than the correct subpage. This fixes
the link.
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Apr 18, 2024
Since elastic#104614 the top-level repo troubleshooting page is just a short
paragraph which talks about "this page" but in fact refers to
information spread across a number of subsequent pages. It's not obvious
to the reader that they need to use the navigation menu to get to the
information they seek. Moreover we link to this page from an exception
message today so there's a reasonable chance that users will find it
when trying to troubleshoot a genuine problem.

This commit rewords things slightly and adds links to the subsequent
pages to the body of the page to avoid this confusion.
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Apr 18, 2024
Since elastic#104614 the top-level repo troubleshooting page is just a short
paragraph which talks about "this page" but in fact refers to
information spread across a number of subsequent pages. It's not obvious
to the reader that they need to use the navigation menu to get to the
information they seek. Moreover we link to this page from an exception
message today so there's a reasonable chance that users will find it
when trying to troubleshoot a genuine problem.

This commit rewords things slightly and adds links to the subsequent
pages to the body of the page to avoid this confusion.
elasticsearchmachine pushed a commit that referenced this pull request Apr 18, 2024
Since #104614 the top-level repo troubleshooting page is just a short
paragraph which talks about "this page" but in fact refers to
information spread across a number of subsequent pages. It's not obvious
to the reader that they need to use the navigation menu to get to the
information they seek. Moreover we link to this page from an exception
message today so there's a reasonable chance that users will find it
when trying to troubleshoot a genuine problem.

This commit rewords things slightly and adds links to the subsequent
pages to the body of the page to avoid this confusion.
elasticsearchmachine pushed a commit that referenced this pull request Apr 18, 2024
Since #104614 the top-level repo troubleshooting page is just a short
paragraph which talks about "this page" but in fact refers to
information spread across a number of subsequent pages. It's not obvious
to the reader that they need to use the navigation menu to get to the
information they seek. Moreover we link to this page from an exception
message today so there's a reasonable chance that users will find it
when trying to troubleshoot a genuine problem.

This commit rewords things slightly and adds links to the subsequent
pages to the body of the page to avoid this confusion.
elasticsearchmachine pushed a commit that referenced this pull request Apr 18, 2024
This page was split up in #104614 but the `ReferenceDocs` symbol links
to the top-level page still rather than the correct subpage. This fixes
the link.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Health indicator for broken repositories
3 participants