Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CHANGE] Ruler: set default evaluation-delay-duration to 1m #4250

Merged
merged 6 commits into from
Mar 3, 2023
Merged

Conversation

ying-jeanne
Copy link
Contributor

@ying-jeanne ying-jeanne commented Feb 20, 2023

What this PR does

This PR is to set the default value of evaluation-delay-duration to 1m instead of 0s, with the value set to 1m, means ruler would only evaluate the data when their temp stamps are before now - 1m

Which issue(s) this PR fixes or relates to

Fixes #3764

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@CLAassistant
Copy link

CLAassistant commented Feb 20, 2023

CLA assistant check
All committers have signed the CLA.

@ying-jeanne ying-jeanne changed the title [CHANGE] Set default evaluation delay of ruler to 1m [CHANGE] Ruler set default evaluation delay of ruler to 1m Feb 20, 2023
@ying-jeanne ying-jeanne changed the title [CHANGE] Ruler set default evaluation delay of ruler to 1m [CHANGE] Ruler: set default evaluation-delay-duration to 1m. #3764 Feb 20, 2023
@ying-jeanne ying-jeanne changed the title [CHANGE] Ruler: set default evaluation-delay-duration to 1m. #3764 [CHANGE] Ruler: set default evaluation-delay-duration to 1m. Feb 20, 2023
@ying-jeanne ying-jeanne changed the title [CHANGE] Ruler: set default evaluation-delay-duration to 1m. [CHANGE] Ruler: set default evaluation-delay-duration to 1m Feb 20, 2023
Copy link
Collaborator

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In integration tests we can't afford to wait 1m before running evaluations (also the reason why they fail). Please set it back to 0 for integration tests, adding the CLI flag override in RulerFlags defined at integration/configs.go.

CHANGELOG.md Outdated Show resolved Hide resolved
@pstibrany
Copy link
Member

In integration tests we can't afford to wait 1m before running evaluations (also the reason why they fail). Please set it back to 0 for integration tests, adding the CLI flag override in RulerFlags defined at integration/configs.go.

FYI, we've discussed this with @ying-jeanne few days ago. PR is still in draft, no need for review yet :)

integration/configs.go Outdated Show resolved Hide resolved
Co-authored-by: Marco Pracucci <marco@pracucci.com>
@ying-jeanne ying-jeanne marked this pull request as ready for review March 2, 2023 21:39
@ying-jeanne ying-jeanne requested review from a team as code owners March 2, 2023 21:39
Copy link
Member

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thank you!

CHANGELOG.md Outdated Show resolved Hide resolved
Co-authored-by: Peter Štibraný <pstibrany@gmail.com>
@pracucci pracucci enabled auto-merge (squash) March 3, 2023 09:05
@pracucci pracucci merged commit 0100cc9 into main Mar 3, 2023
@pracucci pracucci deleted the 3764 branch March 3, 2023 10:22
krajorama added a commit that referenced this pull request Mar 6, 2023
* Helm: nginx HPA and tests kubeversion fixes (#4299)

* Helm: fix Kubernetes override for nginx HPA

The template did not take into account the override "kubeVersionOverride".
Fix by using the mimir template implemented for this reason.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Helm: fix missing Kubernetes version overrides in tests

The golden record tests need a fixed version because helm uses the version
of the default context and can produce different results between
contributor's machine and the CI environment.

Add logic to test build to inject the minimal version if not found in the
values file. Mainly because we cannot have a version override in the
small and large values files.

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Co-authored-by: Jon Kartago Lamida <lamidaj@gmail.com>

* Ruler: load more tenants in parallel during startup (#4258)

* Ruler: load more tenants in parallel during startup

* add more tests

* fix lint

* Apply suggestions from code review

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Ingester: fix OOO blocks labelling (#4297)

* Ingester: fix OOO blocks labelling

This fixes a bug where the OutOfOrderExternalLabel
was being added to all blocks instead of the ones coming
from OOO data, when the feature flag was enabled.

* Changelog

* PR number to changelog

* Update previous changelog entry instead

* Ruler: load more tenants in parallel during startup

* fix context

* improve unittest

---------

Co-authored-by: Marco Pracucci <marco@pracucci.com>
Co-authored-by: Nicolás Pazos <32206519+npazosmendez@users.noreply.github.com>

* Change language to match the math. (#4356)

* Upgrade mimir-prometheus to get a fast regexp path optimization (#4357)

* Upgrade mimir-prometheus to get a fast regexp path optimization

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Added CHANGELOG entry

Signed-off-by: Marco Pracucci <marco@pracucci.com>

---------

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fix typo in the docs URL for migrating from Cortex (#4358)

* Remove forced paragraph break. (#4359)

* Bump actions/setup-go to v3 to resolve Node.js 12 deprecation warning. (#4361)

* Improve flaky `TestIngesterWithShippingDisabledDeletesBlocksOnlyAfterRetentionExpires` (#4362)

* Use more specific assertion to include more information in test failures.

See #4198.

* Reduce flakiness of test by extending retention period.

This gives the rest of the test more time to retrieve `oldBlocks`
before any of the blocks is removed.

* Add asynchronous validation scaffolding for block upload (#3411)

* Add asynchronous validation scaffolding for block upload

* addressed lint errors

* Update pkg/compactor/block_upload.go

Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>

* Update pkg/compactor/block_upload.go

Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>

* Update pkg/compactor/block_upload.go

Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>

* Update pkg/compactor/block_upload.go

Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>

* Update pkg/compactor/block_upload.go

Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>

* Update pkg/compactor/block_upload.go

Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>

* enable block upload for dev testing

* fixed validation errors, added debug log messages

* fixed cancelled context issue

* changed name of flag to disable complete block upload

* addressed reviewer feedback

* addressed reviewer feedback

* Address some review comments, WIP

* Small spacing cleanup

* Transition to in-memory bucket for block finish test

* Async validation test coordination, adding configuration flags

* Small comment and flag fix

* Swap config strategy, test still needs separation

* Docs + lint

* Review comments, begin separating tests

* Finish validateAndComplete test

* Update docs

* Remove docker compose arguments

* Regenerate rather than modify

* Review, add test for periodicValidationUpdater

* Make validateAndComplete test clearer, add upload meta check

* Set missing cancelContext

* Add sleep as suggested

* Configure data directory

* fixed compactor data dir in e2e test

* Add changelog entry

* Split into two entries

* Missing entry number

* Update CHANGELOG.md

---------

Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: Andy Asp <andy.asp@grafana.com>
Co-authored-by: Andy Asp <90626759+andyasp@users.noreply.github.com>

* Jsonnet: honor the minimum shard size configured (#4363)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* [CHANGE] Ruler: set default `evaluation-delay-duration` to 1m (#4250)

* change the default evaluation delay of ruler to 1m

* revert the changes in ruler test

* change integration test to set ruler default value

* fix integration tests

* Update integration/configs.go

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Update CHANGELOG.md

Co-authored-by: Peter Štibraný <pstibrany@gmail.com>

---------

Co-authored-by: Marco Pracucci <marco@pracucci.com>
Co-authored-by: Peter Štibraný <pstibrany@gmail.com>

* [Chore] Update jsonnet manifest create query frontend discovery only when it is necessary (#4353)

* [Chore] update jsonnet manifest, avoid setting querier.frontend-address or create query-frontend-discovery when deployement mode is microserivces or query-scheduler is enabled

* linter and changelog

* Helm: fix parity with jsonnet on query frontend headless service

Do not generate query-frontend-headless service if query scheduler
 is enabled

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>

* Apply suggestions from code review

Co-authored-by: Marco Pracucci <marco@pracucci.com>

* correct changelog

* regenerate helm golden files

* Update CHANGELOG.md

---------

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Co-authored-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>

* Remove block validation mimirtool changelog entry (#4369)

* Spread TSDB head compaction over the configured interval (#4364)

* Spread TSDB head compaction over the configured interval

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fixed unit test

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Apply suggestion from code review

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fix typo in CHANGELOG entry

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fix typo in CHANGELOG entry

Signed-off-by: Marco Pracucci <marco@pracucci.com>

---------

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Fix port number values. (#4368)

* Ruler: change deployment max surge and max unavailable to reduce ownership spillover (#4381)

* Ruler: change deployment max surge and max unavailable to reduce ownership spillover

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Apply suggestions from code review

Co-authored-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

---------

Signed-off-by: Marco Pracucci <marco@pracucci.com>
Co-authored-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

* Move "Note:" about cross-zone costs to "Costs" (#4370)

This note was in an unrelated section.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>

* Change default -blocks-storage.tsdb.retention-period from 24h to 13h (#4382)

Signed-off-by: Marco Pracucci <marco@pracucci.com>

* Support histograms in pkg/storage and update other breakages (#4354)

* Support histograms in pkg/storage and update other breakages

Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>

---------

Signed-off-by: György Krajcsovits <gyorgy.krajcsovits@grafana.com>
Signed-off-by: Marco Pracucci <marco@pracucci.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Ganesh Vernekar <ganeshvern@gmail.com>
Co-authored-by: Jon Kartago Lamida <lamidaj@gmail.com>
Co-authored-by: ying-jeanne <74549700+ying-jeanne@users.noreply.github.com>
Co-authored-by: Marco Pracucci <marco@pracucci.com>
Co-authored-by: Nicolás Pazos <32206519+npazosmendez@users.noreply.github.com>
Co-authored-by: Ursula Kallio <ursula.kallio@grafana.com>
Co-authored-by: l3ioo <122443155+l3ioo@users.noreply.github.com>
Co-authored-by: Charles Korn <charleskorn@users.noreply.github.com>
Co-authored-by: Vernon Miller <96601789+aldernero@users.noreply.github.com>
Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: Andy Asp <andy.asp@grafana.com>
Co-authored-by: Andy Asp <90626759+andyasp@users.noreply.github.com>
Co-authored-by: Peter Štibraný <pstibrany@gmail.com>
Co-authored-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com>
Co-authored-by: Ganesh Vernekar <ganeshvern@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Change default evaluation delay of ruler
4 participants