Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retention deletion does not work on S3 or GCS when a prefix is configured #3465

Closed
bpfoster opened this issue Mar 5, 2024 · 2 comments · Fixed by #3466
Closed

Retention deletion does not work on S3 or GCS when a prefix is configured #3465

bpfoster opened this issue Mar 5, 2024 · 2 comments · Fixed by #3466

Comments

@bpfoster
Copy link
Contributor

bpfoster commented Mar 5, 2024

Describe the bug
When object storage is configured to use AWS S3, and a prefix is configured, if retention is configured it does not appear to delete old data beyond the retention period.

To Reproduce
Steps to reproduce the behavior:

  1. Start Tempo v2.4.0
  2. Wait for blocklist_poll time to elapse.
  3. Wait for compacted_block_retention time to elapse.
  4. Wait a lot longer.
  5. Witness no files deleted from S3.

Expected behavior
I expect data files in S3 older than the configured block_retention to be deleted from S3.

Environment:

  • Infrastructure: Kubernetes
  • Deployment tool: helm

Additional Context
I see 0 blocks listed in the compactor logs:

level=debug ts=2024-03-05T17:03:31.370043415Z caller=s3.go:273 msg="listing blocks" keypath=foo/tempo/ found=1 IsTruncated=false NextMarker=
level=debug ts=2024-03-05T17:03:32.513352722Z caller=s3.go:389 msg="listing blocks complete" blockIDs=0 compactedBlockIDs=0

Notable configuration:

compactor:
  compaction:
    block_retention: 48h
    compacted_block_retention: 10m
storage:
  trace:
    backend: s3
    blocklist_poll: 1m
    s3:
      bucket: foobar-bucket
      prefix: foo/tempo

When I have no prefix configured, I see a lot of compaction/retention activity in the (debug) log. With a prefix configured, I see virtually nothing.

@joe-elliott
Copy link
Member

Does this issue occur in 2.3.1? or is it new in 2.4.0? We did make some changes to the polling logic and I wonder if that's the root cause. Is Tempo working otherwise? Traces are being returned correctly from object storage?

Retention logic is here:
https://github.com/grafana/tempo/blob/main/tempodb/retention.go

I don't believe much, if anything, changed here and I don't see anything that stands out as impacted by a s3 prefix.

bpfoster added a commit to bpfoster/tempo that referenced this issue Mar 5, 2024
bpfoster added a commit to bpfoster/tempo that referenced this issue Mar 5, 2024
@bpfoster
Copy link
Contributor Author

bpfoster commented Mar 5, 2024

Hi @joe-elliott . I don't have an answer on whether it occurred in 2.3.1. I'm setting up a new installation for the first time so v2.4.0 is my only experience. As far as I can tell, everything otherwise is working as expected.

I'm quite unfamiliar with the tempo codebase. I found one potentially problematic area in tempodb/backend/s3/s3.go. Please see PR #3466.
On debugging, I noticed that all objects from S3 were filtered out in the affected method, leaving with no results. I can't say for sure this is the only (or best!) fix, but running it locally for a brief period of time results in the expected cleanup behaviors.

bpfoster added a commit to bpfoster/tempo that referenced this issue Mar 5, 2024
bpfoster added a commit to bpfoster/tempo that referenced this issue Mar 5, 2024
bpfoster added a commit to bpfoster/tempo that referenced this issue Mar 6, 2024
bpfoster added a commit to bpfoster/tempo that referenced this issue Mar 8, 2024
@bpfoster bpfoster changed the title Retention deletion does not work on S3 when a prefix is configured Retention deletion does not work on S3 or GCS when a prefix is configured Mar 14, 2024
zalegrala added a commit that referenced this issue Mar 14, 2024
* Handle prefixes when listing blocks from S3

fixes #3465

* Handle prefixes when listing blocks from GCS

* Add test for prefixes when listing blocks from Azure

* Update unit tests to check for actual block IDs instead of just length of the slices

Cleanup unit tests

* Further refine S3/GCS backend for ListBlocks

Brings logic more in line with Azure object parsing.
Also has the benefit of handling prefixes without a trailing slash.

* Update poller integration test to exercise prefixes

* Update e2e test to exercise prefixes

* Fix format check error

* Fix failing e2e tests

* Remove unnecessary prefix permutations from e2e test

* Remove unnecessary test config file copy

* Ignore lint

---------

Co-authored-by: Zach Leslie <zach.leslie@grafana.com>
joe-elliott pushed a commit that referenced this issue Mar 19, 2024
* Handle prefixes when listing blocks from S3

fixes #3465

* Handle prefixes when listing blocks from GCS

* Add test for prefixes when listing blocks from Azure

* Update unit tests to check for actual block IDs instead of just length of the slices

Cleanup unit tests

* Further refine S3/GCS backend for ListBlocks

Brings logic more in line with Azure object parsing.
Also has the benefit of handling prefixes without a trailing slash.

* Update poller integration test to exercise prefixes

* Update e2e test to exercise prefixes

* Fix format check error

* Fix failing e2e tests

* Remove unnecessary prefix permutations from e2e test

* Remove unnecessary test config file copy

* Ignore lint

---------

Co-authored-by: Zach Leslie <zach.leslie@grafana.com>
(cherry picked from commit 8e6e7fe)
joe-elliott added a commit that referenced this issue Mar 19, 2024
* Handle prefixes when listing blocks from S3

fixes #3465

* Handle prefixes when listing blocks from GCS

* Add test for prefixes when listing blocks from Azure

* Update unit tests to check for actual block IDs instead of just length of the slices

Cleanup unit tests

* Further refine S3/GCS backend for ListBlocks

Brings logic more in line with Azure object parsing.
Also has the benefit of handling prefixes without a trailing slash.

* Update poller integration test to exercise prefixes

* Update e2e test to exercise prefixes

* Fix format check error

* Fix failing e2e tests

* Remove unnecessary prefix permutations from e2e test

* Remove unnecessary test config file copy

* Ignore lint

---------

Co-authored-by: Zach Leslie <zach.leslie@grafana.com>
(cherry picked from commit 8e6e7fe)

Co-authored-by: Ben Foster <bpfoster@gmail.com>
joe-elliott pushed a commit to joe-elliott/tempo that referenced this issue Mar 19, 2024
* Handle prefixes when listing blocks from S3

fixes grafana#3465

* Handle prefixes when listing blocks from GCS

* Add test for prefixes when listing blocks from Azure

* Update unit tests to check for actual block IDs instead of just length of the slices

Cleanup unit tests

* Further refine S3/GCS backend for ListBlocks

Brings logic more in line with Azure object parsing.
Also has the benefit of handling prefixes without a trailing slash.

* Update poller integration test to exercise prefixes

* Update e2e test to exercise prefixes

* Fix format check error

* Fix failing e2e tests

* Remove unnecessary prefix permutations from e2e test

* Remove unnecessary test config file copy

* Ignore lint

---------

Co-authored-by: Zach Leslie <zach.leslie@grafana.com>
joe-elliott added a commit that referenced this issue Mar 20, 2024
* log request

Signed-off-by: Joe Elliott <number101010@gmail.com>

* move stuff a bit

Signed-off-by: Joe Elliott <number101010@gmail.com>

* oh my. e2e tests pass

Signed-off-by: Joe Elliott <number101010@gmail.com>

* add handlers

Signed-off-by: Joe Elliott <number101010@gmail.com>

* streaming tags

Signed-off-by: Joe Elliott <number101010@gmail.com>

* add cli support

Signed-off-by: Joe Elliott <number101010@gmail.com>

* improve logging

Signed-off-by: Joe Elliott <number101010@gmail.com>

* fix

Signed-off-by: Joe Elliott <number101010@gmail.com>

* docs

Signed-off-by: Joe Elliott <number101010@gmail.com>

* pipe overrides

Signed-off-by: Joe Elliott <number101010@gmail.com>

* cleanup

Signed-off-by: Joe Elliott <number101010@gmail.com>

* cleanup

Signed-off-by: Joe Elliott <number101010@gmail.com>

* support limits

Signed-off-by: Joe Elliott <number101010@gmail.com>

* docs

Signed-off-by: Joe Elliott <number101010@gmail.com>

* e2e tests and caching

Signed-off-by: Joe Elliott <number101010@gmail.com>

* key prefixes

Signed-off-by: Joe Elliott <number101010@gmail.com>

* cache keys

Signed-off-by: Joe Elliott <number101010@gmail.com>

* Fixed distinct collection in combiners

Signed-off-by: Joe Elliott <number101010@gmail.com>

* fixed combiner bugs and revived tests

Signed-off-by: Joe Elliott <number101010@gmail.com>

* restored all tests

Signed-off-by: Joe Elliott <number101010@gmail.com>

* lint

Signed-off-by: Joe Elliott <number101010@gmail.com>

* made search handler utilities generic

Signed-off-by: Joe Elliott <number101010@gmail.com>

* Added handler tests for tags

Signed-off-by: Joe Elliott <number101010@gmail.com>

* add diff support

Signed-off-by: Joe Elliott <number101010@gmail.com>

* lint

Signed-off-by: Joe Elliott <number101010@gmail.com>

* add distinct value collector tests

Signed-off-by: Joe Elliott <number101010@gmail.com>

* fix integration tests

Signed-off-by: Joe Elliott <number101010@gmail.com>

* diff tests

Signed-off-by: Joe Elliott <number101010@gmail.com>

* swapped query for the more robust ExtractMatchers(query)

Signed-off-by: Joe Elliott <number101010@gmail.com>

* tests

Signed-off-by: Joe Elliott <number101010@gmail.com>

* moved e2e tests to a more sensible place

Signed-off-by: Joe Elliott <number101010@gmail.com>

* fix non-deterministic  test

Signed-off-by: Joe Elliott <number101010@gmail.com>

* changelog

Signed-off-by: Joe Elliott <number101010@gmail.com>

* fix tests for 429 handling

Signed-off-by: Joe Elliott <number101010@gmail.com>

* Update docs/sources/tempo/operations/tempo_cli.md

Co-authored-by: Mario <mariorvinas@gmail.com>

* review

Signed-off-by: Joe Elliott <number101010@gmail.com>

* Update docs/sources/tempo/api_docs/_index.md

Co-authored-by: Kim Nylander <104772500+knylander-grafana@users.noreply.github.com>

* Correctly cancel GRPC context beneath the HTTP server (#3443)

* cancel context

Signed-off-by: Joe Elliott <number101010@gmail.com>

* update dskit

Signed-off-by: Joe Elliott <number101010@gmail.com>

* focused timeouts

Signed-off-by: Joe Elliott <number101010@gmail.com>

* docs

Signed-off-by: Joe Elliott <number101010@gmail.com>

* lint N docs

Signed-off-by: Joe Elliott <number101010@gmail.com>

* more lint

Signed-off-by: Joe Elliott <number101010@gmail.com>

* make update-mod

Signed-off-by: Joe Elliott <number101010@gmail.com>

---------

Signed-off-by: Joe Elliott <number101010@gmail.com>

* Bump anchore/sbom-action from 0.15.8 to 0.15.9 (#3476)

Bumps [anchore/sbom-action](https://github.com/anchore/sbom-action) from 0.15.8 to 0.15.9.
- [Release notes](https://github.com/anchore/sbom-action/releases)
- [Commits](anchore/sbom-action@v0.15.8...v0.15.9)

---
updated-dependencies:
- dependency-name: anchore/sbom-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Doc update (#3482)

* doc: remove reference to previously purged script

* doc: correct label for docs updates

* [TraceQL Metrics] Use new per-tenant max_metrics_duration and fix duration check (#3484)

* Use new per-tenant max_metrics_duration, and fix duration timestamp handling

* Update docs and defaults

* Handle prefixes when listing blocks from S3 and GCS (#3466)

* Handle prefixes when listing blocks from S3

fixes #3465

* Handle prefixes when listing blocks from GCS

* Add test for prefixes when listing blocks from Azure

* Update unit tests to check for actual block IDs instead of just length of the slices

Cleanup unit tests

* Further refine S3/GCS backend for ListBlocks

Brings logic more in line with Azure object parsing.
Also has the benefit of handling prefixes without a trailing slash.

* Update poller integration test to exercise prefixes

* Update e2e test to exercise prefixes

* Fix format check error

* Fix failing e2e tests

* Remove unnecessary prefix permutations from e2e test

* Remove unnecessary test config file copy

* Ignore lint

---------

Co-authored-by: Zach Leslie <zach.leslie@grafana.com>

* Update doc-validator.yml (#3483)

Updates the doc-validator to the latest version. Note that this changes the reference format to use the full URL (https://....) instead of /docs/blah

* [DOC] Document Tempo Operator Monolithic mode (#3474)

* [DOC] Document Tempo Operator Monolithic mode

Signed-off-by: Andreas Gerstmayr <agerstmayr@redhat.com>

* clarify supported storages

Signed-off-by: Andreas Gerstmayr <agerstmayr@redhat.com>

* fix case of title

Signed-off-by: Andreas Gerstmayr <agerstmayr@redhat.com>

* Apply suggestions from code review

* Apply suggestions from code review

---------

Signed-off-by: Andreas Gerstmayr <agerstmayr@redhat.com>
Co-authored-by: Kim Nylander <104772500+knylander-grafana@users.noreply.github.com>

* [DOC] document Grafana data source setup using Grafana and Tempo operators (#3473)

* [docs] document Grafana data source setup using Grafana and Tempo operators

* move the Grafana data source setup page to the operator folder (this
  page is only relevant for the operator)
* document Grafana data source setup using Grafana and Tempo operators

Signed-off-by: Andreas Gerstmayr <agerstmayr@redhat.com>

* Apply suggestions from code review

---------

Signed-off-by: Andreas Gerstmayr <agerstmayr@redhat.com>
Co-authored-by: Kim Nylander <104772500+knylander-grafana@users.noreply.github.com>

* [DOC] fix typo in setup/operator/monolithic.md (#3496)

Signed-off-by: Andreas Gerstmayr <agerstmayr@redhat.com>

* Bump github.com/prometheus/client_golang from 1.18.0 to 1.19.0 (#3455)

* Bump github.com/prometheus/client_golang from 1.18.0 to 1.19.0

Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.18.0 to 1.19.0.
- [Release notes](https://github.com/prometheus/client_golang/releases)
- [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md)
- [Commits](prometheus/client_golang@v1.18.0...v1.19.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_golang
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Update serverless gomod

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: grafanabot <bot@grafana.com>

* Add support for dashes, quotes and spaces in attribute names (#3458)

* Add support for dashes, quotes and spaces in attribute names

* chlog

* [TraceQL Metrics] Step align query_range time range (#3490)

* Step align query_range time range

* Time range error: improve message and fix format for prom format.

* oops remove printlns

* lint

* changelog

* 2.4.1 changelog (#3503)

Signed-off-by: Joe Elliott <number101010@gmail.com>

* [DOC] Add 2.4.1 release notes (#3504)

* fix tests due to interface changing

Signed-off-by: Joe Elliott <number101010@gmail.com>

* Pass context

Signed-off-by: Joe Elliott <number101010@gmail.com>

---------

Signed-off-by: Joe Elliott <number101010@gmail.com>
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Andreas Gerstmayr <agerstmayr@redhat.com>
Co-authored-by: Mario <mariorvinas@gmail.com>
Co-authored-by: Kim Nylander <104772500+knylander-grafana@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Matt Simonsen <matt.simonsen@gmail.com>
Co-authored-by: Martin Disibio <martin.disibio@grafana.com>
Co-authored-by: Ben Foster <bpfoster@gmail.com>
Co-authored-by: Zach Leslie <zach.leslie@grafana.com>
Co-authored-by: Andreas Gerstmayr <agerstmayr@redhat.com>
Co-authored-by: grafanabot <bot@grafana.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants