Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add compactor HTTP API for uploading TSDB blocks #1694

Merged
merged 82 commits into from
Jun 17, 2022
Merged

Add compactor HTTP API for uploading TSDB blocks #1694

merged 82 commits into from
Jun 17, 2022

Conversation

aknuds1
Copy link
Contributor

@aknuds1 aknuds1 commented Apr 13, 2022

What this PR does

Add compactor HTTP API for uploading TSDB blocks.

TODOs

  • Add HTTP endpoints in cloud gateway
  • Add HTTP endpoints in GEM gateway
  • Incorporate @pracucci's feedback
  • Add tests
  • Add validation when creating a block upload session
  • Validate block metadata
  • Validate block files(?) (@colega suggestion)
  • Make sure that when starting a backfill, file lengths are sent with file index
  • Validate that block time range is within retention period (@aldernero working on this)
  • Validate minTime/maxTime in meta.json (@aldernero working on this)
  • Validate block ID on backfill start
  • Test output from sanitizeMeta
  • Mark user-uploaded blocks, for debugging and security(?) (this is in place through thanos.source property in meta.json)
  • Make sure that a backfill can be restarted, in case it got interrupted
  • Add/fix tests

Which issue(s) this PR fixes or relates to

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Copy link
Collaborator

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @aknuds1 for working on this! I did a very high level review, focusing on API design. Could you take a look at my comments, please?

pkg/api/api.go Outdated Show resolved Hide resolved
pkg/api/api.go Outdated Show resolved Hide resolved
pkg/api/api.go Outdated Show resolved Hide resolved
pkg/distributor/distributor.go Outdated Show resolved Hide resolved
pkg/distributor/distributor.go Outdated Show resolved Hide resolved
pkg/mimirpb/mimir.proto Outdated Show resolved Hide resolved
pkg/distributor/distributor.go Outdated Show resolved Hide resolved
pkg/distributor/distributor.go Outdated Show resolved Hide resolved
cmd/mimirtool/main.go Outdated Show resolved Hide resolved
pkg/distributor/distributor.go Outdated Show resolved Hide resolved
@aknuds1
Copy link
Contributor Author

aknuds1 commented Apr 15, 2022

Thanks for the review @pracucci! I will incorporate your feedback when I can start refining the PR. Concentrating on implementing the endpoint for finishing backfills now.

@pracucci
Copy link
Collaborator

What happens when an upload is aborted? If the client interrupts the upload and will never recover it, the partial files will be stored in uploads/ forever. We may think of doing a cleanup in the compactor cleanup (not in this PR, but we need to design it - can you add it to the design doc, please?).

@aknuds1
Copy link
Contributor Author

aknuds1 commented Apr 15, 2022

What happens when an upload is aborted? If the client interrupts the upload and will never recover it, the partial files will be stored in uploads/ forever. We may think of doing a cleanup in the compactor cleanup (not in this PR, but we need to design it - can you add it to the design doc, please?).

@pracucci I have thought about the same issue, and was planning to return to it after implementing the basic scenario ("happy path"). Good thinking about letting the compactor cleanup do it, I can note it in the design doc.

pkg/ingester/ingester.go Outdated Show resolved Hide resolved
@aknuds1 aknuds1 force-pushed the feat/backfill branch 2 times, most recently from 46fa91c to 817b5a1 Compare April 25, 2022 11:37
@pstibrany
Copy link
Member

pstibrany commented Apr 25, 2022

What is the rationale behind using distributor, grpc and ingester to upload blocks to the storage?

I would suggest to use component, which already has access to the blocks storage. Personally I would suggest compactor, or purger (which also handles tenant deletion API, although purger doesn't exactly mean "block uploads" to me). This component should not need to forward the messages over gRPC to yet-another component, but can handle the API calls directly.

@aknuds1
Copy link
Contributor Author

aknuds1 commented Apr 25, 2022

What is the reationale behind using distributor, grpc and ingester to upload blocks to the storage?

I would suggest to use component, which already has access to the blocks storage. Personally I would suggest compactor, or purger (which also handles tenant deletion API). This component should not need to forward the messages over gRPC to yet-another component, but can handle the API calls directly.

@pstibrany it's because I figured the distributor would be the logical component for receiving the backfill requests. Would there be any (technical) drawbacks from putting the HTTP endpoints in compactor or purger?

It seems a bit weird to me from a logical PoV though to put backfill endpoints in compactor or purger components though? It's basically a type of ingestion?

WDYT @pracucci?

@pstibrany
Copy link
Member

pstibrany commented Apr 25, 2022

it's because I figured the distributor would be the logical component for receiving the backfill requests

Distributor doesn't currently talk to blocks storage in any way.

Would there be any drawbacks from putting the HTTP endpoints in compactor or purger?

I don't think so, other than it seems "weird". Compactor already handles lot of blocks maintanance work (cleanup, block index), but doesn't currently expose any API. "Purger" doesn't exactly mean "backfilling" in my mind. If we overlook these minor issues, I think they both are better choice than distributor.

We may also end up with completely separate "backfill" module, and only people interested in having backfill API can run it. (Or "blocks-api" module, and then we can also provide blocks-reading API in the future.)

@aknuds1
Copy link
Contributor Author

aknuds1 commented Apr 25, 2022

it's because I figured the distributor would be the logical component for receiving the backfill requests

Distributor doesn't currently talk to blocks storage in any way.

@pstibrany It forwards ingestion requests to the ingester though. My thinking was backfill is another form of ingestion. I'm not sure what's technically the best. I'm open to moving the functionality to another component. I think compactor sounds the least weird of the two, logically.

We may also end up with completely separate "backfill" module, and only people interested in having backfill API can run it. (Or "blocks-api" module, and then we can also provide blocks-reading API in the future.)

That could also make sense, although maybe overkill for this PR?

@pstibrany
Copy link
Member

I'm not sure what's technically the best. I'm open to moving the functionality to another component. I think compactor sounds the least weird of the two, logically.

I think the best option is one where we can leverage HTTP protocol, so that client can stream the data through some Mimir component directly to the storage, without any intermediate local files. That rules out distributor (no access to blocks storage). Ingesters could handle the HTTP API calls directly (without going through distributors), but I don't think they should be wasting their CPU cycles on this, or on blocks verification.

We may also end up with completely separate "backfill" module, and only people interested in having backfill API can run it. (Or "blocks-api" module, and then we can also provide blocks-reading API in the future.)

That could also make sense, although maybe overkill for this PR?

Perhaps, I don't think cost of having more modules is so high. Compactors have advantage that they already have local disk (we will need that to download full block and validate it, before successfully finishing the upload).

@aknuds1
Copy link
Contributor Author

aknuds1 commented Apr 25, 2022

@pstibrany how about we move the endpoints to the compactor module then?

@pstibrany
Copy link
Member

@pstibrany how about we move the endpoints to the compactor module then?

Sounds good to me. It allows us to get rid of gRPC middle step.

We can keep backfill implementation separate from compactor, and only pass bucket configuration to the HTTP handlers. It will make it simple to use from different module in the future.

@aknuds1
Copy link
Contributor Author

aknuds1 commented Apr 25, 2022

@pstibrany how about we move the endpoints to the compactor module then?

Sounds good to me. It allows us to get rid of gRPC middle step.

We can keep backfill implementation separate from compactor, and only pass bucket configuration to the HTTP handlers. It will make it simple to use from different module in the future.

@pstibrany Sounds good! I'll just verify tomorrow with @pracucci that he agrees.

@pracucci
Copy link
Collaborator

We're working to simplify the Mimir architecture and operations. In this perspective, for users it's easier to understand there are only 2 ingress components in Mimir:

  • Write path: distributor
  • Read path: query-frontend

Do the backfill API write or read? Write. Then I would keep the API exposed by distributor. Then, internally we can route requests to other components. I'm fine routing to compactor instead of ingester (I think it makes sense), but I would keep the API exposed by distributor. This also allow for some traffic sharding (e.g. shuffle-sharding).

For the same reason, I would avoid adding any new component for backfilling.

About the purger: the existance of this component is a legacy of the chunks storage. I believe it shouldn't even exist in Mimir anymore (and its feature merged in other already-existing components), but that's a separate issue to discuss.

@aknuds1
Copy link
Contributor Author

aknuds1 commented Apr 26, 2022

Thanks for chiming in @pracucci! I'll keep the API as is then, considering moving the block storage logic to the compactor.

@pracucci
Copy link
Collaborator

Do the backfill API write or read? Write. Then I would keep the API exposed by distributor.

I would like to hear @09jvilla opinion too. Exposing the API directly from compactor is easier from an implementation perspective. My main take is about simplifying the architecture for the final user, but that doesn't necessarily simplify the implementation for us.

@pstibrany
Copy link
Member

Do the backfill API write or read? Write. Then I would keep the API exposed by distributor. Then, internally we can route requests to other components. I'm fine routing to compactor instead of ingester (I think it makes sense), but I would keep the API exposed by distributor.

If it is the Distributor that exposes the API, and we don't want distributors doing the upload to the storage directly (which would be an option too, but I don't think we should do that), then distributor needs to reroute the request. We use gRPC for this rerouting. gRPC introduces extra complication and forces us to reimplement streaming of body from HTTP request via gRPC. If we didn't introduce gRPC to the mix, we could simply pass io.Reader from HTTP request directly to the object client's Upload method. In other words, rerouting from distributor to another component via gRPC seems like unnecessary complication, which is the reason why I disagree to use distributor to expose the API.

Another thing to consider: component handling the upload will need to perform validation of the block during the finish. This means downloading the block locally and perform various checks on the index and chunks, to make sure we don’t let wrong block in. I don’t think ingesters should be doing this validation – it’s IO and CPU intensive, and ingesters have better things to do. Similarly distributors would be a poor choice. I think compactor makes sense to do this.

tl;dr: I'm still in favor of directly exposing upload HTTP API from compactor, without going through distributor first.

@09jvilla what is your take?

@aknuds1
Copy link
Contributor Author

aknuds1 commented Apr 26, 2022

@pstibrany @pracucci I implemented Andy's/Peter's proposal to upload blocks directly to the destination, instead of going via a staging directory ("uploads"). I ensure that the block is considered unfinished while uploading by only writing meta.json when completing the block upload session.

A great benefit of this is I don't have to fork Thanos to implement a bucket move method, which is actually a lot of work due to all the different bucket backends.

@09jvilla
Copy link
Contributor

09jvilla commented Apr 28, 2022

Warning - long thread below. I've put in a TLDR at the bottom if you're in a hurry.

  1. If we expose it via the distributor, the argument is that it simplifies the interface for the user. They only need to think about the distributor as the single ingress component for the system (regardless of whether those metrics are realtime or historic). However, does it add complexity to what the user has to configure upfront? In other words, does that extra grpc connection and request forwarding that has to happen add some extra configuration burden for the user? If it does, that might mitigate the benefits of the distributor as the single interface. If it doesn't, just ignore this point.

  2. In general I would rather us take on the complexity for our users, so us doing a little more implementation work upfront to save our users some cognitive load is worth it. This would again argue for having it exposed via the distributor. However, if this is going to make issues harder for users to debug (because of the extra distributor to compactor hop), then it may add just as many problems as it solves.

  3. Its hard to say what the implications would be on the ease-of-use effort. In the monolithic mode, this doesn't really matter that much right? Because both the compactor and the distributor are deployed in the same process? (I mean sure it changes the api route but the base address is the same). Depending on how you do a miniservices implementation (e.g., put both the distributor and compactor on target=write), this might also be true.

  4. Let's say we do downsampling in the compactor at some point. That probably would be configured via an HTTP API on the compactor, right? We could then frame it to the user as:

  • distributor --> ingress for write path (i.e. 'live data')
  • query-frontend --> ingress for read path
  • compactor --> handles historic data functions (handles custom retention, downsampling, deletion of data, and import of old data)

So you can still keep it relatively simple for the user, but still have this live on the compactor.

TLDR: In most cases, this should be a one time action the user is taking. ("As a user, I need to spin up a new cluster and want to pull in my historic data"). There are some folks like Adobe who need to periodically import historic data, but I think this is the less common case.

Because its not a super frequent action the user would take, I'm not that worried that it will add a lot of user confusion (plus I tried to make some arguments in # 1- # 4 that maybe the confusion isn't as bad as we think).

Given that, I would vote in favor of what is simpler for us to maintain in the long term, which if I'm reading this thread correctly, would be putting it on the compactor.

@aknuds1
Copy link
Contributor Author

aknuds1 commented Apr 28, 2022

Thanks @09jvilla.

Warning - long thread below. I've put in a TLDR at the bottom if you're in a hurry.

  1. If we expose it via the distributor, the argument is that it simplifies the interface for the user. They only need to think about the distributor as the single ingress component for the system (regardless of whether those metrics are realtime or historic). However, does it add complexity to what the user has to configure upfront? In other words, does that extra grpc connection and request forwarding that has to happen add some extra configuration burden for the user? If it does, that might mitigate the benefits of the distributor as the single interface. If it doesn't, just ignore this point.

I'm not aware of any extra configuration required, at least none has been introduced so far in the PR.

  1. In general I would rather us take on the complexity for our users, so us doing a little more implementation work upfront to save our users some cognitive load is worth it. This would again argue for having it exposed via the distributor. However, if this is going to make issues harder for users to debug (because of the extra distributor to compactor hop), then it may add just as many problems as it solves.

I'd need input from the others (Marco, Peter, others?) on how much extra maintenance effort the (gRPC) hop from distributor to ingester might mean in practice.

@pstibrany
Copy link
Member

From user's point of view, there is no extra complexity due to using gRPC. Extra complexity is only in our code.

We already have API exposed by different components (distributor, query-frontend, alertmanager, ruler), so adding compactor to the mix with backfilling API (and also perhaps tenant deletion API, which currently lives in "purger" component) doesn't seem to make the situation much worse.

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
@aknuds1 aknuds1 changed the title Add compactor HTTP API for backfilling blocks Add compactor HTTP API for uploading blocks Jun 16, 2022
@aknuds1 aknuds1 changed the title Add compactor HTTP API for uploading blocks Add compactor HTTP API for uploading TSDB blocks Jun 16, 2022
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
@aknuds1 aknuds1 requested a review from pstibrany June 16, 2022 08:38
@@ -30,6 +30,7 @@
* [ENHANCEMENT] Chunk Mapper: reduce memory usage of async chunk mapper. #2043
* [ENHANCEMENT] Ingesters: Added new configuration option that makes it possible for mimir ingesters to perform queries on overlapping blocks in the filesystem. Enabled with `-blocks-storage.tsdb.allow-overlapping-queries`. #2091
* [ENHANCEMENT] Ingester: reduce sleep time when reading WAL. #2098
* [ENHANCEMENT] Compactor: Add HTTP API for uploading TSDB blocks. #1694
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also mention that the feature is experimental (here: docs/sources/operators-guide/configuring/about-versioning.md:35) for now, as API may change a bit still (I expect that validation will modify "finish block" call to return progress updates).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the heads up @pstibrany, done. PTAL.

Copy link
Member

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. I think we're ready to merge this PR after last comments are fixed.

pkg/compactor/block_upload.go Outdated Show resolved Hide resolved
pkg/compactor/block_upload.go Show resolved Hide resolved
pkg/compactor/block_upload.go Outdated Show resolved Hide resolved
pkg/compactor/block_upload_test.go Outdated Show resolved Hide resolved
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
@aknuds1 aknuds1 requested a review from pstibrany June 16, 2022 13:55
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Copy link
Member

@pstibrany pstibrany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for your effort on this PR!

@aknuds1
Copy link
Contributor Author

aknuds1 commented Jun 17, 2022

Thanks for the thorough and quick reviews @pstibrany!

@aknuds1 aknuds1 merged commit 94f00f8 into main Jun 17, 2022
@aknuds1 aknuds1 deleted the feat/backfill branch June 17, 2022 07:43
masonmei pushed a commit to udmire/mimir that referenced this pull request Jul 11, 2022
* Compactor: Add HTTP API for uploading TSDB blocks

Signed-off-by: Arve Knudsen <arve.knudsen@gmail.com>
Co-authored-by: Peter Štibraný <pstibrany@gmail.com>
Co-authored-by: aldernero <vernon.w.miller@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants