sidecar: Enabled uploading all compaction levels if compation is disabled.#207
sidecar: Enabled uploading all compaction levels if compation is disabled.#207bwplotka wants to merge 4 commits into
Conversation
|
Hey @Bplotka I think its caused by the rebase here prometheus/prometheus#3671 Cheers |
|
@V3ckt0r thanks, rebased. Make sure you wipe out NOT uploaded blocks from |
|
@Bplotka yup that has done the trick, thanks. After nuking my As you can see it took a few mins to upload about ~7GBs worth of data. |
|
@fabxc mind having a quick look? |
|
This generally seems sufficient. However, this will work ok for backfilling already compacted data but it is still not safe to have Prometheus keep compacting data. Namely the case where Prometheus creates a new block and immediately afterwards compacts it together with a few other blocks. We do create hardlink-copies of a block before copying it. But even during that hardlinking there may be a race where files suddenly disappear from underneath us. Now, with the compactor running I think that the combination of the sidecars and bucket garbage collection will still converge back to a correct state. But this absolutely needs more reasoning, explicit handling in the sidecar, and testing. For the sidecar's handling there are various options. I'm not 100% sure which one the simplest and safest would be. I'd like to avoid to have the sidecar call TSDB hooks (like enable/disable compaction) in Prometheus or even do compactions itself. |
|
True, all sort of races can happen. |
|
Will take a look how to prevent from these race conditions later this week on this PR. |
|
Hey guys, if I'm understanding the problem space correctly. We are talking about race conditions for the most recent data? For instance, given: We're talking about the tail end such as Shipping off the compaction level 2's straight away should be fine. For the more recent stuff that have compaction level 1, we could introduce some handling that looks at the |
|
The problem is not knowing what to ship (e.g. by looking at min/max times) but rather that any read or write operation we have to make against the storage dir for that is subject to races with Prometheus's write operations. |
|
oh, right. I see. hmmm 🤔 |
Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
|
As discussed offline we need:
|
Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
|
PTAL |
Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
Signed-off-by: Bartek Plotka <bwplotka@gmail.com>
|
Should we merge this (quite a bit of code) or rather attempt to integrate with the additions we made in Prometheus 2.2? |
|
Let's maybe close this or iterate on this to make sure the Prometheus version with delayed compaction is there, and if it is -> we can run upload. Otherwise just block all |
|
Explained new approach in #206 |
Fixes #206
@V3ckt0r
Signed-off-by: Bartek Plotka bwplotka@gmail.com