Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow a custom label for sharding. #6437

Open
arnecls opened this issue Mar 26, 2024 · 12 comments
Open

Allow a custom label for sharding. #6437

arnecls opened this issue Mar 26, 2024 · 12 comments

Comments

@arnecls
Copy link

arnecls commented Mar 26, 2024

Component(s)

Prometheus

What is missing? Please describe.

Currently the recording rule assigning endpoint to shards is hardcoded to __address__.

return generateAddressShardingRelabelingRulesWithSourceLabel(relabelings, shards, "__address__")

We would like to be able to change the label and/or the method to, for example, allow zone aware sharding.
If all endpoints expose a certain label, let's say zone, either directly or through a rewrite rules, one could use this label instead of __address__. This would allow more control over how endpoints are distributed amongst the available shards.

In addition to this it would be nice to switch from hash based sharding to "value based sharding" where each shard is reacting to a specific value. This could be used to, for example, assign a specific shard to a specific zone, removing the need to find out which shard is scraping which zone after hashing.

Describe alternatives you've considered.

We're not aware of any alternative method to inject a "per-shard" rewrite rule to all metrics handled by a specific shard.

Environment Information.

Environment

Kubernetes Version: -
Prometheus-Operator Version: 0.72.0

@arnecls arnecls added kind/feature needs-triage Issues that haven't been triaged yet labels Mar 26, 2024
@simonpasquier simonpasquier added area/sharding and removed needs-triage Issues that haven't been triaged yet labels Mar 26, 2024
@simonpasquier
Copy link
Contributor

I thought we already had an issue about this but I can't find it but definitely 👍 for the idea of zone sharding.
Can you elaborate a bit more how this could work? My understanding is that to shard on zones, Prometheus needs to attach node's metadata to the targets.
Also do you foresee other target labels which would be good candidates for sharding?

@arnecls
Copy link
Author

arnecls commented Mar 26, 2024

Can you elaborate a bit more how this could work? My understanding is that to shard on zones, Prometheus needs to attach node's metadata to the targets.

The main issue here is that zones are cloud provider specific and not all scraping configs support the same set of labels.

My idea (for our case) is that we have re-labeling rules for different cases which generate a unified label. That would only work though if dropping happens after these rules have been evaluated.

For pod based metrics the idea was to use Kyverno (or some other suitable controller) to add a label on each pod containing the zone. A rule for this case could look like this:

  - source_labels: [__meta_kubernetes_pod_label_zone] # generated by us, same for all shards
    separator: ;
    regex: (.+)
    target_label: zone
    replacement: ${1}
    action: replace
  # ... more rewrite rules creating zone from other labels
  - source_labels: [zone] # generated by operator per shard
    separator: ;
    regex: (.*)
    modulus: 1
    target_label: __tmp_hash
    replacement: $1
    action: hashmod
  - source_labels: [__tmp_hash]
    separator: ;
    regex: "0"
    replacement: $1
    action: keep

That would still be a bit fiddly, as the shards need to be placed in the correct zone based on the output of the hash.

If we have an "equality" match for the zone label, this would make the placement simpler.
However, we either need to assign a zone value to a shard (which is doable via helm, but a bit tricky) or we need to convert the zone to a number when generating the label.
Assuming we did the latter, e.g. by using kyverno, the following would work.

  - source_labels: [__meta_kubernetes_pod_label_zone]  # generated by us, same for all shards
    separator: ;
    regex: (.+)
    target_label: zone
    replacement: ${1}
    action: replace
  # ... more rewrite rules creating zone from other labels
  - source_labels: [zone] # generated by operator per shard
    separator: ;
    regex: "0"
    replacement: $1
    action: keep

From the Prometheus/Prometheus agent CRD I'd expect to set the following

shardingOn: "zone"
shardingByHash: false

Now for the "helm could totally generate the equality check", you could add something like this

shardValues:
- europe-west4-a
- europe-west4-b
- europe-west4-c

When generating the per-shard rewrite rule, you could do something like regex: "(shardValues[ shardNumber % len(shardValues)])".

@nicolastakashi
Copy link
Contributor

My two cents!
I do think by default we should shard on the target, but users should be free to shard for any other label.
This will bring more flexibility to users, and they can play with what label will provide a better target distribution for their needs.

@mviswanathsai
Copy link
Contributor

I think we would need to create the new field, get a hold of it in each of the config generating functions (for ServiceMonitor, PodMonitors) and pass the custom ShardingOn value in place of __address__. With this, we would add corresponding validation too for the field.

What do you think? @simonpasquier

@simonpasquier
Copy link
Contributor

simonpasquier commented Mar 29, 2024

Let me try to summarize the requirements:

  • Users should be able to shard targets by any custom label (instead of just the target's address).
  • When it's possible to create a label (e.g. "zone") for which the value maps directly to the shard number, the hashmod function isn't necessary.

I could see something like:

spec:
  shardingStrategy:
    # ShardingByTarget == current implementation. There's no additional option.
    type: ShardingByTarget|ShardingByLabel
    shardingByLabel: # Only used if type=ShardingByLabel
      # HashMod: hashmod applied to the label value then the target is kept if the hash value == the shard number, 
      # None: the target is kept if the label value == the shard number.
      transform: HashMod|None
      label: <string>
      relabels: [...] # Additional relabel configs that may required to build the shard label.

In the case of @arnecls it would turn into

spec:
  shardingStrategy:
    type: ShardingByLabel
    shardingByLabel:
      transform: None
      label: zone
      relabels:
      - sourceLabels: [__meta_kubernetes_pod_label_zone]
        targetLabel: zone
      - ...

Open questions

  • Probes use the __param_target label. Do we need something special when sharding by a custom label?
  • Investigate sharding implementation for additionalScrapeConfigs.
  • If we focus on sharding by zone, there would be no guarantee that Prometheus pods scrape targets from the same zone IIUC. Is it a problem?

@ArthurSens
Copy link
Member

Open questions

  • Investigate sharding implementation for additionalScrapeConfigs.

I believe we want to deprecate additionalScrapeConfigs after ScrapeConfig CRD became a thing, couldn't we just ignore or document that?

@mviswanathsai
Copy link
Contributor

mviswanathsai commented Apr 4, 2024

Probes use the __param_target label. Do we need something special when sharding by a custom label?

I am not sure I follow what you are trying to say. __param_target label is used to pass the targets as parameters, and so we would not want the user to write something to that label. Are you referring to that? How is sharding on this label a problem, it would be, if we are writing to this label but that is not the case here IIUC.

@simonpasquier
Copy link
Contributor

@mviswanathsai

Probe targets are sharded differently from "regular" targets. How will it work in case of custom sharding?

func generateAddressShardingRelabelingRulesForProbes(relabelings []yaml.MapSlice, shards int32) []yaml.MapSlice {
return generateAddressShardingRelabelingRulesWithSourceLabel(relabelings, shards, "__param_target")
}

relabelings = generateAddressShardingRelabelingRulesForProbes(relabelings, shards)
cfg = append(cfg, yaml.MapItem{Key: "relabel_configs", Value: relabelings})

@arnecls
Copy link
Author

arnecls commented Apr 8, 2024

If we focus on sharding by zone, there would be no guarantee that Prometheus pods scrape targets from the same zone IIUC. Is it a problem?

That's actually a good point.
We cannot set nodeSelector config per shard yet.

This could be solved via something like this (based on the previous suggestion)

spec:
  shardingStrategy:
    perShardNodeAffinity:
      label: 'topology.kubernetes.io/zone'
      values:
        - europe-west4-a
        - europe-west4-b
  # ...

The value of the given label could then be calucluated via

affinityValues := spec.shardingStrategy.perShardNodeAffinity.values
labelValue := affinityValues[shardNumber % len(affinityValues)]

@simonpasquier
Copy link
Contributor

We've discussed this feature request during today's office hours and the general consensus was that we'd need a design proposal to move forward (https://prometheus-operator.dev/docs/prologue/contributing/#proposal-process).

@arnecls
Copy link
Author

arnecls commented Apr 8, 2024

@simonpasquier if I understand that link (and the repository structure) correctly, a proposal requires a PR with a single file in Documentation/proposals, structured as described in the link. Correct?

As I initiated this issue, shall I write this doc? How do we collaborate on this? Is there a deadline/timeframe?

@simonpasquier
Copy link
Contributor

@arnecls yes you're right: we need someone to submit a PR to Documentation/proposals (here is one example: #5961).
If you're willing to contribute, it's even better :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants