Load Balancing by Attribute #33660

danielbanks · 2024-06-19T16:48:04Z

Component(s)

exporter/loadbalancing

Is your feature request related to a problem? Please describe.

In our project want to setup load balanced sampling based on a session ID an attribute.

In our project, we care about RUM and we have the concept of a user session with a session ID set as an attribute on our traces. We sample based on the session ID rather than the trace ID. This way we preserve telemetry of the whole user session.

The sampling is working fine but it is client-side head sampling. As we look to scale our solution we want to move this to the collector and introduce load balancing.

How do we achieve session sampling?

Right now we use the same ID generator for trace IDs as for session and we set it as an attribute. Then we have a head sampling strategy that uses the same logic as the probabilistic head sampler, but rather than applying the decision to the trace ID we apply it to the session ID. This ensures we make the same sampling decision for the whole user session.

What we would like to do is move these sampling decisions off the client and into the collector so that we have more flexibility. Our client is an Android application and making these decisions client side is not a long-term solution because we have to deal with application updates etc.

Following the recommended practice we would like to have a 2 layer collector setup, with the first layer load balancing the second. The issue is that the load balancer only supports decisions based on trace ID or service name.

Given that we want to sample based on session ID (an attribute), then making load balancing decisions on trace ID alone is not enough. We need to load balance telemetry with the same session ID to the same collector instance so that consistent sampling decisions can be made.

It doesn't look like the load balancer currently supports balancing based on an attribute. This is a friendly request to add it!

Describe the solution you'd like

The ability to route telemetry based on attributes in addition to service name and trace ID

Describe alternatives you've considered

No response

Additional context

No response

github-actions · 2024-06-19T16:49:14Z

Pinging code owners:

exporter/loadbalancing: @jpkrohling

See Adding Labels via Comments if you do not have permissions to add labels yourself.

jpkrohling · 2024-06-20T09:22:55Z

I believe there are a couple of comments to this:

balancing based on an arbitrary attribute is doable, and we are doing that already for the service name. It should be easy to extend this function here to do that:

opentelemetry-collector-contrib/exporter/loadbalancingexporter/trace_exporter.go

Lines 135 to 165 in 2aa0e6b

    
           func routingIdentifiersFromTraces(td ptrace.Traces, key routingKey) (map[string]bool, error) { 
        
           	ids := make(map[string]bool) 
        
           	rs := td.ResourceSpans() 
        
           	if rs.Len() == 0 { 
        
           		return nil, errors.New("empty resource spans") 
        
           	} 
        
           	ils := rs.At(0).ScopeSpans() 
        
           	if ils.Len() == 0 { 
        
           		return nil, errors.New("empty scope spans") 
        
           	} 
        
           	spans := ils.At(0).Spans() 
        
           	if spans.Len() == 0 { 
        
           		return nil, errors.New("empty spans") 
        
           	} 
        
           	if key == svcRouting { 
        
           		for i := 0; i < rs.Len(); i++ { 
        
           			svc, ok := rs.At(i).Resource().Attributes().Get("service.name") 
        
           			if !ok { 
        
           				return nil, errors.New("unable to get service name") 
        
           			} 
        
           			ids[svc.Str()] = true 
        
           		} 
        
           		return ids, nil 
        
           	} 
        
           	tid := spans.At(0).TraceID() 
        
           	ids[string(tid[:])] = true 
        
           	return ids, nil 
        
           }

I'm not quite sure you need two layers: if you are doing probabilistic sampling based on the session ID, it's pretty much the same idea we have for the probabilistic sampling at the collector, which means that it can be consistent across collector instances without the need to centralize all session IDs on the same decision instances. So, you might not need the balancer to know about session IDs at all

danielbanks · 2024-07-02T10:54:54Z

Thanks for the reply @jpkrohling. That's useful insight.

I'd like to move our probabilistic sampling of sessions into the collector rather than having this client side. But the sampler configuration can only specify custom attributes for logs not traces. Our target solution is to have load-balanced telemetry across logs and traces, which is sampled based on complete sessions. We want to observe the users sessions so that we can understand the full journey.

Do you have any recommendations for how this can be achieved with the current tooling?

jpkrohling · 2024-07-05T11:57:16Z

Take a look at the code for the probabilistic sampling processor at contrib. It could be changed to use specific attributes instead of trace ID, which would be sufficient for your use case, if I'm understanding it correctly.

danielbanks added enhancement New feature or request needs triage New item requiring triage labels Jun 19, 2024

github-actions bot added the exporter/loadbalancing label Jun 19, 2024

This was referenced Jun 20, 2024

Weekly Report: 2024-06-13 - 2024-06-20 LucaLanziani/opentelemetry-collector-contrib#14

Closed

Weekly Report: 2024-06-13 - 2024-06-20 LucaLanziani/opentelemetry-collector-contrib#15

Closed

github-actions bot mentioned this issue Jul 2, 2024

Weekly Report: 2024-06-25 - 2024-07-02 #33839

Open

github-actions bot mentioned this issue Jul 9, 2024

Weekly Report: 2024-07-02 - 2024-07-09 #33962

Open

This was referenced Jul 16, 2024

Weekly Report: 2024-07-09 - 2024-07-16 #34087

Open

Weekly Report: 2024-07-16 - 2024-07-23 #34202

Open

This was referenced Jul 30, 2024

Weekly Report: 2024-07-23 - 2024-07-30 #34301

Open

Weekly Report: 2024-07-30 - 2024-08-06 #34410

Open

This was referenced Aug 13, 2024

Weekly Report: 2024-08-06 - 2024-08-13 #34626

Open

Weekly Report: 2024-08-13 - 2024-08-20 #34743

Open

github-actions bot mentioned this issue Aug 27, 2024

Weekly Report: 2024-08-20 - 2024-08-27 #34856

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Load Balancing by Attribute #33660

Load Balancing by Attribute #33660

danielbanks commented Jun 19, 2024

github-actions bot commented Jun 19, 2024

jpkrohling commented Jun 20, 2024

danielbanks commented Jul 2, 2024 •

edited

Loading

jpkrohling commented Jul 5, 2024

Load Balancing by Attribute #33660

Load Balancing by Attribute #33660

Comments

danielbanks commented Jun 19, 2024

Component(s)

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

github-actions bot commented Jun 19, 2024

jpkrohling commented Jun 20, 2024

danielbanks commented Jul 2, 2024 • edited Loading

jpkrohling commented Jul 5, 2024

danielbanks commented Jul 2, 2024 •

edited

Loading