Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[jaeger-v2] Add remotesampling extension #5389

Draft
wants to merge 39 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 31 commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
1764c4c
init
Pushkarm029 Apr 25, 2024
ac859c7
fix
Pushkarm029 Apr 25, 2024
0b53417
added README.md
Pushkarm029 Apr 25, 2024
a7ba921
Merge branch 'main' into sampling_extension
Pushkarm029 Apr 25, 2024
5f267d0
updated README.md
Pushkarm029 Apr 26, 2024
901f76a
fix
Pushkarm029 Apr 26, 2024
7a03553
fix
Pushkarm029 Apr 26, 2024
36f38d2
Merge branch 'main' into sampling_extension
Pushkarm029 Apr 26, 2024
cb0d857
[e2e tests] Ping v2 binary to be ready before running tests (#5382)
dependabot[bot] Apr 26, 2024
67f2203
[agent] Use grpc.NewClient (#5392)
yurishkuro Apr 27, 2024
9fcf052
Bump anchore/sbom-action from 0.15.10 to 0.15.11 (#5395)
dependabot[bot] Apr 29, 2024
976dfb8
[jaeger-v2] Normalize config names (#5400)
yurishkuro Apr 30, 2024
cf8fe8d
Combine jaeger UI release notes with jaeger backend (#5405)
albertteoh May 1, 2024
a78782e
[jaeger-v2] Define an internal interface of storage v2 spanstore (#5399)
james-ryans May 1, 2024
6ba2c8a
Upgrade mockery from v2.14.0 to v2.42.3 (#5404)
Pushkarm029 May 1, 2024
487013e
Bump google.golang.org/protobuf from 1.33.0 to 1.34.0 (#5401)
dependabot[bot] May 1, 2024
98b5f99
Prepare releaes 1.57.0 (#5406)
albertteoh May 1, 2024
7c7aa94
Add `Purge` method for ES/OS (#5407)
Pushkarm029 May 2, 2024
d631ae9
[es] Remove unused indexCache (#5408)
yurishkuro May 2, 2024
96c4e33
[tests] Simplify integration tests (#5409)
yurishkuro May 3, 2024
a3a3f20
Only build Docker images for Crossdock tests for linux/amd64 (#5410)
varshith257 May 3, 2024
a15962b
Use helper action to retry codecov uploads (#5411)
yurishkuro May 3, 2024
fab02d8
[jaeger-v2] add elasticsearch & opensearch e2e integration test (#5345)
Pushkarm029 May 3, 2024
f1c253a
[v2] Add diagrams to the docs (#5412)
yurishkuro May 3, 2024
3579fe3
Add Purge method for cassandra (#5414)
akagami-harsh May 4, 2024
b4ab638
Add missing mermaid markup (#5413)
yurishkuro May 4, 2024
74b07a0
Merge branch 'main' into sampling_extension
Pushkarm029 May 4, 2024
fcc5547
major changes
Pushkarm029 May 6, 2024
8d4c4b1
some fixes
Pushkarm029 May 6, 2024
076eb58
minor fix
Pushkarm029 May 7, 2024
2912e9a
some progress
Pushkarm029 May 8, 2024
8fd879d
Merge branch 'main' into sampling_extension
Pushkarm029 May 10, 2024
bcd90b7
Merge branch 'main' into sampling_extension
Pushkarm029 May 22, 2024
eb029f7
some changes
Pushkarm029 May 22, 2024
1631830
fix
Pushkarm029 May 22, 2024
28f0dc9
experimental config sharing
Pushkarm029 May 22, 2024
f39093b
fix
Pushkarm029 May 24, 2024
54a20fa
minor fix
Pushkarm029 May 25, 2024
9fb0797
Merge branch 'main' into sampling_extension
Pushkarm029 May 26, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
14 changes: 12 additions & 2 deletions cmd/jaeger/config-badger.yaml
Pushkarm029 marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -1,12 +1,20 @@
service:
extensions: [jaeger_storage, jaeger_query]
extensions: [jaeger_storage, jaeger_query, remote_sampling]
pipelines:
traces:
receivers: [otlp]
processors: [batch]
processors: [batch, adaptivesampling]
exporters: [jaeger_storage_exporter]

extensions:
remote_sampling:
# You can either use file or adaptive sampling strategy in remote_sampling
file:
path: ./cmd/jaeger/sampling-strategies.json
# adaptive:
# initial_sampling_probability: 0.1
# strategy_store: badger_main

jaeger_query:
trace_storage: badger_main
trace_storage_archive: badger_archive
Expand Down Expand Up @@ -37,6 +45,8 @@ receivers:

processors:
batch:
adaptivesampling:
strategy_store: badger_main

exporters:
jaeger_storage_exporter:
Expand Down
15 changes: 13 additions & 2 deletions cmd/jaeger/config-cassandra.yaml
Original file line number Diff line number Diff line change
@@ -1,12 +1,20 @@
service:
extensions: [jaeger_storage, jaeger_query]
extensions: [jaeger_storage, jaeger_query, remote_sampling]
pipelines:
traces:
receivers: [otlp]
processors: [batch]
processors: [batch, adaptivesampling]
exporters: [jaeger_storage_exporter]

extensions:
remote_sampling:
# You can either use file or adaptive sampling strategy in remote_sampling
file:
path: ./cmd/jaeger/sampling-strategies.json
# adaptive:
# initial_sampling_probability: 0.1
# strategy_store: cassandra_main

jaeger_query:
trace_storage: cassandra_main
trace_storage_archive: cassandra_archive
Expand Down Expand Up @@ -35,6 +43,9 @@ receivers:

processors:
batch:
adaptivesampling:
strategy_store: cassandra_main


exporters:
jaeger_storage_exporter:
Expand Down
14 changes: 12 additions & 2 deletions cmd/jaeger/config-elasticsearch.yaml
Original file line number Diff line number Diff line change
@@ -1,12 +1,20 @@
service:
extensions: [jaeger_storage, jaeger_query]
extensions: [jaeger_storage, jaeger_query, remote_sampling]
pipelines:
traces:
receivers: [otlp]
processors: [batch]
processors: [batch, adaptivesampling]
exporters: [jaeger_storage_exporter]

extensions:
remote_sampling:
# You can either use file or adaptive sampling strategy in remote_sampling
file:
path: ./cmd/jaeger/sampling-strategies.json
# adaptive:
# initial_sampling_probability: 0.1
# strategy_store: es_main

jaeger_query:
trace_storage: es_main
trace_storage_archive: es_archive
Expand Down Expand Up @@ -35,6 +43,8 @@ receivers:

processors:
batch:
adaptivesampling:
strategy_store: es_main

exporters:
jaeger_storage_exporter:
Expand Down
14 changes: 12 additions & 2 deletions cmd/jaeger/config-opensearch.yaml
Original file line number Diff line number Diff line change
@@ -1,12 +1,20 @@
service:
extensions: [jaeger_storage, jaeger_query]
extensions: [jaeger_storage, jaeger_query, remote_sampling]
pipelines:
traces:
receivers: [otlp]
processors: [batch]
processors: [batch, adaptivesampling]
exporters: [jaeger_storage_exporter]

extensions:
remote_sampling:
# You can either use file or adaptive sampling strategy in remote_sampling
file:
path: ./cmd/jaeger/sampling-strategies.json
# adaptive:
# initial_sampling_probability: 0.1
# strategy_store: os_main

jaeger_query:
trace_storage: os_main
trace_storage_archive: os_archive
Expand Down Expand Up @@ -36,6 +44,8 @@ receivers:

processors:
batch:
adaptivesampling:
strategy_store: os_main

exporters:
jaeger_storage_exporter:
Expand Down
14 changes: 12 additions & 2 deletions cmd/jaeger/config.yaml
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
service:
extensions: [jaeger_storage, jaeger_query]
extensions: [jaeger_storage, jaeger_query, remote_sampling]
pipelines:
traces:
receivers: [otlp, jaeger, zipkin]
processors: [batch]
processors: [batch, adaptivesampling]
exporters: [jaeger_storage_exporter]

extensions:
Expand All @@ -13,6 +13,14 @@ extensions:
# zpages:
# endpoint: 0.0.0.0:55679

remote_sampling:
# You can either use file or adaptive sampling strategy in remote_sampling
file:
path: ./cmd/jaeger/sampling-strategies.json
# adaptive:
# initial_sampling_probability: 0.1
# strategy_store: memstore

jaeger_query:
trace_storage: memstore
trace_storage_archive: memstore_archive
Expand Down Expand Up @@ -42,6 +50,8 @@ receivers:

processors:
batch:
adaptivesampling:
strategy_store: memstore

exporters:
jaeger_storage_exporter:
Expand Down
6 changes: 4 additions & 2 deletions cmd/jaeger/internal/components.go
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,9 @@ import (
"github.com/jaegertracing/jaeger/cmd/jaeger/internal/exporters/storageexporter"
"github.com/jaegertracing/jaeger/cmd/jaeger/internal/extension/jaegerquery"
"github.com/jaegertracing/jaeger/cmd/jaeger/internal/extension/jaegerstorage"
"github.com/jaegertracing/jaeger/cmd/jaeger/internal/extension/remotesampling"
"github.com/jaegertracing/jaeger/cmd/jaeger/internal/integration/storagecleaner"
"github.com/jaegertracing/jaeger/cmd/jaeger/internal/processors/adaptivesampling"
)

type builders struct {
Expand Down Expand Up @@ -61,8 +63,8 @@ func (b builders) build() (otelcol.Factories, error) {
// add-ons
jaegerquery.NewFactory(),
jaegerstorage.NewFactory(),
remotesampling.NewFactory(),
storagecleaner.NewFactory(),
// TODO add adaptive sampling
)
if err != nil {
return otelcol.Factories{}, err
Expand Down Expand Up @@ -99,7 +101,7 @@ func (b builders) build() (otelcol.Factories, error) {
batchprocessor.NewFactory(),
memorylimiterprocessor.NewFactory(),
// add-ons
// TODO add adaptive sampling
adaptivesampling.NewFactory(),
)
if err != nil {
return otelcol.Factories{}, err
Expand Down
103 changes: 103 additions & 0 deletions cmd/jaeger/internal/extension/remotesampling/config.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
// Copyright (c) 2024 The Jaeger Authors.
// SPDX-License-Identifier: Apache-2.0

package remotesampling

import (
"errors"
"reflect"
"time"
)

var (
errNoSource = errors.New("no sampling strategy specified, has to be either 'adaptive' or 'static'")
Pushkarm029 marked this conversation as resolved.
Show resolved Hide resolved
errMultipleSource = errors.New("only one sampling strategy can be specified, has to be either 'adaptive' or 'static'")
)

type FileConfig struct {
// File specifies a local file as the strategies source
Path string `mapstructure:"path"`
}

type AdaptiveConfig struct {
Pushkarm029 marked this conversation as resolved.
Show resolved Hide resolved
StrategyStore string `mapstructure:"strategy_store"`

// TargetSamplesPerSecond is the global target rate of samples per operation.
// TODO implement manual overrides per service/operation.
TargetSamplesPerSecond float64 `mapstructure:"target_samples_per_second"`

// DeltaTolerance is the acceptable amount of deviation between the observed and the desired (target)
// throughput for an operation, expressed as a ratio. For example, the value of 0.3 (30% deviation)
// means that if abs((actual-expected) / expected) < 0.3, then the actual sampling rate is "close enough"
// and the system does not need to send an updated sampling probability (the "control signal" u(t)
// in the PID Controller terminology) to the sampler in the application.
//
// Increase this to reduce the amount of fluctuation in the calculated probabilities.
DeltaTolerance float64 `mapstructure:"delta_tolerance"`

// CalculationInterval determines how often new probabilities are calculated. E.g. if it is 1 minute,
// new sampling probabilities are calculated once a minute and each bucket will contain 1 minute worth
// of aggregated throughput data.
CalculationInterval time.Duration `mapstructure:"calculation_interval"`

// AggregationBuckets is the total number of aggregated throughput buckets kept in memory, ie. if
// the CalculationInterval is 1 minute (each bucket contains 1 minute of thoughput data) and the
// AggregationBuckets is 3, the adaptive sampling processor will keep at most 3 buckets in memory for
// all operations.
// TODO(wjang): Expand on why this is needed when BucketsForCalculation seems to suffice.
AggregationBuckets int `mapstructure:"aggregation_buckets"`

// BucketsForCalculation determines how many previous buckets used in calculating the weighted QPS,
// ie. if BucketsForCalculation is 1, only the most recent bucket will be used in calculating the weighted QPS.
BucketsForCalculation int `mapstructure:"buckets_for_calculation"`

// Delay is the amount of time to delay probability generation by, ie. if the CalculationInterval
// is 1 minute, the number of buckets is 10, and the delay is 2 minutes, then at one time
// we'll have [now()-12m,now()-2m] range of throughput data in memory to base the calculations
// off of. This delay is necessary to counteract the rate at which the jaeger clients poll for
// the latest sampling probabilities. The default client poll rate is 1 minute, which means that
// during any 1 minute interval, the clients will be fetching new probabilities in a uniformly
// distributed manner throughout the 1 minute window. By setting the delay to 2 minutes, we can
// guarantee that all clients can use the latest calculated probabilities for at least 1 minute.
Delay time.Duration `mapstructure:"delay"`

// InitialSamplingProbability is the initial sampling probability for all new operations.
InitialSamplingProbability float64 `mapstructure:"initial_sampling_probability"`

// MinSamplingProbability is the minimum sampling probability for all operations. ie. the calculated sampling
// probability will be in the range [MinSamplingProbability, 1.0].
MinSamplingProbability float64 `mapstructure:"min_sampling_probability"`

// MinSamplesPerSecond determines the min number of traces that are sampled per second.
// For example, if the value is 0.01666666666 (one every minute), then the sampling processor will do
// its best to sample at least one trace a minute for an operation. This is useful for low QPS operations
// that may never be sampled by the probabilistic sampler.
MinSamplesPerSecond float64 `mapstructure:"min_samples_per_second"`

// LeaderLeaseRefreshInterval is the duration to sleep if this processor is elected leader before
// attempting to renew the lease on the leader lock. NB. This should be less than FollowerLeaseRefreshInterval
// to reduce lock thrashing.
LeaderLeaseRefreshInterval time.Duration `mapstructure:"leader_lease_refresh_interval"`

// FollowerLeaseRefreshInterval is the duration to sleep if this processor is a follower
// (ie. failed to gain the leader lock).
FollowerLeaseRefreshInterval time.Duration `mapstructure:"follower_lease_refresh_interval"`
}

type Config struct {
File FileConfig `mapstructure:"file"`
Adaptive AdaptiveConfig `mapstructure:"adaptive"`
Port string `mapstructure:"port"`
Pushkarm029 marked this conversation as resolved.
Show resolved Hide resolved
}

func (cfg *Config) Validate() error {
emptyCfg := createDefaultConfig().(*Config)
if reflect.DeepEqual(*cfg, *emptyCfg) {
return errNoSource
}

if cfg.File.Path != "" && cfg.Adaptive.StrategyStore != "" {
return errMultipleSource
}
return nil
}