Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shard Time Series Requests in Trickster 2.0 #573

Merged
merged 9 commits into from
May 26, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ If you would like to contribute to this project you can do so through GitHub by
Practices for Production
Environments](http://peter.bourgon.org/go-in-production/#formatting-and-style).

* Before your contribution can be accepted, you must sign off your commits to signify accepttance of the [DCO](https://github.com/probot/dco#how-it-works).
* Before your contribution can be accepted, you must sign off your commits to signify acceptance of the [DCO](https://github.com/probot/dco#how-it-works).

## Reporting Feature Requests, Bugs, Vulnerabilities and other Issues

Expand Down
18 changes: 18 additions & 0 deletions deploy/kube/configmap.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -338,6 +338,24 @@ data:
# # fastforward_ttl_ms defines the relative expiration of cached fast forward data. default is 15s
# fastforward_ttl_ms: 15000

# # shard_max_size_points defines the maximum size of a timeseries request in unique timestamps,
# # before sharding into multiple requests of this denomination and reconsitituting the results.
# # If shard_max_size_points and shard_max_size_ms are both > 0, the configuration is invalid.
# # default is 0
# shard_max_size_points: 0

# # shard_max_size_ms defines the max size of a timeseries request in milliseconds,
# # before sharding into multiple requests of this denomination and reconsitituting the results.
# # If shard_max_size_ms and shard_max_size_points are both > 0, the configuration is invalid.
# # default is 0
# shard_max_size_ms: 0

# # shard_step_ms defines the epoch-aligned cadence to use when creating shards. When set to 0,
# # shards are not aligned to the epoch at a specific step. shard_max_size_ms must be perfectly
# # divisible by shard_step_ms when both are > 0, or the configuration is invalid.
# # default is 0
# shard_step_ms: 0

# #
# # Each backend provider implements their own defaults for health_check_upstream_url, health_check_verb and health_check_query,
# # which can be overridden per backend. See /docs/health.md for more information
Expand Down
Binary file modified docs/images/diagrams/trickster-docs-graphics.graffle
Binary file not shown.
Binary file added docs/images/sharding_duration.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/sharding_duration_and_step.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/sharding_points.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/sharding_step.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
76 changes: 76 additions & 0 deletions docs/timeseries_sharding.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# Timeseries Request Sharding

## Overview

A shard means "a small part of a whole," and Trickster 2.0 supports the sharding of upstream HTTP requests when retrieving timeseries data. When configured for a given time series backend, Trickster will shard eligible requests by inspecting the time ranges needed from origin and subdividing them into smaller ranges that conform to the backend's sharding configuration. Sharded requests are sent to the origin concurrently, and their responses are reconstituted back into a single dataset by Trickster after they've all been returned.

## Mechanisms

Trickster support three main mechanisms for sharding:

----

- Maximum Timestamps Per Shard - Trickster calculates number of expected unique timestamps in the response by dividing the requested time range size by the step cadence, and then subdivides the time ranges so that each sharded request's time range will return no more timestamps than the configured maximum.

<img src="./images/sharding_points.png" width="760">

- Maximum Time Range Width Per Shard - Trickster inspects each needed time range, and subdivides them such that each sharded request's time range duration is no larger than the configured maximum.

<img src="./images/sharding_duration.png" width="760">

- Epoch-Aligned Maximum Time Range Width Per Shard - Trickster inspects each needed time range, and subdivides them such that each sharded request's time range duration is no larger than the configured maximum, while also ensuring that each shard's time boundaries are aligned to the Epoch based on the configured shard step size.

<img src="./images/sharding_step.png" width="760">

<img src="./images/sharding_duration_and_step.png" width="760">

## Configuring

### Maximum Unique Timestamp Count Per Shard

In the Trickster configuration, use the `shard_max_size_points` configuration to shard requests by limiting the maximum number of unique timestamps in each sharded response.

```yaml
backends:
example:
provider: prometheus
origin_url: http://prometheus:9090
shard_max_size_points: 10999
```

### Maximum Time Range Width Per Shard

In the Trickster configuration, use the `shard_max_size_ms` configuration to shard requests by limiting the maximum width of each sharded request's time range.

```yaml
backends:
example:
provider: 'prometheus'
origin_url: http://prometheus:9090
shard_max_size_ms: 7200000
```

### Epoch-Aligned Maximum Time Range Width Per Shard

In the Trickster configuration, use the `shard_step_ms` configuration to shard requests by limiting the maximum width of each sharded request's time range, while ensuring shards align with the epoch on the configured cadence. This is useful for aligning shard boundaries with an upstream database's partition boundaries, ensuring that sharded requests have as little partition overlap as possible.

```yaml
backends:
example:
provider: 'prometheus'
origin_url: http://prometheus:9090
shard_step_ms: 7200000
```

`shard_step_ms` can be used in conjunction with `shard_max_size_ms`, so long as `shard_max_size_ms` is perfectly divisible by `shard_step_ms`. This combination configuration will align shards against the configured shard step, while sizing each shard's time range to be multiple shard steps wide.

```yaml
backends:
example:
provider: 'prometheus'
origin_url: http://prometheus:9090
shard_step_ms: 7200000
shard_size_ms: 14400000
```

Neither `shard_step_ms` or `shard_max_size_ms` can be used in conjunction with `shard_max_size_points`.
18 changes: 18 additions & 0 deletions examples/conf/example.full.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -340,6 +340,24 @@ backends:
# # fastforward_ttl_ms defines the relative expiration of cached fast forward data. default is 15s
# fastforward_ttl_ms: 15000

# # shard_max_size_points defines the maximum size of a timeseries request in unique timestamps,
# # before sharding into multiple requests of this denomination and reconsitituting the results.
# # If shard_max_size_points and shard_max_size_ms are both > 0, the configuration is invalid.
# # default is 0
# shard_max_size_points: 0

# # shard_max_size_ms defines the max size of a timeseries request in milliseconds,
# # before sharding into multiple requests of this denomination and reconsitituting the results.
# # If shard_max_size_ms and shard_max_size_points are both > 0, the configuration is invalid.
# # default is 0
# shard_max_size_ms: 0

# # shard_step_ms defines the epoch-aligned cadence to use when creating shards. When set to 0,
# # shards are not aligned to the epoch at a specific step. shard_max_size_ms must be perfectly
# # divisible by shard_step_ms when both are > 0, or the configuration is invalid.
# # default is 0
# shard_step_ms: 0

# #
# # Each backend provider implements their own defaults for health checking
# # which can be overridden per backend configuration. See /docs/health.md for more information
Expand Down
27 changes: 27 additions & 0 deletions examples/conf/exmple.sharding.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
#
# Trickster 2.0 Example Configuration File - Example Prometheus Accelerator with Sharding
#
# To use this, run: trickster -config /path/to/simple.sharding.yaml
#
# This file demonstrates a basic configuration to accelerate
# Prometheus queries using Trickster. More documentation is
# available at https://github.com/tricksterproxy/trickster/docs/
#
# Copyright 2018 The Trickster Authors
#

frontend:
listen_port: 9090

backends:
default:
# update FQDN and Port to work in your environment
origin_url: 'http://prometheus:9090'
provider: 'prometheus'
shard_step_ms: 7200000 # this will make shards bounded by 0:00, 2:00, 4:00, 8:00, etc., (UTC)

metrics:
listen_port: 8481 # available for scraping at http://<trickster>:<metrics.listen_port>/metrics

logging:
log_level: 'info'
2 changes: 1 addition & 1 deletion examples/conf/simple.prometheus.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#
# Trickster 2.0 Example Configuration File - Simple Prometheus Reverse Proxy Cache
# Trickster 2.0 Example Configuration File - Example Prometheus Accelerator
#
# To use this, run: trickster -config /path/to/simple.prometheus.yaml
#
Expand Down
4 changes: 4 additions & 0 deletions pkg/backends/options/defaults.go
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,10 @@ const (
DefaultForwardedHeaders = "standard"
// DefaullALBMechansimName defines the default ALB Mechanism Name
DefaullALBMechansimName = "rr" // round robin
// DefaultTimeseriesShardSize defines the default shard size of 0 (no sharding)
DefaultTimeseriesShardSize = 0
// DefaultTimeseriesShardStep defines the default shard step of 0 (no sharding)
DefaultTimeseriesShardStep = 0
)

// DefaultCompressibleTypes returns a list of types that Trickster should compress before caching
Expand Down
10 changes: 10 additions & 0 deletions pkg/backends/options/errors.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,16 @@ import (
// ErrInvalidMetadata is an error for invalid metadata
var ErrInvalidMetadata = errors.New("invalid options metadata")

// ErrInvalidMaxShardSizeMS is an error for when 'shard_max_size_ms' is not
// a multiple 'shard_step_ms'
var ErrInvalidMaxShardSizeMS = errors.New(
"'shard_max_size_ms' must be a multiple of 'shard_step_ms' when both are non-zero")

// ErrInvalidMaxShardSize is an error for when both 'shard_max_size_ms' and
// 'shard_max_size_points' are used on the same backend
var ErrInvalidMaxShardSize = errors.New(
"'shard_max_size_ms' and 'shard_max_size_points' cannot both be non-zero")

// ErrMissingProvider is an error type for missing provider
type ErrMissingProvider struct {
error
Expand Down
58 changes: 58 additions & 0 deletions pkg/backends/options/options.go
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,18 @@ type Options struct {
// ReqRewriterName is the name of a configured Rewriter that will modify the request prior to
// processing by the backend client
ReqRewriterName string `yaml:"req_rewriter_name,omitempty"`
// MaxShardSizePoints defines the maximum size of a timeseries request in unique timestamps,
// before sharding into multiple requests of this denomination and reconsitituting the results.
// If MaxShardSizePoints and MaxShardSizeMS are both > 0, the configuration is invalid
MaxShardSizePoints int `yaml:"shard_max_size_points,omitempty"`
// MaxShardSizeMS defines the max size of a timeseries request in milliseconds,
// before sharding into multiple requests of this denomination and reconsitituting the results.
// If MaxShardSizePoints and MaxShardSizeMS are both > 0, the configuration is invalid
MaxShardSizeMS int `yaml:"shard_max_size_ms,omitempty"`
// ShardStepMS defines the epoch-aligned cadence to use when creating shards. When set to 0,
// shards are not aligned to the epoch at a specific step. MaxShardSizeMS must be perfectly
// divisible by ShardStepMS when both are > 0, or the configuration is invalid
ShardStepMS int `yaml:"shard_step_ms,omitempty"`

// ALBOptions holds the options for ALBs
ALBOptions *ao.Options `yaml:"alb,omitempty"`
Expand Down Expand Up @@ -183,6 +195,13 @@ type Options struct {
RuleOptions *ro.Options `yaml:"-"`
// ReqRewriter is the rewriter handler as indicated by RuleName
ReqRewriter rewriter.RewriteInstructions
// DoesShard is true when sharding will be used with this origin, based on how the
// sharding options have been configured
DoesShard bool `yaml:"-"`
// MaxShardSize is the parsed version of MaxShardSizeMS
MaxShardSize time.Duration `yaml:"-"`
// ShardStep is the parsed version of ShardStepMS
ShardStep time.Duration `yaml:"-"`

//
md yamlx.KeyLookup `yaml:"-"`
Expand Down Expand Up @@ -210,6 +229,11 @@ func New() *Options {
NegativeCacheName: DefaultBackendNegativeCacheName,
Paths: make(map[string]*po.Options),
RevalidationFactor: DefaultRevalidationFactor,
MaxShardSizePoints: DefaultTimeseriesShardSize,
MaxShardSizeMS: DefaultTimeseriesShardSize,
MaxShardSize: time.Duration(DefaultTimeseriesShardSize) * time.Millisecond,
ShardStepMS: DefaultTimeseriesShardStep,
ShardStep: time.Duration(DefaultTimeseriesShardStep) * time.Millisecond,
TLS: &to.Options{},
Timeout: time.Millisecond * DefaultBackendTimeoutMS,
TimeoutMS: DefaultBackendTimeoutMS,
Expand All @@ -233,6 +257,7 @@ func (o *Options) Clone() *Options {
no.BackfillTolerancePoints = o.BackfillTolerancePoints
no.CacheName = o.CacheName
no.CacheKeyPrefix = o.CacheKeyPrefix
no.DoesShard = o.DoesShard
no.FastForwardDisable = o.FastForwardDisable
no.FastForwardTTL = o.FastForwardTTL
no.FastForwardTTLMS = o.FastForwardTTLMS
Expand All @@ -253,6 +278,11 @@ func (o *Options) Clone() *Options {
no.RevalidationFactor = o.RevalidationFactor
no.RuleName = o.RuleName
no.Scheme = o.Scheme
no.MaxShardSize = o.MaxShardSize
no.MaxShardSizeMS = o.MaxShardSizeMS
no.MaxShardSizePoints = o.MaxShardSizePoints
no.ShardStep = o.ShardStep
no.ShardStepMS = o.ShardStepMS
no.Timeout = o.Timeout
no.TimeoutMS = o.TimeoutMS
no.TimeseriesRetention = o.TimeseriesRetention
Expand Down Expand Up @@ -343,6 +373,22 @@ func (l Lookup) Validate(ncl negative.Lookups) error {
o.TimeseriesTTL = time.Duration(o.TimeseriesTTLMS) * time.Millisecond
o.FastForwardTTL = time.Duration(o.FastForwardTTLMS) * time.Millisecond
o.MaxTTL = time.Duration(o.MaxTTLMS) * time.Millisecond
o.DoesShard = o.MaxShardSizePoints > 0 || o.MaxShardSizeMS > 0 || o.ShardStepMS > 0
o.ShardStep = time.Duration(o.ShardStepMS) * time.Millisecond
o.MaxShardSize = time.Duration(o.MaxShardSizeMS) * time.Millisecond

if o.MaxShardSizeMS > 0 && o.MaxShardSizePoints > 0 {
return ErrInvalidMaxShardSize
}

if o.ShardStepMS > 0 && o.MaxShardSizeMS == 0 {
o.MaxShardSize = o.ShardStep
}

if o.ShardStep > 0 && o.MaxShardSize%o.ShardStep != 0 {
return ErrInvalidMaxShardSizeMS
}

if o.CompressibleTypeList != nil {
o.CompressibleTypes = make(map[string]interface{})
for _, v := range o.CompressibleTypeList {
Expand Down Expand Up @@ -523,6 +569,18 @@ func SetDefaults(
no.KeepAliveTimeoutMS = o.KeepAliveTimeoutMS
}

if metadata.IsDefined("backends", name, "shard_max_size_points") {
no.MaxShardSizePoints = o.MaxShardSizePoints
}

if metadata.IsDefined("backends", name, "shard_max_size_ms") {
no.MaxShardSizeMS = o.MaxShardSizeMS
}

if metadata.IsDefined("backends", name, "shard_step_ms") {
no.ShardStepMS = o.ShardStepMS
}

if metadata.IsDefined("backends", name, "timeseries_retention_factor") {
no.TimeseriesRetentionFactor = o.TimeseriesRetentionFactor
}
Expand Down
3 changes: 3 additions & 0 deletions pkg/backends/options/options_data_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,9 @@ backends:
forwarded_headers: x
negative_cache_name: test
rule_name: ''
shard_max_size_ms: 0
shard_max_size_points: 0
shard_step_ms: 0
healthcheck:
headers:
Authorization: Basic SomeHash
Expand Down
Loading