Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storage backend examples #2357

Merged
merged 4 commits into from
Jul 16, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 5 additions & 4 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,10 +46,11 @@ simplifies the operation and significantly lowers the cost of Loki.
1. [Authentication](operations/authentication.md)
2. [Observability](operations/observability.md)
3. [Scalability](operations/scalability.md)
4. [Storage](operations/storage/README.md)
1. [Table Manager](operations/storage/table-manager.md)
2. [Retention](operations/storage/retention.md)
3. [BoltDB Shipper](operations/storage/boltdb-shipper.md)
4. [Storage](storage.md)
1. [Operations](operations/storage/README.md)
1. [Table Manager](operations/storage/table-manager.md)
2. [Retention](operations/storage/retention.md)
3. [BoltDB Shipper](operations/storage/boltdb-shipper.md)
5. [Multi-tenancy](operations/multi-tenancy.md)
6. [Loki Canary](operations/loki-canary.md)
9. [HTTP API](api.md)
Expand Down
2 changes: 1 addition & 1 deletion docs/configuration/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -824,7 +824,7 @@ gcs:
# CLI flag: -gcs.request-timeout
[request_timeout: <duration> | default = 0s]

# Configures storing chunks in Cassandra
# Configures storing chunks and/or the index in Cassandra
cassandra:
# Comma-separated hostnames or IPs of Cassandra instances
# CLI flag: -cassandra.addresses
Expand Down
2 changes: 1 addition & 1 deletion docs/operations/storage/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Loki needs to store two different types of data: **chunks** and **indexes**.

Loki receives logs in separate streams, where each stream is uniquely identified
by its tenant ID and its set of labels. As log entries from a stream arrive,
they are GZipped as "chunks" and saved in the chunks store. See [chunk
they are compressed as "chunks" and saved in the chunks store. See [chunk
format](#chunk-format) for how chunks are stored internally.

The **index** stores each stream's label set and links them to the individual
Expand Down
296 changes: 296 additions & 0 deletions docs/storage.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,296 @@
# Storage

Loki uses a two pronged strategy regarding storage, which is responsible for both it's limitations and it's advantages. The main idea is that logs are large and traditional indexing strategies are prohibitively expensive and complex to run at scale. This often brings along ancillary procedure costs in the form of schema design, index management/rotation, backup/restore protocols, etc. Instead, Loki stores all the its log content unindexed in object storage. It then uses the Prometheus label paradigm along with a small but specialized index store to allow lookup, matching, and filtering based on the these labels. When a set of unique key/value label pairs are combined with their logs, we call this a _log stream_, which is generally analagous to a log file on disk. It may have labels like `{app="api", env="production", filename="/var/logs/app.log"}`, which together uniqely identify it. The object storage is responsible for storing the compressed logs cheaply while the index takes care of storing these labels in a way that enables fast, effective querying.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add a small reference to our goal toward not requiring an index store ? WDYT ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to leave this point out for now.

* [Chunk Clients](#Implementations---Chunks)
* [Cassandra](#Cassandra)
* [GCS](#GCS)
* [File System](#File-System)
* [S3](#S3)
* [Notable Mentions](#Notable-Mentions)
* [Index Clients](#Implementations---Index)
* [Cassandra](#Cassandra-1)
* [BigTable](#BigTable)
* [DynamoDB](#DynamoDB)
* [BoltDB](#BoltDB)
* [Period Configs](#Period-Configs)
* [Table Manger](#Table-Manager)
* [Upgrading Schemas](#Upgrading-Schemas)
* [Retention](#Retention)
* [Examples](Examples)
* [Single machine/local development (boltdb+filesystem)](Single-machine/local-development-(boltdb+filesystem))
* [GCP deployment (GCS+BigTable)](GCP-deployment-(GCS+BigTable))
* [AWS deployment (S3+DynamoDB)](AWS-deployment-(S3+DynamoDB))
* [On prem deployment (Cassandra+Cassandra)](On-prem-deployment-(Cassandra+Cassandra))
* [On prem deployment (Cassandra+MinIO)](On-prem-deployment-(Cassandra+MinIO))

## Implementations - Chunks

### Cassandra

Cassandra is a popular database and one of Loki's possible chunk stores and is production safe.

### GCS

GCS is a hosted object store offered by Google. It is a good candidate for a managed object store, especially when you're already running on GCP, and is production safe.

### File System

The file system is the simplest backend for chunks, although it's also susceptible to data loss as it's unreplicated. This is common for single binary deployments though, as well as for those trying out loki or doing local development on the project. It is similar in concept to many Prometheus deployments where a single Prometheus is responsible for monitoring a fleet.

### S3

S3 is AWS's hosted object store. It is a good candidate for a managed object store, especially when you're already running on AWS, and is production safe.

### Notable Mentions

You may use any subsitutable services, such as those that implement the S3 API like [MinIO](https://min.io/).

## Implementations - Index

### Cassandra

Cassandra can also be utilized for the index store and asides from the experimental [boltdb-shipper](./storage/boltdb-shipper.md), it's the only non-cloud offering that can be used for the index that's horizontally scalable and has configurable replication. It's a good candidate when you already run Cassandra, are running on-prem, or do not wish to use a managed cloud offering.

### BigTable

Bigtable is a cloud database offered by Google. It is a good candidate for a managed index store if you're already using it (due to it's heavy fixed costs) or wish to run in GCP.

### DynamoDB

DynamoDB is a cloud database offered by AWS. It is a good candidate for a managed index store, especially if you're already running in AWS.

#### Rate Limiting

DynamoDB is susceptible to rate limiting, particularly due to overconsuming what is called [provisioned capacity](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.ReadWriteCapacityMode.html). This can be controlled via the [provisioning](#Provisioning) configs in the table manager.

### BoltDB

BoltDB is an embedded database on disk. It is not replicated and thus cannot be used for high availability or clustered Loki deployments, but is commonly paired with a `filesystem` chunk store for proof of concept deployments, trying out Loki, and development. There is also an experimental mode, the [boltdb-shipper](./operations/storage/boltdb-shipper.md), which aims to support clustered deployments using `boltdb` as an index.

## Period Configs

Loki aims to be backwards compatible and over the course of it's development has had many internal changes that facilitate better and more efficient storage/querying. Loki allows incrementally upgrading to these new storage _schemas_ and can query across them transparently. This makes upgrading a breeze. For instance, this is what it looks like when migrating from the v10 -> v11 schemas starting 2020-07-01:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's awesome !


```yaml
schema_config:
configs:
- from: 2019-07-01
store: boltdb
object_store: filesystem
schema: v10
index:
prefix: index_
period: 168h
- from: 2020-07-01
store: boltdb
object_store: filesystem
schema: v11
index:
prefix: index_
period: 168h
```

For all data ingested before 2020-07-01, Loki used the v10 schema and then switched after that point to the more effective v11. This dramatically simplifies upgrading, ensuring it's simple to take advantages of new storage optimizations. These configs should be immutable for as long as you care about retention.

## Table Manager

One of the subcomponents in Loki is the `table-manager`. It is responsible for pre-creating and expiring index tables. This helps partition the writes and reads in loki across a set of distinct indices in order to prevent unbounded growth.

```yaml
table_manager:
# The retention period must be a multiple of the index / chunks
# table "period" (see period_config).
retention_deletes_enabled: true
# This is 15 weeks retention, based on the 168h (1week) period durations used in the rest of the examples.
retention_period: 2520h
```

For more information, see the table manager [doc](./operations/storage/table-manager.md).

### Provisioning

In the case of AWS DynamoDB, you'll likely want to tune the provisioned throughput for your tables as well. This is to prevent your tables being rate limited on one hand and assuming unnecessary cost on the other. By default Loki uses a [provisioned capacity](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.ReadWriteCapacityMode.html) strategy for DynamoDB tables like so:

```
table_manager:
index_tables_provisioning:
# Read/write throughput requirements for the current table
# (the table which would handle writes/reads for data timestamped at the current time)
provisioned_write_throughput: <int> | default = 3000
provisioned_read_throughput: <int> | default = 300

# Read/write throughput requirements for non-current tables
inactive_write_throughput: <int> | default = 1
inactive_read_throughput: <int> | Default = 300
```

Note, there are a few other DynamoDB provisioning options including DynamoDB autoscaling and on-demand capacity. See the [docs](./configuration/README.md#provision_config) for more information.

## Upgrading Schemas

When a new schema is released and you want to gain the advantages it provides, you can! Loki can transparently query & merge data from across schema boundaries so there is no disruption of service and upgrading is easy.

First, you'll want to create a new [period_config](./configuration/README.md#period_config) entry in your [schema_config](./configuration/README.md#schema_config). The important thing to remember here is to set this at some point in the _future_ and then roll out the config file changes to Loki. This allows the table manager to create the required table in advance of writes and will ensure that existing data isn't queried as if it adheres to the new schema.

As an example, let's say it's 2020-07-14 and we want to start using the `v11` schema on the 20th:
```yaml
schema_config:
configs:
- from: 2019-07-14
store: boltdb
object_store: filesystem
schema: v10
index:
prefix: index_
period: 168h
- from: 2020-07-20
store: boltdb
object_store: filesystem
schema: v11
index:
prefix: index_
period: 168h
```

It's that easy; we just created a new entry starting on the 20th.

## Retention

With the exception of the `filesystem` chunk store, Loki will not delete old chunk stores. This is generally handled instead by configuring TTLs (time to live) in the chunk store of your choice (bucket lifecycles in S3/GCS, and TTLs in Cassandra). Neither will Loki currently delete old data when your local disk fills when using the `filesystem` chunk store -- deletion is only determined by retention duration.

We're interested in adding targeted deletion in future Loki releases (think tenant or stream level granularity) and may include other strategies as well.

For more information, see the configuration [docs](./operations/storage/retention.md).


## Examples

### Single machine/local development (boltdb+filesystem)

```yaml
storage_config:
boltdb:
directory: /tmp/loki/index
filesystem:
directory: /tmp/loki/chunks

schema_config:
configs:
- from: 2020-07-01
store: boltdb
object_store: filesystem
schema: v11
index:
prefix: index_
period: 168h
```

### GCP deployment (GCS+BigTable)

```yaml
storage_config:
bigtable:
instance: <instance>
project: <project>
gcs:
bucket_name: <bucket>

schema_config:
configs:
- from: 2020-07-01
store: bigtable
object_store: gcs
schema: v11
index:
prefix: index_
period: 168h
```

### AWS deployment (S3+DynamoDB)

```yaml
storage_config:
aws:
s3: s3://<access_key>:<uri-encoded-secret-access-key>@<region>
bucketnames: <bucket1,bucket2>
dynamodb:
dynamodb_url: dynamodb://<access_key>:<uri-encoded-secret-access-key>@<region>

schema_config:
configs:
- from: 2020-07-01
store: aws
object_store: aws
schema: v11
index:
prefix: index_
period: 168h
```

If you don't wish to hard-code S3 credentials, you can also configure an EC2
instance role by changing the `storage_config` section:

```yaml
storage_config:
aws:
s3: s3://region
bucketnames: <bucket1,bucket2>
dynamodb:
dynamodb_url: dynamodb://region
```

### On prem deployment (Cassandra+Cassandra)

```yaml
storage_config:
cassandra:
addresses: <comma-separated-IPs-or-hostnames>
keyspace: <keyspace>
auth: <true|false>
username: <username> # only applicable when auth=true
password: <password> # only applicable when auth=true

schema_config:
configs:
- from: 2020-07-01
store: cassandra
object_store: cassandra
schema: v11
index:
prefix: index_
period: 168h
chunks:
prefix: chunk_
period: 168h

```

### On prem deployment (Cassandra+MinIO)

We configure MinIO by using the AWS config because MinIO implements the S3 API:

```yaml
storage_config:
aws:
# Note: use a fully qualified domain name, like localhost.
# full example: http://loki:supersecret@localhost.:9000
s3: http<s>://<username>:<secret>@<fqdn>:<port>
s3forcepathstyle: true
cassandra:
addresses: <comma-separated-IPs-or-hostnames>
keyspace: <keyspace>
auth: <true|false>
username: <username> # only applicable when auth=true
password: <password> # only applicable when auth=true

schema_config:
configs:
- from: 2020-07-01
store: cassandra
object_store: aws
schema: v11
index:
prefix: index_
period: 168h
```