Skip to content

Commit

Permalink
Dev pjac aaefold2 (#146)
Browse files Browse the repository at this point in the history
* Update project descriptions metadata

* Fix aae table ending

* Add aaefold usage to 3.0.7, 3.0.8 and 3.0.9

* Updated downloads metadata
  • Loading branch information
pjaclark committed Nov 29, 2022
1 parent e8e02c5 commit 17c27ce
Show file tree
Hide file tree
Showing 44 changed files with 5,073 additions and 89 deletions.
65 changes: 65 additions & 0 deletions content/riak/kv/3.0.7/configuring/active-anti-entropy.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
---
title_supertext: "Configuring:"
title: "Active Anti-Entropy"
description: ""
project: "riak_kv"
project_version: 3.0.7
menu:
riak_kv-3.0.7:
name: "Active Anti-Entropy (AAE)"
identifier: "configuring-active-anti-entropy"
weight: 210
parent: "configuring"
toc: true
commercial_offering: false
version_history:
in: "2.9.0p5+"
since: 2.9.0p5
aliases:
---

[config legacy]: ./legacy-aae/
[config tictac]: ./tictac-aae/
[config tictac-repl]: ../next-gen-replication/
[using aaefold]: ../../using/cluster-operations/tictac-aae-fold/
[learn aae]: ../../learn/concepts/active-anti-entropy/

Riak's [active anti-entropy][learn aae] \(AAE) subsystem is a set of background processes that repair object inconsistencies stemming from missing or divergent object values across nodes. Riak operators can turn AAE on and off and configure and monitor its functioning.

Both Legacy and TicTac AAE systems can be used seperately or together.

If you are using the legacy AAE system, it is recommended that you migrate to the TicTac AAE system.

## TicTac AAE system

The version of TicTac AAE included in 2.9 releases is a working prototype with limited testing. The intention is to full integrate the library into the KV 3.0 release.

TicTac Active Anti-Entropy makes two changes to the way Anti-Entropy has previously worked in Riak. The first change is to the way Merkle Trees are contructed so that they are built incrementally. The second change allows the underlying Anti-entropy key store to be key-ordered while still allowing faster access to keys via their Merkle tree location or the last modified date of the object.


#### [Configuring TicTac AAE][config tictac]

A guide covering commonly adjusted parameters for the TicTac AAE system.

[Learn More >>][config tictac]

#### [Configuring TicTac AAE's Next Gen Replication][config tictac-repl]

A guide covering commonly adjusted parameters for TicTac AAE's enhanced FullSync replication system.

[Learn More >>][config tictac-repl]

#### Other documentation

- [How to use `aae_fold`][using aaefold] to efficiently find, list and mangage keys.

## Legacy AAE system

The legacy AAE system is still present, and works exactly as before.

### [Configuring Legacy AAE][config legacy]

A guide covering commonly adjusted parameters for the legacy AAE system.

[Learn More >>][config legacy]

169 changes: 169 additions & 0 deletions content/riak/kv/3.0.7/configuring/active-anti-entropy/legacy-aae.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,169 @@
---
title_supertext: "Configuring:"
title: "Legacy Active Anti-Entropy"
description: ""
project: "riak_kv"
project_version: 3.0.7
menu:
riak_kv-3.0.7:
name: "Legacy AAE"
identifier: "configuring_legacy_aae"
weight: 103
parent: "configuring-active-anti-entropy"
toc: true
version_history:
in: "2.9.0p5+"
since: 2.9.0p5
aliases:
---

The configuration for the legacy AAE is kept in
the `riak.conf` configuration file.

## Validate Settings

Once your configuration is set, you can verify its correctness by
running the `riak` command-line tool:

```bash
riak chkconfig
```

## riak.conf Settings

Configurable parameters for Riak's legacy active anti-entropy subsystem.

<table class="riak-conf">
<thead>
<tr>
<th>Config</th>
<th>Description</th>
<th>Default</th>
</tr>
</thead>
<tbody>

<tr>
<td><code>anti_entropy</code></td>
<td>How Riak will repair out-of-sync keys. If set to
<code>active</code>, out-of-sync keys will be repaired in the
background; if set to <code>passive</code>, out-of-sync keys are only
repaired on read; and if set to <code>active-debug</code>, verbose
debugging information will be output.</td>
<td><code>active</code></td>
</tr>

<tr>
<td><code>search.anti_entropy.throttle</code></td>
<td>Whether the distributed throttle for Active Anti-Entropy is
enabled.</td>
<td><code>on</code></td>
</tr>

<tr>
<td><code>search.anti_entropy.throttle.$tier.solrq_queue_length</code></td>
<td>Sets the throttling tiers for Active Anti-Entropy. Each tier is a
minimum vnode mailbox size and a time-delay that the throttle should
observe at that size and above. For example,
<cod>anti_entropy.throttle.tier1.mailbox_size = 0</code>,
<code>anti_entropy.throttle.tier1.delay = 0ms</code>,
<code>anti_entropy.throttle.tier2.mailbox_size = 40</code>,
<code>anti_entropy.throttle.tier2.delay = 5ms</code>, etc. If
configured, there must be a tier which includes a mailbox size of 0.
Both <code>.mailbox_size</code> and <code>.delay</code> must be set for
each tier.</td>
<td></td>
</tr>

<tr>
<td><code>search.anti_entropy.throttle.$tier.delay</code></td>
<td>See the description for
<code>anti_entropy.throttle.$tier.mailbox_size</code> above.</td>
<td></td>
</tr>

<tr>
<td><code>anti_entropy.bloomfilter</code></td>
<td>Bloom filters are highly effective in shortcutting data queries
that are destined to not find the requested key, though they tend to
entail a small performance cost.</td>
<td><code>on</code></td>
</tr>

<tr>
<td><code>anti_entropy.max_open_files</code></td>
<td></td>
<td><code>20</code></td>
</tr>

<tr>
<td><code>anti_entropy.write_buffer_size</code></td>
<td>The LevelDB options used by Active Anti-Entropy to generate the
LevelDB-backed on-disk hashtrees.</td>
<td><code>4MB</code></td>
</tr>

<tr>
<td><code>anti_entropy.data_dir</code></td>
<td>The directory where AAE hash trees are stored.</td>
<td><code>./data/anti_entropy</code></td>
</tr>

<tr>
<td><code>anti_entropy.trigger_interval</code></td>
<td>The tick determines how often the Active Anti-Entropy manager looks
for work to do (building/expiring trees, triggering exchanges, etc).
Lowering this value will speed up the rate at which all replicas are
synced across the cluster. Increasing the value is not recommended.
</td>
<td><code>15s</code></td>
</tr>

<tr>
<td><code>anti_entropy.concurrency_limit</code></td>
<td>Limit how many Active Anti-Entropy exchanges or builds can happen
concurrently.</td>
<td><code>2</code></td>
</tr>

<tr>
<td><code>anti_entropy.tree.expiry</code></td>
<td>Determines how often hash trees are expired after being built.
Periodically expiring a hash tree ensures that the on-disk hash tree
data stays consistent with the actual K/V backend data. It also helps
Riak identify silent disk failures and bit rot. However, expiration is
not needed for normal active anti-entropy operations and should be
infrequent for performance reasons. The time is specified in
milliseconds.</td>
<td><code>1w</code></td>
</tr>

<tr>
<td><code>anti_entropy.tree.build_limit.per_timespan</code></td>
<td></td>
<td><code>1h</code></td>
</tr>

<tr>
<td><code>anti_entropy.tree.build_limit.number</code></td>
<td>Restrict how fast AAE can build hash trees. Building the tree for a
given partition requires a full scan over that partition's data. Once
built, trees stay built until they are expired. <code>.number</code> is
the number of builds; <code>.per_timespan</code> is the amount of time
in which that number of builds occurs.</td>
<td><code>1</code></td>
</tr>

<tr>
<td><code>anti_entropy.use_background_manager</code></td>
<td>Whether AAE is to use a background process to limit AAE tree
rebuilds. If set to <code>on</code>, this will help to prevent system
response degradation under times of heavy load from multiple background
tasks that contend for the same system resources; setting this parameter
to <code>off</code> can cut down on system resource usage.
</td>
<td><code>off</code></td>
</tr>

</tbody>
</table>
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
---
title_supertext: "Configuring:"
title: "TicTac Active Anti-Entropy"
description: ""
project: "riak_kv"
project_version: 3.0.7
menu:
riak_kv-3.0.7:
name: "TicTac AAE"
identifier: "configuring_tictac_aae"
weight: 101
parent: "configuring-active-anti-entropy"
toc: true
version_history:
in: "2.9.0p5+"
since: 2.9.0p5
aliases:
---

[configure next-gen-repl]: ../../next-gen-replication

The configuration for TicTac AAE is kept in
the `riak.conf` configuration file.

## Validate Settings

Once your configuration is set, you can verify its correctness by
running the `riak` command-line tool:

```bash
riak chkconfig
```

## riak.conf Settings

Setting | Options | Default | Description
:-------|:--------|:--------|:-----------
`tictacaae_active` | `active`, `passive` | `passive` | Enable or disable tictacaae. Note that disabling tictacaae will set the use of tictacaae_active only at startup - setting the environment variable at runtime will have no impact.
`aae_tokenbucket` | `enabled`, `disabled` | `enabled` | To protect against unbounded queues developing and subsequent timeouts/crashes of the AAE process, back-pressure signalling is used to block the vnode should a backlog develop on the AAE process. This can be disabled.
`tictacaae_dataroot` | `` | `"$platform_data_dir/tictac_aae"` | Set the path for storing tree caches and parallel key stores. Note that at startup folders may be created for every partition, and not removed when that partition hands off (although the contents should be cleared).
`tictacaae_parallelstore` | `leveled_ko`, `leveled_so` | `leveled_so` | On startup, if tictacaae is enabled, then the vnode will detect of the vnode backend has the capability to be a "native" store. If not, then parallel mode will be entered, and a parallel AAE keystore will be started. There are two potential parallel store backends - leveled_ko, and leveled_so.
`tictacaae_rebuildwait` | `` | `336` | This is the number of hours between rebuilds of the Tictac AAE system for each vnode. A rebuild will invoke a rebuild of the key store (which is a null operation when in native mode), and then a rebuild of the tree cache from the rebuilt store.
`tictacaae_rebuilddelay` | `` | `345600` | Once the AAE system has expired (due to the rebuild wait), the rebuild will not be triggered until the rebuild delay which will be a random number up to the size of this delay (in seconds).
`tictacaae_storeheads` | `enabled`, `disabled` | `disabled` | By default only a small amount of metadata is required for AAE purposes, and with storeheads disabled only that small amount of metadata is stored. Enabling storeheads will allow for greater functionality (notably with [`aae_fold`](../../../using/cluster-operations/tictac-aae-fold/)) at the cost of disk space and memory.
`tictacaae_exchangetick` | `` | `240000` | Exchanges are prompted every exchange tick, on each vnode. By default there is a tick every 4 minutes. Exchanges will skip when previous exchanges have not completed, in order to prevent a backlog of fetch-clock scans developing.
`tictacaae_rebuildtick` | `` | `3600000` | Rebuilds will be triggered depending on the riak_kv.tictacaae_rebuildwait, but they must also be prompted by a tick. The tick size can be modified at run-time by setting the environment variable via riak attach.
`tictacaae_maxresults` | `` | `256` | The Merkle tree used has 4096 * 1024 leaves. When a large discrepancy is discovered, only part of the discrepancy will be resolved each exchange - active anti-entropy is intended to be a background process for repairing long-term loss of data, hinted handoff and read-repair are the short-term and immediate answers to entropy. How much of the tree is repaired each pass is defined by the tictacaae_maxresults.

## See also

[Next Gen Replication][configure next-gen-repl] makes extensive use of TicTac AAE, and has some replication-specific TicTac AAE settings.
3 changes: 3 additions & 0 deletions content/riak/kv/3.0.7/configuring/reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -1372,6 +1372,9 @@ to <code>off</code> can cut down on system resource usage.
<td><code>345600</code></td>
</tr>

</tbody>
</table>

## Intra-Cluster Handoff

Configurable parameters for intra-cluster, i.e. inter-node, [handoff][cluster ops handoff].
Expand Down
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
---
title: "Managing Active Anti-Entropy"
title: "Legacy Active Anti-Entropy"
description: ""
project: "riak_kv"
project_version: 3.0.7
menu:
riak_kv-3.0.7:
name: "Managing Active Anti-Entropy"
name: "Legacy AAE"
identifier: "cluster_operations_aae"
weight: 111
weight: 110
parent: "managing_cluster_operations"
toc: true
version_history:
Expand Down
Loading

0 comments on commit 17c27ce

Please sign in to comment.