From 92476cea487fe86a8d8b1ee630f15675ce92823a Mon Sep 17 00:00:00 2001 From: Jan Melcher Date: Thu, 30 Nov 2023 11:19:39 +0100 Subject: [PATCH 1/3] Update arangosearch-views-reference.md Fix "higher" and "lower" in explanation for commitIntervalMsec A higher value for the interval means the data is commited less frequently, so there will be more time where the index does not account for the changes. Also, memory consumption will grow because the data is not yet flushed to disk yet. --- .../arangosearch/arangosearch-views-reference.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/site/content/3.12/index-and-search/arangosearch/arangosearch-views-reference.md b/site/content/3.12/index-and-search/arangosearch/arangosearch-views-reference.md index 93c9d1c09e..7f6fd97c0a 100644 --- a/site/content/3.12/index-and-search/arangosearch/arangosearch-views-reference.md +++ b/site/content/3.12/index-and-search/arangosearch/arangosearch-views-reference.md @@ -392,10 +392,10 @@ of removing unused segments after release of internal resources. Wait at least this many milliseconds between committing View data store changes and making documents visible to queries. - For the case where there are a lot of inserts/updates, a lower value, until + For the case where there are a lot of inserts/updates, a higher value, until commit, causes the index not to account for them and memory usage continues to grow. - For the case where there are a few inserts/updates, a higher value impacts + For the case where there are a few inserts/updates, a lower value impacts performance and wastes disk space for each commit call without any added benefits. From 40a2fd0564c729ecc7d5c4f237c79a7eb64babcf Mon Sep 17 00:00:00 2001 From: Simran Spiller Date: Thu, 21 Dec 2023 15:19:35 +0100 Subject: [PATCH 2/3] Review --- .../3.10/develop/http-api/indexes/inverted.md | 11 +++---- .../http-api/views/arangosearch-views.md | 33 +++++++++---------- .../arangosearch-views-reference.md | 11 +++---- .../3.11/develop/http-api/indexes/inverted.md | 11 +++---- .../http-api/views/arangosearch-views.md | 33 +++++++++---------- .../arangosearch-views-reference.md | 11 +++---- .../3.12/develop/http-api/indexes/inverted.md | 11 +++---- .../http-api/views/arangosearch-views.md | 33 +++++++++---------- 8 files changed, 70 insertions(+), 84 deletions(-) diff --git a/site/content/3.10/develop/http-api/indexes/inverted.md b/site/content/3.10/develop/http-api/indexes/inverted.md index 12f1201bed..1634b889c5 100644 --- a/site/content/3.10/develop/http-api/indexes/inverted.md +++ b/site/content/3.10/develop/http-api/indexes/inverted.md @@ -489,12 +489,11 @@ paths: Wait at least this many milliseconds between committing inverted index data store changes and making documents visible to queries (default: 1000, to disable use: 0). - For the case where there are a lot of inserts/updates, a lower value, until - commit, will cause the index not to account for them and memory usage would - continue to grow. - For the case where there are a few inserts/updates, a higher value will impact - performance and waste disk space for each commit call without any added - benefits. + For the case where there are a lot of inserts/updates, a higher value causes the + index not to account for them and memory usage continues to grow until the commit. + For the case where there are a few inserts/updates, a lower value impacts + performance (because of synchronous locking) and wastes disk space for each + commit call. _Background:_ For data retrieval, ArangoSearch follows the concept of diff --git a/site/content/3.10/develop/http-api/views/arangosearch-views.md b/site/content/3.10/develop/http-api/views/arangosearch-views.md index 920bc95514..223781f73d 100644 --- a/site/content/3.10/develop/http-api/views/arangosearch-views.md +++ b/site/content/3.10/develop/http-api/views/arangosearch-views.md @@ -179,12 +179,11 @@ paths: Wait at least this many milliseconds between committing View data store changes and making documents visible to queries (default: 1000, to disable use: 0). - For the case where there are a lot of inserts/updates, a lower value, until - commit, will cause the index not to account for them and memory usage would - continue to grow. - For the case where there are a few inserts/updates, a higher value will impact - performance and waste disk space for each commit call without any added - benefits. + For the case where there are a lot of inserts/updates, a higher value causes the + index not to account for them and memory usage continues to grow until the commit. + For the case where there are a few inserts/updates, a lower value impacts + performance (because of synchronous locking) and wastes disk space for each + commit call. _Background:_ For data retrieval, ArangoSearch follows the concept of @@ -541,12 +540,11 @@ paths: Wait at least this many milliseconds between committing View data store changes and making documents visible to queries (default: 1000, to disable use: 0). - For the case where there are a lot of inserts/updates, a lower value, until - commit, will cause the index not to account for them and memory usage would - continue to grow. - For the case where there are a few inserts/updates, a higher value will impact - performance and waste disk space for each commit call without any added - benefits. + For the case where there are a lot of inserts/updates, a higher value causes the + index not to account for them and memory usage continues to grow until the commit. + For the case where there are a few inserts/updates, a lower value impacts + performance (because of synchronous locking) and wastes disk space for each + commit call. _Background:_ For data retrieval, ArangoSearch follow the concept of @@ -708,12 +706,11 @@ paths: Wait at least this many milliseconds between committing View data store changes and making documents visible to queries (default: 1000, to disable use: 0). - For the case where there are a lot of inserts/updates, a lower value, until - commit, will cause the index not to account for them and memory usage would - continue to grow. - For the case where there are a few inserts/updates, a higher value will impact - performance and waste disk space for each commit call without any added - benefits. + For the case where there are a lot of inserts/updates, a higher value causes the + index not to account for them and memory usage continues to grow until the commit. + For the case where there are a few inserts/updates, a lower value impacts + performance (because of synchronous locking) and wastes disk space for each + commit call. _Background:_ For data retrieval, ArangoSearch follows the concept of diff --git a/site/content/3.10/index-and-search/arangosearch/arangosearch-views-reference.md b/site/content/3.10/index-and-search/arangosearch/arangosearch-views-reference.md index c0bf28170e..496ac93b0d 100644 --- a/site/content/3.10/index-and-search/arangosearch/arangosearch-views-reference.md +++ b/site/content/3.10/index-and-search/arangosearch/arangosearch-views-reference.md @@ -343,12 +343,11 @@ of removing unused segments after release of internal resources. Wait at least this many milliseconds between committing View data store changes and making documents visible to queries. - For the case where there are a lot of inserts/updates, a lower value, until - commit, causes the index not to account for them and memory usage continues - to grow. - For the case where there are a few inserts/updates, a higher value impacts - performance and wastes disk space for each commit call without any added - benefits. + For the case where there are a lot of inserts/updates, a higher value causes the + index not to account for them and memory usage continues to grow until the commit. + For the case where there are a few inserts/updates, a lower value impacts + performance (because of synchronous locking) and wastes disk space for each + commit call. > For data retrieval `arangosearch` Views follow the concept of > "eventually-consistent", i.e. eventually all the data in ArangoDB is diff --git a/site/content/3.11/develop/http-api/indexes/inverted.md b/site/content/3.11/develop/http-api/indexes/inverted.md index 8c904a052e..f221a65858 100644 --- a/site/content/3.11/develop/http-api/indexes/inverted.md +++ b/site/content/3.11/develop/http-api/indexes/inverted.md @@ -482,12 +482,11 @@ paths: Wait at least this many milliseconds between committing inverted index data store changes and making documents visible to queries (default: 1000, to disable use: 0). - For the case where there are a lot of inserts/updates, a lower value, until - commit, will cause the index not to account for them and memory usage would - continue to grow. - For the case where there are a few inserts/updates, a higher value will impact - performance and waste disk space for each commit call without any added - benefits. + For the case where there are a lot of inserts/updates, a higher value causes the + index not to account for them and memory usage continues to grow until the commit. + For the case where there are a few inserts/updates, a lower value impacts + performance (because of synchronous locking) and wastes disk space for each + commit call. _Background:_ For data retrieval, ArangoSearch follows the concept of diff --git a/site/content/3.11/develop/http-api/views/arangosearch-views.md b/site/content/3.11/develop/http-api/views/arangosearch-views.md index 147089f396..d73e2e9afe 100644 --- a/site/content/3.11/develop/http-api/views/arangosearch-views.md +++ b/site/content/3.11/develop/http-api/views/arangosearch-views.md @@ -178,12 +178,11 @@ paths: Wait at least this many milliseconds between committing View data store changes and making documents visible to queries (default: 1000, to disable use: 0). - For the case where there are a lot of inserts/updates, a lower value, until - commit, will cause the index not to account for them and memory usage would - continue to grow. - For the case where there are a few inserts/updates, a higher value will impact - performance and waste disk space for each commit call without any added - benefits. + For the case where there are a lot of inserts/updates, a higher value causes the + index not to account for them and memory usage continues to grow until the commit. + For the case where there are a few inserts/updates, a lower value impacts + performance (because of synchronous locking) and wastes disk space for each + commit call. _Background:_ For data retrieval, ArangoSearch follows the concept of @@ -540,12 +539,11 @@ paths: Wait at least this many milliseconds between committing View data store changes and making documents visible to queries (default: 1000, to disable use: 0). - For the case where there are a lot of inserts/updates, a lower value, until - commit, will cause the index not to account for them and memory usage would - continue to grow. - For the case where there are a few inserts/updates, a higher value will impact - performance and waste disk space for each commit call without any added - benefits. + For the case where there are a lot of inserts/updates, a higher value causes the + index not to account for them and memory usage continues to grow until the commit. + For the case where there are a few inserts/updates, a lower value impacts + performance (because of synchronous locking) and wastes disk space for each + commit call. _Background:_ For data retrieval, ArangoSearch follow the concept of @@ -707,12 +705,11 @@ paths: Wait at least this many milliseconds between committing View data store changes and making documents visible to queries (default: 1000, to disable use: 0). - For the case where there are a lot of inserts/updates, a lower value, until - commit, will cause the index not to account for them and memory usage would - continue to grow. - For the case where there are a few inserts/updates, a higher value will impact - performance and waste disk space for each commit call without any added - benefits. + For the case where there are a lot of inserts/updates, a higher value causes the + index not to account for them and memory usage continues to grow until the commit. + For the case where there are a few inserts/updates, a lower value impacts + performance (because of synchronous locking) and wastes disk space for each + commit call. _Background:_ For data retrieval, ArangoSearch follows the concept of diff --git a/site/content/3.11/index-and-search/arangosearch/arangosearch-views-reference.md b/site/content/3.11/index-and-search/arangosearch/arangosearch-views-reference.md index 116274ee19..0516f4f70f 100644 --- a/site/content/3.11/index-and-search/arangosearch/arangosearch-views-reference.md +++ b/site/content/3.11/index-and-search/arangosearch/arangosearch-views-reference.md @@ -369,12 +369,11 @@ of removing unused segments after release of internal resources. Wait at least this many milliseconds between committing View data store changes and making documents visible to queries. - For the case where there are a lot of inserts/updates, a lower value, until - commit, causes the index not to account for them and memory usage continues - to grow. - For the case where there are a few inserts/updates, a higher value impacts - performance and wastes disk space for each commit call without any added - benefits. + For the case where there are a lot of inserts/updates, a higher value causes the + index not to account for them and memory usage continues to grow until the commit. + For the case where there are a few inserts/updates, a lower value impacts + performance (because of synchronous locking) and wastes disk space for each + commit call. > For data retrieval `arangosearch` Views follow the concept of > "eventually-consistent", i.e. eventually all the data in ArangoDB is diff --git a/site/content/3.12/develop/http-api/indexes/inverted.md b/site/content/3.12/develop/http-api/indexes/inverted.md index 2f71dc365b..3265197f5e 100644 --- a/site/content/3.12/develop/http-api/indexes/inverted.md +++ b/site/content/3.12/develop/http-api/indexes/inverted.md @@ -508,12 +508,11 @@ paths: Wait at least this many milliseconds between committing inverted index data store changes and making documents visible to queries (default: 1000, to disable use: 0). - For the case where there are a lot of inserts/updates, a lower value, until - commit, will cause the index not to account for them and memory usage would - continue to grow. - For the case where there are a few inserts/updates, a higher value will impact - performance and waste disk space for each commit call without any added - benefits. + For the case where there are a lot of inserts/updates, a higher value causes the + index not to account for them and memory usage continues to grow until the commit. + For the case where there are a few inserts/updates, a lower value impacts + performance (because of synchronous locking) and wastes disk space for each + commit call. _Background:_ For data retrieval, ArangoSearch follows the concept of diff --git a/site/content/3.12/develop/http-api/views/arangosearch-views.md b/site/content/3.12/develop/http-api/views/arangosearch-views.md index bc0e55bd5a..963244368b 100644 --- a/site/content/3.12/develop/http-api/views/arangosearch-views.md +++ b/site/content/3.12/develop/http-api/views/arangosearch-views.md @@ -202,12 +202,11 @@ paths: Wait at least this many milliseconds between committing View data store changes and making documents visible to queries (default: 1000, to disable use: 0). - For the case where there are a lot of inserts/updates, a lower value, until - commit, will cause the index not to account for them and memory usage would - continue to grow. - For the case where there are a few inserts/updates, a higher value will impact - performance and waste disk space for each commit call without any added - benefits. + For the case where there are a lot of inserts/updates, a higher value causes the + index not to account for them and memory usage continues to grow until the commit. + For the case where there are a few inserts/updates, a lower value impacts + performance (because of synchronous locking) and wastes disk space for each + commit call. _Background:_ For data retrieval, ArangoSearch follows the concept of @@ -564,12 +563,11 @@ paths: Wait at least this many milliseconds between committing View data store changes and making documents visible to queries (default: 1000, to disable use: 0). - For the case where there are a lot of inserts/updates, a lower value, until - commit, will cause the index not to account for them and memory usage would - continue to grow. - For the case where there are a few inserts/updates, a higher value will impact - performance and waste disk space for each commit call without any added - benefits. + For the case where there are a lot of inserts/updates, a higher value causes the + index not to account for them and memory usage continues to grow until the commit. + For the case where there are a few inserts/updates, a lower value impacts + performance (because of synchronous locking) and wastes disk space for each + commit call. _Background:_ For data retrieval, ArangoSearch follow the concept of @@ -731,12 +729,11 @@ paths: Wait at least this many milliseconds between committing View data store changes and making documents visible to queries (default: 1000, to disable use: 0). - For the case where there are a lot of inserts/updates, a lower value, until - commit, will cause the index not to account for them and memory usage would - continue to grow. - For the case where there are a few inserts/updates, a higher value will impact - performance and waste disk space for each commit call without any added - benefits. + For the case where there are a lot of inserts/updates, a higher value causes the + index not to account for them and memory usage continues to grow until the commit. + For the case where there are a few inserts/updates, a lower value impacts + performance (because of synchronous locking) and wastes disk space for each + commit call. _Background:_ For data retrieval, ArangoSearch follows the concept of From 886e0fa612a4c254dd316118193cbc81b9955cfc Mon Sep 17 00:00:00 2001 From: Simran Spiller Date: Thu, 21 Dec 2023 15:57:28 +0100 Subject: [PATCH 3/3] Tweaks --- .../guides/working-with-files.md | 10 +-- .../3.10/develop/http-api/indexes/inverted.md | 18 ++-- .../http-api/views/arangosearch-views.md | 82 +++++++++---------- .../arangosearch-views-reference.md | 10 +-- .../guides/working-with-files.md | 10 +-- .../3.11/develop/http-api/indexes/inverted.md | 18 ++-- .../http-api/views/arangosearch-views.md | 82 +++++++++---------- .../arangosearch-views-reference.md | 10 +-- .../guides/working-with-files.md | 10 +-- .../3.12/develop/http-api/indexes/inverted.md | 18 ++-- .../http-api/views/arangosearch-views.md | 82 +++++++++---------- .../arangosearch-views-reference.md | 15 ++-- 12 files changed, 182 insertions(+), 183 deletions(-) diff --git a/site/content/3.10/develop/foxx-microservices/guides/working-with-files.md b/site/content/3.10/develop/foxx-microservices/guides/working-with-files.md index 85a816bda7..49e488bddb 100644 --- a/site/content/3.10/develop/foxx-microservices/guides/working-with-files.md +++ b/site/content/3.10/develop/foxx-microservices/guides/working-with-files.md @@ -61,11 +61,11 @@ the filesystem from within a service: may therefore cause race conditions and **result in corrupted data**. - Writing to files outside the service folder introduces external state. In - a cluster this will result in Coordinators no longer being interchangeable. + a cluster, this results in Coordinators no longer being interchangeable. - Writing to files during setup is unreliable because the setup script may - be executed several times or not at all. In a cluster the setup script - will only be executed on a single Coordinator. + be executed several times or not at all. In a cluster, the setup script + is only executed on a single Coordinator. Therefore it is almost always a better option to store files using a specialized, external file storage service @@ -77,13 +77,13 @@ ArangoDB documents by using a separate collection. {{< danger >}} Due to the way ArangoDB stores documents internally, you should not store file contents alongside other attributes that might be updated independently. -Additionally, large file sizes will impact performance for operations +Additionally, large file sizes impact performance for operations involving the document and may affect overall database performance. {{< /danger >}} {{< warning >}} In production, you should avoid storing any files in ArangoDB or handling file -uploads in Foxx. The following example will work for moderate amounts of small +uploads in Foxx. The following example works for moderate amounts of small files but is not recommended for large files or frequent uploads or modifications. {{< /warning >}} diff --git a/site/content/3.10/develop/http-api/indexes/inverted.md b/site/content/3.10/develop/http-api/indexes/inverted.md index 9a934fa4b1..f3960b150d 100644 --- a/site/content/3.10/develop/http-api/indexes/inverted.md +++ b/site/content/3.10/develop/http-api/indexes/inverted.md @@ -472,10 +472,10 @@ paths: Wait at least this many commits between removing unused files in the ArangoSearch data directory (default: 2, to disable use: 0). For the case where the consolidation policies merge segments often (i.e. a lot - of commit+consolidate), a lower value will cause a lot of disk space to be + of commit+consolidate), a lower value causes a lot of disk space to be wasted. For the case where the consolidation policies rarely merge segments (i.e. few - inserts/deletes), a higher value will impact performance without any added + inserts/deletes), a higher value impacts performance without any added benefits. _Background:_ @@ -493,9 +493,9 @@ paths: use: 0). For the case where there are a lot of inserts/updates, a higher value causes the index not to account for them and memory usage continues to grow until the commit. - For the case where there are a few inserts/updates, a lower value impacts - performance (because of synchronous locking) and wastes disk space for each - commit call. + A lower value impacts performance, including the case where there are no or only a + few inserts/updates because of synchronous locking, and it wastes disk space for + each commit call. _Background:_ For data retrieval, ArangoSearch follows the concept of @@ -517,7 +517,7 @@ paths: For the case where there are a lot of data modification operations, a higher value could potentially have the data store consume more space and file handles. For the case where there are a few data modification operations, a lower value - will impact performance due to no segment candidates available for + impacts performance due to no segment candidates being available for consolidation. _Background:_ @@ -535,8 +535,8 @@ paths: _Background:_ With each ArangoDB transaction that inserts documents, one or more ArangoSearch-internal segments get created. - Similarly, for removed documents the segments that contain such documents - will have these documents marked as 'deleted'. + Similarly, for removed documents, the segments that contain such documents + have these documents marked as 'deleted'. Over time, this approach causes a lot of small and sparse segments to be created. A "consolidation" operation selects one or more segments and copies all of @@ -593,7 +593,7 @@ paths: description: | Maximum memory byte size per writer (segment) before a writer (segment) flush is triggered. `0` value turns off this limit for any writer (buffer) and data - will be flushed periodically based on the value defined for the flush thread + is flushed periodically based on the value defined for the flush thread (ArangoDB server startup option). `0` value should be used carefully due to high potential memory consumption (default: 33554432, use 0 to disable) diff --git a/site/content/3.10/develop/http-api/views/arangosearch-views.md b/site/content/3.10/develop/http-api/views/arangosearch-views.md index 1a9d403fac..9526bc380a 100644 --- a/site/content/3.10/develop/http-api/views/arangosearch-views.md +++ b/site/content/3.10/develop/http-api/views/arangosearch-views.md @@ -160,10 +160,10 @@ paths: Wait at least this many commits between removing unused files in the ArangoSearch data directory (default: 2, to disable use: 0). For the case where the consolidation policies merge segments often (i.e. a lot - of commit+consolidate), a lower value will cause a lot of disk space to be + of commit+consolidate), a lower value causes a lot of disk space to be wasted. For the case where the consolidation policies rarely merge segments (i.e. few - inserts/deletes), a higher value will impact performance without any added + inserts/deletes), a higher value impacts performance without any added benefits. _Background:_ @@ -181,9 +181,9 @@ paths: use: 0). For the case where there are a lot of inserts/updates, a higher value causes the index not to account for them and memory usage continues to grow until the commit. - For the case where there are a few inserts/updates, a lower value impacts - performance (because of synchronous locking) and wastes disk space for each - commit call. + A lower value impacts performance, including the case where there are no or only a + few inserts/updates because of synchronous locking, and it wastes disk space for + each commit call. _Background:_ For data retrieval, ArangoSearch follows the concept of @@ -205,11 +205,11 @@ paths: For the case where there are a lot of data modification operations, a higher value could potentially have the data store consume more space and file handles. For the case where there are a few data modification operations, a lower value - will impact performance due to no segment candidates available for + impacts performance due to no segment candidates being available for consolidation. _Background:_ - For data modification, ArangoSearch follow the concept of a + For data modification, ArangoSearch follows the concept of a "versioned data store". Thus old versions of data may be removed once there are no longer any users of the old data. The frequency of the cleanup and compaction operations are governed by `consolidationIntervalMsec` and the @@ -223,8 +223,8 @@ paths: _Background:_ With each ArangoDB transaction that inserts documents, one or more ArangoSearch-internal segments get created. - Similarly, for removed documents the segments that contain such documents - will have these documents marked as 'deleted'. + Similarly, for removed documents, the segments that contain such documents + have these documents marked as 'deleted'. Over time, this approach causes a lot of small and sparse segments to be created. A "consolidation" operation selects one or more segments and copies all of @@ -251,10 +251,10 @@ paths: (default: 2097152) - `segmentsBytesMax` (number, _optional_): Maximum allowed size of all consolidated segments in bytes (default: 5368709120) - - `segmentsMax` (number, _optional_): The maximum number of segments that will - be evaluated as candidates for consolidation (default: 10) - - `segmentsMin` (number, _optional_): The minimum number of segments that will - be evaluated as candidates for consolidation (default: 1) + - `segmentsMax` (number, _optional_): The maximum number of segments that are + evaluated as candidates for consolidation (default: 10) + - `segmentsMin` (number, _optional_): The minimum number of segments that are + evaluated as candidates for consolidation (default: 1) - `minScore` (number, _optional_): (default: 0) type: object writebufferIdle: @@ -272,7 +272,7 @@ paths: description: | Maximum memory byte size per writer (segment) before a writer (segment) flush is triggered. `0` value turns off this limit for any writer (buffer) and data - will be flushed periodically based on the value defined for the flush thread + is flushed periodically based on the value defined for the flush thread (ArangoDB server startup option). `0` value should be used carefully due to high potential memory consumption (default: 33554432, use 0 to disable, immutable) @@ -521,10 +521,10 @@ paths: Wait at least this many commits between removing unused files in the ArangoSearch data directory (default: 2, to disable use: 0). For the case where the consolidation policies merge segments often (i.e. a lot - of commit+consolidate), a lower value will cause a lot of disk space to be + of commit+consolidate), a lower value causes a lot of disk space to be wasted. For the case where the consolidation policies rarely merge segments (i.e. few - inserts/deletes), a higher value will impact performance without any added + inserts/deletes), a higher value impacts performance without any added benefits. _Background:_ @@ -542,12 +542,12 @@ paths: use: 0). For the case where there are a lot of inserts/updates, a higher value causes the index not to account for them and memory usage continues to grow until the commit. - For the case where there are a few inserts/updates, a lower value impacts - performance (because of synchronous locking) and wastes disk space for each - commit call. + A lower value impacts performance, including the case where there are no or only a + few inserts/updates because of synchronous locking, and it wastes disk space for + each commit call. _Background:_ - For data retrieval, ArangoSearch follow the concept of + For data retrieval, ArangoSearch follows the concept of "eventually-consistent", i.e. eventually all the data in ArangoDB will be matched by corresponding query expressions. The concept of ArangoSearch "commit" operations is introduced to @@ -566,11 +566,11 @@ paths: For the case where there are a lot of data modification operations, a higher value could potentially have the data store consume more space and file handles. For the case where there are a few data modification operations, a lower value - will impact performance due to no segment candidates available for + impacts performance due to no segment candidates being available for consolidation. _Background:_ - For data modification, ArangoSearch follow the concept of a + For data modification, ArangoSearch follows the concept of a "versioned data store". Thus old versions of data may be removed once there are no longer any users of the old data. The frequency of the cleanup and compaction operations are governed by `consolidationIntervalMsec` and the @@ -584,8 +584,8 @@ paths: _Background:_ With each ArangoDB transaction that inserts documents, one or more ArangoSearch-internal segments get created. - Similarly, for removed documents the segments that contain such documents - will have these documents marked as 'deleted'. + Similarly, for removed documents, the segments that contain such documents + have these documents marked as 'deleted'. Over time, this approach causes a lot of small and sparse segments to be created. A "consolidation" operation selects one or more segments and copies all of @@ -612,10 +612,10 @@ paths: (default: 2097152) - `segmentsBytesMax` (number, _optional_): Maximum allowed size of all consolidated segments in bytes (default: 5368709120) - - `segmentsMax` (number, _optional_): The maximum number of segments that will - be evaluated as candidates for consolidation (default: 10) - - `segmentsMin` (number, _optional_): The minimum number of segments that will - be evaluated as candidates for consolidation (default: 1) + - `segmentsMax` (number, _optional_): The maximum number of segments that are + evaluated as candidates for consolidation (default: 10) + - `segmentsMin` (number, _optional_): The minimum number of segments that + evaluated as candidates for consolidation (default: 1) - `minScore` (number, _optional_): (default: 0) type: object responses: @@ -687,10 +687,10 @@ paths: Wait at least this many commits between removing unused files in the ArangoSearch data directory (default: 2, to disable use: 0). For the case where the consolidation policies merge segments often (i.e. a lot - of commit+consolidate), a lower value will cause a lot of disk space to be + of commit+consolidate), a lower value causes a lot of disk space to be wasted. For the case where the consolidation policies rarely merge segments (i.e. few - inserts/deletes), a higher value will impact performance without any added + inserts/deletes), a higher value impacts performance without any added benefits. _Background:_ @@ -708,9 +708,9 @@ paths: use: 0). For the case where there are a lot of inserts/updates, a higher value causes the index not to account for them and memory usage continues to grow until the commit. - For the case where there are a few inserts/updates, a lower value impacts - performance (because of synchronous locking) and wastes disk space for each - commit call. + A lower value impacts performance, including the case where there are no or only a + few inserts/updates because of synchronous locking, and it wastes disk space for + each commit call. _Background:_ For data retrieval, ArangoSearch follows the concept of @@ -732,11 +732,11 @@ paths: For the case where there are a lot of data modification operations, a higher value could potentially have the data store consume more space and file handles. For the case where there are a few data modification operations, a lower value - will impact performance due to no segment candidates available for + impacts performance due to no segment candidates being available for consolidation. _Background:_ - For data modification, ArangoSearch follow the concept of a + For data modification, ArangoSearch follows the concept of a "versioned data store". Thus old versions of data may be removed once there are no longer any users of the old data. The frequency of the cleanup and compaction operations are governed by `consolidationIntervalMsec` and the @@ -750,8 +750,8 @@ paths: _Background:_ With each ArangoDB transaction that inserts documents, one or more ArangoSearch-internal segments get created. - Similarly, for removed documents the segments that contain such documents - will have these documents marked as 'deleted'. + Similarly, for removed documents, the segments that contain such documents + have these documents marked as 'deleted'. Over time, this approach causes a lot of small and sparse segments to be created. A "consolidation" operation selects one or more segments and copies all of @@ -778,10 +778,10 @@ paths: (default: 2097152) - `segmentsBytesMax` (number, _optional_): Maximum allowed size of all consolidated segments in bytes (default: 5368709120) - - `segmentsMax` (number, _optional_): The maximum number of segments that will - be evaluated as candidates for consolidation (default: 10) - - `segmentsMin` (number, _optional_): The minimum number of segments that will - be evaluated as candidates for consolidation (default: 1) + - `segmentsMax` (number, _optional_): The maximum number of segments that are + evaluated as candidates for consolidation (default: 10) + - `segmentsMin` (number, _optional_): The minimum number of segments that are + evaluated as candidates for consolidation (default: 1) - `minScore` (number, _optional_): (default: 0) type: object responses: diff --git a/site/content/3.10/index-and-search/arangosearch/arangosearch-views-reference.md b/site/content/3.10/index-and-search/arangosearch/arangosearch-views-reference.md index 9dc65f4041..0a08a1cc40 100644 --- a/site/content/3.10/index-and-search/arangosearch/arangosearch-views-reference.md +++ b/site/content/3.10/index-and-search/arangosearch/arangosearch-views-reference.md @@ -345,9 +345,9 @@ of removing unused segments after release of internal resources. For the case where there are a lot of inserts/updates, a higher value causes the index not to account for them and memory usage continues to grow until the commit. - For the case where there are a few inserts/updates, a lower value impacts - performance (because of synchronous locking) and wastes disk space for each - commit call. + A lower value impacts performance, including the case where there are no or only a + few inserts/updates because of synchronous locking, and it wastes disk space for + each commit call. > For data retrieval `arangosearch` Views follow the concept of > "eventually-consistent", i.e. eventually all the data in ArangoDB is @@ -398,8 +398,8 @@ is used by these writers (in terms of "writers pool") one can use to disable use: `0`; _immutable_) Maximum memory byte size per writer (segment) before a writer (segment) flush is - triggered. `0` value turns off this limit for any writer (buffer) and data will - be flushed periodically. `0` value should be used carefully due to high + triggered. `0` value turns off this limit for any writer (buffer) and data is + flushed periodically. `0` value should be used carefully due to high potential memory consumption. - **consolidationPolicy** (_optional_; type: `object`; default: `{}`) diff --git a/site/content/3.11/develop/foxx-microservices/guides/working-with-files.md b/site/content/3.11/develop/foxx-microservices/guides/working-with-files.md index 85a816bda7..49e488bddb 100644 --- a/site/content/3.11/develop/foxx-microservices/guides/working-with-files.md +++ b/site/content/3.11/develop/foxx-microservices/guides/working-with-files.md @@ -61,11 +61,11 @@ the filesystem from within a service: may therefore cause race conditions and **result in corrupted data**. - Writing to files outside the service folder introduces external state. In - a cluster this will result in Coordinators no longer being interchangeable. + a cluster, this results in Coordinators no longer being interchangeable. - Writing to files during setup is unreliable because the setup script may - be executed several times or not at all. In a cluster the setup script - will only be executed on a single Coordinator. + be executed several times or not at all. In a cluster, the setup script + is only executed on a single Coordinator. Therefore it is almost always a better option to store files using a specialized, external file storage service @@ -77,13 +77,13 @@ ArangoDB documents by using a separate collection. {{< danger >}} Due to the way ArangoDB stores documents internally, you should not store file contents alongside other attributes that might be updated independently. -Additionally, large file sizes will impact performance for operations +Additionally, large file sizes impact performance for operations involving the document and may affect overall database performance. {{< /danger >}} {{< warning >}} In production, you should avoid storing any files in ArangoDB or handling file -uploads in Foxx. The following example will work for moderate amounts of small +uploads in Foxx. The following example works for moderate amounts of small files but is not recommended for large files or frequent uploads or modifications. {{< /warning >}} diff --git a/site/content/3.11/develop/http-api/indexes/inverted.md b/site/content/3.11/develop/http-api/indexes/inverted.md index d6f76e5f5e..4c5a218976 100644 --- a/site/content/3.11/develop/http-api/indexes/inverted.md +++ b/site/content/3.11/develop/http-api/indexes/inverted.md @@ -465,10 +465,10 @@ paths: Wait at least this many commits between removing unused files in the ArangoSearch data directory (default: 2, to disable use: 0). For the case where the consolidation policies merge segments often (i.e. a lot - of commit+consolidate), a lower value will cause a lot of disk space to be + of commit+consolidate), a lower value causes a lot of disk space to be wasted. For the case where the consolidation policies rarely merge segments (i.e. few - inserts/deletes), a higher value will impact performance without any added + inserts/deletes), a higher value impacts performance without any added benefits. _Background:_ @@ -486,9 +486,9 @@ paths: use: 0). For the case where there are a lot of inserts/updates, a higher value causes the index not to account for them and memory usage continues to grow until the commit. - For the case where there are a few inserts/updates, a lower value impacts - performance (because of synchronous locking) and wastes disk space for each - commit call. + A lower value impacts performance, including the case where there are no or only a + few inserts/updates because of synchronous locking, and it wastes disk space for + each commit call. _Background:_ For data retrieval, ArangoSearch follows the concept of @@ -510,7 +510,7 @@ paths: For the case where there are a lot of data modification operations, a higher value could potentially have the data store consume more space and file handles. For the case where there are a few data modification operations, a lower value - will impact performance due to no segment candidates available for + impacts performance due to no segment candidates being available for consolidation. _Background:_ @@ -528,8 +528,8 @@ paths: _Background:_ With each ArangoDB transaction that inserts documents, one or more ArangoSearch-internal segments get created. - Similarly, for removed documents the segments that contain such documents - will have these documents marked as 'deleted'. + Similarly, for removed documents, the segments that contain such documents + have these documents marked as 'deleted'. Over time, this approach causes a lot of small and sparse segments to be created. A "consolidation" operation selects one or more segments and copies all of @@ -586,7 +586,7 @@ paths: description: | Maximum memory byte size per writer (segment) before a writer (segment) flush is triggered. `0` value turns off this limit for any writer (buffer) and data - will be flushed periodically based on the value defined for the flush thread + is flushed periodically based on the value defined for the flush thread (ArangoDB server startup option). `0` value should be used carefully due to high potential memory consumption (default: 33554432, use 0 to disable) diff --git a/site/content/3.11/develop/http-api/views/arangosearch-views.md b/site/content/3.11/develop/http-api/views/arangosearch-views.md index e5700f02d2..d52ac3b76a 100644 --- a/site/content/3.11/develop/http-api/views/arangosearch-views.md +++ b/site/content/3.11/develop/http-api/views/arangosearch-views.md @@ -159,10 +159,10 @@ paths: Wait at least this many commits between removing unused files in the ArangoSearch data directory (default: 2, to disable use: 0). For the case where the consolidation policies merge segments often (i.e. a lot - of commit+consolidate), a lower value will cause a lot of disk space to be + of commit+consolidate), a lower value causes a lot of disk space to be wasted. For the case where the consolidation policies rarely merge segments (i.e. few - inserts/deletes), a higher value will impact performance without any added + inserts/deletes), a higher value impacts performance without any added benefits. _Background:_ @@ -180,9 +180,9 @@ paths: use: 0). For the case where there are a lot of inserts/updates, a higher value causes the index not to account for them and memory usage continues to grow until the commit. - For the case where there are a few inserts/updates, a lower value impacts - performance (because of synchronous locking) and wastes disk space for each - commit call. + A lower value impacts performance, including the case where there are no or only a + few inserts/updates because of synchronous locking, and it wastes disk space for + each commit call. _Background:_ For data retrieval, ArangoSearch follows the concept of @@ -204,11 +204,11 @@ paths: For the case where there are a lot of data modification operations, a higher value could potentially have the data store consume more space and file handles. For the case where there are a few data modification operations, a lower value - will impact performance due to no segment candidates available for + impacts performance due to no segment candidates being available for consolidation. _Background:_ - For data modification, ArangoSearch follow the concept of a + For data modification, ArangoSearch follows the concept of a "versioned data store". Thus old versions of data may be removed once there are no longer any users of the old data. The frequency of the cleanup and compaction operations are governed by `consolidationIntervalMsec` and the @@ -222,8 +222,8 @@ paths: _Background:_ With each ArangoDB transaction that inserts documents, one or more ArangoSearch-internal segments get created. - Similarly, for removed documents the segments that contain such documents - will have these documents marked as 'deleted'. + Similarly, for removed documents, the segments that contain such documents + have these documents marked as 'deleted'. Over time, this approach causes a lot of small and sparse segments to be created. A "consolidation" operation selects one or more segments and copies all of @@ -250,10 +250,10 @@ paths: (default: 2097152) - `segmentsBytesMax` (number, _optional_): Maximum allowed size of all consolidated segments in bytes (default: 5368709120) - - `segmentsMax` (number, _optional_): The maximum number of segments that will - be evaluated as candidates for consolidation (default: 10) - - `segmentsMin` (number, _optional_): The minimum number of segments that will - be evaluated as candidates for consolidation (default: 1) + - `segmentsMax` (number, _optional_): The maximum number of segments that are + evaluated as candidates for consolidation (default: 10) + - `segmentsMin` (number, _optional_): The minimum number of segments that are + evaluated as candidates for consolidation (default: 1) - `minScore` (number, _optional_): (default: 0) type: object writebufferIdle: @@ -271,7 +271,7 @@ paths: description: | Maximum memory byte size per writer (segment) before a writer (segment) flush is triggered. `0` value turns off this limit for any writer (buffer) and data - will be flushed periodically based on the value defined for the flush thread + is flushed periodically based on the value defined for the flush thread (ArangoDB server startup option). `0` value should be used carefully due to high potential memory consumption (default: 33554432, use 0 to disable, immutable) @@ -520,10 +520,10 @@ paths: Wait at least this many commits between removing unused files in the ArangoSearch data directory (default: 2, to disable use: 0). For the case where the consolidation policies merge segments often (i.e. a lot - of commit+consolidate), a lower value will cause a lot of disk space to be + of commit+consolidate), a lower value causes a lot of disk space to be wasted. For the case where the consolidation policies rarely merge segments (i.e. few - inserts/deletes), a higher value will impact performance without any added + inserts/deletes), a higher value impacts performance without any added benefits. _Background:_ @@ -541,12 +541,12 @@ paths: use: 0). For the case where there are a lot of inserts/updates, a higher value causes the index not to account for them and memory usage continues to grow until the commit. - For the case where there are a few inserts/updates, a lower value impacts - performance (because of synchronous locking) and wastes disk space for each - commit call. + A lower value impacts performance, including the case where there are no or only a + few inserts/updates because of synchronous locking, and it wastes disk space for + each commit call. _Background:_ - For data retrieval, ArangoSearch follow the concept of + For data retrieval, ArangoSearch follows the concept of "eventually-consistent", i.e. eventually all the data in ArangoDB will be matched by corresponding query expressions. The concept of ArangoSearch "commit" operations is introduced to @@ -565,11 +565,11 @@ paths: For the case where there are a lot of data modification operations, a higher value could potentially have the data store consume more space and file handles. For the case where there are a few data modification operations, a lower value - will impact performance due to no segment candidates available for + impacts performance due to no segment candidates being available for consolidation. _Background:_ - For data modification, ArangoSearch follow the concept of a + For data modification, ArangoSearch follows the concept of a "versioned data store". Thus old versions of data may be removed once there are no longer any users of the old data. The frequency of the cleanup and compaction operations are governed by `consolidationIntervalMsec` and the @@ -583,8 +583,8 @@ paths: _Background:_ With each ArangoDB transaction that inserts documents, one or more ArangoSearch-internal segments get created. - Similarly, for removed documents the segments that contain such documents - will have these documents marked as 'deleted'. + Similarly, for removed documents, the segments that contain such documents + have these documents marked as 'deleted'. Over time, this approach causes a lot of small and sparse segments to be created. A "consolidation" operation selects one or more segments and copies all of @@ -611,10 +611,10 @@ paths: (default: 2097152) - `segmentsBytesMax` (number, _optional_): Maximum allowed size of all consolidated segments in bytes (default: 5368709120) - - `segmentsMax` (number, _optional_): The maximum number of segments that will - be evaluated as candidates for consolidation (default: 10) - - `segmentsMin` (number, _optional_): The minimum number of segments that will - be evaluated as candidates for consolidation (default: 1) + - `segmentsMax` (number, _optional_): The maximum number of segments that are + evaluated as candidates for consolidation (default: 10) + - `segmentsMin` (number, _optional_): The minimum number of segments that are + evaluated as candidates for consolidation (default: 1) - `minScore` (number, _optional_): (default: 0) type: object responses: @@ -686,10 +686,10 @@ paths: Wait at least this many commits between removing unused files in the ArangoSearch data directory (default: 2, to disable use: 0). For the case where the consolidation policies merge segments often (i.e. a lot - of commit+consolidate), a lower value will cause a lot of disk space to be + of commit+consolidate), a lower value causes a lot of disk space to be wasted. For the case where the consolidation policies rarely merge segments (i.e. few - inserts/deletes), a higher value will impact performance without any added + inserts/deletes), a higher value impacts performance without any added benefits. _Background:_ @@ -707,9 +707,9 @@ paths: use: 0). For the case where there are a lot of inserts/updates, a higher value causes the index not to account for them and memory usage continues to grow until the commit. - For the case where there are a few inserts/updates, a lower value impacts - performance (because of synchronous locking) and wastes disk space for each - commit call. + A lower value impacts performance, including the case where there are no or only a + few inserts/updates because of synchronous locking, and it wastes disk space for + each commit call. _Background:_ For data retrieval, ArangoSearch follows the concept of @@ -731,11 +731,11 @@ paths: For the case where there are a lot of data modification operations, a higher value could potentially have the data store consume more space and file handles. For the case where there are a few data modification operations, a lower value - will impact performance due to no segment candidates available for + impacts performance due to no segment candidates being available for consolidation. _Background:_ - For data modification, ArangoSearch follow the concept of a + For data modification, ArangoSearch follows the concept of a "versioned data store". Thus old versions of data may be removed once there are no longer any users of the old data. The frequency of the cleanup and compaction operations are governed by `consolidationIntervalMsec` and the @@ -749,8 +749,8 @@ paths: _Background:_ With each ArangoDB transaction that inserts documents, one or more ArangoSearch-internal segments get created. - Similarly, for removed documents the segments that contain such documents - will have these documents marked as 'deleted'. + Similarly, for removed documents, the segments that contain such documents + have these documents marked as 'deleted'. Over time, this approach causes a lot of small and sparse segments to be created. A "consolidation" operation selects one or more segments and copies all of @@ -777,10 +777,10 @@ paths: (default: 2097152) - `segmentsBytesMax` (number, _optional_): Maximum allowed size of all consolidated segments in bytes (default: 5368709120) - - `segmentsMax` (number, _optional_): The maximum number of segments that will - be evaluated as candidates for consolidation (default: 10) - - `segmentsMin` (number, _optional_): The minimum number of segments that will - be evaluated as candidates for consolidation (default: 1) + - `segmentsMax` (number, _optional_): The maximum number of segments that are + evaluated as candidates for consolidation (default: 10) + - `segmentsMin` (number, _optional_): The minimum number of segments that are + evaluated as candidates for consolidation (default: 1) - `minScore` (number, _optional_): (default: 0) type: object responses: diff --git a/site/content/3.11/index-and-search/arangosearch/arangosearch-views-reference.md b/site/content/3.11/index-and-search/arangosearch/arangosearch-views-reference.md index 0516f4f70f..43bec445b3 100644 --- a/site/content/3.11/index-and-search/arangosearch/arangosearch-views-reference.md +++ b/site/content/3.11/index-and-search/arangosearch/arangosearch-views-reference.md @@ -371,9 +371,9 @@ of removing unused segments after release of internal resources. For the case where there are a lot of inserts/updates, a higher value causes the index not to account for them and memory usage continues to grow until the commit. - For the case where there are a few inserts/updates, a lower value impacts - performance (because of synchronous locking) and wastes disk space for each - commit call. + A lower value impacts performance, including the case where there are no or only a + few inserts/updates because of synchronous locking, and it wastes disk space for + each commit call. > For data retrieval `arangosearch` Views follow the concept of > "eventually-consistent", i.e. eventually all the data in ArangoDB is @@ -424,8 +424,8 @@ is used by these writers (in terms of "writers pool") one can use to disable use: `0`; _immutable_) Maximum memory byte size per writer (segment) before a writer (segment) flush is - triggered. `0` value turns off this limit for any writer (buffer) and data will - be flushed periodically. `0` value should be used carefully due to high + triggered. `0` value turns off this limit for any writer (buffer) and data is + flushed periodically. `0` value should be used carefully due to high potential memory consumption. - **consolidationPolicy** (_optional_; type: `object`; default: `{}`) diff --git a/site/content/3.12/develop/foxx-microservices/guides/working-with-files.md b/site/content/3.12/develop/foxx-microservices/guides/working-with-files.md index 85a816bda7..49e488bddb 100644 --- a/site/content/3.12/develop/foxx-microservices/guides/working-with-files.md +++ b/site/content/3.12/develop/foxx-microservices/guides/working-with-files.md @@ -61,11 +61,11 @@ the filesystem from within a service: may therefore cause race conditions and **result in corrupted data**. - Writing to files outside the service folder introduces external state. In - a cluster this will result in Coordinators no longer being interchangeable. + a cluster, this results in Coordinators no longer being interchangeable. - Writing to files during setup is unreliable because the setup script may - be executed several times or not at all. In a cluster the setup script - will only be executed on a single Coordinator. + be executed several times or not at all. In a cluster, the setup script + is only executed on a single Coordinator. Therefore it is almost always a better option to store files using a specialized, external file storage service @@ -77,13 +77,13 @@ ArangoDB documents by using a separate collection. {{< danger >}} Due to the way ArangoDB stores documents internally, you should not store file contents alongside other attributes that might be updated independently. -Additionally, large file sizes will impact performance for operations +Additionally, large file sizes impact performance for operations involving the document and may affect overall database performance. {{< /danger >}} {{< warning >}} In production, you should avoid storing any files in ArangoDB or handling file -uploads in Foxx. The following example will work for moderate amounts of small +uploads in Foxx. The following example works for moderate amounts of small files but is not recommended for large files or frequent uploads or modifications. {{< /warning >}} diff --git a/site/content/3.12/develop/http-api/indexes/inverted.md b/site/content/3.12/develop/http-api/indexes/inverted.md index 86f3f383f3..de489c6dfe 100644 --- a/site/content/3.12/develop/http-api/indexes/inverted.md +++ b/site/content/3.12/develop/http-api/indexes/inverted.md @@ -491,10 +491,10 @@ paths: Wait at least this many commits between removing unused files in the ArangoSearch data directory (default: 2, to disable use: 0). For the case where the consolidation policies merge segments often (i.e. a lot - of commit+consolidate), a lower value will cause a lot of disk space to be + of commit+consolidate), a lower value causes a lot of disk space to be wasted. For the case where the consolidation policies rarely merge segments (i.e. few - inserts/deletes), a higher value will impact performance without any added + inserts/deletes), a higher value impacts performance without any added benefits. _Background:_ @@ -512,9 +512,9 @@ paths: use: 0). For the case where there are a lot of inserts/updates, a higher value causes the index not to account for them and memory usage continues to grow until the commit. - For the case where there are a few inserts/updates, a lower value impacts - performance (because of synchronous locking) and wastes disk space for each - commit call. + A lower value impacts performance, including the case where there are no or only a + few inserts/updates because of synchronous locking, and it wastes disk space for + each commit call. _Background:_ For data retrieval, ArangoSearch follows the concept of @@ -536,7 +536,7 @@ paths: For the case where there are a lot of data modification operations, a higher value could potentially have the data store consume more space and file handles. For the case where there are a few data modification operations, a lower value - will impact performance due to no segment candidates available for + impacts performance due to no segment candidates being available for consolidation. _Background:_ @@ -554,8 +554,8 @@ paths: _Background:_ With each ArangoDB transaction that inserts documents, one or more ArangoSearch-internal segments get created. - Similarly, for removed documents the segments that contain such documents - will have these documents marked as 'deleted'. + Similarly, for removed documents, the segments that contain such documents + have these documents marked as 'deleted'. Over time, this approach causes a lot of small and sparse segments to be created. A "consolidation" operation selects one or more segments and copies all of @@ -612,7 +612,7 @@ paths: description: | Maximum memory byte size per writer (segment) before a writer (segment) flush is triggered. `0` value turns off this limit for any writer (buffer) and data - will be flushed periodically based on the value defined for the flush thread + is flushed periodically based on the value defined for the flush thread (ArangoDB server startup option). `0` value should be used carefully due to high potential memory consumption (default: 33554432, use 0 to disable) diff --git a/site/content/3.12/develop/http-api/views/arangosearch-views.md b/site/content/3.12/develop/http-api/views/arangosearch-views.md index ebd04efd32..6fa73494af 100644 --- a/site/content/3.12/develop/http-api/views/arangosearch-views.md +++ b/site/content/3.12/develop/http-api/views/arangosearch-views.md @@ -183,10 +183,10 @@ paths: Wait at least this many commits between removing unused files in the ArangoSearch data directory (default: 2, to disable use: 0). For the case where the consolidation policies merge segments often (i.e. a lot - of commit+consolidate), a lower value will cause a lot of disk space to be + of commit+consolidate), a lower value causes a lot of disk space to be wasted. For the case where the consolidation policies rarely merge segments (i.e. few - inserts/deletes), a higher value will impact performance without any added + inserts/deletes), a higher value impacts performance without any added benefits. _Background:_ @@ -204,9 +204,9 @@ paths: use: 0). For the case where there are a lot of inserts/updates, a higher value causes the index not to account for them and memory usage continues to grow until the commit. - For the case where there are a few inserts/updates, a lower value impacts - performance (because of synchronous locking) and wastes disk space for each - commit call. + A lower value impacts performance, including the case where there are no or only a + few inserts/updates because of synchronous locking, and it wastes disk space for + each commit call. _Background:_ For data retrieval, ArangoSearch follows the concept of @@ -228,11 +228,11 @@ paths: For the case where there are a lot of data modification operations, a higher value could potentially have the data store consume more space and file handles. For the case where there are a few data modification operations, a lower value - will impact performance due to no segment candidates available for + impacts performance due to no segment candidates being available for consolidation. _Background:_ - For data modification, ArangoSearch follow the concept of a + For data modification, ArangoSearch follows the concept of a "versioned data store". Thus old versions of data may be removed once there are no longer any users of the old data. The frequency of the cleanup and compaction operations are governed by `consolidationIntervalMsec` and the @@ -246,8 +246,8 @@ paths: _Background:_ With each ArangoDB transaction that inserts documents, one or more ArangoSearch-internal segments get created. - Similarly, for removed documents the segments that contain such documents - will have these documents marked as 'deleted'. + Similarly, for removed documents, the segments that contain such documents + have these documents marked as 'deleted'. Over time, this approach causes a lot of small and sparse segments to be created. A "consolidation" operation selects one or more segments and copies all of @@ -274,10 +274,10 @@ paths: (default: 2097152) - `segmentsBytesMax` (number, _optional_): Maximum allowed size of all consolidated segments in bytes (default: 5368709120) - - `segmentsMax` (number, _optional_): The maximum number of segments that will - be evaluated as candidates for consolidation (default: 10) - - `segmentsMin` (number, _optional_): The minimum number of segments that will - be evaluated as candidates for consolidation (default: 1) + - `segmentsMax` (number, _optional_): The maximum number of segments that are + evaluated as candidates for consolidation (default: 10) + - `segmentsMin` (number, _optional_): The minimum number of segments that are + evaluated as candidates for consolidation (default: 1) - `minScore` (number, _optional_): (default: 0) type: object writebufferIdle: @@ -295,7 +295,7 @@ paths: description: | Maximum memory byte size per writer (segment) before a writer (segment) flush is triggered. `0` value turns off this limit for any writer (buffer) and data - will be flushed periodically based on the value defined for the flush thread + is flushed periodically based on the value defined for the flush thread (ArangoDB server startup option). `0` value should be used carefully due to high potential memory consumption (default: 33554432, use 0 to disable, immutable) @@ -544,10 +544,10 @@ paths: Wait at least this many commits between removing unused files in the ArangoSearch data directory (default: 2, to disable use: 0). For the case where the consolidation policies merge segments often (i.e. a lot - of commit+consolidate), a lower value will cause a lot of disk space to be + of commit+consolidate), a lower value causes a lot of disk space to be wasted. For the case where the consolidation policies rarely merge segments (i.e. few - inserts/deletes), a higher value will impact performance without any added + inserts/deletes), a higher value impacts performance without any added benefits. _Background:_ @@ -565,12 +565,12 @@ paths: use: 0). For the case where there are a lot of inserts/updates, a higher value causes the index not to account for them and memory usage continues to grow until the commit. - For the case where there are a few inserts/updates, a lower value impacts - performance (because of synchronous locking) and wastes disk space for each - commit call. + A lower value impacts performance, including the case where there are no or only a + few inserts/updates because of synchronous locking, and it wastes disk space for + each commit call. _Background:_ - For data retrieval, ArangoSearch follow the concept of + For data retrieval, ArangoSearch follows the concept of "eventually-consistent", i.e. eventually all the data in ArangoDB will be matched by corresponding query expressions. The concept of ArangoSearch "commit" operations is introduced to @@ -589,11 +589,11 @@ paths: For the case where there are a lot of data modification operations, a higher value could potentially have the data store consume more space and file handles. For the case where there are a few data modification operations, a lower value - will impact performance due to no segment candidates available for + impacts performance due to no segment candidates being available for consolidation. _Background:_ - For data modification, ArangoSearch follow the concept of a + For data modification, ArangoSearch follows the concept of a "versioned data store". Thus old versions of data may be removed once there are no longer any users of the old data. The frequency of the cleanup and compaction operations are governed by `consolidationIntervalMsec` and the @@ -607,8 +607,8 @@ paths: _Background:_ With each ArangoDB transaction that inserts documents, one or more ArangoSearch-internal segments get created. - Similarly, for removed documents the segments that contain such documents - will have these documents marked as 'deleted'. + Similarly, for removed documents, the segments that contain such documents + have these documents marked as 'deleted'. Over time, this approach causes a lot of small and sparse segments to be created. A "consolidation" operation selects one or more segments and copies all of @@ -635,10 +635,10 @@ paths: (default: 2097152) - `segmentsBytesMax` (number, _optional_): Maximum allowed size of all consolidated segments in bytes (default: 5368709120) - - `segmentsMax` (number, _optional_): The maximum number of segments that will - be evaluated as candidates for consolidation (default: 10) - - `segmentsMin` (number, _optional_): The minimum number of segments that will - be evaluated as candidates for consolidation (default: 1) + - `segmentsMax` (number, _optional_): The maximum number of segments that are + evaluated as candidates for consolidation (default: 10) + - `segmentsMin` (number, _optional_): The minimum number of segments that are + evaluated as candidates for consolidation (default: 1) - `minScore` (number, _optional_): (default: 0) type: object responses: @@ -710,10 +710,10 @@ paths: Wait at least this many commits between removing unused files in the ArangoSearch data directory (default: 2, to disable use: 0). For the case where the consolidation policies merge segments often (i.e. a lot - of commit+consolidate), a lower value will cause a lot of disk space to be + of commit+consolidate), a lower value causes a lot of disk space to be wasted. For the case where the consolidation policies rarely merge segments (i.e. few - inserts/deletes), a higher value will impact performance without any added + inserts/deletes), a higher value impacts performance without any added benefits. _Background:_ @@ -731,9 +731,9 @@ paths: use: 0). For the case where there are a lot of inserts/updates, a higher value causes the index not to account for them and memory usage continues to grow until the commit. - For the case where there are a few inserts/updates, a lower value impacts - performance (because of synchronous locking) and wastes disk space for each - commit call. + A lower value impacts performance, including the case where there are no or only a + few inserts/updates because of synchronous locking, and it wastes disk space for + each commit call. _Background:_ For data retrieval, ArangoSearch follows the concept of @@ -755,11 +755,11 @@ paths: For the case where there are a lot of data modification operations, a higher value could potentially have the data store consume more space and file handles. For the case where there are a few data modification operations, a lower value - will impact performance due to no segment candidates available for + impacts performance due to no segment candidates being available for consolidation. _Background:_ - For data modification, ArangoSearch follow the concept of a + For data modification, ArangoSearch follows the concept of a "versioned data store". Thus old versions of data may be removed once there are no longer any users of the old data. The frequency of the cleanup and compaction operations are governed by `consolidationIntervalMsec` and the @@ -773,8 +773,8 @@ paths: _Background:_ With each ArangoDB transaction that inserts documents, one or more ArangoSearch-internal segments get created. - Similarly, for removed documents the segments that contain such documents - will have these documents marked as 'deleted'. + Similarly, for removed documents, the segments that contain such documents + have these documents marked as 'deleted'. Over time, this approach causes a lot of small and sparse segments to be created. A "consolidation" operation selects one or more segments and copies all of @@ -801,10 +801,10 @@ paths: (default: 2097152) - `segmentsBytesMax` (number, _optional_): Maximum allowed size of all consolidated segments in bytes (default: 5368709120) - - `segmentsMax` (number, _optional_): The maximum number of segments that will - be evaluated as candidates for consolidation (default: 10) - - `segmentsMin` (number, _optional_): The minimum number of segments that will - be evaluated as candidates for consolidation (default: 1) + - `segmentsMax` (number, _optional_): The maximum number of segments that are + evaluated as candidates for consolidation (default: 10) + - `segmentsMin` (number, _optional_): The minimum number of segments that are + evaluated as candidates for consolidation (default: 1) - `minScore` (number, _optional_): (default: 0) type: object responses: diff --git a/site/content/3.12/index-and-search/arangosearch/arangosearch-views-reference.md b/site/content/3.12/index-and-search/arangosearch/arangosearch-views-reference.md index 7f6fd97c0a..518637d8be 100644 --- a/site/content/3.12/index-and-search/arangosearch/arangosearch-views-reference.md +++ b/site/content/3.12/index-and-search/arangosearch/arangosearch-views-reference.md @@ -392,12 +392,11 @@ of removing unused segments after release of internal resources. Wait at least this many milliseconds between committing View data store changes and making documents visible to queries. - For the case where there are a lot of inserts/updates, a higher value, until - commit, causes the index not to account for them and memory usage continues - to grow. - For the case where there are a few inserts/updates, a lower value impacts - performance and wastes disk space for each commit call without any added - benefits. + For the case where there are a lot of inserts/updates, a higher value causes the + index not to account for them and memory usage continues to grow until the commit. + A lower value impacts performance, including the case where there are no or only a + few inserts/updates because of synchronous locking, and it wastes disk space for + each commit call. > For data retrieval `arangosearch` Views follow the concept of > "eventually-consistent", i.e. eventually all the data in ArangoDB is @@ -448,8 +447,8 @@ is used by these writers (in terms of "writers pool") one can use to disable use: `0`; _immutable_) Maximum memory byte size per writer (segment) before a writer (segment) flush is - triggered. `0` value turns off this limit for any writer (buffer) and data will - be flushed periodically. `0` value should be used carefully due to high + triggered. `0` value turns off this limit for any writer (buffer) and data + is flushed periodically. `0` value should be used carefully due to high potential memory consumption. - **consolidationPolicy** (_optional_; type: `object`; default: `{}`)