From 6ca6e6961ae3a657f4c7bb653cda8a30c52077e8 Mon Sep 17 00:00:00 2001 From: Simran Brucherseifer Date: Wed, 18 Mar 2020 16:00:42 +0100 Subject: [PATCH 1/7] Supported languages --- 3.5/arangosearch-analyzers.md | 36 ++++++++++++++++++++-- 3.6/arangosearch-analyzers.md | 35 ++++++++++++++++++++-- 3.7/arangosearch-analyzers.md | 56 +++++++++++++++++++++++++++++++---- 3 files changed, 115 insertions(+), 12 deletions(-) diff --git a/3.5/arangosearch-analyzers.md b/3.5/arangosearch-analyzers.md index 1339779706..f6e8502955 100644 --- a/3.5/arangosearch-analyzers.md +++ b/3.5/arangosearch-analyzers.md @@ -194,9 +194,6 @@ An Analyzer capable of breaking up strings into individual words while also optionally filtering out stop-words, extracting word stems, applying case conversion and accent removal. -Stemming support is provided by -[Snowball](https://snowballstem.org/){:target="_blank"}. - The *properties* allowed for this Analyzer are an object with the following attributes: @@ -281,3 +278,36 @@ Name | Type | Language `text_ru` | `text` | Russian `text_sv` | `text` | Swedish `text_zh` | `text` | Chinese + +Supported Languages +------------------- + +Analyzers rely on [ICU](http://site.icu-project.org/){:target="_blank"} for +language-dependent tokenization and normalization. The ICU data file +`icudtl.dat` that ArangoDB ships with contains information for a lot of +languages, which are technically all supported. + +{% hint 'warning' %} +The alphabetical order of characters is not taken into account by ArangoSearch, +i.e. range queries and sorting of Views will not follow the language rules as +per the defined Analyzer locale or the server startup option +`--default-language`! +{% endhint %} + +Stemming support is provided by [Snowball](https://snowballstem.org/){:target="_blank"}, +which supports the following languages: + +Code | Language +------|----------- +`de` | German +`en` | English +`es` | Spanish +`fi` | Finnish +`fr` | French +`it` | Italian +`nl` | Dutch +`no` | Norwegian +`pt` | Portuguese +`ru` | Russian +`sv` | Swedish +`zh` | Chinese diff --git a/3.6/arangosearch-analyzers.md b/3.6/arangosearch-analyzers.md index 277ce18f85..9a8a575fe4 100644 --- a/3.6/arangosearch-analyzers.md +++ b/3.6/arangosearch-analyzers.md @@ -215,9 +215,6 @@ An Analyzer capable of breaking up strings into individual words while also optionally filtering out stop-words, extracting word stems, applying case conversion and accent removal. -Stemming support is provided by -[Snowball](https://snowballstem.org/){:target="_blank"}. - The *properties* allowed for this Analyzer are an object with the following attributes: @@ -367,3 +364,35 @@ Name | Type | Language `text_ru` | `text` | Russian `text_sv` | `text` | Swedish `text_zh` | `text` | Chinese + +Supported Languages +------------------- + +Analyzers rely on [ICU](http://site.icu-project.org/){:target="_blank"} for +language-dependent tokenization and normalization. The ICU data file +`icudtl.dat` that ArangoDB ships with contains information for a lot of +languages, which are technically all supported. + +{% hint 'warning' %} +The alphabetical order of characters is not taken into account by ArangoSearch, +i.e. range queries and sorting of Views will not follow the language rules as +per the defined Analyzer locale or the server startup option +`--default-language`! +{% endhint %} + +Stemming support is provided by [Snowball](https://snowballstem.org/){:target="_blank"}, +which supports the following languages: + +Code | Language +------|----------- +`de` | German +`en` | English +`es` | Spanish +`fi` | Finnish +`fr` | French +`it` | Italian +`nl` | Dutch +`no` | Norwegian +`pt` | Portuguese +`ru` | Russian +`sv` | Swedish diff --git a/3.7/arangosearch-analyzers.md b/3.7/arangosearch-analyzers.md index 21afb12ceb..8149fbbd29 100644 --- a/3.7/arangosearch-analyzers.md +++ b/3.7/arangosearch-analyzers.md @@ -135,7 +135,7 @@ attributes: - `locale` (string): a locale in the format `language[_COUNTRY][.encoding][@variant]` (square brackets denote optional parts), e.g. `"de.utf-8"` or `"en_US.utf-8"`. Only UTF-8 encoding is - meaningful in ArangoDB. + meaningful in ArangoDB. Also see [Supported Languages](#supported-languages). ### Norm @@ -148,7 +148,7 @@ attributes: - `locale` (string): a locale in the format `language[_COUNTRY][.encoding][@variant]` (square brackets denote optional parts), e.g. `"de.utf-8"` or `"en_US.utf-8"`. Only UTF-8 encoding is - meaningful in ArangoDB. + meaningful in ArangoDB. Also see [Supported Languages](#supported-languages). - `accent` (boolean, _optional_): - `true` to preserve accented characters (default) - `false` to convert accented characters to their base characters @@ -215,16 +215,13 @@ An Analyzer capable of breaking up strings into individual words while also optionally filtering out stop-words, extracting word stems, applying case conversion and accent removal. -Stemming support is provided by -[Snowball](https://snowballstem.org/){:target="_blank"}. - The *properties* allowed for this Analyzer are an object with the following attributes: - `locale` (string): a locale in the format `language[_COUNTRY][.encoding][@variant]` (square brackets denote optional parts), e.g. `"de.utf-8"` or `"en_US.utf-8"`. Only UTF-8 encoding is - meaningful in ArangoDB. + meaningful in ArangoDB. Also see [Supported Languages](#supported-languages). - `accent` (boolean, _optional_): - `true` to preserve accented characters - `false` to convert accented characters to their base characters (default) @@ -367,3 +364,50 @@ Name | Type | Language `text_ru` | `text` | Russian `text_sv` | `text` | Swedish `text_zh` | `text` | Chinese + +Supported Languages +------------------- + +Analyzers rely on [ICU](http://site.icu-project.org/){:target="_blank"} for +language-dependent tokenization and normalization. The ICU data file +`icudtl.dat` that ArangoDB ships with contains information for a lot of +languages, which are technically all supported. + +{% hint 'warning' %} +The alphabetical order of characters is not taken into account by ArangoSearch, +i.e. range queries and sorting of Views will not follow the language rules as +per the defined Analyzer locale or the server startup option +`--default-language`! +{% endhint %} + +Stemming support is provided by [Snowball](https://snowballstem.org/){:target="_blank"}, +which supports the following languages: + +Language | Code +---------- | ---- +Arabic | `ar` +Basque | `eu` +Catalan | `ca` +Danish | `da` +Dutch | `nl` +English | `en` +Finnish | `fi` +French | `fr` +German | `de` +Greek | `el` +Hindi | `hi` +Hungarian | `hu` +Indonesian | `id` +Irish | `ga` +Italian | `it` +Lithuanian | `lt` +Nepali | `ne` +Norwegian | `no` +Portuguese | `pt` +Romanian | `ro` +Russian | `ru` +Serbian | `sr` +Spanish | `es` +Swedish | `sv` +Tamil | `ta` +Turkish | `tr` From 8510ca31439f6aa50d95f0014803f124d7b15997 Mon Sep 17 00:00:00 2001 From: Simran Brucherseifer Date: Wed, 18 Mar 2020 16:04:07 +0100 Subject: [PATCH 2/7] xrefs --- 3.5/arangosearch-analyzers.md | 6 +++--- 3.6/arangosearch-analyzers.md | 6 +++--- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/3.5/arangosearch-analyzers.md b/3.5/arangosearch-analyzers.md index f6e8502955..3af6c76744 100644 --- a/3.5/arangosearch-analyzers.md +++ b/3.5/arangosearch-analyzers.md @@ -134,7 +134,7 @@ attributes: - `locale` (string): a locale in the format `language[_COUNTRY][.encoding][@variant]` (square brackets denote optional parts), e.g. `"de.utf-8"` or `"en_US.utf-8"`. Only UTF-8 encoding is - meaningful in ArangoDB. + meaningful in ArangoDB. Also see [Supported Languages](#supported-languages). ### Norm @@ -147,7 +147,7 @@ attributes: - `locale` (string): a locale in the format `language[_COUNTRY][.encoding][@variant]` (square brackets denote optional parts), e.g. `"de.utf-8"` or `"en_US.utf-8"`. Only UTF-8 encoding is - meaningful in ArangoDB. + meaningful in ArangoDB. Also see [Supported Languages](#supported-languages). - `accent` (boolean, _optional_): - `true` to preserve accented characters (default) - `false` to convert accented characters to their base characters @@ -200,7 +200,7 @@ attributes: - `locale` (string): a locale in the format `language[_COUNTRY][.encoding][@variant]` (square brackets denote optional parts), e.g. `"de.utf-8"` or `"en_US.utf-8"`. Only UTF-8 encoding is - meaningful in ArangoDB. + meaningful in ArangoDB. Also see [Supported Languages](#supported-languages). - `accent` (boolean, _optional_): - `true` to preserve accented characters - `false` to convert accented characters to their base characters (default) diff --git a/3.6/arangosearch-analyzers.md b/3.6/arangosearch-analyzers.md index 9a8a575fe4..4f5f322562 100644 --- a/3.6/arangosearch-analyzers.md +++ b/3.6/arangosearch-analyzers.md @@ -135,7 +135,7 @@ attributes: - `locale` (string): a locale in the format `language[_COUNTRY][.encoding][@variant]` (square brackets denote optional parts), e.g. `"de.utf-8"` or `"en_US.utf-8"`. Only UTF-8 encoding is - meaningful in ArangoDB. + meaningful in ArangoDB. Also see [Supported Languages](#supported-languages). ### Norm @@ -148,7 +148,7 @@ attributes: - `locale` (string): a locale in the format `language[_COUNTRY][.encoding][@variant]` (square brackets denote optional parts), e.g. `"de.utf-8"` or `"en_US.utf-8"`. Only UTF-8 encoding is - meaningful in ArangoDB. + meaningful in ArangoDB. Also see [Supported Languages](#supported-languages). - `accent` (boolean, _optional_): - `true` to preserve accented characters (default) - `false` to convert accented characters to their base characters @@ -221,7 +221,7 @@ attributes: - `locale` (string): a locale in the format `language[_COUNTRY][.encoding][@variant]` (square brackets denote optional parts), e.g. `"de.utf-8"` or `"en_US.utf-8"`. Only UTF-8 encoding is - meaningful in ArangoDB. + meaningful in ArangoDB. Also see [Supported Languages](#supported-languages). - `accent` (boolean, _optional_): - `true` to preserve accented characters - `false` to convert accented characters to their base characters (default) From a8dce603955126e0433e7fbcb4238877d913ae43 Mon Sep 17 00:00:00 2001 From: Simran Brucherseifer Date: Wed, 18 Mar 2020 16:22:25 +0100 Subject: [PATCH 3/7] Release notes --- 3.7/release-notes-new-features37.md | 41 +++++++++++++++++++++++++++++ 1 file changed, 41 insertions(+) diff --git a/3.7/release-notes-new-features37.md b/3.7/release-notes-new-features37.md index b68213888e..44ddb44049 100644 --- a/3.7/release-notes-new-features37.md +++ b/3.7/release-notes-new-features37.md @@ -30,6 +30,47 @@ FOR doc IN viewName See [ArangoSearch functions](aql/functions-arangosearch.html#like) +### Stemming support for more languages + +The Snowball library was updated to the latest version 2, adding stemming +support for the following languages: + +- Arabic (`ar`) +- Basque (`eu`) +- Catalan (`ca`) +- Danish (`da`) +- Greek (`el`) +- Hindi (`hi`) +- Hungarian (`hu`) +- Indonesian (`id`) +- Irish (`ga`) +- Lithuanian (`lt`) +- Nepali (`ne`) +- Romanian (`ro`) +- Serbian (`sr`) +- Tamil (`ta`) +- Turkish (`tr`) + +Create a custom Analyzer and set the `locale` accordingly in the properties, +e.g. `"el.utf-8"` for Greek. Arangosh example: + +```js +var analyzers = require("@arangodb/analyzers"); + +analyzers.save("text_el", "text", { + locale: "el.utf-8", + stemming: true, + case: "lower", + accent: false, + stopwords: [] +}, ["frequency", "norm", "position"]); + +db._query(`RETURN TOKENS("αυτοκινητουσ πρωταγωνιστούσαν", "text_el")`) +// [ [ "αυτοκινητ", "πρωταγωνιστ" ] ] +``` + +Also see [Analyzers: Supported Languages](arangosearch-analyzers.html#supported-languages) + SatelliteGraphs --------------- From bb32e986ce7510c7c3bce03e1b0148690d9ee0cb Mon Sep 17 00:00:00 2001 From: Simran Date: Wed, 18 Mar 2020 19:34:53 +0100 Subject: [PATCH 4/7] =?UTF-8?q?Breaking=20change:=20=D1=91=20gets=20active?= =?UTF-8?q?ly=20mapped=20to=20=D0=B5?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- 3.7/release-notes-upgrading-changes37.md | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/3.7/release-notes-upgrading-changes37.md b/3.7/release-notes-upgrading-changes37.md index f392fe79c0..fb0a48b839 100644 --- a/3.7/release-notes-upgrading-changes37.md +++ b/3.7/release-notes-upgrading-changes37.md @@ -25,6 +25,16 @@ will still continue to work in ArangoDB 3.7. However, 3.7 will be the last Arang version supporting the MMFiles storage engines, so users are asked to migrate to the RocksDB storage engine soon. +ArangoSearch +------------ + +The stemming library Snowball was updated, bringing a breaking change to Analyzers: + +There is a 33rd letter in the Russian alphabet, `ё` (`e"`), but it is rarely used +and often replaced by `е` in informal writing. The original algorithm assumed it +had already been mapped to `е` (`e`), but now it actively translates this character +if the locale is set to Russian language. + HTTP REST API ------------- From 013c92c7e65e3b11c576d09330744ac7063eb2c0 Mon Sep 17 00:00:00 2001 From: Simran Brucherseifer Date: Thu, 19 Mar 2020 14:51:10 +0100 Subject: [PATCH 5/7] Add known issue --- 3.5/release-notes-known-issues35.md | 1 + 3.6/release-notes-known-issues35.md | 1 + 3.6/release-notes-known-issues36.md | 1 + 3.7/release-notes-known-issues35.md | 1 + 3.7/release-notes-known-issues36.md | 2 ++ 3.7/release-notes-known-issues37.md | 1 + 6 files changed, 7 insertions(+) diff --git a/3.5/release-notes-known-issues35.md b/3.5/release-notes-known-issues35.md index fb5432c009..51ff510aff 100644 --- a/3.5/release-notes-known-issues35.md +++ b/3.5/release-notes-known-issues35.md @@ -23,6 +23,7 @@ ArangoSearch | **Date Added:** 2018-12-03
**Component:** ArangoSearch
**Deployment Mode:** All
**Description:** Using a loop variable in expressions within a corresponding SEARCH condition is not supported
**Affected Versions:** 3.4.x, 3.5.x
**Fixed in Versions:** -
**Reference:** [arangodb/backlog#318](https://github.com/arangodb/backlog/issues/318){:target="_blank"} (internal) | | **Date Added:** 2019-06-25
**Component:** ArangoSearch
**Deployment Mode:** All
**Description:** The `primarySort` attribute in ArangoSearch View definitions can not be set via the web interface. The option is immutable, but the web interface does not allow to set any View properties upfront (it creates a View with default parameters before the user has a chance to configure it).
**Affected Versions:** 3.5.x
**Fixed in Versions:** -
**Reference:** N/A | | **Date Added:** 2019-11-06
**Component:** ArangoSearch
**Deployment Mode:** Cluster
**Description:** There is a possibility to get into deadlocks during Coordinator execution if a custom Analyzer was created (and is present in the `_analyzers` system collection). It is recommended not to use custom Analyzers in production environments in affected versions.
**Affected Versions:** 3.5.x
**Fixed in Versions:** 3.5.3
**Reference:** [arangodb/backlog#651](https://github.com/arangodb/backlog/issues/651){:target="_blank"} (internal) | +| **Date Added:** 2020-03-19
**Component:** ArangoSearch
**Deployment Mode:** All
**Description:** Operators and functions in `SEARCH` clauses of AQL queries which compare values such as `>`, `>=`, `<`, `<=`, `IN_RANGE()` and `STARTS_WITH()` neither take the server language (`--default-language`) nor the Analyzer locale into account. The alphabetical order of characters as defined by a language is thus not honored and can lead to unexpected results in range queries.
**Affected Versions:** 3.5.x
**Fixed in Versions:** -
**Reference:** [arangodb/backlog#679](https://github.com/arangodb/backlog/issues/679){:target="_blank"} (internal) | AQL --- diff --git a/3.6/release-notes-known-issues35.md b/3.6/release-notes-known-issues35.md index fb5432c009..51ff510aff 100644 --- a/3.6/release-notes-known-issues35.md +++ b/3.6/release-notes-known-issues35.md @@ -23,6 +23,7 @@ ArangoSearch | **Date Added:** 2018-12-03
**Component:** ArangoSearch
**Deployment Mode:** All
**Description:** Using a loop variable in expressions within a corresponding SEARCH condition is not supported
**Affected Versions:** 3.4.x, 3.5.x
**Fixed in Versions:** -
**Reference:** [arangodb/backlog#318](https://github.com/arangodb/backlog/issues/318){:target="_blank"} (internal) | | **Date Added:** 2019-06-25
**Component:** ArangoSearch
**Deployment Mode:** All
**Description:** The `primarySort` attribute in ArangoSearch View definitions can not be set via the web interface. The option is immutable, but the web interface does not allow to set any View properties upfront (it creates a View with default parameters before the user has a chance to configure it).
**Affected Versions:** 3.5.x
**Fixed in Versions:** -
**Reference:** N/A | | **Date Added:** 2019-11-06
**Component:** ArangoSearch
**Deployment Mode:** Cluster
**Description:** There is a possibility to get into deadlocks during Coordinator execution if a custom Analyzer was created (and is present in the `_analyzers` system collection). It is recommended not to use custom Analyzers in production environments in affected versions.
**Affected Versions:** 3.5.x
**Fixed in Versions:** 3.5.3
**Reference:** [arangodb/backlog#651](https://github.com/arangodb/backlog/issues/651){:target="_blank"} (internal) | +| **Date Added:** 2020-03-19
**Component:** ArangoSearch
**Deployment Mode:** All
**Description:** Operators and functions in `SEARCH` clauses of AQL queries which compare values such as `>`, `>=`, `<`, `<=`, `IN_RANGE()` and `STARTS_WITH()` neither take the server language (`--default-language`) nor the Analyzer locale into account. The alphabetical order of characters as defined by a language is thus not honored and can lead to unexpected results in range queries.
**Affected Versions:** 3.5.x
**Fixed in Versions:** -
**Reference:** [arangodb/backlog#679](https://github.com/arangodb/backlog/issues/679){:target="_blank"} (internal) | AQL --- diff --git a/3.6/release-notes-known-issues36.md b/3.6/release-notes-known-issues36.md index 868a6b9afe..9811456944 100644 --- a/3.6/release-notes-known-issues36.md +++ b/3.6/release-notes-known-issues36.md @@ -21,6 +21,7 @@ ArangoSearch | **Date Added:** 2018-12-03
**Component:** ArangoSearch
**Deployment Mode:** Cluster
**Description:** Score values evaluated by corresponding score functions (BM25/TFIDF) may differ in single-server and cluster with a collection having more than 1 shard
**Affected Versions:** 3.4.x, 3.5.x, 3.6.x
**Fixed in Versions:** -
**Reference:** [arangodb/backlog#508](https://github.com/arangodb/backlog/issues/508){:target="_blank"} (internal) | | **Date Added:** 2018-12-03
**Component:** ArangoSearch
**Deployment Mode:** All
**Description:** Using a loop variable in expressions within a corresponding SEARCH condition is not supported
**Affected Versions:** 3.4.x, 3.5.x, 3.6.x
**Fixed in Versions:** -
**Reference:** [arangodb/backlog#318](https://github.com/arangodb/backlog/issues/318){:target="_blank"} (internal) | | **Date Added:** 2019-06-25
**Component:** ArangoSearch
**Deployment Mode:** All
**Description:** The `primarySort` attribute in ArangoSearch View definitions can not be set via the web interface. The option is immutable, but the web interface does not allow to set any View properties upfront (it creates a View with default parameters before the user has a chance to configure it).
**Affected Versions:** 3.5.x, 3.6.x
**Fixed in Versions:** -
**Reference:** N/A | +| **Date Added:** 2020-03-19
**Component:** ArangoSearch
**Deployment Mode:** All
**Description:** Operators and functions in `SEARCH` clauses of AQL queries which compare values such as `>`, `>=`, `<`, `<=`, `IN_RANGE()` and `STARTS_WITH()` neither take the server language (`--default-language`) nor the Analyzer locale into account. The alphabetical order of characters as defined by a language is thus not honored and can lead to unexpected results in range queries.
**Affected Versions:** 3.5.x, 3.6.x
**Fixed in Versions:** -
**Reference:** [arangodb/backlog#679](https://github.com/arangodb/backlog/issues/679){:target="_blank"} (internal) | AQL --- diff --git a/3.7/release-notes-known-issues35.md b/3.7/release-notes-known-issues35.md index fb5432c009..51ff510aff 100644 --- a/3.7/release-notes-known-issues35.md +++ b/3.7/release-notes-known-issues35.md @@ -23,6 +23,7 @@ ArangoSearch | **Date Added:** 2018-12-03
**Component:** ArangoSearch
**Deployment Mode:** All
**Description:** Using a loop variable in expressions within a corresponding SEARCH condition is not supported
**Affected Versions:** 3.4.x, 3.5.x
**Fixed in Versions:** -
**Reference:** [arangodb/backlog#318](https://github.com/arangodb/backlog/issues/318){:target="_blank"} (internal) | | **Date Added:** 2019-06-25
**Component:** ArangoSearch
**Deployment Mode:** All
**Description:** The `primarySort` attribute in ArangoSearch View definitions can not be set via the web interface. The option is immutable, but the web interface does not allow to set any View properties upfront (it creates a View with default parameters before the user has a chance to configure it).
**Affected Versions:** 3.5.x
**Fixed in Versions:** -
**Reference:** N/A | | **Date Added:** 2019-11-06
**Component:** ArangoSearch
**Deployment Mode:** Cluster
**Description:** There is a possibility to get into deadlocks during Coordinator execution if a custom Analyzer was created (and is present in the `_analyzers` system collection). It is recommended not to use custom Analyzers in production environments in affected versions.
**Affected Versions:** 3.5.x
**Fixed in Versions:** 3.5.3
**Reference:** [arangodb/backlog#651](https://github.com/arangodb/backlog/issues/651){:target="_blank"} (internal) | +| **Date Added:** 2020-03-19
**Component:** ArangoSearch
**Deployment Mode:** All
**Description:** Operators and functions in `SEARCH` clauses of AQL queries which compare values such as `>`, `>=`, `<`, `<=`, `IN_RANGE()` and `STARTS_WITH()` neither take the server language (`--default-language`) nor the Analyzer locale into account. The alphabetical order of characters as defined by a language is thus not honored and can lead to unexpected results in range queries.
**Affected Versions:** 3.5.x
**Fixed in Versions:** -
**Reference:** [arangodb/backlog#679](https://github.com/arangodb/backlog/issues/679){:target="_blank"} (internal) | AQL --- diff --git a/3.7/release-notes-known-issues36.md b/3.7/release-notes-known-issues36.md index 868a6b9afe..4c777201b1 100644 --- a/3.7/release-notes-known-issues36.md +++ b/3.7/release-notes-known-issues36.md @@ -21,6 +21,8 @@ ArangoSearch | **Date Added:** 2018-12-03
**Component:** ArangoSearch
**Deployment Mode:** Cluster
**Description:** Score values evaluated by corresponding score functions (BM25/TFIDF) may differ in single-server and cluster with a collection having more than 1 shard
**Affected Versions:** 3.4.x, 3.5.x, 3.6.x
**Fixed in Versions:** -
**Reference:** [arangodb/backlog#508](https://github.com/arangodb/backlog/issues/508){:target="_blank"} (internal) | | **Date Added:** 2018-12-03
**Component:** ArangoSearch
**Deployment Mode:** All
**Description:** Using a loop variable in expressions within a corresponding SEARCH condition is not supported
**Affected Versions:** 3.4.x, 3.5.x, 3.6.x
**Fixed in Versions:** -
**Reference:** [arangodb/backlog#318](https://github.com/arangodb/backlog/issues/318){:target="_blank"} (internal) | | **Date Added:** 2019-06-25
**Component:** ArangoSearch
**Deployment Mode:** All
**Description:** The `primarySort` attribute in ArangoSearch View definitions can not be set via the web interface. The option is immutable, but the web interface does not allow to set any View properties upfront (it creates a View with default parameters before the user has a chance to configure it).
**Affected Versions:** 3.5.x, 3.6.x
**Fixed in Versions:** -
**Reference:** N/A | +| **Date Added:** 2020-03-19
**Component:** ArangoSearch
**Deployment Mode:** All
**Description:** Operators and functions in `SEARCH` clauses of AQL queries which compare values such as `>`, `>=`, `<`, `<=`, `IN_RANGE()` and `STARTS_WITH()` neither take the server language (`--default-language`) nor the Analyzer locale into account. The alphabetical order of characters as defined by a language is thus not honored and can lead to unexpected results in range queries.
**Affected Versions:** 3.5.x, 3.6.x
**Fixed in Versions:** -
**Reference:** [arangodb/backlog#679](https://github.com/arangodb/backlog/issues/679){:target="_blank"} (internal) | + AQL --- diff --git a/3.7/release-notes-known-issues37.md b/3.7/release-notes-known-issues37.md index 34f96ed889..f19ca73865 100644 --- a/3.7/release-notes-known-issues37.md +++ b/3.7/release-notes-known-issues37.md @@ -21,6 +21,7 @@ ArangoSearch | **Date Added:** 2018-12-03
**Component:** ArangoSearch
**Deployment Mode:** Cluster
**Description:** Score values evaluated by corresponding score functions (BM25/TFIDF) may differ in single-server and cluster with a collection having more than 1 shard
**Affected Versions:** 3.4.x, 3.5.x, 3.6.x, 3.7.x
**Fixed in Versions:** -
**Reference:** [arangodb/backlog#508](https://github.com/arangodb/backlog/issues/508){:target="_blank"} (internal) | | **Date Added:** 2018-12-03
**Component:** ArangoSearch
**Deployment Mode:** All
**Description:** Using a loop variable in expressions within a corresponding SEARCH condition is not supported
**Affected Versions:** 3.4.x, 3.5.x, 3.6.x, 3.7.x
**Fixed in Versions:** -
**Reference:** [arangodb/backlog#318](https://github.com/arangodb/backlog/issues/318){:target="_blank"} (internal) | | **Date Added:** 2019-06-25
**Component:** ArangoSearch
**Deployment Mode:** All
**Description:** The `primarySort` attribute in ArangoSearch View definitions can not be set via the web interface. The option is immutable, but the web interface does not allow to set any View properties upfront (it creates a View with default parameters before the user has a chance to configure it).
**Affected Versions:** 3.5.x, 3.6.x, 3.7.x
**Fixed in Versions:** -
**Reference:** N/A | +| **Date Added:** 2020-03-19
**Component:** ArangoSearch
**Deployment Mode:** All
**Description:** Operators and functions in `SEARCH` clauses of AQL queries which compare values such as `>`, `>=`, `<`, `<=`, `IN_RANGE()` and `STARTS_WITH()` neither take the server language (`--default-language`) nor the Analyzer locale into account. The alphabetical order of characters as defined by a language is thus not honored and can lead to unexpected results in range queries.
**Affected Versions:** 3.5.x, 3.6.x
**Fixed in Versions:** -
**Reference:** [arangodb/backlog#679](https://github.com/arangodb/backlog/issues/679){:target="_blank"} (internal) | AQL --- From 08f47c09676e035120ebc2658daa245a54e70715 Mon Sep 17 00:00:00 2001 From: Simran Brucherseifer Date: Thu, 19 Mar 2020 14:51:34 +0100 Subject: [PATCH 6/7] Add remark about known issue --- 3.5/aql/functions-arangosearch.md | 16 ++++++++++++++++ 3.5/aql/operations-search.md | 8 ++++++++ 3.5/arangosearch-analyzers.md | 7 ++++--- 3.6/aql/functions-arangosearch.md | 16 ++++++++++++++++ 3.6/aql/operations-search.md | 8 ++++++++ 3.6/arangosearch-analyzers.md | 7 ++++--- 3.7/aql/functions-arangosearch.md | 16 ++++++++++++++++ 3.7/aql/operations-search.md | 8 ++++++++ 3.7/arangosearch-analyzers.md | 7 ++++--- 9 files changed, 84 insertions(+), 9 deletions(-) diff --git a/3.5/aql/functions-arangosearch.md b/3.5/aql/functions-arangosearch.md index 391d75ffb0..c8439ed38d 100644 --- a/3.5/aql/functions-arangosearch.md +++ b/3.5/aql/functions-arangosearch.md @@ -260,6 +260,14 @@ Match documents where the attribute at **path** is greater than (or equal to) *low* and *high* can be numbers or strings (technically also `null`, `true` and `false`), but the data type must be the same for both. +{% hint 'warning' %} +The alphabetical order of characters is not taken into account by ArangoSearch, +i.e. range queries in SEARCH operations against Views will not follow the +language rules as per the defined Analyzer locale nor the server language +(startup option `--default-language`)! +Also see [Known Issues](release-notes-known-issues35.html#arangosearch). +{% endhint %} + - **path** (attribute path expression): the path of the attribute to test in the document - **low** (number\|string): minimum value of the desired range @@ -406,6 +414,14 @@ is processed by a tokenizing Analyzer (type `"text"` or `"delimiter"`) or if it is an array, then a single token/element starting with the prefix is sufficient to match the document. +{% hint 'warning' %} +The alphabetical order of characters is not taken into account by ArangoSearch, +i.e. range queries in SEARCH operations against Views will not follow the +language rules as per the defined Analyzer locale nor the server language +(startup option `--default-language`)! +Also see [Known Issues](release-notes-known-issues35.html#arangosearch). +{% endhint %} + - **path** (attribute path expression): the path of the attribute to compare against in the document - **prefix** (string): a string to search at the start of the text diff --git a/3.5/aql/operations-search.md b/3.5/aql/operations-search.md index 364d009aa7..53b8e9e55e 100644 --- a/3.5/aql/operations-search.md +++ b/3.5/aql/operations-search.md @@ -64,6 +64,14 @@ are supported: - `!=` - `IN` (array or range), also `NOT IN` +{% hint 'warning' %} +The alphabetical order of characters is not taken into account by ArangoSearch, +i.e. range queries in SEARCH operations against Views will not follow the +language rules as per the defined Analyzer locale nor the server language +(startup option `--default-language`)! +Also see [Known Issues](release-notes-known-issues35.html#arangosearch). +{% endhint %} + ```js FOR doc IN viewName SEARCH ANALYZER(doc.text == "quick" OR doc.text == "brown", "text_en") diff --git a/3.5/arangosearch-analyzers.md b/3.5/arangosearch-analyzers.md index 3af6c76744..d6f5c97559 100644 --- a/3.5/arangosearch-analyzers.md +++ b/3.5/arangosearch-analyzers.md @@ -289,9 +289,10 @@ languages, which are technically all supported. {% hint 'warning' %} The alphabetical order of characters is not taken into account by ArangoSearch, -i.e. range queries and sorting of Views will not follow the language rules as -per the defined Analyzer locale or the server startup option -`--default-language`! +i.e. range queries in SEARCH operations against Views will not follow the +language rules as per the defined Analyzer locale nor the server language +(startup option `--default-language`)! +Also see [Known Issues](release-notes-known-issues35.html#arangosearch). {% endhint %} Stemming support is provided by [Snowball](https://snowballstem.org/){:target="_blank"}, diff --git a/3.6/aql/functions-arangosearch.md b/3.6/aql/functions-arangosearch.md index 119c10847a..c88e709529 100644 --- a/3.6/aql/functions-arangosearch.md +++ b/3.6/aql/functions-arangosearch.md @@ -260,6 +260,14 @@ Match documents where the attribute at **path** is greater than (or equal to) *low* and *high* can be numbers or strings (technically also `null`, `true` and `false`), but the data type must be the same for both. +{% hint 'warning' %} +The alphabetical order of characters is not taken into account by ArangoSearch, +i.e. range queries in SEARCH operations against Views will not follow the +language rules as per the defined Analyzer locale nor the server language +(startup option `--default-language`)! +Also see [Known Issues](release-notes-known-issues35.html#arangosearch). +{% endhint %} + - **path** (attribute path expression): the path of the attribute to test in the document - **low** (number\|string): minimum value of the desired range @@ -438,6 +446,14 @@ is processed by a tokenizing Analyzer (type `"text"` or `"delimiter"`) or if it is an array, then a single token/element starting with the prefix is sufficient to match the document. +{% hint 'warning' %} +The alphabetical order of characters is not taken into account by ArangoSearch, +i.e. range queries in SEARCH operations against Views will not follow the +language rules as per the defined Analyzer locale nor the server language +(startup option `--default-language`)! +Also see [Known Issues](release-notes-known-issues35.html#arangosearch). +{% endhint %} + - **path** (attribute path expression): the path of the attribute to compare against in the document - **prefix** (string): a string to search at the start of the text diff --git a/3.6/aql/operations-search.md b/3.6/aql/operations-search.md index 57f182b12b..51ee020fcd 100644 --- a/3.6/aql/operations-search.md +++ b/3.6/aql/operations-search.md @@ -64,6 +64,14 @@ are supported: - `!=` - `IN` (array or range), also `NOT IN` +{% hint 'warning' %} +The alphabetical order of characters is not taken into account by ArangoSearch, +i.e. range queries in SEARCH operations against Views will not follow the +language rules as per the defined Analyzer locale nor the server language +(startup option `--default-language`)! +Also see [Known Issues](release-notes-known-issues35.html#arangosearch). +{% endhint %} + ```js FOR doc IN viewName SEARCH ANALYZER(doc.text == "quick" OR doc.text == "brown", "text_en") diff --git a/3.6/arangosearch-analyzers.md b/3.6/arangosearch-analyzers.md index 4f5f322562..9bdbcf9a71 100644 --- a/3.6/arangosearch-analyzers.md +++ b/3.6/arangosearch-analyzers.md @@ -375,9 +375,10 @@ languages, which are technically all supported. {% hint 'warning' %} The alphabetical order of characters is not taken into account by ArangoSearch, -i.e. range queries and sorting of Views will not follow the language rules as -per the defined Analyzer locale or the server startup option -`--default-language`! +i.e. range queries in SEARCH operations against Views will not follow the +language rules as per the defined Analyzer locale nor the server language +(startup option `--default-language`)! +Also see [Known Issues](release-notes-known-issues36.html#arangosearch). {% endhint %} Stemming support is provided by [Snowball](https://snowballstem.org/){:target="_blank"}, diff --git a/3.7/aql/functions-arangosearch.md b/3.7/aql/functions-arangosearch.md index 805d4f2773..b2a0c72528 100644 --- a/3.7/aql/functions-arangosearch.md +++ b/3.7/aql/functions-arangosearch.md @@ -260,6 +260,14 @@ Match documents where the attribute at **path** is greater than (or equal to) *low* and *high* can be numbers or strings (technically also `null`, `true` and `false`), but the data type must be the same for both. +{% hint 'warning' %} +The alphabetical order of characters is not taken into account by ArangoSearch, +i.e. range queries in SEARCH operations against Views will not follow the +language rules as per the defined Analyzer locale nor the server language +(startup option `--default-language`)! +Also see [Known Issues](release-notes-known-issues35.html#arangosearch). +{% endhint %} + - **path** (attribute path expression): the path of the attribute to test in the document - **low** (number\|string): minimum value of the desired range @@ -438,6 +446,14 @@ is processed by a tokenizing Analyzer (type `"text"` or `"delimiter"`) or if it is an array, then a single token/element starting with the prefix is sufficient to match the document. +{% hint 'warning' %} +The alphabetical order of characters is not taken into account by ArangoSearch, +i.e. range queries in SEARCH operations against Views will not follow the +language rules as per the defined Analyzer locale nor the server language +(startup option `--default-language`)! +Also see [Known Issues](release-notes-known-issues35.html#arangosearch). +{% endhint %} + - **path** (attribute path expression): the path of the attribute to compare against in the document - **prefix** (string): a string to search at the start of the text diff --git a/3.7/aql/operations-search.md b/3.7/aql/operations-search.md index 5263927c1c..8b01813076 100644 --- a/3.7/aql/operations-search.md +++ b/3.7/aql/operations-search.md @@ -65,6 +65,14 @@ are supported: - `IN` (array or range), also `NOT IN` - `LIKE` (introduced in v3.7.0), also `NOT LIKE` +{% hint 'warning' %} +The alphabetical order of characters is not taken into account by ArangoSearch, +i.e. range queries in SEARCH operations against Views will not follow the +language rules as per the defined Analyzer locale nor the server language +(startup option `--default-language`)! +Also see [Known Issues](release-notes-known-issues37.html#arangosearch). +{% endhint %} + ```js FOR doc IN viewName SEARCH ANALYZER(doc.text == "quick" OR doc.text == "brown", "text_en") diff --git a/3.7/arangosearch-analyzers.md b/3.7/arangosearch-analyzers.md index 8149fbbd29..09492b86ca 100644 --- a/3.7/arangosearch-analyzers.md +++ b/3.7/arangosearch-analyzers.md @@ -375,9 +375,10 @@ languages, which are technically all supported. {% hint 'warning' %} The alphabetical order of characters is not taken into account by ArangoSearch, -i.e. range queries and sorting of Views will not follow the language rules as -per the defined Analyzer locale or the server startup option -`--default-language`! +i.e. range queries in SEARCH operations against Views will not follow the +language rules as per the defined Analyzer locale nor the server language +(startup option `--default-language`)! +Also see [Known Issues](release-notes-known-issues37.html#arangosearch). {% endhint %} Stemming support is provided by [Snowball](https://snowballstem.org/){:target="_blank"}, From 124377de4db9de70378f261728cb16fdb7424c8c Mon Sep 17 00:00:00 2001 From: Simran Brucherseifer Date: Thu, 19 Mar 2020 15:31:15 +0100 Subject: [PATCH 7/7] Fix links --- 3.5/aql/functions-arangosearch.md | 4 ++-- 3.5/aql/operations-search.md | 2 +- 3.6/aql/functions-arangosearch.md | 4 ++-- 3.6/aql/operations-search.md | 2 +- 3.7/aql/functions-arangosearch.md | 4 ++-- 3.7/aql/operations-search.md | 2 +- 6 files changed, 9 insertions(+), 9 deletions(-) diff --git a/3.5/aql/functions-arangosearch.md b/3.5/aql/functions-arangosearch.md index c8439ed38d..5e64b8639a 100644 --- a/3.5/aql/functions-arangosearch.md +++ b/3.5/aql/functions-arangosearch.md @@ -265,7 +265,7 @@ The alphabetical order of characters is not taken into account by ArangoSearch, i.e. range queries in SEARCH operations against Views will not follow the language rules as per the defined Analyzer locale nor the server language (startup option `--default-language`)! -Also see [Known Issues](release-notes-known-issues35.html#arangosearch). +Also see [Known Issues](../release-notes-known-issues35.html#arangosearch). {% endhint %} - **path** (attribute path expression): @@ -419,7 +419,7 @@ The alphabetical order of characters is not taken into account by ArangoSearch, i.e. range queries in SEARCH operations against Views will not follow the language rules as per the defined Analyzer locale nor the server language (startup option `--default-language`)! -Also see [Known Issues](release-notes-known-issues35.html#arangosearch). +Also see [Known Issues](../release-notes-known-issues35.html#arangosearch). {% endhint %} - **path** (attribute path expression): the path of the attribute to compare diff --git a/3.5/aql/operations-search.md b/3.5/aql/operations-search.md index 53b8e9e55e..9e8c147414 100644 --- a/3.5/aql/operations-search.md +++ b/3.5/aql/operations-search.md @@ -69,7 +69,7 @@ The alphabetical order of characters is not taken into account by ArangoSearch, i.e. range queries in SEARCH operations against Views will not follow the language rules as per the defined Analyzer locale nor the server language (startup option `--default-language`)! -Also see [Known Issues](release-notes-known-issues35.html#arangosearch). +Also see [Known Issues](../release-notes-known-issues35.html#arangosearch). {% endhint %} ```js diff --git a/3.6/aql/functions-arangosearch.md b/3.6/aql/functions-arangosearch.md index c88e709529..f79b479713 100644 --- a/3.6/aql/functions-arangosearch.md +++ b/3.6/aql/functions-arangosearch.md @@ -265,7 +265,7 @@ The alphabetical order of characters is not taken into account by ArangoSearch, i.e. range queries in SEARCH operations against Views will not follow the language rules as per the defined Analyzer locale nor the server language (startup option `--default-language`)! -Also see [Known Issues](release-notes-known-issues35.html#arangosearch). +Also see [Known Issues](../release-notes-known-issues35.html#arangosearch). {% endhint %} - **path** (attribute path expression): @@ -451,7 +451,7 @@ The alphabetical order of characters is not taken into account by ArangoSearch, i.e. range queries in SEARCH operations against Views will not follow the language rules as per the defined Analyzer locale nor the server language (startup option `--default-language`)! -Also see [Known Issues](release-notes-known-issues35.html#arangosearch). +Also see [Known Issues](../release-notes-known-issues35.html#arangosearch). {% endhint %} - **path** (attribute path expression): the path of the attribute to compare diff --git a/3.6/aql/operations-search.md b/3.6/aql/operations-search.md index 51ee020fcd..8c03118d02 100644 --- a/3.6/aql/operations-search.md +++ b/3.6/aql/operations-search.md @@ -69,7 +69,7 @@ The alphabetical order of characters is not taken into account by ArangoSearch, i.e. range queries in SEARCH operations against Views will not follow the language rules as per the defined Analyzer locale nor the server language (startup option `--default-language`)! -Also see [Known Issues](release-notes-known-issues35.html#arangosearch). +Also see [Known Issues](../release-notes-known-issues35.html#arangosearch). {% endhint %} ```js diff --git a/3.7/aql/functions-arangosearch.md b/3.7/aql/functions-arangosearch.md index b2a0c72528..8cc1182e46 100644 --- a/3.7/aql/functions-arangosearch.md +++ b/3.7/aql/functions-arangosearch.md @@ -265,7 +265,7 @@ The alphabetical order of characters is not taken into account by ArangoSearch, i.e. range queries in SEARCH operations against Views will not follow the language rules as per the defined Analyzer locale nor the server language (startup option `--default-language`)! -Also see [Known Issues](release-notes-known-issues35.html#arangosearch). +Also see [Known Issues](../release-notes-known-issues35.html#arangosearch). {% endhint %} - **path** (attribute path expression): @@ -451,7 +451,7 @@ The alphabetical order of characters is not taken into account by ArangoSearch, i.e. range queries in SEARCH operations against Views will not follow the language rules as per the defined Analyzer locale nor the server language (startup option `--default-language`)! -Also see [Known Issues](release-notes-known-issues35.html#arangosearch). +Also see [Known Issues](../release-notes-known-issues35.html#arangosearch). {% endhint %} - **path** (attribute path expression): the path of the attribute to compare diff --git a/3.7/aql/operations-search.md b/3.7/aql/operations-search.md index 8b01813076..7a7b6dd0f9 100644 --- a/3.7/aql/operations-search.md +++ b/3.7/aql/operations-search.md @@ -70,7 +70,7 @@ The alphabetical order of characters is not taken into account by ArangoSearch, i.e. range queries in SEARCH operations against Views will not follow the language rules as per the defined Analyzer locale nor the server language (startup option `--default-language`)! -Also see [Known Issues](release-notes-known-issues37.html#arangosearch). +Also see [Known Issues](../release-notes-known-issues37.html#arangosearch). {% endhint %} ```js