[docs-only] Update search README.md

References: #7553 (enhancement: improve content extraction stop word cleaning) Making the term `stop word` and the use of the envvar more clear.
owncloud · Nov 3, 2023 · ed17f31 · ed17f31
1 parent 0df009e
commit ed17f31
Showing 1 changed file with 1 addition and 2 deletions.
diff --git a/services/search/README.md b/services/search/README.md
@@ -70,8 +70,7 @@ When the search service can reach Tika, it begins to read out the content on dem
 
 Content extraction and handling the extracted content can be very resource intensive. Content extraction is therefore limited to files with a certain file size. The default limit is 20MB and can be configured using the `SEARCH_CONTENT_EXTRACTION_SIZE_LIMIT` variable.
 
-When extracting the content you can specify whether filler words are ignored or not.
-To keep them, the environment variable `SEARCH_EXTRACTOR_TIKA_CLEAN_STOP_WORDS` must be set to false.
+When extracting content, you can specify whether [stop words](https://en.wikipedia.org/wiki/Stop_word) like `I`, `you`, `the` are ignored or not. Noramlly, these stop words are removed automatically. To keep them, the environment variable `SEARCH_EXTRACTOR_TIKA_CLEAN_STOP_WORDS` must be set to `false`.
 
 When using the Tika container and docker-compose, consider the following: