-
Notifications
You must be signed in to change notification settings - Fork 78
Zoekt indexer trigram and file size limits #1436
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Zoekt indexer trigram and file size limits #1436
Conversation
https://ampcode.com/threads/T-0390a39a-9c04-441e-8982-7e2ef7b9bf76 Co-authored-by: Amp <amp@ampcode.com>
https://ampcode.com/threads/T-0390a39a-9c04-441e-8982-7e2ef7b9bf76 Co-authored-by: Amp <amp@ampcode.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
docs/admin/search.mdx
Outdated
| To view which files are skipped during indexing, visit the repository settings page and click on **Indexing**. | ||
| To force the indexer to include specific files (like `yarn.lock` or other large text files) that are otherwise skipped, add their file path or a glob pattern to the [`search.largeFiles`](/admin/config/site_config#search-largeFiles) setting in your site configuration and reindex the repository. Note that files must still be valid UTF-8 to be indexed, even if added to `search.largeFiles`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a dead markdown path: https://github.com/sourcegraph/docs/blob/amp/zoekt-indexer-trigram-and-file-size-limits/admin/config/site_config#search-largeFiles
Maybe we want to point here or docs/admin/config/site_config.mdx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
Updated link in search documentation for large files setting.
Updated the documentation to include a link for the search.largeFiles setting.
s3nu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The documentation was updated in
search.mdxto specify that the Zoekt indexer skips files exceeding 20,000 unique trigrams or those that are not valid UTF-8. Instructions were added detailing how to override these limits by configuring thesearch.largeFilessetting and reindexing the repository.Thread: https://ampcode.com/threads/T-0390a39a-9c04-441e-8982-7e2ef7b9bf76