-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
This is causing most of our persistent indexing errors, and fixing it here is a better solution than explicitly blocking the strange repositories. #136
Conversation
Change-Id: Ic36ab798aaa097414b1c5b3a67a19893664a314a
Change-Id: I99766314e1f551a7ae900dd5ffe578ac1d694c4d
Change-Id: I060ef3c4f55fe91b4d45979685475749027765a4
It is only used by the iterators. So we can instead use VisitMatchTree to update the stats. Change-Id: I1be5e3c57fc8ea5e3baa20de6f7747c161423365
Change-Id: I34a3601f5686bf3cb79e76fe1e06f86b4b7d0329
Change-Id: I41503e95a41512f8445eb8109069f37b7cea0314
Change-Id: I58af12bc32735c52cdd2ef080d1fada328caba48
Change-Id: I89f211e77af9b9893e3e016f5cf704ddab2115ec
Change-Id: Iba7868c1935477d88f9dd285ef010fe0ebe37617
This reverts commit dacbbfd.
This reverts commit 23a3bdb.
This reverts commit 4381e0a.
This reverts commit d8c7986.
This reverts commit bf3e7a5.
…ewMatchTree" This reverts commit bfcde41.
This reverts commit 3d4a710.
…xData" This reverts commit 2d60dcf.
This reverts commit 3011f8c.
This reverts commit b50b7f5.
Change-Id: Ibe3634e9a541e068cc0c5e90bfa37cd7e1a9e894
Change-Id: I70f6c53edc355ba2cc0b02ffda34567ecce8dbe7
This is to prevent log spam due to it failing at startup. Now it will only log after 4 attemps with exp backoff ~= 15s. Change-Id: Id93c56a24ecada87f3d30d66e0c460450a1e9068
* fetch options from server and pass them through to index command Change-Id: I49c69efe18a21ed6efe7936300f0ce91af7aae4b * non pointer Change-Id: Idcfc1d17d9c086f6fe6d31a320e520c4bfc5e4d2 * add test for get options func Change-Id: I4df5ffc9aac538989cd08443a258a8daf393ec52 * correct argument building and add it to test Change-Id: I26985085c091122da47529e02ed78942e8d1ea5f
Change-Id: Idea480bef9f74884647e2b2785aea0754522115a
Change-Id: I70e51197558e760432e820743284696a4c469132
Save the options that affect the index that is created in order to know if we should reindex based on option changes. Change-Id: I676503e2c32934edae7c594a8ca9ac1b0f72e226
Change-Id: If2b44ab7f822be5c97dd0460023cad3a2a573b0e
Change-Id: I3e15b31d6fcdc06279d95f33b03a77ebfeffa591
We did not call `tr.Finish` in `shardedSeacher.Search` which caused the associated spans to dangle in Jaeger UI.
We add a simply buffering logic to the stream server: events containing no file matches but only stats are aggregated and sent out with the first event containing file matches. This approach should reduce the number of events sent, but still give us the per-shard granularity that we want.
For `traceAwareSearcher`, both `Search` and `StreamingSearch` had the same operation name which was confusing to read in the traces.
We want to enforce this in our next release of Sourcegraph to ensure all indexes have the Sourcegraph repository ID set. In the last release we would only have the repoid set if we re-indexed. This would only occur if the repository was updated.
If we never dial a client and we try close it, it will lead to a nil panic. Change-Id: Iea0fbd5add7cda8c8ef922838f190c5dd0a6b40f
This changes how we handle context errors returned from the server: Instead of translating context errors, which were serialized as string, back to context errors, we treat them the same way as any other error. To not confuse `fmt.Errorf("context canceled")` with `context.Canceled` during debugging, we add the prefix "error received from zoekt".
This will avoid filling disks when a repo persistently fails to index midway for some reason-- OOMs, crashes, or otherwise.
This commit introduces a scheduler abstraction for searches to use. Every search now starts in the "interactive" queue. But after 5s will move to the "batch" queue. The idea here is to have a simple approach of allowing long running queries to run without impacting interactive queries. We use WeightedSemaphore as the "queue" (internally has a FIFO queue). Before this commit we just had one WeightedSemaphore with size #CPUs. In this commit that semaphore becomes the "interactive" queue. The "batch" queue is a new semaphore with size 1/4 #CPUs. This doesn't limit the amount of CPU a search can consume, but rather we use number of CPUs as a guide for the number of concurrent searches that can be running. This commit changes a few things, so view the PR for it so you can see it commit by commit.
Change-Id: I36b9c4bd6f13013806d3e51dafd6289f1527fe60
If master is not your default branch, but you are indexing it then we would not index the default branch. This happened because in the git repo with synthesize we run `update-ref HEAD foo` then `update-ref master bar`. Because `HEAD` pointed to `refs/heads/master` we would have a final value of `bar` for HEAD. This commit changes the default branch to a random string which shouldn't be indexed. Now `HEAD` will point to the random string, preventing this issue.
This adds the option `--debug-shard` to zoekt-sourcegraph-indexserver to print a csv of trigrams and the length of their postings list for a particular shard.
This will allow docker to self report its version. Change-Id: I4afd0ce6b07a5dd9985c65c546431db15a224959
We want to start collecting continuous profiles of zoekt serving user requests. This adds integration for google cloud profiler. We already use this for frontend and gitserver. This is opt-in, so we will add the GOOGLE_CLOUD_PROFILER_ENABLED environment variable to the sourcegraph.com deployment. Change-Id: Ia005213d27d47bb8ae7a520b8dc530c1c251168b
…t by them zoekt-sourcegraph-reposerver receives floating-point repo priorities as part of the sync loop running getIndexOptions, and persists them to priority.json in the index dir. zoekt-webserver initializes shardedSearcher which loads priority.json and watches for changes, and sorts by priority first and then by name.
In upstream zoekt we always call searchOneShard, even when the ctx is canceled. This ensures we have correct statistics calculated. Previously when we returned early, we would not count the remaining shard in the ShardsSkipped statistic. Co-authored-by: Stefan Hengl <stefan@sourcegraph.com> Change-Id: I7cb44065cfce5363722dba5d32b39fe67de78599
Our heap profiles are dominated by readIndexData. By moving some of the work into its own functions, the heap profiles will tell us where in particular data is allocated from. We suspect most data is from readNgrams, but want to confirm emperically. Co-authored-by: Stefan Hengl <stefan@sourcegraph.com>
We know the number of entries we will add to the ngram maps. By specifying the go runtime can avoid over allocating as it grows. Locally on a small corpus this reduced the map byte size by 15%. Co-authored-by: Stefan Hengl <stefan@sourcegraph.com> Change-Id: I060e7aa12db726b093e7971ec6bc54ce193ec405
We are logging when we update or remove a shard to disk. This is a tab separated values (TSV) file. We will be using this data as input to a simulator for shard merging. Additionally this data can be used as input to interesting graphs. We use two different files to avoid concurrent processes writing to the same file. Co-authored-by: Stefan Hengl <stefan@sourcegraph.com> Change-Id: I9c907bc687612db616fb1f896930362608b198c3
Change-Id: I6e96aecb96ed4b8e0b3cfb5f9ce38bfe8794515e
Same change we did for zoekt-webserver in f669d63. Change-Id: I91715ac16226a221f3af95b85b84b8a942806041
There's no reason to assume that ctags can complete within 5s for large repositories.
This is causing most of our persistent indexing errors, and fixing it here is a better solution than explicitly blocking the strange repositories.
Sorry, I opened this against the wrong repository. |
All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter. We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the ℹ️ Googlers: Go here for more info. |
No description provided.