Skip to content
This repository has been archived by the owner on Jul 11, 2022. It is now read-only.

Incremental indexing #8

Closed
cloudspeech opened this issue Oct 29, 2015 · 3 comments
Closed

Incremental indexing #8

cloudspeech opened this issue Oct 29, 2015 · 3 comments

Comments

@cloudspeech
Copy link

Great to discover today there's an actively worked on fork of codesearch!

I am using codesearch already, and noticed that with lots of files reindexing is slow.

It would be great if one could tell the indexer to (re-)index a few files only and merge that efficiently into the existing index. Cursory inspection of the code tells me this should be doable.

A strong plus would be to read the file names - 1 per line - from a (named or regular) pipe, or else a regular file, and index those as soon as a new line becomes available.

Maybe an option --reindex-using < pipeOrFile > ?

@junkblocker
Copy link
Owner

Hi @cloudspeech , this in its entirety including filesystem / FIFO notification driven indexing has been on my wishlist forever but I currently do not have the kind of free time this would require to implement. I saw when you requested this upstream and was hoping somebody finds time.

@abingham
Copy link

abingham commented Jan 12, 2017

Incremental updating is my biggest missing feature for codesearch, and I'd love to see some support for it. I've looked over the code and - taking into account that I know almost nothing about go - it looks like there might be a relatively simple way to support incremental indexing.

First, I noticed that cindex will happily take a path to a file as an argument. It will index that file and merge the results into the existing index. The only downside to this that I can see is that this also adds the file to the list of "indexed paths" which get stored in the index file. This in itself isn't terrible, but it's redundant if you're just trying to incrementally index something that's already accounted for by another indexed path.

However, it appears that the two operations - indexing a file and adding it to the list of indexed paths - are effectively independent operations. That is, we could index a single file without adding that path to the list of indexed paths.

If I'm right, we could add a flag to cindex which says "index the provided path, but don't add it to the list of paths". Then, for incremental indexing, you could just pass that flag and the path to your file.

Is this workable? Is it wrong-headed?

@junkblocker
Copy link
Owner

This issue was moved to junkblocker/codesearch#3

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants