You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Sometimes (as it seems to work sometimes) indexing of blob content
is not correct. It always decodes as UTF-8.
It turns out that there are two indexing methods, one index() that
works and re-index that doesn't. If I drop the lucene index, reindex will be used.
Fix attached
Reported by robin.rosenberg on 2012-07-23 14:26:49
I do like reusing code, but jumping to that method in JGitUtils opens a new revwalk,
a new treewalk with a path filter, and doesn't reuse any byte buffers for what is a
memory-consuming process. The other getStringContent would be a better match, but
it still has to perform an unnecessary lookup. To my mind it is better to keep the
8 lines of code which decode a blob from the repository in the Lucene indexer.
It should be noted that the strategy differs slightly between index() and reindex().
Index is for incrementally updating branches and blobs and is executed due to pushed
commits. It delegates most git ops to JGitUtils which I think is reasonable. Reindex
is for ground-zero indexing, which is expensive, so it directly uses revwalks, treewalks,
etc in a way that is optimal for the indexing the entire rpeository.
Originally reported on Google Code with ID 112
Reported by
robin.rosenberg
on 2012-07-23 14:26:49- _Attachment: [0001-Fix-the-LuceneExecutor.reindex-to-decode-blobs-the-s.patch](https://storage.googleapis.com/google-code-attachments/gitblit/issue-112/comment-0/0001-Fix-the-LuceneExecutor.reindex-to-decode-blobs-the-s.patch)_
The text was updated successfully, but these errors were encountered: