Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search within and across repositories #312

Closed
gitblit opened this issue Aug 12, 2015 · 10 comments
Closed

Search within and across repositories #312

gitblit opened this issue Aug 12, 2015 · 10 comments

Comments

@gitblit
Copy link
Owner

gitblit commented Aug 12, 2015

Originally reported on Google Code with ID 16

It would be bad ass if we could use something like lucene to search across repositories.
At the moment, I don't think there's a good way to search across git repositories...
I know quite a few people that setup an OpenGrok instance to just do this.

Thanks!


Reported by caniszczyk on 2011-08-05 20:51:51

@gitblit
Copy link
Owner Author

gitblit commented Aug 12, 2015

This is definitely doable.  Updating the indexes is the tricky part.  If pushes are
done through the gitblit servlet (jgit servlet) then I can intercept pushes and schedule
an index update.  But if gitblit is a viewer only then I guess I need a polling mechanism
or to specify a post-commit hook?

Can you comment on the jgit support for post-commit hooks?

Also, what are you interested in indexing?  Commit messages?  Files?

Reported by James.Moger on 2011-08-06 21:16:16

@gitblit
Copy link
Owner Author

gitblit commented Aug 12, 2015

jgit doesn't support post-commit hooks yet really...  however, you could just do a timed
index update

I'm interested in indexing files and commit messages, sort of like OpenGrok.

Reported by caniszczyk on 2011-08-08 18:23:41

@gitblit
Copy link
Owner Author

gitblit commented Aug 12, 2015

I believe that cross-repository commit message search would be really useful. 
My organization has the habit of commenting commits related to a defect fix with the
number of the defect fix in our defect management system. Sometimes a fix may span
over multiple repositories, and after few months it may be difficult to remember all
the repositories involved. This is why it would be very handy if gitblit allowed to
search for commit messages across multiple repositories.

I know my need could be solved by means of a better defect management system, but at
present I am stuck with an old piece of software which is very limited in its feature
set and does not support this use case.

Reported by gm.romanato on 2011-10-21 11:11:09

@gitblit
Copy link
Owner Author

gitblit commented Aug 12, 2015

Lucene search is something I plan to tackle.  I think it would be very useful too. 
It's a medium/large project IMO and there are a few other projects ahead of it in my
priority queue.  Each of the parts of Lucene integration (index, presentation, security)
are not all that difficult, but putting them all together the right way requires some
thought.

Reported by James.Moger on 2011-10-21 11:43:21

@gitblit
Copy link
Owner Author

gitblit commented Aug 12, 2015

Issue 334 has been merged into this issue.

Reported by James.Moger on 2012-01-11 13:11:10

@gitblit
Copy link
Owner Author

gitblit commented Aug 12, 2015

@Chris: I've been reviewing OpenGrok some more, since you cited it as a reference. 
I will not be able to deliver the code-structure searching that OpenGrok offers unless
we can find an ASL-friendly alternative to the GPL'd Exuberant Ctags.  I don't yet
fully appreciate all of OpenGrok's capabilities, but it would appear that Lucene and
Exuberant Ctags do most of the heavy lifting.

I am currently prototyping a Lucene back-end for search.

Reported by James.Moger on 2012-01-19 13:22:03

  • Status changed: Started
  • Labels added: Milestone-0.9.0

@gitblit
Copy link
Owner Author

gitblit commented Aug 12, 2015

Issue 341 has been merged into this issue.

Reported by James.Moger on 2012-01-24 13:42:21

@gitblit
Copy link
Owner Author

gitblit commented Aug 12, 2015

Lucene indexing is completed.  The current design is pretty straight-forward and indexes
commits and blobs on a 2-minute cycle.  Lucene indexing is an opt-in feature meaning
that each repository must elect to be indexed by specifying which branches will be
indexed.  This is to help ensure that things you actually care about are indexed and
everything else is ignored.

In some ways this is an improvement on OpenGrok, which as far as I can see can not
directly index a repository but rather must checkout the repository first and then
index the working copy.  And it must do this per-branch should you want to index multiple
branches.

Gitblit does not currently index any AST metadata like OpenGrok.  This is due in part
to a licensing conflict with Exuberant CTags and also due to Exuberant CTags being
native code, which breaks Gitblit's promise of a "pure Java Git solution".

I have been in contact with Robert Futrell of the RsyntaxTextArea project about the
possibility of refactoring some of his excellent project into a new pure Java AST-extraction
library.  I'm not sure if this will go anywhere, but potentially Gitblit could offer
at least Java AST indexing based on this code.

Oh and when 0.9.0 is released and you turn-on Lucene indexing be sure to feed your
Gitblit plenty of ram on that first pass through your repositories.  :)

Reported by James.Moger on 2012-03-25 17:49:42

  • Status changed: Queued

@gitblit
Copy link
Owner Author

gitblit commented Aug 12, 2015

Fixed in v0.9.1

Reported by James.Moger on 2012-03-28 00:02:10

@gitblit
Copy link
Owner Author

gitblit commented Aug 12, 2015

Fixed in v0.9.1. Closing.

Reported by James.Moger on 2012-03-28 00:03:12

  • Status changed: Fixed

@gitblit gitblit closed this as completed Aug 12, 2015
This was referenced Aug 12, 2015
@flaix flaix modified the milestone: 0.9.0 Dec 13, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants