Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

recursive directory scanning harmful for performance #76

Open
nojb opened this issue Sep 22, 2015 · 2 comments
Open

recursive directory scanning harmful for performance #76

nojb opened this issue Sep 22, 2015 · 2 comments

Comments

@nojb
Copy link
Contributor

nojb commented Sep 22, 2015

I am using ocp-index with a large source tree (~1000 modules in ~1500 directories). Running

ocp-index complete -I . foo

when standing at the source of the tree incurs in a delay of about 1s before completions appear. While this is not much it is too long to use it for completion in emacs.

Some profiling shows that most of the time is spent traversing the directory tree looking for *.cm{t,ti,i} files. Since the set of such directories are more or less static one should be allowed to specify these directories by hand to avoid the recursive scanning each time ocp-index is run. The following changes are proposed:

  • A flag --recursive (or just -r) indicating that the include directories should be recursively traversed. If backwards-compatibility is important, we could instead introduce a flag --no-recursive.
  • A file .ocp-index can be placed in the tree containing a list of directories. These directories would be used as include directories as if they had been passed to the -I command line option. Typically this file would be generated by doing
find . -name '*.cmt*' -or -name '*.cmi' -exec dirname {} \; | sort | uniq > .ocp-index

while standing at the project root.

Note that the file .ocp-index can also be used to indicate the project root and we could replace all the build system-specific logic easily using this one mechanism. For example, projects using ocamlbuild could simply create a .ocp-index file containing "_build":

echo _build > .ocp-index

I have a small patch implementing this proposal and it gets the delay reduced to < .2s which makes it useful for interactive emacs use again. I would like to submit a pull request, but I am opening this issue first to ask for opinions/ideas/etc.

Thanks!

@AltGr
Copy link
Member

AltGr commented Sep 29, 2015

Thanks; in my experience, the delay gets below noticable once the os caching has done its job after first run, but that may not always be the case. Ocp-index does already implement some magic to guess your project's root and build directory, though: so whenever a _build is found in a parent directory, both should be inferred from there. (the code is here).

How does ocp-index behave in your case if you remove the -I . ?

So .ocp-index may be a good idea to work around the cases where this doesn't work reliably, if there is no easy fix to the heuristics. I'd like to keep the current defaults which appear to work well for most projects, though.

@nojb
Copy link
Contributor Author

nojb commented Sep 29, 2015

Just pushed a PR in order to have something more concrete to discuss. If I remove the -I . or if I pass a directory with a small number of subdirectories then ocp-index works fast as usual. Also, the delay does not go away after the first run under Cygwin.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants