Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Magit calls Git *many* times to produce the status buffer #1327

Closed
jwiegley opened this issue Apr 6, 2014 · 5 comments
Closed

Magit calls Git *many* times to produce the status buffer #1327

jwiegley opened this issue Apr 6, 2014 · 5 comments
Labels
discussions We now use the "Discussions" feature for this
Milestone

Comments

@jwiegley
Copy link
Contributor

jwiegley commented Apr 6, 2014

Here is a complete listing of all the Git commands Magit calls to generate its status buffer:

GIT: magit-git-string git describe --tags --dirty
GIT: magit-git-string git rev-parse --show-toplevel
GIT: magit-git-string git rev-parse --show-toplevel [3 times]
GIT: magit-git-exit-code git update-index --refresh
GIT: magit-git-string git symbolic-ref -q HEAD [2 times]
GIT: magit-git-string git config branch.master.remote
GIT: magit-git-string git config branch.master.merge
GIT: magit-git-lines git config --get-all remote.magit.fetch
GIT: magit-git-string git config --bool branch.master.rebase
GIT: magit-git-string git config branch.master.remote
GIT: magit-git-string git config remote.magit.url
GIT: magit-git-string git rev-parse --verify HEAD
GIT: magit-git-string git log -1 --pretty=format:%h %s HEAD
GIT: magit-git-string git describe --long --tags
GIT: magit-git-string git describe --contains HEAD
GIT: magit-git-string git rev-parse --git-dir [7 times]
GIT: magit-git-lines git stash list
GIT: magit-git-lines git status --porcelain
GIT: magit-git-string git rev-parse --git-dir [2 times]
GIT: magit-cmd-insert-section: git diff-files
GIT: magit-git-exit-code git log -1 HEAD
GIT: magit-git-exit-code git diff --quiet --cached
GIT: magit-git-string git symbolic-ref -q HEAD
GIT: magit-git-string git config branch.master.remote
GIT: magit-git-string git config branch.master.merge
GIT: magit-git-lines git config --get-all remote.magit.fetch
GIT: magit-cmd-insert-section: git log --format=format:%h %s HEAD..refs/remotes/magit/master
GIT: magit-git-string git config core.abbrev
GIT: magit-git-string git symbolic-ref -q HEAD
GIT: magit-git-string git config branch.master.remote
GIT: magit-git-string git config branch.master.merge
GIT: magit-git-lines git config --get-all remote.magit.fetch
GIT: magit-cmd-insert-section: git log --format=format:%h %s refs/remotes/magit/master..HEAD
GIT: magit-git-string git config core.abbrev

Note that several of these commands are repeated multiple times too.

I've created a patch to memoize the results of calls, so that the multiple calls are no longer done, but it only drops speed down from 1.22s to 1.06s in one of my repositories. Probably not worth the additional complexity.

The subject of this issue is:

  • determining whether each of these calls is really necessary
  • reducing or eliminating duplicate calls

magit-status can be quite slow on large repositories, taking multiple seconds each time that I stage a hunk, which really slows down my workflow. I'd like to help make magit snappy for such use cases.

@tarsius
Copy link
Member

tarsius commented Apr 6, 2014

Except possibly on Windows, I think that the bad performance is mostly due to how inefficiently we parse diffs, and to a lesser extend parsing other things too. Also we use a lot of overlays but never disposed of them, making Magit slower over time. I have mostly addressed these issue on the next branch.

Calling Git a lot certainly also isn't optimal but as I said and your benchmark seems to confirm it's probably not all that bad. I am not motivated to cache git output, and then to deal with cache invalidation. Actually I am hoping ffi will appear fairly soon in Emacs, since in the long run I want to use libgit2.

For a long time I have not done anything about bad performance, at least not directly. That has changed recently, but most of that work is only in the next branch for now. Instead of doing benchmarks and then doing micro optimization and caching I have worked hard to cleanup code. Performance wasn't the only goal of such refactoring, so better performance came pretty much for free.

One place where I have refactored mainly for performance is the wazzup buffer. The original implementation was deeply flawed and had terrible performance. Over the years you and others have added hacks to work around the bad performance, i.e. provide a way to filter out branches the user is not interested in. I have thrown all of that out and have instead made one small adjustment: git log isn't called until a branch section is actually expanded. See bc6268c and c4e37e3. Actually we are now using git cherry instead of git log, the fact that this is much slower is a non-issue now.

Also note that the "delay inserting and/or washing" feature used by wazzup is currently hacked on top of the generic section handling. I am currently working on generalizing it so that it could be used elsewhere too, e.g. for initially-collapsed diffs in the status buffer. Related changes will including delaying the refinement of hunks and highlighting of bad whitespace. Together these things should give another big speedup.

In short, there are still things left where we can get better performance by simply "doing the right thing" and so I want to concentrate on that and delay benchmark-motivated changes for a little longer. Not much longer though, the end of the refactoring is finally in sight.

@tarsius tarsius added this to the support/later/needinfo/... milestone Apr 6, 2014
@jwiegley
Copy link
Contributor Author

jwiegley commented Apr 6, 2014

Sounds good! I look forward to seeing the next branch make it's way to master.

@tarsius
Copy link
Member

tarsius commented Apr 6, 2014

By the way good benchmarks would help. It's not that nobody has ever provided any benchmarks that was capable of demonstrating that something was slow. But these benchmarks were never very "scientific", i.e. I got the impression that the benchmarks were run already knowing what the result would be.

For example a good benchmark for refreshing the status buffer would not only show how long each git command is taking or how long each function on magit-status-sections-hook is taking. It would have to do both and then also provide some information about how long (1) we wait for git output (2) wash git output (3) do other things in elisp. It should not only enable us to make statements like "washing git output takes long" or "running git many times takes long". It should also provide insights like "compared to other magit-insert-* functions xy takes very long, even though getting the raw output from git actually is comparatively fast in that case". And then these benchmarks should also be written in a way that would allow them to be added to tests/magit-tests.el. Bonus points if implementing these benchmarks doesn't require modifying the code they are measuring.

@tarsius
Copy link
Member

tarsius commented Apr 11, 2014

On next it now takes fewer calls to git... to be further reduced.

@tarsius tarsius modified the milestones: 2.2.0, support/later/needinfo/... Apr 11, 2014
@tarsius tarsius modified the milestones: 2.1.0, 2.2.0 Apr 16, 2014
@tarsius tarsius added the . label Apr 16, 2014
@tarsius
Copy link
Member

tarsius commented Apr 16, 2014

Closing and adding to my "come back to this eventually" list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussions We now use the "Discussions" feature for this
Development

No branches or pull requests

2 participants