Speed up git-info #221

Closed
paulmillr opened this Issue Jul 22, 2012 · 34 comments

Projects

None yet

7 participants

@paulmillr

So, git-info is currently about 408 lines long, but many themes don't need all stuff it does.

Actually I don't mind about everything there, but the reason I created this issue is its speed. It's freaking slow.

How about adding light-git-info that will only do: git symbolic-ref HEAD 2> /dev/null (get current branch)?

@sorin-ionescu

'Freaking slow' doesn't help me make it faster. vcs_info is even larger.

@paulmillr

Right, but I do not think profiling will help much. It just seems logical. When your machine constantly does IO and you have HDD instead of SSD, doing one IO command will be faster than doing 5 (10?).

also you seem made a typo in this guy's name.

@sorin-ionescu

I do not have a SSD, and, for me, it's fast enough. You can execute git-info off for very large repositories.

I'm going to invite @ColinHebert into this conversation.

@clvv

Try it on a removable usb drive. I reckon that would make a difference.

A solution is to use timeout. But of course it can only be used on one command, not the entire function. Or it maybe possible to create a timeout function to set running-time limit on a shell function. Something like this (link to my dotfile repo, implementation doesn't really work on zsh).

@sorin-ionescu

How does git-info compare to vcs_info? Do you consider that git-info does slightly more than vcs_info in some cases.

@benohara

Hard to explain, but it takes a second for the prompt to display compared to using vcs_info

see http://codestre.am/7668bbd5a1dcb3607ff738b09

@sorin-ionescu

I wonder if vcs_info is caching. @benohara git-info is not that complicated. Don't be afraid to look it over.

@ColinHebert

For @benohara, If I had to guess, I would say that it's due to git submodules, it tends to slow down git status which is the biggest call made in git-info.
You can try that theory with zstyle ':omz:module:git:ignore' submodule 'all' it should skip git status on your submodules and go way faster.


Regarding the speed of git-info itself I think it could be slightly improved by checking if the appropriate zstyle is set (zstyle -t context style [ strings ...] does that I believe).

For example, no need to run git symbolic-ref -q HEAD if you don't get the branch name or git rev-parse --symbolic-full-name --verify HEAD@{upstream} if you display nothing when the local branch isn't synchronised with the remote branch.

Those are minor improvements but should be make things slightly faster (I think).


And to answer the initial question, should we have a git-info light; I think that if you need a lighter git-info (ie. with less features) you should consider using vcs_info.

git-info is, as far as I am know, not here to be lighter or faster than vcs_info, but to add more features that you couldn't do in vcs_info without writing a terrible code (because there is not enough hooks for everything) or a slower one.

I have nothing personal against vcs_info and it works very well if you want to keep you shell simple (and you can use the same style for every VCS). The reason why I don't use it is because I work all day long with git, and I need more info than just "is the local repo dirty" and the current branch name; I think I use 90% of what is executed in git-info.

@sorin-ionescu

I have looked at vcs_info, more specifically VCS_INFO_get_data_git. It does not cache. It does not do anything clever to be faster. It uses git diff-index to get a few things, namely staged, non-staged, and commit in addition to the branch and action.

git-status is slow. There is no way around it as far as I know. If you do not need a lot of information, use a theme that uses vcs_info. If you want a lot of information, use a theme that uses git-info.

As @ColinHebert said, any changes that can be done to git-info are trivial and are not likely to provided a perceptible increase in speed.

@pbrisbin

For what it's worth, I have a git-info-fast in my branch. It doesn't do some of the remote lookups and runs much faster (measured subjectively of course.) than the existing git-info.

I've been meaning to do a proper pull request (for this and other things) and now that the repo's been split, hopefully that can happen soon.

If anyone's interested, it's sitting here for now.

@ColinHebert

What does git-info-fast do that couldn't be available (easily) with vcs_info?

I mean, as said in this discussion the difference between git-info and vcs_info is that one provides more information while the other has the compatibility system with every (or most of them) VCS.
It seems to me (correct me if I'm wrong) that git-info-fast provides the same amount of details as vcs_info while not being compatible with other VCS systems like git-info.

Not that what you did isn't efficient or useful, but how does this solution compare to the two existing solutions already used by OMZ users?

@sorin-ionescu

@pbrisbin I am not merging that. Other than what @ColinHebert said, it's broken, especially the way it checks if you are inside of a repository.

I am open to making git-info faster without removing functionality, perhaps by using different low level git executables, such as git-diff-index.

@sorin-ionescu

Is there a Git daemon that uses inotify (Linux), FSEvents (Mac OS X), kqueue (Mac OS X, BSD), ReadDirectoryChangesW (Windows) to always be up to date on work tree changes in order for git status to run instantly by not having to walk said tree?

Should we cache git status then use a directory change notification library to update the changed file counts for added, modified, removed, renamed, and so on?

@paulmillr

@sorin-ionescu awesome idea, 👍

@sorin-ionescu

@paulmillr Someone else has had the same idea: inotify daemon speedup for git. Unfortunately, it was not successful.

@sorin-ionescu sorin-ionescu reopened this Oct 2, 2012
@sorin-ionescu

So, who wants to extend kqwait to attempt caching + file system notifications? You cannot expect me to everything?

@ColinHebert

I'm not really fond of having that inside Prezto. If anything was done I would prefer to see an extension of git itself speeding up the git status, I'm not sure I'm comfortable with having my shell spawning daemons in every git repo I own and caching things weirdly.

I would be all for a new project to replace git status or enhance it. (As I'm trying to play with ruby on my weekends I might try that actually, but it's for fun, don't expect anything)

@sorin-ionescu

You can't do it in Ruby. It's low level kernel stuff. You'll have to do it in C.

@pbrisbin
@sorin-ionescu

So, I've been toying with trying to make git-info faster. I've done a lot of changes on this issue's branch.

Besides the boat load of if statements to test if a zstyle has been defined, it now also lets you choose between classic git-info status (full) and vcs_info status (partial), which only shows indexed (staged), via format code %i, and unindexed (unstaged), via format code %I.

zstyle ':prezto:module:git:info' status 'partial'
zstyle ':prezto:module:git:info:branch' format ':%F{green}%b%f'
zstyle ':prezto:module:git:info:indexed' format ' %B%F{green}i%f%b'
zstyle ':prezto:module:git:info:unindexed' format ' %B%F{blue}I%f%b'
zstyle ':prezto:module:git:info:keys' format \
  'prompt' ' %F{blue}git%b' \
  'rprompt' '%i%I'

Please test this new git-info for speed and bugs.

# Switch to git-info theme.
time (git-info)

# Switch to vcs_info theme.
time (vcs_info)
@ColinHebert

@sorin-ionescu I don't intend to do any low level stuff there is already plenty of tools to use inotify and FSEvent. Worse case scenario I would have to do some ruby ffi (I would very much like to avoid that anyway).

Plus it would be easier to move to C if a POC can be setup quickly, there IMHO is only perl, python and ruby as viable languages for this POC, there is no way I do that in perl, so I'll try with ruby.

@pbrisbin I think it will still be useful when you work with a lot of submodules (which is my case, about 100 submodules in my main project at work)

@sorin-ionescu heh, applying ifs to check if the zstyle is used rings a bell. But anyway, I think our main problem is (and will stay for a while) this git status which is incredibly slow (at least that's what bothers me the most).

@sorin-ionescu

@ColinHebert Well, with %i and %I, you can now have vcs_info status, including its deficiency of not detecting untracked files. The new boat load of if statements, we should probably keep. The vcs_info style status, I'm not too sure.

Benchmarking it against vcs_info themes would be useful.

@sorin-ionescu

The new git-info is slightly faster.

Old:
0.04s user 0.09s system 85% cpu 0.153 total

New (status enabled):
0.04s user 0.08s system 87% cpu 0.138 total

New (status not enabled):
0.02s user 0.05s system 87% cpu 0.085 total
@sorin-ionescu

I've toyed with a peepcode theme clone called peepcode_git_info that uses git-info.

peepcode (vcs_info):
0.04s user 0.07s system 87% cpu 0.124 total

peepcode_git_info (git-info):
0.03s user 0.06s system 86% cpu 0.104 total

It's probably faster because git-info does not have stgit support.

The git-info version is a lot more readable than the vcs_info version.

Comments?

@sorin-ionescu

@ColinHebert How does multiple calls to git ls-files compare to one call to git status --porcelain, I wonder?

@ColinHebert

Hum, I'm not so sure about ls-files it's really recommended to stay away from it (for scripting). If we want to go with plumbing commands, we should take a look at git diff-index and git diff-files.

I did a really quick test, here is what we would like to have:

added (to the WD/untracked) :

git ls-files -o --exclude-standard

added (to the index):

git diff-index HEAD --name-status --cached (--find-renames)

removed (from the WD):

git diff-files --name-status

removed (from the index):

git diff-index HEAD --name-status --cached (--find-renames)

modified (in the WD):

git diff-files --name-status

modified (in the index):

git diff-index HEAD --name-status --cached (--find-renames)

renamed (in the WD):
NOT RELEVANT

renamed (in the index):

git diff-index HEAD --name-status --cached --find-renames

I haven't checked the unmerged yet. And there is a big problem with all of that, almost all of those commands require HEAD which doesn't exist until the initial commit is done.


Overall I think that we should stick with git status which already does the aggregation we're about to do. I'm not sure that doing that ourselves will give better results.

@sorin-ionescu

Has anybody bothered to test these changes for speed and bugginess?

@sorin-ionescu

I am inviting @skpw into this conversation.

@sorin-ionescu

I have made git-info faster by only computing information when a particular zstyle is defined. However, since git-status is slow and many do not want as much repository information as my theme shows, I have also added a mode, simple, in lieu of complex, feel free to suggest better names, that behaves similarly to vcs_info, which informs of staged and unstaged files, which for the purpose of git-info, they shall be known as indexed files and unindexed files, the %S format code is in use for stashed files.

Select the mode you want for your theme:

zstyle ':prezto:module:git:info' status 'simple/complex'

I have come up with two versions of the simple mode, known as v1 and v2, which I shall discuss next.


v1 behaves similarly to vcs_info, but unlike vcs_info, unindexed also informs of untracked files because I have noticed that many vcs_info themes hack support for untracked files using a vcs_info hook since most people consider both unindexed and untracked as one and the same — not in the index. See the peepcode theme for an example. They can be separated, of coarse; I just chose to follow the hook hack.

The performance between vcs_info and git-info is virtually identical provided that the vcs_info theme also checks for untracked files.

The following format codes are available.

Name Format Code Description
indexed %i Indexed files indicator
unindexed %I Unindexed (including untracked) files indicator

The deficiency of this version of the simple mode is that these format codes have to be set to a coloured UTF-8 character or word. There is no count of indexed and unindexed files like in other contexts.


v2 behaves similarly to the classic git-info and calls the same git porcelain commands as v1 but presents the information computed differently. unindexed no longer mashes together unindexed files and untracked files; they are now split into separate unindexed and untracked contexts. Furthermore, the file count for each context is provided.

This version also transplants two contexts from the complex mode, clean and dirty. Many people just want to know when a repository is dirty by displaying the character.

So, what is dirty?

 dirty = indexed + unindexed + untracked

The above three contexts are initialised to 0 and unless defined in the theme, they are never computed. If dirty to you means unindexed and untracked but not indexed, and you want to show the character you'll have to define the following:

zstyle ':prezto:module:git:info:unindexed' format ' '
zstyle ':prezto:module:git:info:untracked' format ' '
zstyle ':prezto:module:git:info:dirty' format ' %F{red}✗%f'

The following format codes are available.

Name Format Code Description
clean %C Clean state
dirty %D Dirty files count
indexed %i Indexed files count
unindexed %I Unindexed files count
untracked %u Untracked files count

v2 is slightly slower than v1 because for indexed and unindexed, we can no longer rely on exit codes and have to count files.

Using time (vcs_info) and time (git-info), I have got the following numbers in a repository with 1 indexed file, 3 unindexed files, and 1 untracked file.

  • peepcode vcs_info: 0.118
  • peepcode simple v1: 0.120
  • peepcode simple v2: 0.136
  • sorin complex: 0.249

Please vote for or against v1 or v2. You can also suggest your own or none at all. I'm not particularly fond of adding more features to git-info.

@paulmillr

👍 v2

@lunaryorn

👍 v2

@sorin-ionescu

Perhaps minimal and verbose are better names for the two modes than simple and complex.

@gmaghera gmaghera added a commit to gmaghera/prezto that referenced this issue May 19, 2013
@gmaghera gmaghera Merge remote-tracking branch 'upstream/master'
* upstream/master:
  Make gpg-agent and ssh-agent work with each other
  [Fix #425] Rewrite module ssh-agent; rename it to ssh
  [Fix #103] Add documentation for editor
  Remove the git-info SIGINT message
  [Fix #307] Do not auto-off git-info
  Remove ununsed variable
  Clarify Git listing aliases descriptions
  Swap aliases gsd and gsL
  Rename alias gRc to gRp
  [Fix #221] Add a simple git-info
  [#221] Do not format undefined zstyles
  Initialize ahead and behind local variables
  Add rar command to archive module
  Refactor Emacs module
  Load completion for Carton
ac72c9a
@admk admk added a commit to admk/prezto that referenced this issue May 20, 2013
@admk admk Merge branch 'master' of https://github.com/sorin-ionescu/prezto into…
… HEAD

* 'master' of https://github.com/sorin-ionescu/prezto: (35 commits)
  Make gpg-agent and ssh-agent work with each other
  [Fix #425] Rewrite module ssh-agent; rename it to ssh
  [Fix #103] Add documentation for editor
  Remove the git-info SIGINT message
  [Fix #307] Do not auto-off git-info
  Remove ununsed variable
  Clarify Git listing aliases descriptions
  Swap aliases gsd and gsL
  Rename alias gRc to gRp
  [Fix #221] Add a simple git-info
  [#221] Do not format undefined zstyles
  Initialize ahead and behind local variables
  Add rar command to archive module
  Refactor Emacs module
  Load completion for Carton
  Correct syntax error in variable assignment
  Ensure that the tmux server is started
  [Fix #426] Correct syntax error in variable assignment
  [Fix #419] Rewrite module gpg-agent; rename it to gpg
  [Fix #52] Add zstyles to configure history-substring-search
  ...
a19cdee
@sorin-ionescu

If anybody has got ideas on how to speed it up further, I'm listening. Yes, you'll have to read and comprehend the giant git-info function.

@sorin-ionescu

If all you want to show is a dirty repository indicator, no counts, vcs_info is still your best bet.

@gudleik gudleik added a commit that referenced this issue May 23, 2013
@gudleik gudleik Merge remote-tracking branch 'upstream/master'
* upstream/master:
  Make gpg-agent and ssh-agent work with each other
  [Fix #425] Rewrite module ssh-agent; rename it to ssh
  [Fix #103] Add documentation for editor
  Remove the git-info SIGINT message
  [Fix #307] Do not auto-off git-info
  Remove ununsed variable
  Clarify Git listing aliases descriptions
  Swap aliases gsd and gsL
  Rename alias gRc to gRp
  [Fix #221] Add a simple git-info
  [#221] Do not format undefined zstyles
  Initialize ahead and behind local variables
  Add rar command to archive module
  Refactor Emacs module
  Load completion for Carton

Conflicts:
	runcoms/zpreztorc
b5a1d0a
@stefanfrede stefanfrede pushed a commit that referenced this issue May 25, 2013
Stefan Frede Merge remote-tracking branch 'upstream/master'
* upstream/master:
  [Fix #436] Remove Bombich rsync references
  Add the RubyGems bin directory to PATH on other Unix systems
  Do not substitute /tmp since $TMPDIR is always set
  [Fix #437] Always set $TMPDIR
  Make gpg-agent and ssh-agent work with each other
  [Fix #425] Rewrite module ssh-agent; rename it to ssh
  [Fix #103] Add documentation for editor
  Remove the git-info SIGINT message
  [Fix #307] Do not auto-off git-info
  Remove ununsed variable
  Clarify Git listing aliases descriptions
  Swap aliases gsd and gsL
  Rename alias gRc to gRp
  [Fix #221] Add a simple git-info
  [#221] Do not format undefined zstyles
  Initialize ahead and behind local variables
  Add rar command to archive module
  Refactor Emacs module
  Load completion for Carton

Conflicts:
	runcoms/zpreztorc
5258a51
@trongrg trongrg added a commit to trongrg/prezto that referenced this issue May 27, 2013
@trongrg trongrg Merge remote-tracking branch 'upstream/master'
* upstream/master: (39 commits)
  [Fix #436] Remove Bombich rsync references
  Add the RubyGems bin directory to PATH on other Unix systems
  Do not substitute /tmp since $TMPDIR is always set
  [Fix #437] Always set $TMPDIR
  Make gpg-agent and ssh-agent work with each other
  [Fix #425] Rewrite module ssh-agent; rename it to ssh
  [Fix #103] Add documentation for editor
  Remove the git-info SIGINT message
  [Fix #307] Do not auto-off git-info
  Remove ununsed variable
  Clarify Git listing aliases descriptions
  Swap aliases gsd and gsL
  Rename alias gRc to gRp
  [Fix #221] Add a simple git-info
  [#221] Do not format undefined zstyles
  Initialize ahead and behind local variables
  Add rar command to archive module
  Refactor Emacs module
  Load completion for Carton
  Correct syntax error in variable assignment
  ...
8e4a2f3
@adamrights adamrights added a commit to adamrights/prezto that referenced this issue Jun 7, 2013
@adamrights adamrights Merge branch 'master' of https://github.com/sorin-ionescu/prezto
* 'master' of https://github.com/sorin-ionescu/prezto: (42 commits)
  Rename archive module functions
  [Fix #436] Update link to Bombich rsync
  Revert "[Fix #436] Remove Bombich rsync references"
  [Fix #436] Remove Bombich rsync references
  Add the RubyGems bin directory to PATH on other Unix systems
  Do not substitute /tmp since $TMPDIR is always set
  [Fix #437] Always set $TMPDIR
  Make gpg-agent and ssh-agent work with each other
  [Fix #425] Rewrite module ssh-agent; rename it to ssh
  [Fix #103] Add documentation for editor
  Remove the git-info SIGINT message
  [Fix #307] Do not auto-off git-info
  Remove ununsed variable
  Clarify Git listing aliases descriptions
  Swap aliases gsd and gsL
  Rename alias gRc to gRp
  [Fix #221] Add a simple git-info
  [#221] Do not format undefined zstyles
  Initialize ahead and behind local variables
  Add rar command to archive module
  ...

Conflicts:
	runcoms/zpreztorc
c5bd540
@douglasdrumond douglasdrumond added a commit to douglasdrumond/prezto that referenced this issue Jun 24, 2013
@douglasdrumond douglasdrumond Merge branch 'master' of github.com:eee19/prezto
* 'master' of github.com:eee19/prezto: (28 commits)
  Rename archive module functions
  [Fix #436] Update link to Bombich rsync
  Revert "[Fix #436] Remove Bombich rsync references"
  [Fix #436] Remove Bombich rsync references
  Add the RubyGems bin directory to PATH on other Unix systems
  Do not substitute /tmp since $TMPDIR is always set
  [Fix #437] Always set $TMPDIR
  Make gpg-agent and ssh-agent work with each other
  [Fix #425] Rewrite module ssh-agent; rename it to ssh
  [Fix #103] Add documentation for editor
  Remove the git-info SIGINT message
  [Fix #307] Do not auto-off git-info
  Remove ununsed variable
  Clarify Git listing aliases descriptions
  Swap aliases gsd and gsL
  Rename alias gRc to gRp
  [Fix #221] Add a simple git-info
  [#221] Do not format undefined zstyles
  Initialize ahead and behind local variables
  Add rar command to archive module
  ...
7fe0563
@agrimaldi agrimaldi added a commit that referenced this issue Jul 11, 2013
@agrimaldi agrimaldi Merge remote-tracking branch 'upstream/master'
* upstream/master: (52 commits)
  Rename archive module functions
  [Fix #436] Update link to Bombich rsync
  Revert "[Fix #436] Remove Bombich rsync references"
  [Fix #436] Remove Bombich rsync references
  Add the RubyGems bin directory to PATH on other Unix systems
  Do not substitute /tmp since $TMPDIR is always set
  [Fix #437] Always set $TMPDIR
  Make gpg-agent and ssh-agent work with each other
  [Fix #425] Rewrite module ssh-agent; rename it to ssh
  [Fix #103] Add documentation for editor
  Remove the git-info SIGINT message
  [Fix #307] Do not auto-off git-info
  Remove ununsed variable
  Clarify Git listing aliases descriptions
  Swap aliases gsd and gsL
  Rename alias gRc to gRp
  [Fix #221] Add a simple git-info
  [#221] Do not format undefined zstyles
  Initialize ahead and behind local variables
  Add rar command to archive module
  ...

Conflicts:
	runcoms/zpreztorc
c0f4127
@jeffknupp jeffknupp pushed a commit to jeffknupp/prezto that referenced this issue Oct 15, 2013
@sorin-ionescu [#221] Do not format undefined zstyles 18c0876
@jeffknupp jeffknupp pushed a commit to jeffknupp/prezto that referenced this issue Oct 15, 2013
@sorin-ionescu [Fix #221] Add a simple git-info b40c40e
@linuslundahl linuslundahl added a commit to linuslundahl/prezto that referenced this issue Oct 17, 2013
@sorin-ionescu [#221] Do not format undefined zstyles d26df90
@linuslundahl linuslundahl added a commit to linuslundahl/prezto that referenced this issue Oct 17, 2013
@sorin-ionescu [Fix #221] Add a simple git-info 100f4ef
@zeroasterisk zeroasterisk added a commit to zeroasterisk/prezto that referenced this issue Oct 22, 2013
@sorin-ionescu [#221] Do not format undefined zstyles 25a0584
@zeroasterisk zeroasterisk added a commit to zeroasterisk/prezto that referenced this issue Oct 22, 2013
@sorin-ionescu [Fix #221] Add a simple git-info 12f7991
@lildude lildude pushed a commit to lildude/prezto that referenced this issue Jan 12, 2014
@sorin-ionescu [#221] Do not format undefined zstyles f607ec8
@lildude lildude pushed a commit to lildude/prezto that referenced this issue Jan 12, 2014
@sorin-ionescu [Fix #221] Add a simple git-info ff65e48
@lackac lackac added a commit to lackac/prezto that referenced this issue Jan 19, 2014
@sorin-ionescu [#221] Do not format undefined zstyles bd88593
@lackac lackac added a commit to lackac/prezto that referenced this issue Jan 19, 2014
@sorin-ionescu [Fix #221] Add a simple git-info 1ddc1f4
@matthoffman matthoffman added a commit to matthoffman/oh-my-zsh that referenced this issue Sep 18, 2014
@sorin-ionescu [#221] Do not format undefined zstyles 48c93e2
@matthoffman matthoffman added a commit to matthoffman/oh-my-zsh that referenced this issue Sep 18, 2014
@sorin-ionescu [Fix #221] Add a simple git-info e508a6a
@fanf fanf added a commit to fanf/prezto that referenced this issue Nov 12, 2015
@sorin-ionescu [#221] Do not format undefined zstyles 7824af5
@fanf fanf added a commit to fanf/prezto that referenced this issue Nov 12, 2015
@sorin-ionescu [Fix #221] Add a simple git-info 1570920
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment