Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

git_prompt_status is slow on cygwin #5486

Closed
TheDauthi opened this issue Oct 1, 2016 · 18 comments
Closed

git_prompt_status is slow on cygwin #5486

TheDauthi opened this issue Oct 1, 2016 · 18 comments
Labels
Area: theme Issue or PR related to a theme Bug Something isn't working Performance Issue or PR about performance Platform: Windows Issue or PR for Windows

Comments

@TheDauthi
Copy link

git_prompt_status is rather slow on cygwin, and on other platforms where spawning a subshell is slow.
Two cases:
(first, just calling the function, second is inside a fresh linux kernel tree. )

TheDauthi@Hera:~$ time (git_prompt_status)
( git_prompt_status; )  0.21s user 0.41s system 97% cpu 0.636 total
TheDauthi@Hera:~/Projects/linux$ time (git_prompt_status)
( git_prompt_status; )  2.41s user 9.50s system 228% cpu 5.216 total

(Not much we can do about the second one, it's due to the slowness of the git status itself)
Identical setup on linux:

billyconn@Hades:~$ time (git_prompt_status)
( git_prompt_status; )  0.00s user 0.01s system 16% cpu 0.047 total
billyconn@Hades:~/Projects/linux$ time (git_prompt_status)
( git_prompt_status; )  0.06s user 0.08s system 85% cpu 0.173 total

Tracking this down, it's the spawning of the greps (which is no shock; cygwin emulation of fork is well-known, and grep is somehow particularly slow). I rewrote this function to use zsh builtins and an ugly bit of sed instead of spawning greps:

Cygwin:

TheDauthi@Hera:~$ time (git_prompt_status)
( git_prompt_status; )  0.07s user 0.04s system 117% cpu 0.102 total
TheDauthi@Hera:~/Projects/linux$ time (git_prompt_status)
( git_prompt_status; )  2.30s user 8.75s system 241% cpu 4.567 total

It's faster on linux, too, but linux was already fast:
Linux:

billyconn@Hades:~$ time (git_prompt_status)           
( git_prompt_status; )  0.00s user 0.00s system 0% cpu 0.008 total
billyconn@Hades:~/Projects/linux$ time (git_prompt_status)
( git_prompt_status; )  0.06s user 0.09s system 104% cpu 0.137 total

https://github.com/TheDauthi/git-prompt-status

I didn't make a pull request; this is in a frequently-used lib so I didn't want to touch it without testing, if there's even interest in using a modified version. I wrote a FEW tests (they're in the test directory), but will try to add more. I also would like to know if/how you typically integrate tests, as what's there is just something I hacked up in a couple of hours.

Not sure how the command works on OSX yet, don't have a machine to test on until Monday. I suspect that the sed statements will need to be re-quoted. Wanted to gauge interest and get feedback before continuing.

@mcornella
Copy link
Member

Feel free to make a PR, I'm all for using all zsh builtins. Thanks!

@mcornella mcornella added Area: theme Issue or PR related to a theme Bug Something isn't working Platform: Windows Issue or PR for Windows labels Oct 3, 2016
@igitur
Copy link

igitur commented Aug 17, 2017

Was this ever PR'ed / merged?

@TheDauthi
Copy link
Author

I made a PR, but it was never merged.

There's also a sed-based version in that linked repo. It's even faster (another multiple) because it only counts unique status lines. But I never got around to testing it on BSD platforms, and it doesn't use ZSH builtins.

@igitur
Copy link

igitur commented Aug 22, 2017

Ok, thanks. Bit of a side question: how did you track down the cause of the slow prompt? Even after using your sed-based version, my prompt it still slow and I suspect it's the damn corporate anti-virus. But I need a way to verify that.

@TheDauthi
Copy link
Author

I started out by looking at the outputs of echo $PROMPT, echo $RPROMPT, echo $RPROMPT2 and echo $PROMPT_COMMAND. That shows what functions are being called before/after each prompt. Testing each function separately using $(function-name) will show which one is taking its time. From there, it's just a matter of taking apart the functions individually.

If you're on Cygwin, I'd give it an 98% chance of being extra subshells or commands (especially grep) being executed. Forking and running commands (including subshells) is extremely slow on cygwin, because it's not really forking, it's creating a process and then cloning the previous command's entire data segment, stack, heap, etc. manually into the new process. The Windows kernel itself does support forking, but that functionality isn't exposed to the win32 subsystem.

If you're on another environment, the only real things I've seen used that killed performance in a prompt was a plugin function that scanned and parsed the history file and rewrote it after every command and one that was trying to do a git fetch after every one.

@igitur
Copy link

igitur commented Aug 23, 2017

@TheDauthi Yes, I'm on babun (cygwin + extras). Turns out it was the anti-virus. It does a real-time scan for every fork, which exacerbates the issue. Thanks for your help.

@TheDauthi
Copy link
Author

TheDauthi commented Aug 23, 2017

That's interesting. I'd suspect that with the extra work with the anti-virus, the pure-shell-parser version would be the faster one. That means that the sed one should be faster in some cases, but the pure shell one would be a more consistent improvement.

Too bad we need the 'stashed' status, or we could cut the number of forks in half. On Cygwin, it might actually be better iterating up the directory tree for that status, looking for .git/refs/stash, since we don't care about anything other than whether there is a stash or not. But that change would be better served by a plugin, since it's a Cygwin-only hack.

EDIT: @igitur the other issue you linked to is exactly what caused me to write this. I also use babun and had switched to the theme I normally use from Linux.

FredDeschenes added a commit to FredDeschenes/dotfiles that referenced this issue Sep 13, 2017
@robbyrussell
Copy link
Member

@TheDauthi could you link to your PR for me?

@TheDauthi
Copy link
Author

#5531.

It also provides more data for someone to use if someone wanted (the count of each type of modification), and would be trivially easy to change the order. There's no extra work being done, it's just a side-effect of how it's being counted now.

I'll be happy to fix any problems found. It's been tested somewhat (linked repo), but that's no guarantee that I didn't miss anything.

@iegik
Copy link

iegik commented Dec 7, 2017

  1. On git status there are hooks lounched and checks for EOL /whitespaces and so on. In Windows there is some troubles with core.autocrlf. So, if You have large project, git will check each file state.... it will be a lot of resource usage.
  2. The main feature (light one) is detecting branch name. It can be done with parsing content of .git/HEAD. Easy.
  3. I think getting state of uncommited files and getting branch/pushed content history state should be seporate process.

Also checkout https://gist.github.com/sindresorhus/3898739.

@TheDauthi
Copy link
Author

  1. Agreed. But there is also significant overhead for the fork itself. You can see what the performance looks like for each portion by running the tests, as one set of tests actually cuts out the git status portion entirely and replaces them with fixture data. When running on real-world git repos, the git portion was faster than the other overhead for small repositories, but grew much slower as the size increased. The PR improves on that problem too, by eliminating all but one of the git status calls. git status itself does not seem to have any hooks, as it's not a modifying request. I could be wrong, in which case I'll go look at the source to see if there's any way of skipping the hook, like there is for committing.

  2. Agreed, but this change does not have that functionality. That is already in a different function. There is also one other reason to not check .git for data that I've never seen used in real life: it's possible to change where the .git directory goes (ie, it's possible to move it outside the repository root directory). I'm assuming that's functionality that's used for read-only/network/ancient filesystems. This is the reason I did not iterate upwards to check for a stash: I did not want to break existing behaviour, even if it's in an edge-case. I actually had to research to see if this was possible when I wrote the code, as my original attempt did exactly what you are suggesting.

  3. Problematic. First: this function is used in several other places (including themes not shipped with oh-my-zsh), which means we really can't change the exposed portions without breaking backwards-compatibility for other people. Second: all of the data is coming from the same git command; duplicating it would negate the goal and works against your first point. I could pass the data over to a separate function to parse the status line, and considered that, but... well, I'm lazy and didn't think it cleaned things up much more. I agree with that they should be separate, if for no other reason than to allow people to put them in different locations in their prompt.

  4. We should use --porcelain to ensure this functionality doesn't break in the future (non-porcelain text can be changed, though it rarely is). I'm aware I'm not using it for stashed - I might want to revisit that, it was just laziness on my part. I should test the other flags to see if it improves speed on large repos on Cygwin. I mostly didn't because I was keeping previous behaviour again, but if it increases speed without changing functionality, I'm all for it.

@iegik
Copy link

iegik commented Dec 12, 2017

Actually, my shell was freezing because of Russel's theme of zsh on function git_prompt_info (I guess it is same as git_prompt_status). I've to disable git plugin and also $PROMPT variable.

@TheDauthi
Copy link
Author

It's different code and does a different thing, but possibly the same underlying problem. We're a bit off topic here, but your point is valid and should be noted for anyone coming across the problem: for a large codebase (or a very slow filesystem), anything calling git status underneath may be slow. I've seen several attempt to fix that, from trying to kill the git status process after a timeout to caching the output. I don't think any of them have been successful yet. I think the current suggestion is to use git config --add oh-my-zsh.hide-status 1 in any slow git repository to make oh-my-zsh ignore building the status for that particular directory.

@CiderAndWhisky
Copy link

I have the same proplem here - Win10, Ubuntu Subsystem, Oh-My-Zsh:
time (git_prompt_status)
--> ( git_prompt_status; ) 0.05s user 6.73s system 253% cpu 2.674 total

This is on an i7 with an ssd, however the git repo resides in /mnt/d/..., so this is on an NTFS device, not the virtual extfs used by the bash.

Any ideas except hide-status?

@CiderAndWhisky
Copy link

My current work-around: Comment the line "prompt_git" at the very end of the agnoster theme (~/.oh-my-zsh/themes/agnoster.zsh-theme) - Snappy as hell, but no branch info any more. :-/

@Shea690901
Copy link

I've solved the problem quiet easy:

git config --global oh-my-zsh.hide-status 1

Et voilá: without any change to any theme git_prompt_status() terminates after checking the configuration...
It's even possible to switch the status on/off depending on the repo ;)

@9Lukas5
Copy link

9Lukas5 commented Mar 26, 2019

I wanted to ask if there's any possibility to have a fix here anytime, or this is just not possible on windows because of technical borders?

I have made a change to my agnoster file according to @iegik suggestion, that i have a git light.
This always shows a green background, but I atleast see the branch name if I'm in a git repo.
Basically, I commented out the most of the git function:

prompt_git() {
  (( $+commands[git] )) || return
  if [[ "$(git config --get oh-my-zsh.hide-status 2>/dev/null)" = 1 ]]; then
    return
  fi
#  local PL_BRANCH_CHAR
#  () {
#    local LC_ALL="" LC_CTYPE="en_US.UTF-8"
    PL_BRANCH_CHAR=$'\ue0a0'         # 
#  }
#  local ref dirty mode repo_path

  if $(git rev-parse --is-inside-work-tree >/dev/null 2>&1); then
    #repo_path=$(git rev-parse --git-dir 2>/dev/null)
#    dirty=$(parse_git_dirty)
    ref=$(git symbolic-ref HEAD 2> /dev/null) || ref="➦ $(git rev-parse --short HEAD 2> /dev/null)"
#    if [[ -n $dirty ]]; then
#      prompt_segment yellow black
#    else
      prompt_segment green $CURRENT_FG
#    fi

#    if [[ -e "${repo_path}/BISECT_LOG" ]]; then
#      mode=" <B>"
#    elif [[ -e "${repo_path}/MERGE_HEAD" ]]; then
#      mode=" >M<"
#    elif [[ -e "${repo_path}/rebase" || -e "${repo_path}/rebase-apply" || -e "${repo_path}/rebase-merge" || -e "${repo_path}/../.dotest" ]]; then
#      mode=" >R>"
#    fi

#    setopt promptsubst
#    autoload -Uz vcs_info

#    zstyle ':vcs_info:*' enable git
#    zstyle ':vcs_info:*' get-revision true
#    zstyle ':vcs_info:*' check-for-changes true
#    zstyle ':vcs_info:*' stagedstr '✚'
#    zstyle ':vcs_info:*' unstagedstr '●'
#    zstyle ':vcs_info:*' formats ' %u%c'
#    zstyle ':vcs_info:*' actionformats ' %u%c'
#    vcs_info
    echo -n "${ref/refs\/heads\//$PL_BRANCH_CHAR }${vcs_info_msg_0_%% }${mode}"
  fi
}

So all I left active would be this:

prompt_git() {
  (( $+commands[git] )) || return
  if [[ "$(git config --get oh-my-zsh.hide-status 2>/dev/null)" = 1 ]]; then
    return
  fi

  PL_BRANCH_CHAR=$'\ue0a0'         # 

  if $(git rev-parse --is-inside-work-tree >/dev/null 2>&1); then
    ref=$(git symbolic-ref HEAD 2> /dev/null) || ref="➦ $(git rev-parse --short HEAD 2> /dev/null)"
      prompt_segment green $CURRENT_FG
    echo -n "${ref/refs\/heads\//$PL_BRANCH_CHAR }${vcs_info_msg_0_%% }${mode}"
  fi
}

@mcornella mcornella added the Performance Issue or PR about performance label Apr 10, 2019
@mcornella mcornella moved this to Backlog in Main project Jan 2, 2022
@robbyrussell
Copy link
Member

Hey there, I'm doing a bit of housekeeping on the project. Thanks for raising this issue—we always love a good edge case! 😄

However, it's not gathering much community interest, so I will close it for now. Don't let this discourage you! We're all for hearing about the weird things. We can't commit to spending too much time tracking each weird thing down ourselves.

@robbyrussell robbyrussell closed this as not planned Won't fix, can't repro, duplicate, stale Oct 13, 2023
@github-project-automation github-project-automation bot moved this from Backlog to Done in Main project Oct 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: theme Issue or PR related to a theme Bug Something isn't working Performance Issue or PR about performance Platform: Windows Issue or PR for Windows
Projects
Archived in project
Development

No branches or pull requests

8 participants