Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

massive performance regression with large git trees #310

Closed
robclark opened this issue Jul 10, 2014 · 16 comments
Closed

massive performance regression with large git trees #310

robclark opened this issue Jul 10, 2014 · 16 comments
Labels
component:revision-graph Revision graph type:performance Performance issue
Milestone

Comments

@robclark
Copy link

Not sure if you ever test tig with linux kernel git tree? You should. Somehow since tig 2.0 performance has massively regressed, to the point where it is barely usable. (fwiw, I have 2.0.2 curently)

It doesn't seem to be a problem on smaller git trees.

@jonas
Copy link
Owner

jonas commented Jul 14, 2014

True. I don't test large git trees, unfortunately. Will try to investigate
when I get some time.

One optimization was added to reduce the log data read from git by using a
custom pretty format. Maybe that is to blame.

Else could you try to disable the revision graph if it is not already. 2.0
contains a lot of improvements that unfortunately also require additional
computations.
On Jul 10, 2014 9:14 AM, "Rob Clark" notifications@github.com wrote:

Not sure if you ever test tig with linux kernel git tree? You should.
Somehow since tig 2.0 performance has massively regressed, to the point
where it is barely usable. (fwiw, I have 2.0.2 curently)

It doesn't seem to be a problem on smaller git trees.


Reply to this email directly or view it on GitHub
#310.

@robclark
Copy link
Author

I do tend to have a lot of remotes and a lot of branches, which I'm sure doesn't help. But from a quite test, it seem to take ~5s to see anything in main view (after 'Unstaged Commits') both with and without graph. And in either case it will sit there (if not interrupted) for several minutes at least loading older commits. Not sure if there is some way I can configure it to limit the amount of history that it tries to load?

At any rate, other than taking a lot of disk space and network bandwidth for a fresh kernel tree clone, it would be a good thing to keep around for a good real-world stress test. So far I've not found any other git tree that I have this problem with. But the kernel tree is one I spend a lot of time with ;-)

If it is somehow due to new features in tig 2.0, it would be pretty nice if there was a way to configure (per git tree) to somehow fall back to 1.x behaviour.

@jonas
Copy link
Owner

jonas commented Jul 29, 2014

Oh, another thing to try is to:

set show-changes = no

in your ~/.tigrc to avoid that Tig does a git-update-index before rendering the main view. This is unfortunately done synchronously before loading the rest of the view.

@jonas
Copy link
Owner

jonas commented Jul 29, 2014

And yes, on my work laptop I used to have the kernel tree lying around for testing. I will make a new clone and test this myself.

@robclark
Copy link
Author

seems to help a bit.. I guess I should build tig1 so I have a way to compare back to back?

I don't suppose you happen to have a .tigrc that is equiv to tig v1. behaviour? Fwiw, what I am using at the moment:

 set line-graphics = utf-8
 set commit-order = topo         # Order commits topologically
 set git-colors = no             # Do not read Git's color settings.
 set show-changes = no

 # Wrap branch names with () and tags with <>
 set reference-format = (branch) <tag> {remote}

 set main-view = \
    id,width=8 \
    date:short \
    author:abbreviated,width=10 \
    commit-title:width=40,graph=true,refs=true

 #### OLD:
  #set author-width = 5
 #set show-date = short           # Show relative commit date.
 #set show-rev-graph = yes        # Show revision graph?
 #set show-refs = yes             # Show references?
 #set commit-order = topo         # Order commits topologically
 #set read-git-colors = no        # Do not read Git's color settings.
 #set show-line-numbers = no      # Show line numbers?
 #set line-number-interval = 5    # Interval between line numbers
 #set horizontal-scroll = 33%     # Scroll 33% of the view width
 #set blame-options = -C -C    # Blame lines from other files
 #set show-refs = yes
 #set show-id = yes

@jonas
Copy link
Owner

jonas commented Jul 30, 2014

No, there's only a file for reversing the bindings to v1.

To run v1 when v2 is installed use:

$ TIGRC_SYSTEM= tig-1.2

@jonas
Copy link
Owner

jonas commented Aug 16, 2014

So I got around to testing this. Assuming:

  • Linux git repo based on v3.16-11452-g88ec63d (with 468098 commits) nicely warmed up.
  • Launching Tig using this command: TIGRC_SYSTEM= TIGRC_USER= tig.
  • Best loading Xs reported by Tig of three runs (the number shown before it disappears from the view title bar).
Version No graph, no changes No graph, with changes With graph and changes
tig-1.2 42 seconds 30 seconds 40 seconds
tig-1.2.1 20 seconds 12 seconds 31 seconds
tig-2.0.2 12 seconds 10 seconds Infinity*
master 12 seconds 9 seconds Infinity**

* (60+ seconds to load less than 5% of commits)
** (60+ seconds to load less than 10% of commits)

See below for the settings added in .git/config to disable/enable graph and show-changes:

Observations

  • The show-changes actually speeds things up, which is kind of surprising to me.
  • tig-1.2.1 was the first version to no render the revision graph when it was disabled, which clearly speeds things up.
  • tig-2.0 optionally uses a custom git log --pretty=format to further reduce startup time. This clearly has an effect in the no graph, no changes case but doesn't seem to have much effect when show-changes is enabled.
  • The graph rendering algorithm in tig-2.0 is sloooooooooooooooooow.

Run 1 settings: No graph, no changes

[tig] show-rev-graph = false
[tig] main-view = line-number:no,interval=5 id:no date:default author:full commit-title:yes,refs,overflow=no
[tig] show-changes = false

Run 2 settings: No graph, with show-changes

[tig] show-rev-graph = false
[tig] main-view = line-number:no,interval=5 id:no date:default author:full commit-title:yes,refs,overflow=no

Run 3 settings: With graph and show-changes

# No config

@robclark
Copy link
Author

ahh, thanks for following up on this.. sorry, I'd been kinda busy and hadn't gotten to building tig 1.x yet.

I don't suppose there is any way to see other-heads without graph enabled? I guess looking at the other heads may be expensive (especially when there are a large number of local and fetched remote branches), but it is nice to be able to see things like 'local branch foo is at a particular commit, while remote/foo is several commits behind', and that sort of thing which normally comes along with the graph.

I guess I'll disable graph for now for my kernel trees.. although some sort of lightweight graph w/ algo more similar to tig 1.x might be kinda useful..

one note: I usually use 'z' to stop loading commits once I have something on screen.. although maybe some sort of optional time-based threshold to stop loading could be useful?

@Gnurou
Copy link

Gnurou commented Oct 6, 2014

I am feeling the pain on this issue as well. Tig 1.x used to show the history immediatly on a Linux kernel tree and work in the background (like Rob I usually pressed 'z' immediately to stop loading commits). With Tig 2.x I have to wait at least 5 seconds before something useful appears on the screen.

I am not really aware of the underlying changes that may have caused this, or the reasons behind them, but this certainly made the hugely useful tig less usable for Linux developers.

@jonas
Copy link
Owner

jonas commented Oct 6, 2014

The reason is that graph rendering now automatically enables the --topo-order flag, which causes Git to buffer commits until it knows whether it has to reorder certain commits. I added the --topo-order flag to fix issue #238 (and Debian bug #757692) which both reported problems with corrupted graph due to parent commits having a timestamp newer than that of its children. Since git log --graph and I believe gitk both forces --topo-order I opted for doing the same in tig.

So this problem can be worked around by disabling graph rendering and you can further skip startup costs by disabling display of dirty tree state in the main view. This can be done by adding the following lines to .git/config in the linux kernel repo:

[tig] main-view = line-number:no,interval=5 id:no date:default author:full,width=20 commit-title:yes,refs,overflow=no
[tig] show-changes = false

How to resolve this I am not sure. On one hand I'd prefer that tig does "the right thing" by default, which is to automatically assume --topo-order when graph rendering is enabled. This obvious puts the burden on you guys to add additional configuration, which unfortunately might leads some users to view tig as broken and slow.

As I mentioned somewhere else, if I remember correctly gitk loads an initial page of commits using the default (fast) order and then restarts git log with the --topo-order flag to ensure a consistent graph. Having tig do this would solve both problems, but I am not sure yet how difficult it is to do. Short term, we could either reintegrate the old graph renderer which doesn't care about non-monotonic timestamps in the commit graph. We could also add an option to disable the automatic --topo-order so that (together with #318) something like set main-view-graph = default-order would show the graph using the default order.

@Gnurou
Copy link

Gnurou commented Oct 6, 2014

Oh yes, thanks. Now it's super-fast again. Graph rendering is nice, but for our use-case speed is probably more important. Is there an option that would make tig not specify --topo=order when using graph rendering? That would definitely do the trick for us.

jonas added a commit that referenced this issue Oct 7, 2014
This permits to avoid any initial pause during startup due to git-log
buffering commits for reordering.

References #310
@jonas jonas modified the milestone: tig-2.1 Nov 9, 2014
dcfranca pushed a commit to jerojasro-booking/tig that referenced this issue Nov 27, 2014
jonas added a commit that referenced this issue Feb 6, 2015
@adam-singer
Copy link

+1

@jonas jonas closed this as completed in 3ae9688 Mar 11, 2015
@jonas
Copy link
Owner

jonas commented Mar 11, 2015

Please see https://github.com/jonas/tig/blob/master/contrib/large-repo.tigrc for a list of options to speed up Tig in large repos.

@adam-singer
Copy link

@jonas works great for loading large repo, works for me. Still having problems with the status window not reloading changes with set refresh-mode = auto or set refresh-mode = periodic, tried a range of set refresh-interval = 15 without any success. time git status returns

real    0m6.966s
user    0m1.840s
sys 0m2.462s

Doing a shift+r will reload within 5-10 seconds, so not sure what is timing out. Tried looking into a while back but ran out of time :) Is there a major difference between the code paths of shift+r vs refresh-mode ?

@jonas
Copy link
Owner

jonas commented Mar 12, 2015

@financecoding Maybe this should go into a new issue. However, refresh-mode and shift+r use the same code paths, in that refresh-mode basically sends a refresh request to the status view, same as the keybinding.

@jonas
Copy link
Owner

jonas commented Mar 12, 2015

Refreshing is broken though ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:revision-graph Revision graph type:performance Performance issue
Projects
None yet
Development

No branches or pull requests

4 participants