-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow repository browsing response times #1518
Comments
Thanks your detailed feedback!
In my opinion, I think it's not a topic about what causes this, instead, it is because Gogs hasn't had a cache system for repository Git data(it reloads everything every time you visit a page). The speed should be improved at some point in the future release. Hope my explanation help you understand. 😄 |
Ah OK, so this is normal? |
At least it is expected for me when:
|
Got it :-) It's really a pretty small repository, it's just spread over a lot of files. |
Yeah, so parse file header info takes time, cache(when implemented) would definitely help! |
Hmm, I wonder -- would it help much if the calls to The calls to |
Hmm, turns out This is related to #684. |
this is currently a show-stopper for something like this repo: https://github.com/gentoo/gentoo |
Performance is enhanced. |
This is still a thing: https://try.gogs.io/Rukenshia/loonix It is impossible to browse this repository properly. Loading the "Documentation" folder takes over 45 seconds for me. I dont know how gogs handles viewing the tree, but maybe an idea would be some kind of loading the file info delayed (last commit on that file/directory). I think GitLab does this too. @unknwon can you tell me whether gogs loads the complete tree of the repository when viewing the repo? |
Yeah, I too think that caching is not the only answer. The answer is to just show a loading icon for the git commit info, have the information being gathered in the background and just show it when it's available, while still allowing other operations that don't need that information. |
Gogs finds only the information necessary for the immediate files in the folder being displayed, not the whole tree. But internally, git itself goes through its entire history looking for each child file/folder to get the most recent commit. Git's data structures are really not made for that kind of query. |
I don't think you need commit hash, commit message and modification date to just browse the repository, do you (and these are what takes so long)? So these information can be loaded while the tree itself is already shown. |
Sorry, I wasn't clear. I meant as opposed to loading the information for the whole tree, as Rukenshia was wondering. I agree that if the information cannot be obtained instantly, preventing the UI from blocking by fetching it in the background is a good idea. |
I might use https://github.com/src-d/go-git to test see if we can have any performance gain. |
I tried at one point to implement the same functionality using the git C API directly (from C code), taking in the whole list of files as input instead of a single one at a time. No matter what I tried, it was still slower than invoking the git process one file at a time, which of course is already too slow. Fundamentally, whatever client is used, git's data structures don't allow this sort of query to be done quickly. |
@cameron314 One git process execution is under 100ms on my dev machine, and I can speed up with unlimited processes running at the same time, but this cause many problems on machines with small memory, so right now it is hard limited to 10 at maximum. Therefore, I think cache layer is the ultimate solution, and cache ahead browsing, is another way to improve view experience. |
I have the same problem with a big repo. I uploaded it to try.gogs.io, so you can see the performance in a known machine/config. The same repo works pretty well with cgit (with the cache enabled). Probably the slowest directory in the repo is this: https://try.gogs.io/juanfra684/openbsd-ports/src/master/devel (Gogs Version: 0.9.14.0321 Page: 118641ms Template: 214ms) |
An alternative strategy for very large directories, instead of calling git log for every file, would be to walk the git commit history and check for each commit whether or not a file in the directory was modified in this commit. Using the git command, this would be something like:
This should be much more efficient than calling Furthermore, using a git library instead of making calls to the git command could make this solution reasonably efficient. Another advantage of this approach is that it allows trading-off running time for accuracy: there could be a limit on the number of commits to visit. After that, the files for which the date of last modification could not be determined could simply render as "more than x months ago", where As mentioned above, the git data model was not designed for this kind of queries, so it seems that eventually some caching will be necessary for both efficient and exact answers for large directories/commit history. |
Ok so I did a very preliminary experiment usling libgit2 and the approach described above. The code is very ugly and not fully functional, but it already gives an idea of the gain which could be obtained. Testing it on the root directory of the Gentoo repository I am able to cut the running time by a factor two when using the above approach rather than calling
The gain is not mind-blowing, and it is a design choice to decide to pay in terms of code complexity for it. Ultimately, caching is probably the way to go, and the more I think about it the more I think it wouldn't be too hard to implement. |
A bit late, but I found the source code of my test, if you want to compare implementations. Here you go: https://gist.github.com/cameron314/c9d55a82cc91e45496ab0c38a31e69cb |
Maybe taking a look into how GitLab implements the repository browsing might be worth a try. I just gave it a shot on a Virtual Machine here on my tower PC. GitLab displays the "Files" tab of the Git repository in very little time (~1s) while Gogs needs ~23s to display the (semantically equivalent) "Code" tab. :( |
I don't know for gitlab but i thinks that github take the approach of displaying the file list directly and retreiving after the hash from api. This will also permit to divide the problem and focus on optimizing (maybe by caching) the hash retreiving separatly. |
Related? #3022 |
@tycho kind of, but not the exact same problem. |
I think it is related in that if we go for a caching solution for this issue, we might as well cache the commit count at the same time and fix both problems at the same time. |
Loading Django repository (~23k commits) with gogs takes about ~5 seconds. |
What do you think about this ? Can be a nice feature for large directories like https://github.com/DefinitelyTyped/DefinitelyTyped |
I have made a mirror of |
Is someone working on this? This issue is tagged with "dont send pull request" so I'm not going to send one, but through simple caching I was able to speedup a repositories home (overview) site by 300% on subsequent requests (the first one still takes its time), it's kind of a pain to work with a repository which takes three seconds to load :/ |
My biggest problem with this issue is that in the underlying gogits/git-module, there's an artifical .2s delay for every non-commit object. By commenting that |
I should mention, that on my server browsing linux kernel repo clone is slow, but possible, but loading tags (release page) takes very long time. |
my problem is if i have plenty files on a folder ( 1500 ) to be precise it take allot to process almost 13 minutes to open the folder on gogs :( |
I transferred to gitea, as this issue is still unresolved here...
simoesp <notifications@github.com> ezt írta (időpont: 2017. ápr. 10., H
12:11):
… my problem is if i have plenty files on a folder ( 1500 ) to be precise it
take allot to process almost 13 minutes to open the folder on gogs :(
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#1518 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAPoSuyRJB_4gJGuH59pe024mzqvr5qGks5rugBAgaJpZM4FvG8D>
.
|
…ork defaultly (#1518) * fix go get sub package and add domain on installation to let go get work defaultly * fix import sequence * fix .git problem
Do we have any progress on this? |
Configuring Gogs, browsing the commit log, searching, pushing/pulling, etc. is all fairly snappy, but browsing the files of a repository is (comparatively) very, very slow.
For example, I have a repository with 30 entries in the root, and 147 folders in a nested folder. The total page time (as seen at the bottom of the page) is ~700ms for the root, and ~4000ms for the larger nested folder. I realize that's still only 23-27ms per item, but the sum total lag is significant, and makes it difficult to browse between folders. The template rendering time itself is negligible (~10ms).
Below are the top results of an
strace -cfp
pidof gogs`` for a refresh of the 147-item folder. Note this only shows the time spent in syscalls, which is only about a quarter of the total time (it takes twice as long whenstrace
is running):Gogs version: Gogs version 0.6.3.0802 Beta
Git version: git version 2.5.0
System: Fresh install of CentOS 7 in a single-core VM on a Windows 8 host (Hyper-V). No anti-virus.
Any idea of what might be causing this?
The text was updated successfully, but these errors were encountered: