Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poor scalability for huge "flat" repositories #684

Closed
borman opened this issue Nov 29, 2014 · 4 comments
Closed

Poor scalability for huge "flat" repositories #684

borman opened this issue Nov 29, 2014 · 4 comments
Labels
💊 bug Something isn't working

Comments

@borman
Copy link

borman commented Nov 29, 2014

  • Gogs Version: master (0.5.8.1125 Beta)
  • Bug Description:
    1. Create a new repository
    2. Push from a gargantuan repository with ~10k top-level directories and ~100k commits
    3. Navigate to your newly-populated repository
    4. See gogs turn your cpu into a frying pan

Turns out, gogs issues a bunch of subprocess calls like git log -1 --pretty=format:%H 713f036503b783df70c81e7174387af214ddcbd1 -- subdir-703 for each (top-level) directory item, which means at least 10k subprocess invocations per request. Good thing, it does not bring your system down with a fork bomb, nor it DoSes the whole gogs instance.

  • How to reproduce:
    The following script may be used to create a model repository:
#!/bin/bash

ndirs=1000
nrounds=13

git init .

for i in $(seq 1 $ndirs); do
    mkdir subdir-$i
done

for round in $(seq 1 $nrounds); do
    for i in $(seq 1 $ndirs); do
        head -c 50 /dev/urandom | base64 > subdir-$i/data
        git add subdir-$i/data
        git commit -m"modify $i at round $round"
    done
done

On my laptop, it was enough to experience unacceptable request processing time:

[Macaron] Started GET /borman/bigrepo for 127.0.0.1
[Macaron] Completed /borman/bigrepo 200 OK in 45.588725781s
[Macaron] Started GET /borman/bigrepo for 127.0.0.1
[Macaron] Completed /borman/bigrepo 200 OK in 46.089607608s
[Macaron] Started GET /css/github.min.css for 127.0.0.1
@unknwon unknwon added the 💊 bug Something isn't working label Nov 30, 2014
@unknwon
Copy link
Member

unknwon commented Nov 30, 2014

Thanks your feedback!

But I think it's too early to make Gogs perfectly handle huge repos, it involves cache technics which Gogs hasn't had right now but in the plan.

If you any ideas on it, please post as reply, then we can have some discussions.

@borman
Copy link
Author

borman commented Nov 30, 2014

Seems like GitLab has the same issue, nothing special there:
https://gitlab.com/gitlab-org/gitlab-ce/blob/master/app/controllers/projects/refs_controller.rb#L48
https://gitlab.com/gitlab-org/gitlab-ce/blob/master/app/models/repository.rb#L237

I also wanted to check Atlassian Stash source, but they only provide it to paid customers.

How to solve the problem:

  1. Are there any significant reasons (apart from simplicity) why git2go is not used as a git backend? This is actually not a solution, just a way to reduce git overhead. You could reduce IO complexity from Nfiles * SCAN to a single SCAN where SCAN is complexity of iterating through all commits. CPU complexity stays the same: O(Nfiles * Ncommits) (for each commit, you compare its tree with its parents' trees and mark entries that differ as updated, O(Nfiles * Nparents), but you could assume Nparents be equal to 1).
    I'd like to try and port Gogs to git2go since it looks like a nice entry-level project. As for the benefits, in the future it would allow to use git object store backends other than plain git packfiles, thanks to libgit2 extensible architecture.
  2. As for caching, you need to cache (repository, path, commit_sha) -> last_modified_commit_sha mappings. And most of the time, you don't need commit_shas other than master. Such caching might also be needed for other cross-commit queries.

@unknwon
Copy link
Member

unknwon commented Nov 30, 2014

Are there any significant reasons (apart from simplicity) why git2go is not used as a git backend?

Answer this a thousands times... I guess should put in FAQs, Go wrapper isn't as good as it looks, and it uses CGO which we are not comfortable with.

@unknwon
Copy link
Member

unknwon commented Dec 9, 2015

Please follow on #1505

@unknwon unknwon closed this as completed Dec 9, 2015
ethantkoenig pushed a commit to ethantkoenig/gogs that referenced this issue Jan 27, 2017
@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 9, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
💊 bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants