Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrency Issues #773

Open
7 tasks
pjbgf opened this issue May 20, 2023 · 5 comments
Open
7 tasks

Concurrency Issues #773

pjbgf opened this issue May 20, 2023 · 5 comments
Assignees
Labels
bug Something isn't working no-autoclose Issues/PRs to be ignored by stale bot
Milestone

Comments

@pjbgf
Copy link
Member

pjbgf commented May 20, 2023

go-git is not thread-safe. In some cases, concurrent reads may lead to repositories becoming corrupted or actions to yield invalid results (e.g. not finding an existing ref).

There is quite a bit of work until we can answer: What operations are concurrency safe and I can call from different Goroutines?

I created a branch to start amassing fixes around concurrency issues. As this has a performance cost, I won't propose merging it into master just yet. Folks insterested on using it can consume the changes via:

go get github.com/go-git/go-git/v5@read-concurrency

Current status
  • Support read concurrency
    • Map concurrency issues
      • repository.CommitObjects().Foreach()
      • repository.Tags().Foreach()
    • Create concurrency tests to avoid regression
  • Off-set performance costs for using mutex (or add an opt-in/out flag)
  • Worktree.Add()

Please share in the comments any specific operations you may having issues to use concurrently.

Relates to: #186 #48 #175

@pjbgf pjbgf added the bug Something isn't working label May 20, 2023
@pjbgf pjbgf self-assigned this May 20, 2023
@pjbgf
Copy link
Member Author

pjbgf commented May 21, 2023

The loading of object / pack maps (f3311ea) and the index (377731d) seem to be big culprits of read concurrency. After the changes, I could no longer observe "object not found" when going through all commits and tags concurrently.

@github-actions
Copy link

To help us keep things tidy and focus on the active tasks, we've introduced a stale bot to spot
issues/PRs that haven't had any activity in a while.

This particular issue hasn't had any updates or activity in the past 90 days, so it's been labeled
as 'stale'. If it remains inactive for the next 30 days, it'll be automatically closed.

We understand everyone's busy, but if this issue is still important to you, please feel free to add
a comment or make an update to keep it active.

Thanks for your understanding and cooperation!

@github-actions github-actions bot added the stale Issues/PRs that are marked for closure due to inactivity label Sep 28, 2023
@pjbgf pjbgf added no-autoclose Issues/PRs to be ignored by stale bot and removed stale Issues/PRs that are marked for closure due to inactivity labels Sep 28, 2023
@mpldr
Copy link

mpldr commented Nov 4, 2023

Thanks for the info, at least for my usecase (showing various commits from the Kernel tree), the impact was within what I'd consider to be noise. This could obviously be different for more complex operations, but I do not have the insight into the codebase to properly benchmark this.

@pjbgf pjbgf added this to the v6.0.0 milestone Nov 14, 2023
@d4x1
Copy link

d4x1 commented Feb 6, 2024

I want to collect commits details with go-git/v5, but it spent too much time, so I want to collect commits concurrently.

@pjbgf Are you still working on it? Let me know if I can do some help.

@pjbgf
Copy link
Member Author

pjbgf commented Feb 7, 2024

@d4x1 happy for you to take over, if you can, as I have been quite busy of late. Here are three commits (not merged) that I worked on:

f3311ea
377731d
0a7b552

The main blocker on merging any of those changes are performance regression. So we need to benchmark and call out the impact. If the impact is too great, we may need to add a way for users to opt-out/in.

data-douser added a commit to has-ghas/no-phi-ai that referenced this issue Feb 29, 2024
Sets up concurrent processing of git commits.

Updates the version of go-git library to a specific branch that
supports concurrent reads of the git filesystem.
[go-git/go-git#773
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working no-autoclose Issues/PRs to be ignored by stale bot
Projects
None yet
Development

No branches or pull requests

3 participants