Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GC mwindows on too many open files #2758

Closed
carlosmn opened this issue Dec 11, 2014 · 9 comments · Fixed by #5396
Closed

GC mwindows on too many open files #2758

carlosmn opened this issue Dec 11, 2014 · 9 comments · Fixed by #5396

Comments

@carlosmn
Copy link
Member

We currently attempt to free unused map windows for packfiles when we go over a threshold. We however do not account for the number of open packfiles.

There's been at least two distinct reports of people running into issues when they have repositories which do network operations in small batches. These accumulate a lot of tiny packfiles and we end up having them all open at once, which can reach the system's limit for open file descriptors.

We should have a configurable limit on how many open file descriptors we want to have for packfiles and do a soft-GC similar to what we do for memory (address space).

@alexcrichton
Copy link
Contributor

I believe we're hitting this in Cargo as well. We're using an index on GitHub which has lots of little files and lots of updates over time. Sometimes when we do a git update of the repository it causes a huge number of file descriptors to get opened, often reaching close to the system limit or blowing the system limit.

In tracking this down I found that this program is all that's needed to trigger the behavior, namely starting a revwalk that looks at the repository. I believe this is internally done during the fetch operation for smart transports.

Notably, though, I ran git gc in the repository and the problem goes away entirely. Both print statements in that program print 4, so no extra files are being created.

Out of curiosity, is there something we should be doing in Cargo to mitigate this issue? I'm gonna go run git gc manually in a few repos for now, but it'd be great if we could have Cargo do this for you!

@tommoor
Copy link

tommoor commented Sep 21, 2017

We have the same issue here and it's causing a lot of problems. However we are unable to depend on git being available locally (hence using libgit2) to perform gc. Are there other suggested workarounds?

@josharian
Copy link
Contributor

I just this this as well (I believe), and I have the same issue as @tommoor; I can't necessarily rely on git being available locally to run git repack. And unfortunately I can't alter my ulimit. Are there other workarounds available?

@tommoor
Copy link

tommoor commented Feb 5, 2020

We had to abandon libgit2 unfortunately. We've rewritten in go-git – I'm sure that's not the answer you were hoping for 😞

@josharian
Copy link
Contributor

@tommoor ouch. Thanks for the update. go-git has a repo.RepackObjects method, but I'm not quite ready to rewrite everything in go-git, and it seems silly to use two git libraries. :)

cc @ethomson @pks-t who seem to be the most active libgit2 folks at the moment, in case they have suggestions

@tommoor
Copy link

tommoor commented Feb 5, 2020

If you can do it from a licensing POV I'd recommend bundling the vanilla git binary with your application, it's much faster than both libgit2 and go-git and worth the extra weight if you ask me. Even github desktop bundles it.

@josharian
Copy link
Contributor

It's worse than I thought. Even bundling vanilla git won't suffice, for a long-running process.

Watching lsof, the packfile fds never get closed, even if the underlying file gets deleted. This means means that as long as network operations keep happening, fds will continue to leak, regardless of whether the packfiles on disk get gc'd or not.

@josharian
Copy link
Contributor

It looks like the fix is may be as simple as:

diff --git a/src/mwindow.c b/src/mwindow.c
index 262786a5f..ccb24bdd8 100644
--- a/src/mwindow.c
+++ b/src/mwindow.c
@@ -271,6 +271,9 @@ static git_mwindow *new_window(
        while (git_mwindow__mapped_limit < ctl->mapped &&
                        git_mwindow_close_lru(mwf) == 0) /* nop */;
 
+       while (ctl->open_windows > 20 &&
+               git_mwindow_close_lru(mwf) == 0) /* nop */;
+
        /*
         * We treat `mapped_limit` as a soft limit. If we can't find a
         * window to close and are above the limit, we still mmap the new

Obviously the limit needs to be made configurable. I'll see about putting together a PR. Feedback welcomed in the interim.

josharian added a commit to josharian/libgit2 that referenced this issue Feb 6, 2020
Doing many network operations can lead to lots of small packfiles.
The existing mwindow limit was only the total size of mmap'd files,
not on their number, so having lots of small packfiles could exhaust
the allowed number of open files, particularly on macOS, where
the default ulimit is very low.

Add a limit to the number of files as well as their total size,
and set the default limit low enough that even macOS users should not
hit it during normal use.

Fixes libgit2#2758
josharian added a commit to josharian/libgit2 that referenced this issue Feb 6, 2020
Doing many network operations can lead to lots of small packfiles.
The existing mwindow limit was only the total size of mmap'd files,
not on their number, so having lots of small packfiles could exhaust
the allowed number of open files, particularly on macOS, where
the default ulimit is very low.

Add a limit to the number of files as well as their total size,
and set the default limit low enough that even macOS users should not
hit it during normal use.

Fixes libgit2#2758
@josharian
Copy link
Contributor

PR is #5386.

josharian added a commit to josharian/libgit2 that referenced this issue Feb 6, 2020
Doing many network operations can lead to lots of small packfiles.
The existing mwindow limit was only the total size of mmap'd files,
not on their number, so having lots of small packfiles could exhaust
the allowed number of open files, particularly on macOS, where
the default ulimit is very low.

Add a limit to the number of files as well as their total size,
and set the default limit low enough that even macOS users should not
hit it during normal use.

Fixes libgit2#2758
lhchavez added a commit to lhchavez/libgit2 that referenced this issue Feb 8, 2020
There are some cases in which repositories accrue a large number of
packfiles. The existing mwindow limit applies only to the total size of
mmap'd files, not on their number. This leads to a situation in which
having lots of small packfiles could exhaust the allowed number of open
files, particularly on macOS, where the default ulimit is very low
(256).

This change adds a new configuration parameter
(GIT_OPT_SET_MWINDOW_FILE_LIMIT) that sets the maximum number of open
packfiles, with a default of 128. This is low enough so that even macOS
users should not hit it during normal use.

Based on PR libgit2#5386, originally written by @josharian.

Fixes: libgit2#2758
lhchavez added a commit to lhchavez/libgit2 that referenced this issue Feb 8, 2020
There are some cases in which repositories accrue a large number of
packfiles. The existing mwindow limit applies only to the total size of
mmap'd files, not on their number. This leads to a situation in which
having lots of small packfiles could exhaust the allowed number of open
files, particularly on macOS, where the default ulimit is very low
(256).

This change adds a new configuration parameter
(GIT_OPT_SET_MWINDOW_FILE_LIMIT) that sets the maximum number of open
packfiles, with a default of 128. This is low enough so that even macOS
users should not hit it during normal use.

Based on PR libgit2#5386, originally written by @josharian.

Fixes: libgit2#2758
lhchavez added a commit to lhchavez/libgit2 that referenced this issue Feb 18, 2020
There are some cases in which repositories accrue a large number of
packfiles. The existing mwindow limit applies only to the total size of
mmap'd files, not on their number. This leads to a situation in which
having lots of small packfiles could exhaust the allowed number of open
files, particularly on macOS, where the default ulimit is very low
(256).

This change adds a new configuration parameter
(GIT_OPT_SET_MWINDOW_FILE_LIMIT) that sets the maximum number of open
packfiles, with a default of 128. This is low enough so that even macOS
users should not hit it during normal use.

Based on PR libgit2#5386, originally written by @josharian.

Fixes: libgit2#2758
lhchavez added a commit to lhchavez/libgit2 that referenced this issue Feb 23, 2020
There are some cases in which repositories accrue a large number of
packfiles. The existing mwindow limit applies only to the total size of
mmap'd files, not on their number. This leads to a situation in which
having lots of small packfiles could exhaust the allowed number of open
files, particularly on macOS, where the default ulimit is very low
(256).

This change adds a new configuration parameter
(GIT_OPT_SET_MWINDOW_FILE_LIMIT) that sets the maximum number of open
packfiles, with a default of 128. This is low enough so that even macOS
users should not hit it during normal use.

Based on PR libgit2#5386, originally written by @josharian.

Fixes: libgit2#2758
lhchavez added a commit to lhchavez/libgit2 that referenced this issue Mar 1, 2020
There are some cases in which repositories accrue a large number of
packfiles. The existing mwindow limit applies only to the total size of
mmap'd files, not on their number. This leads to a situation in which
having lots of small packfiles could exhaust the allowed number of open
files, particularly on macOS, where the default ulimit is very low
(256).

This change adds a new configuration parameter
(GIT_OPT_SET_MWINDOW_FILE_LIMIT) that sets the maximum number of open
packfiles, with a default of 128. This is low enough so that even macOS
users should not hit it during normal use.

Based on PR libgit2#5386, originally written by @josharian.

Fixes: libgit2#2758
lhchavez added a commit to lhchavez/libgit2 that referenced this issue Apr 12, 2020
There are some cases in which repositories accrue a large number of
packfiles. The existing mwindow limit applies only to the total size of
mmap'd files, not on their number. This leads to a situation in which
having lots of small packfiles could exhaust the allowed number of open
files, particularly on macOS, where the default ulimit is very low
(256).

This change adds a new configuration parameter
(GIT_OPT_SET_MWINDOW_FILE_LIMIT) that sets the maximum number of open
packfiles, with a default of 128. This is low enough so that even macOS
users should not hit it during normal use.

Based on PR libgit2#5386, originally written by @josharian.

Fixes: libgit2#2758
lhchavez added a commit to lhchavez/libgit2 that referenced this issue Jun 21, 2020
There are some cases in which repositories accrue a large number of
packfiles. The existing mwindow limit applies only to the total size of
mmap'd files, not on their number. This leads to a situation in which
having lots of small packfiles could exhaust the allowed number of open
files, particularly on macOS, where the default ulimit is very low
(256).

This change adds a new configuration parameter
(GIT_OPT_SET_MWINDOW_FILE_LIMIT) that sets the maximum number of open
packfiles, with a default of 128. This is low enough so that even macOS
users should not hit it during normal use.

Based on PR libgit2#5386, originally written by @josharian.

Fixes: libgit2#2758
mechanicker pushed a commit to mechanicker/libgit2 that referenced this issue Jul 16, 2020
There are some cases in which repositories accrue a large number of
packfiles. The existing mwindow limit applies only to the total size of
mmap'd files, not on their number. This leads to a situation in which
having lots of small packfiles could exhaust the allowed number of open
files, particularly on macOS, where the default ulimit is very low
(256).

This change adds a new configuration parameter
(GIT_OPT_SET_MWINDOW_FILE_LIMIT) that sets the maximum number of open
packfiles, with a default of 128. This is low enough so that even macOS
users should not hit it during normal use.

Based on PR libgit2#5386, originally written by @josharian.

Fixes: libgit2#2758
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants