Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

git.repo.base.repo.tree() blocks deletion of repo directory on Windows (fine in Linux) #546

Open
altendky opened this issue Oct 27, 2016 · 6 comments

Comments

@altendky
Copy link

I am checking out repositories into a temporary directory, doing various things with them including calling tree(), and then deleting the directories and the temporary directory. This works fine in Linux but not in Windows. I have created this SSCCE to show the issue. I will take some time right now to try to attempt to fix it myself but I like to document my issues first (so others know and just in case someone else sees it and has insight).

Note that I am making an effort (del_rw()) to clear read-only items.

I am testing on:

  • Windows 10 Enterprise 64-bit (in VirtualBox)
  • Python 3.5.2 32-bit

t.py.txt

Output:

(venv) C:\Users\IEUser\Desktop\407>python t.py
Traceback (most recent call last):
  File "C:\Users\IEUser\Desktop\407\venv\lib\shutil.py", line 381, in _rmtree_unsafe
    os.unlink(fullname)
PermissionError: [WinError 5] Access is denied: 'C:\\Users\\IEUser\\AppData\\Local\\Temp\\tmprbrn0gen\\tree\\.git\\objects\\pack\\pack-986f782a1c797b94e8a9fa75402bf81fbbc4d537.idx'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "t.py", line 40, in <module>
    shutil.rmtree(dir, onerror=del_rw)
  File "C:\Users\IEUser\Desktop\407\venv\lib\shutil.py", line 488, in rmtree
    return _rmtree_unsafe(path, onerror)
  File "C:\Users\IEUser\Desktop\407\venv\lib\shutil.py", line 378, in _rmtree_unsafe
    _rmtree_unsafe(fullname, onerror)
  File "C:\Users\IEUser\Desktop\407\venv\lib\shutil.py", line 378, in _rmtree_unsafe
    _rmtree_unsafe(fullname, onerror)
  File "C:\Users\IEUser\Desktop\407\venv\lib\shutil.py", line 378, in _rmtree_unsafe
    _rmtree_unsafe(fullname, onerror)
  File "C:\Users\IEUser\Desktop\407\venv\lib\shutil.py", line 383, in _rmtree_unsafe
    onerror(os.unlink, fullname, sys.exc_info())
  File "t.py", line 19, in del_rw
    os.remove(name)
PermissionError: [WinError 5] Access is denied: 'C:\\Users\\IEUser\\AppData\\Local\\Temp\\tmprbrn0gen\\tree\\.git\\objects\\pack\\pack-986f782a1c797b94e8a9fa75402bf81fbbc4d537.idx'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "t.py", line 40, in <module>
    shutil.rmtree(dir, onerror=del_rw)
  File "C:\Users\IEUser\Desktop\407\venv\lib\tempfile.py", line 808, in __exit__
    self.cleanup()
  File "C:\Users\IEUser\Desktop\407\venv\lib\tempfile.py", line 812, in cleanup
    _shutil.rmtree(self.name)
  File "C:\Users\IEUser\Desktop\407\venv\lib\shutil.py", line 488, in rmtree
    return _rmtree_unsafe(path, onerror)
  File "C:\Users\IEUser\Desktop\407\venv\lib\shutil.py", line 378, in _rmtree_unsafe
    _rmtree_unsafe(fullname, onerror)
  File "C:\Users\IEUser\Desktop\407\venv\lib\shutil.py", line 378, in _rmtree_unsafe
    _rmtree_unsafe(fullname, onerror)
  File "C:\Users\IEUser\Desktop\407\venv\lib\shutil.py", line 378, in _rmtree_unsafe
    _rmtree_unsafe(fullname, onerror)
  File "C:\Users\IEUser\Desktop\407\venv\lib\shutil.py", line 378, in _rmtree_unsafe
    _rmtree_unsafe(fullname, onerror)
  File "C:\Users\IEUser\Desktop\407\venv\lib\shutil.py", line 383, in _rmtree_unsafe
    onerror(os.unlink, fullname, sys.exc_info())
  File "C:\Users\IEUser\Desktop\407\venv\lib\shutil.py", line 381, in _rmtree_unsafe
    os.unlink(fullname)
PermissionError: [WinError 5] Access is denied: 'C:\\Users\\IEUser\\AppData\\Local\\Temp\\tmprbrn0gen\\tree\\.git\\objects\\pack\\pack-986f782a1c797b94e8a9fa75402bf81fbbc4d537.idx'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "t.py", line 42, in <module>
    raise PermissionError('Failed with tree (GitPython {})'.format(gitpython_hash)) from e
PermissionError: Failed with tree (GitPython 5149c807ec5f396c1114851ffbd0f88d65d4c84f)
@altendky
Copy link
Author

After the tree() call there are two entire Git processes that are left open... Pausing in the debugger and killing those process trees manually (via Process Explorer) does allow clean deletion of the directories.

The 'offending' instances of Git are launched per the below stack traces. I did snip the last few stack frames that were for the PyCharm debugger activities.

>>> import traceback; traceback.print_stack()
  File "C:\Program Files (x86)\JetBrains\PyCharm Community Edition 2016.2.3\helpers\pydev\pydevd.py", line 1580, in <module>
    globals = debugger.run(setup['file'], None, None, is_module)
  File "C:\Program Files (x86)\JetBrains\PyCharm Community Edition 2016.2.3\helpers\pydev\pydevd.py", line 964, in run
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "C:\Program Files (x86)\JetBrains\PyCharm Community Edition 2016.2.3\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "C:/Users/IEUser/Desktop/407/t.py", line 40, in <module>
    tree = repo.tree('master')
  File "c:\users\ieuser\desktop\407\gitpython\git\repo\base.py", line 442, in tree
    return self.rev_parse(text_type(rev) + "^{tree}")
  File "c:\users\ieuser\desktop\407\gitpython\git\repo\fun.py", line 193, in rev_parse
    obj = name_to_object(repo, rev[:start])
  File "c:\users\ieuser\desktop\407\gitpython\git\repo\fun.py", line 130, in name_to_object
    return Object.new_from_sha(repo, hex_to_bin(hexsha))
  File "c:\users\ieuser\desktop\407\gitpython\git\objects\base.py", line 64, in new_from_sha
    oinfo = repo.odb.info(sha1)
  File "c:\users\ieuser\desktop\407\gitpython\git\db.py", line 37, in info
    hexsha, typename, size = self._git.get_object_header(bin_to_hex(sha))
  File "c:\users\ieuser\desktop\407\gitpython\git\cmd.py", line 930, in get_object_header
    cmd = self._get_persistent_cmd("cat_file_header", "cat_file", batch_check=True)
  File "c:\users\ieuser\desktop\407\gitpython\git\cmd.py", line 913, in _get_persistent_cmd
    cmd = self._call_process(cmd_name, *args, **options)
  File "c:\users\ieuser\desktop\407\gitpython\git\cmd.py", line 868, in _call_process
    return self.execute(call, **_kwargs)
  File "c:\users\ieuser\desktop\407\gitpython\git\cmd.py", line 586, in execute
    proc = Popen(command,

and

>>> import traceback; traceback.print_stack()
  File "C:\Program Files (x86)\JetBrains\PyCharm Community Edition 2016.2.3\helpers\pydev\pydevd.py", line 1580, in <module>
    globals = debugger.run(setup['file'], None, None, is_module)
  File "C:\Program Files (x86)\JetBrains\PyCharm Community Edition 2016.2.3\helpers\pydev\pydevd.py", line 964, in run
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "C:\Program Files (x86)\JetBrains\PyCharm Community Edition 2016.2.3\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "C:/Users/IEUser/Desktop/407/t.py", line 40, in <module>
    tree = repo.tree('master')
  File "c:\users\ieuser\desktop\407\gitpython\git\repo\base.py", line 442, in tree
    return self.rev_parse(text_type(rev) + "^{tree}")
  File "c:\users\ieuser\desktop\407\gitpython\git\repo\fun.py", line 216, in rev_parse
    obj = to_commit(obj).tree
  File "C:\Users\IEUser\Desktop\407\venv\lib\site-packages\gitdb\util.py", line 237, in __getattr__
    self._set_cache_(attr)
  File "c:\users\ieuser\desktop\407\gitpython\git\objects\commit.py", line 143, in _set_cache_
    binsha, typename, self.size, stream = self.repo.odb.stream(self.binsha)  # @UnusedVariable
  File "c:\users\ieuser\desktop\407\gitpython\git\db.py", line 42, in stream
    hexsha, typename, size, stream = self._git.stream_object_data(bin_to_hex(sha))
  File "c:\users\ieuser\desktop\407\gitpython\git\cmd.py", line 947, in stream_object_data
    cmd = self._get_persistent_cmd("cat_file_all", "cat_file", batch=True)
  File "c:\users\ieuser\desktop\407\gitpython\git\cmd.py", line 913, in _get_persistent_cmd
    cmd = self._call_process(cmd_name, *args, **options)
  File "c:\users\ieuser\desktop\407\gitpython\git\cmd.py", line 868, in _call_process
    return self.execute(call, **_kwargs)
  File "c:\users\ieuser\desktop\407\gitpython\git\cmd.py", line 586, in execute
    proc = Popen(command,

@altendky
Copy link
Author

altendky commented Oct 27, 2016

Does not seem to work (Still throws exception when deleting directory. I guess this just means it's not setup as a context manager):

with git.Repo.clone_from(url, dir) as repo:
    with repo.tree('master') as tree:
        pass

Does work (No exception when deleting directory):

repo = git.Repo.clone_from(url, dir)

tree = repo.tree('master')
del repo
del tree

So, perhaps it is not a bug and I simply need to make sure that my repo and tree get deleted?

@altendky
Copy link
Author

Better 'workaround':

tree = repo.tree('master')
repo.git.clear_cache()

@Byron
Copy link
Member

Byron commented Dec 8, 2016

@altendky Thanks for posting your progress here, as I do hope that others will find it useful! It's a known limitation of GitPython and its API, and best I can offer right now is to manually handle it by implementing some sort of workaround.

efiop added a commit to efiop/dvc that referenced this issue Jun 23, 2019
Workaround for two bugs:

https://bugs.python.org/issue37380

and

gitpython-developers/GitPython#546

Signed-off-by: Ruslan Kuprieiev <kupruser@gmail.com>
TomasTomecek added a commit to TomasTomecek/packit that referenced this issue Oct 7, 2021
This is an attempt to fix

packit/packit-service#1236

The issue is that GitPython can leave lingering `git cat-file` processes
which hold FDs in /sandcastle and hence kernel does not free storage for
us. This causes obscure 'No space left on device' failures.

It's really hard to tell if this is a bug in GitPython or git.

More context:
gitpython-developers/GitPython#546 (comment)

Signed-off-by: Tomas Tomecek <ttomecek@redhat.com>
TomasTomecek added a commit to TomasTomecek/packit that referenced this issue Oct 7, 2021
This is an attempt to fix

packit/packit-service#1236

The issue is that GitPython can leave lingering `git cat-file` processes
which hold FDs in /sandcastle and hence kernel does not free storage for
us. This causes obscure 'No space left on device' failures.

It's really hard to tell if this is a bug in GitPython or git.

More context:
gitpython-developers/GitPython#546 (comment)

Signed-off-by: Tomas Tomecek <ttomecek@redhat.com>
TomasTomecek added a commit to TomasTomecek/packit that referenced this issue Oct 8, 2021
This is an attempt to fix

packit/packit-service#1236

The issue is that GitPython can leave lingering `git cat-file` processes
which hold FDs in /sandcastle and hence kernel does not free storage for
us. This causes obscure 'No space left on device' failures.

It's really hard to tell if this is a bug in GitPython or git.

More context:
gitpython-developers/GitPython#546 (comment)

Signed-off-by: Tomas Tomecek <ttomecek@redhat.com>
softwarefactory-project-zuul bot added a commit to packit/packit that referenced this issue Oct 8, 2021
free GitPython resources

This is an attempt to fix
packit/packit-service#1236
The issue is that GitPython can leave lingering git cat-file processes
which hold FDs in /sandcastle and hence kernel does not free storage for
us. This causes obscure 'No space left on device' failures.
It's really hard to tell if this is a bug in GitPython or git.
More context:
gitpython-developers/GitPython#546 (comment)

Packit now cleans up internal git cache to prevent old runs linger git processes in the service and consume scarce resources.

Reviewed-by: Jiri Popelka <None>
n-dusan added a commit to openlawlibrary/taf that referenced this issue Jul 18, 2022
* Basing on #243, we are having some difficulty with pygit2 Repository file
  handlers being open. in `pygit.py` we instantiate the Repository
  class, which holds references to `pack` and `.idx` git objects and
  keeps files open. If taf and cleanup is attempted from same subprocess,
  Repository handler won't be deleted, and cleanup error occurs. Calling
  manual garbage collection did not help. This
  seems to be a known issue throughout the git community, see [1] and
  [2].

  [1] - libgit2/pygit2#596
  [2] - gitpython-developers/GitPython#546

  Several workarounds were proposed by the community, but it seems that
  we can't fully replicate those workarounds. In [2] they attempt to
  workaround the issue by deleting git cache. That works for them
  because their git implementation has file handlers open in pure
  python, while we only do subprocess git calls.

  Our TODO: if these  windows specific error issues keep persisting, figure out how to
  garbage collect pygit2 Repository handler correctly.

  For now settle on error handling, as this issue is
  specific to a subprocess call that calls taf/updater. Though we should
  keep track of progress of these known bugs, so we can fix them if they
  get addressed by pygit2 or gitpython
n-dusan added a commit to openlawlibrary/taf that referenced this issue Jul 18, 2022
* Basing on #243, we are having some difficulty with pygit2 Repository file
  handlers being open. in `pygit.py` we instantiate the Repository
  class, which holds references to `pack` and `.idx` git objects and
  keeps files open. If taf and cleanup is attempted from same subprocess,
  Repository handler won't be deleted, and cleanup error occurs. Calling
  manual garbage collection did not help. This
  seems to be a known issue throughout the git community, see [1] and
  [2].

  [1] - libgit2/pygit2#596
  [2] - gitpython-developers/GitPython#546

  Several workarounds were proposed by the community, but it seems that
  we can't fully replicate those workarounds. In [2] they attempt to
  workaround the issue by deleting git cache. That works for them
  because their git implementation has file handlers open in pure
  python, while we only do subprocess git calls.

  Our TODO: if these  windows specific error issues keep persisting, figure out how to
  garbage collect pygit2 Repository handler correctly.

  For now settle on error handling, as this issue is
  specific to a subprocess call that calls taf/updater. Though we should
  keep track of progress of these known bugs, so we can fix them if they
  get addressed by pygit2 or gitpython
n-dusan added a commit to openlawlibrary/taf that referenced this issue Jul 19, 2022
* Basing on #243, we are having some difficulty with pygit2 Repository file
  handlers being open. in `pygit.py` we instantiate the Repository
  class, which holds references to `pack` and `.idx` git objects and
  keeps files open. If taf and cleanup is attempted from same subprocess,
  Repository handler won't be deleted, and cleanup error occurs. Calling
  manual garbage collection did not help. This
  seems to be a known issue throughout the git community, see [1] and
  [2].

  [1] - libgit2/pygit2#596
  [2] - gitpython-developers/GitPython#546

  Several workarounds were proposed by the community, but it seems that
  we can't fully replicate those workarounds. In [2] they attempt to
  workaround the issue by deleting git cache. That works for them
  because their git implementation has file handlers open in pure
  python, while we only do subprocess git calls.

  Our TODO: if these  windows specific error issues keep persisting, figure out how to
  garbage collect pygit2 Repository handler correctly.

  For now settle on error handling, as this issue is
  specific to a subprocess call that calls taf/updater. Though we should
  keep track of progress of these known bugs, so we can fix them if they
  get addressed by pygit2 or gitpython

chore: changelog
n-dusan added a commit to openlawlibrary/taf that referenced this issue Jul 19, 2022
* Basing on #243, we are having some difficulty with pygit2 Repository file
  handlers being open. in `pygit.py` we instantiate the Repository
  class, which holds references to `pack` and `.idx` git objects and
  keeps files open. If taf and cleanup is attempted from same subprocess,
  Repository handler won't be deleted, and cleanup error occurs. Calling
  manual garbage collection did not help. This
  seems to be a known issue throughout the git community, see [1] and
  [2].

  [1] - libgit2/pygit2#596
  [2] - gitpython-developers/GitPython#546

  Several workarounds were proposed by the community, but it seems that
  we can't fully replicate those workarounds. In [2] they attempt to
  workaround the issue by deleting git cache. That works for them
  because their git implementation has file handlers open in pure
  python, while we only do subprocess git calls.

  Our TODO: if these  windows specific error issues keep persisting, figure out how to
  garbage collect pygit2 Repository handler correctly.

  For now settle on error handling, as this issue is
  specific to a subprocess call that calls taf/updater. Though we should
  keep track of progress of these known bugs, so we can fix them if they
  get addressed by pygit2 or gitpython

chore: changelog
renatav pushed a commit to openlawlibrary/taf that referenced this issue Jul 20, 2022
* fix: reraise original exception and windows specific cleanup errors

* Basing on #243, we are having some difficulty with pygit2 Repository file
  handlers being open. in `pygit.py` we instantiate the Repository
  class, which holds references to `pack` and `.idx` git objects and
  keeps files open. If taf and cleanup is attempted from same subprocess,
  Repository handler won't be deleted, and cleanup error occurs. Calling
  manual garbage collection did not help. This
  seems to be a known issue throughout the git community, see [1] and
  [2].

  [1] - libgit2/pygit2#596
  [2] - gitpython-developers/GitPython#546

  Several workarounds were proposed by the community, but it seems that
  we can't fully replicate those workarounds. In [2] they attempt to
  workaround the issue by deleting git cache. That works for them
  because their git implementation has file handlers open in pure
  python, while we only do subprocess git calls.

  Our TODO: if these  windows specific error issues keep persisting, figure out how to
  garbage collect pygit2 Repository handler correctly.

  For now settle on error handling, as this issue is
  specific to a subprocess call that calls taf/updater. Though we should
  keep track of progress of these known bugs, so we can fix them if they
  get addressed by pygit2 or gitpython

chore: changelog

* fix: move clean up handling to `on_rm_error`

* Cause of Windows Error permissions is in `on_rm_error`, which can
  trigger and fail in multiple places in updater.py. Since we only put
  out a warning for that, log the warning when calling os.unlink
@gdyrrahitis
Copy link

  • OS Name: Microsoft Windows 10 Pro
  • OS Version: 10.0.19045
  • Python version: 3.10.9

Workaround

def main():
    with git.Repo.clone_from(REPO, local_path) as repo:
        # omitted for brevity

    print(f"Removing temp destination directory {local_path}")
    git.rmtree(local_path)

@reglim
Copy link

reglim commented Jan 11, 2024

I had the same problem just with creating a repo and deleting it again.
Thankfully, @gdyrrahitis solution worked for me 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

5 participants