Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a new class to write multi-thread safe to a complete directory #32344

Closed
soehms opened this issue Aug 7, 2021 · 16 comments
Closed

Add a new class to write multi-thread safe to a complete directory #32344

soehms opened this issue Aug 7, 2021 · 16 comments

Comments

@soehms
Copy link
Member

soehms commented Aug 7, 2021

This new class is just a variation of the existing class atomic_write taking care of directory specific differences.
For a first application case see #32099.

Component: doctest framework

Keywords: atomic write, directory, temporary

Author: Sebastian Oehms

Branch: 6d4693d

Reviewer: Matthias Koeppe

Issue created by migration from https://trac.sagemath.org/ticket/32344

@soehms soehms added this to the sage-9.4 milestone Aug 7, 2021
@soehms
Copy link
Member Author

soehms commented Aug 7, 2021

Branch: u/soehms/atomic_dir_32344

@soehms
Copy link
Member Author

soehms commented Aug 7, 2021

New commits:

681cf4d32344: initial version

@soehms
Copy link
Member Author

soehms commented Aug 7, 2021

Author: Sebastian Oehms

@soehms
Copy link
Member Author

soehms commented Aug 7, 2021

Commit: 681cf4d

@mkoeppe
Copy link
Member

mkoeppe commented Aug 9, 2021

comment:3

Haven't tried yet, but this part:

+            except OSError:
+                shutil.rmtree(self.target)
+                os.rename(self.tempname, self.target)

looks racy to me

@soehms
Copy link
Member Author

soehms commented Aug 10, 2021

comment:4

Replying to @mkoeppe:

Haven't tried yet, but this part:

+            except OSError:
+                shutil.rmtree(self.target)
+                os.rename(self.tempname, self.target)

looks racy to me

This code is just adapted from the corresponding part of atomic_write. It worked fine for me in all of my tests (using Python 3.9). But indeed, the second patchbot (using Python 3.7) shows (ignored) FileNotFoundError in cleanup:

Running doctests with ID 2021-08-08-00-50-13-492edc8c.
Git branch: patchbot/ticket_merged
Using --optional=build,ccache,debian,dochtml,pip,sage,sage_spkg
Doctesting entire Sage library.
Sorting sources by runtime so that slower doctests are run first....
Doctesting 4343 files using 5 threads.
....
sage -t --long --random-seed=0 src/doc/en/prep/Intro-Tutorial.rst
    [25 tests, 1.54 s]
Exception ignored in: <finalize object at 0x7fd572323bd0; dead>
Traceback (most recent call last):
  File "/usr/lib/python3.7/weakref.py", line 552, in __call__
    return info.func(*info.args, **(info.kwargs or {}))
  File "/usr/lib/python3.7/tempfile.py", line 934, in _cleanup
    _rmtree(name)
  File "/usr/lib/python3.7/shutil.py", line 482, in rmtree
    onerror(os.lstat, path, sys.exc_info())
  File "/usr/lib/python3.7/shutil.py", line 480, in rmtree
    orig_st = os.lstat(path)
FileNotFoundError: [Errno 2] No such file or directory: '/home/lelievre/.sage/temp/pascaline-sage-b/9091/tmp5ve0uefv'
Exception ignored in: <finalize object at 0x7fd572323bd0; dead>
Traceback (most recent call last):
  File "/usr/lib/python3.7/weakref.py", line 552, in __call__
    return info.func(*info.args, **(info.kwargs or {}))
  File "/usr/lib/python3.7/tempfile.py", line 934, in _cleanup
    _rmtree(name)
  File "/usr/lib/python3.7/shutil.py", line 482, in rmtree
    onerror(os.lstat, path, sys.exc_info())
  File "/usr/lib/python3.7/shutil.py", line 480, in rmtree
    orig_st = os.lstat(path)
FileNotFoundError: [Errno 2] No such file or directory: '/home/lelievre/.sage/temp/pascaline-sage-b/9091/tmpgp3b5u31'
Exception ignored in: <finalize object at 0x7fd572323bd0; dead>
Traceback (most recent call last):
  File "/usr/lib/python3.7/weakref.py", line 552, in __call__
    return info.func(*info.args, **(info.kwargs or {}))
  File "/usr/lib/python3.7/tempfile.py", line 934, in _cleanup
    _rmtree(name)
  File "/usr/lib/python3.7/shutil.py", line 482, in rmtree
    onerror(os.lstat, path, sys.exc_info())
  File "/usr/lib/python3.7/shutil.py", line 480, in rmtree
    orig_st = os.lstat(path)
FileNotFoundError: [Errno 2] No such file or directory: '/home/lelievre/.sage/temp/pascaline-sage-b/9091/tmpp9nuidju'
sage -t --long --random-seed=0 src/sage/misc/temporary_file.py
    [87 tests, 1.56 s]
....
----------------------------------------------------------------------
sage -t --long --random-seed=0 src/sage/symbolic/expression.pyx  # 1 doctest failed
----------------------------------------------------------------------
Total time for all tests: 2687.7 seconds
    cpu time: 12460.5 seconds
    cumulative wall time: 12941.0 seconds
Pytest is not installed, skip checking tests that rely on it.

Unfortunately, the traceback doesn't show how this is caused by my code. Do you have any ideas on how to make this part more safe?

@mkoeppe
Copy link
Member

mkoeppe commented Aug 10, 2021

comment:5

I think a first step should be to clarify the semantics in the documentation of this class: This is for creating a directory whose contents are determined uniquely by the directory name. If multiple threads or processes attempt to create it in parallel, then it does not matter which thread created it.

If this is the intended semantics, then the following should be correct:

            # Success: move temporary file to target file
            try:
                os.rename(self.tempname, self.target)
            except FileExistsError:
                # Race: Another thread or process must have created the directory
                pass

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Aug 11, 2021

Changed commit from 681cf4d to 6d4693d

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Aug 11, 2021

Branch pushed to git repo; I updated commit sha1. New commits:

00b7106Merge branch 'u/soehms/atomic_dir_32344' of git://trac.sagemath.org/sage into atomic_dir_32344
6d4693d32344: keep dir if it exists

@soehms
Copy link
Member Author

soehms commented Aug 11, 2021

comment:7

Replying to @mkoeppe:

I think a first step should be to clarify the semantics in the documentation of this class: This is for creating a directory whose contents are determined uniquely by the directory name. If multiple threads or processes attempt to create it in parallel, then it does not matter which thread created it.

If this is the intended semantics, then the following should be correct:

            # Success: move temporary file to target file
            try:
                os.rename(self.tempname, self.target)
            except FileExistsError
                # Race: Another thread or process must have created the directory
                pass

Good idea! If cleaning is necessary then the calling code is responsible. If so, it can still lead to a race of removing the directory but under better controlled conditions.

@mkoeppe mkoeppe modified the milestones: sage-9.4, sage-9.5 Aug 22, 2021
@mkoeppe
Copy link
Member

mkoeppe commented Sep 7, 2021

Reviewer: Matthias Koeppe

@mkoeppe
Copy link
Member

mkoeppe commented Sep 7, 2021

comment:9

LGTM

@soehms
Copy link
Member Author

soehms commented Sep 8, 2021

comment:10

Many thanks!

@vbraun
Copy link
Member

vbraun commented Sep 13, 2021

Changed branch from u/soehms/atomic_dir_32344 to 6d4693d

@fchapoton
Copy link
Contributor

Changed keywords from atomic write directory tempory to atomic write, directory, temporary

@fchapoton
Copy link
Contributor

Changed commit from 6d4693d to none

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants