Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ipaclient: schema cache: Write all schema files in concurrent-safe way #467

Closed
wants to merge 1 commit into from
Closed

Conversation

ghost
Copy link

@ghost ghost commented Feb 15, 2017

@ghost ghost requested a review from tiran February 15, 2017 08:39
@ghost ghost assigned tiran Feb 15, 2017
os.fsync(temp_file.fileno())
try:
os.rename(temp_file.name, path)
except EnvironmentError:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EnvironmentError is wrong on Python 3:

>>> os.rename('foo', 'bar')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
FileNotFoundError: [Errno 2] No such file or directory: 'foo' -> 'bar'
>>> FileNotFoundError.__mro__
(<class 'FileNotFoundError'>, <class 'OSError'>, <class 'Exception'>, <class 'BaseException'>, <class 'object'>)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK EnvironmentError in python 2 is super class for IOError and OSError, in python3 EnvironmentError and IOError are aliases for OSError. So I believe this is correct way of catching any OS/FS-related error in both python 2 and python 3.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OSError is sufficient. os.rename() can never raise IOError.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I've first read PEP 3151 (expecially section about Confusing OS-related exceptions [1]) I like to play it safely and always catch EnvironmentError because (like to PEP says):
In fact, it is hard to think of any situation where OSError should be caught but not IOError, or the reverse.

[1] https://www.python.org/dev/peps/pep-3151/#confusing-set-of-os-related-exceptions

@@ -282,6 +282,37 @@ def write_tmp_file(txt):

return fd


@contextmanager
def concurrent_open(path, mode='r'):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find path confusing because path it is not clear if you talk about a directory or file name. How about filename?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

build-in open uses 'name' in python 2 and 'file' in python 3. So why not? I can combine it to 'filename'.

"""
if mode in ('w', 'w+', 'wb', 'wb+'):
with tempfile.NamedTemporaryFile(
mode=mode, prefix=path, delete=False) as temp_file:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefix is not enough. You have to include the dir arg to ensure that the file is created in the same directory as the target dir. Otherwise both files may end up on two diferent file systems and os.rename() is no longer atomic.

https://docs.python.org/3/library/tempfile.html#tempfile.mkstemp

You also can't just yield the temp file or its file attribute, because with concurrent_open(...) will call TemporaryFile.__exit__, which closes the fd and breaks fsync().

class RenameContext(object):
    def __init__(self, temp_file, destpath):
        self.temp_file = temp_file
        self.destpath = destpath

    def __enter__(self):
        return self.temp_file.file  # raw file

    def __exit__(self, exc, value, tb):
        f = self.temp_file
        f.flush()
        os.fsync(f.fileno())
        f.close()
        try:
            os.rename(f.name, self.destpath)
        except OSError:
            os.unlink(f.name)  # only Windows developers call this remove :)
            raise

And then

filename = os.path.abspath(filename)
directory, prefix = os.path.splitpath(filename)
with tempfile.NamedTemporaryFile(mode=mode, dir=directory, prefix=prefix, delete=False) as f:
    return RenameContext(f, filename)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the prefix you're right, I haven't thought about relative names. I know that rename works only inside single FS otherwise raises error.

Regarding yielding NamedTemporaryFile instance from concurrent_open I'm not sure I understand you. Do you want to say that NamedTemporaryFile.__exit__ is called when yield temp_file is executed on line 300? I don't agree with that because the context haven't finished yet. The problem I see not with line 300 is that I should enclose it with try: finally:.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I didn't see that you wrapped concurrent_open in a @contextmanager. In that case the __exit__ function of NamedTemporaryFile is not called because the function doesn't return the NamedTemporaryFile instance directly but rather a wrapped instance.

try/except OSError is fine. You don't want to call os.unlink() when os.rename() was successfully called.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PS: Without dir argument, the tempfile module uses the system's temp file directory. (TMPDIR, TEMP, TMP, /tmp, /var/tmp). These days temp dir are on a different file system, partly for performance, partly for security reasons.

os.rename(temp_file.name, filename)
except EnvironmentError:
os.unlink(temp_file.name)
raise
Copy link
Member

@tiran tiran Feb 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new try/finally also overrides the destination file when there is an exception. How about:

            try:
                yield temp_file
            except Exception:
                raise
            else:
                temp_file.flush()
                os.fsync(temp_file.fileno())
                temp_file.close()
                os.link(temp_file.name, filename)
            finally:
                os.unlink(temp_file.name)

The os.unlink() function is always executed as last statement, https://www.python.org/dev/peps/pep-0341/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correction, unlink doesn't override destination.

            try:
                yield temp_file
            except Exception:
                raise
            else:
                temp_file.flush()
                os.fsync(temp_file.fileno())
                temp_file.close()
                os.rename(temp_file.name, filename)
            finally:
                try:
                    os.unlink(temp_file.name)
                except OSError as e:
                    if e.errno != errno.ENOENT:
                        raise

@ghost
Copy link
Author

ghost commented Mar 1, 2017

superseded by #488

@ghost ghost added the rejected Pull Request has been rejected label Mar 1, 2017
@ghost ghost closed this Mar 1, 2017
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rejected Pull Request has been rejected
Projects
None yet
1 participant