Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use locking for concurrent file access #52567

Open
wants to merge 2 commits into
base: devel
from

Conversation

@dagwieers
Copy link
Member

dagwieers commented Feb 19, 2019

SUMMARY

This implements locking to be used for modules that are used for concurrent file access, like lineinfile or known_hosts.

It relates to #47322 and implements ansible/proposals#113.
This fixes #30413

This is based on @acalm's existing lineinfile refactoring work and integration tests.

ISSUE TYPE
  • Feature Pull Request
COMPONENT NAME

file locking

ADDITIONAL INFORMATION

The decision to go with Python lockf() (which low-level is fcntl() on Linux) is based on the fact that it supports NFS best (contrary to flock()) and we can control what happens inside the locked-context.

The downside of locking here is that when a lock is held, no other file descriptor to the same file reference can be closed (deliberately or undeliberately) or we lose the lock, see this excellent reference.

On non-Linux platforms (BSD) flock() works reliably on NFS and doesn't suffer from this problem.

There are a few ways we mitigate the risk of accidentally losing this lock (by inadvertent opening and closing the same file within this context):

  1. Check on close if the file was still locked, if it was not, we warn the user that there is an issue.
  2. Test this locking specifically in integration tests so it is widely tested on all CI supported platforms
  3. Ensure that the code is properly documented in every module, so people are reminded of the issue

By choosing lockf() we raise the bar for quality in our modules (which we control) then giving a diffused message of what may not work (locking my fail on NFS filesystems) especially since it can cause data-loss.

For this reason we changed the interface from a FileLock class holding the file descriptor, to a simpler context manager that returns the locked file descriptor open_locked() as seen below

    with open_locked(your_file) as fd:
        content = fd.readlines()

        # do something else with this file descriptor
        # NOTE: do not open the same file in this context to not lose the lock on it when it is closed

Since Ansible modules are single-threaded, the risk of undeliberately opening/closing of the same file is very low as long as we ensure we reuse the existing file descriptor and use atomic_move(), like we tend to do in modules.

For non-Linux operating systems (BSD, Solaris, BSD, AIX, HP-UX) we may have to implement a different locking-mechanism, native to the target.

@dagwieers dagwieers changed the title Use locking for concurrent file access WIP: Use locking for concurrent file access Feb 19, 2019

@dagwieers dagwieers force-pushed the dagwieers:file-locking branch from bd3f9de to 7168508 Feb 19, 2019

Use locking for concurrent file access
This implements locking to be used for modules that are used for
concurrent file access, like lineinfile or known_hosts.
@ansibot

This comment has been minimized.

Copy link
Contributor

ansibot commented Feb 19, 2019

@dagwieers dagwieers force-pushed the dagwieers:file-locking branch 2 times, most recently from 96c1e4d to 5f91197 Feb 19, 2019

@ansibot

This comment has been minimized.

Copy link
Contributor

ansibot commented Feb 19, 2019

@ansibot ansibot added the system label Feb 19, 2019

@ansibot

This comment was marked as resolved.

Copy link
Contributor

ansibot commented Feb 19, 2019

The test ansible-test sanity --test changelog [explain] failed with 1 error:

changelogs/fragments/30413-lineinfile-concurrence_issue.yaml:0:0: invalid section: major_change

The test ansible-test sanity --test pep8 [explain] failed with 2 errors:

lib/ansible/module_utils/common/file.py:120:1: E302 expected 2 blank lines, found 1
lib/ansible/module_utils/common/file.py:163:1: E302 expected 2 blank lines, found 1

click here for bot help

@dagwieers dagwieers force-pushed the dagwieers:file-locking branch 4 times, most recently from bc0efe3 to ca82605 Feb 19, 2019

@bcoca bcoca removed the needs_triage label Feb 19, 2019

@samdoran samdoran removed the needs_triage label Feb 19, 2019

@dagwieers dagwieers force-pushed the dagwieers:file-locking branch 3 times, most recently from 229c295 to 7db3309 Feb 20, 2019

@dagwieers dagwieers force-pushed the dagwieers:file-locking branch from 7db3309 to 84ee25c Feb 20, 2019

@dagwieers dagwieers requested a review from abadger Feb 20, 2019

@dagwieers dagwieers requested a review from sivel Feb 20, 2019

@dagwieers dagwieers force-pushed the dagwieers:file-locking branch 5 times, most recently from a1d042f to 2eafa19 Feb 20, 2019

@ansibot

This comment has been minimized.

Copy link
Contributor

ansibot commented Feb 20, 2019

@ansibot ansibot added the net_tools label Feb 20, 2019

@dagwieers dagwieers force-pushed the dagwieers:file-locking branch 2 times, most recently from b3fec3b to 709bf3a Feb 20, 2019

@dagwieers

This comment has been minimized.

Copy link
Member Author

dagwieers commented Feb 20, 2019

If you want the review the changes to the lineinfile module, it helps to look at unified diff ignore whitespace changes (as a large block was indented to fit into the open_locked() context.
https://github.com/ansible/ansible/pull/52567/files?utf8=%E2%9C%93&diff=unified&w=1

The changes are really minimal.

@dagwieers dagwieers force-pushed the dagwieers:file-locking branch from 709bf3a to 21b5fde Feb 20, 2019

Reinstate lock_timeout
This commit includes:
- New file locking infrastructure for modules
- Enable timeout tests
- Madifications to support concurrency with lineinfile

@dagwieers dagwieers force-pushed the dagwieers:file-locking branch from 21b5fde to 711302d Feb 20, 2019

@ansibot ansibot added core_review and removed needs_revision labels Feb 20, 2019

@dagwieers dagwieers removed the net_tools label Feb 20, 2019

acalm pushed a commit to acalm/ansible that referenced this pull request Feb 20, 2019

@dagwieers dagwieers added this to the 2.8.0 milestone Feb 22, 2019

@ansibot ansibot added the stale_ci label Mar 2, 2019

acalm pushed a commit to acalm/ansible that referenced this pull request Mar 4, 2019

@gundalow gundalow removed this from the DO NOT USE 2.8.0 milestone Mar 8, 2019

@ansibot ansibot added the files label Mar 8, 2019

@bcoca bcoca added this to TODO: Backlog, anything can go here. Anyone can work on this after their approved work is done. in Ansible 2.9 Mar 18, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.