Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: bypass ansible-connection for existing connections #76430

Draft
wants to merge 1 commit into
base: devel
Choose a base branch
from

Conversation

nitzmahone
Copy link
Member

@nitzmahone nitzmahone commented Dec 2, 2021

SUMMARY

fixes #75813

Creating new ansible-connection processes to use existing persisted connections is extremely wasteful and slow, since the newly-created ansible-connection effectively exits immediately in that case. This PR significantly reduces connection overhead for all networking connection plugins, but especially speeds up local actions under host contexts where a persistent connection is in use (since TE was blindly running ansible-connection in cases where no connection was used).

  • skip creation of (expensive!) new process once a persistent connection is going
  • move persistent connection detection logic into main worker
  • defer persistent connection creation/update until the connection is used
  • clean up special cases in TE, speed up dispatch of local-only actions under persistent connection hosts

KNOWN ISSUES:

  • Deferred PlayContext updates have incorrect values (just commented that out for now)- maybe a race where we need to snapshot the playcontext immediately? eg, explicit inventory ansible_user is wrong on playcontext remote_user on subsequent connection round-trips- seems to default to current UNIX user instead. Not sure how much longer we'll need to serialize playcontext anyway, since we're already sending over all the connection vars.
  • Breaks networking's experimental direct module execution without a small patch to the base networking action direct exec case to ensure the connection is activated (since TE used to do this preemptively even when it wasn't necessary):
            if hasattr(self._connection, '_ensure_socket_path'):
                self._connection._ensure_socket_path()

This could be done in the base persistent connection's _connect method , except that it's defined as abstractmethod, so none of the subclasses are currently calling super()._connect().

ISSUE TYPE
  • Bugfix Pull Request
COMPONENT NAME

ansible-connection (et al)

ADDITIONAL INFORMATION

cc @cidrblock @Qalthos

* skip creation of (expensive!) new process once a persistent connection is going
* move persistent connection detect logic into main worker
* defer persistent connection creation until the connection is used (cleans up special cases in TE, speeds up dispatch of local-only actions under persistent connection hosts)
@ansibot ansibot added WIP This issue/PR is a work in progress. Nevertheless it was shared for getting input from peers. affects_2.13 bug This issue/PR relates to a bug. needs_triage Needs a first human triage before being processed. networking Network category support:core This issue/PR relates to code supported by the Ansible Engineering Team. labels Dec 2, 2021
@ansibot ansibot added has_issue stale_ci This PR has been tested by CI more than one week ago. Close and re-open this PR to get it retested. labels Jan 31, 2022
@bcoca
Copy link
Member

bcoca commented Apr 11, 2022

related #77509

# is this even necessary anymore now that we're passing the connection options over directly?
# pcdata = cPickle.dumps(self._play_context.serialize(), protocol=0)
# conn.update_play_context(to_text(pcdata))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that you start the connection before we know the connection details for sure, so getting the data earlier will make it even worse. In some cases (when all is known before task templating) you should have no issues as play_context carries the correct info, but if you need info to change at the task, or worse, loop/delegation levels you'll run into trouble much sooner.

Most of the time this problem can be ignored as plays tend to reuse the same information w/o changing it, specially networking ones.

As you note, since we are passing the connection information and using set_options/get_option on them, we should be able to ignore most of the play_context data, but this requires all connection plugins to correctly source the data (all those in core do, but many of those in collections do not). Why I was planning to add deprecation notices to play_context soon.

setattr(connection, '_socket_path', socket_path)
# HACK: need a better way to make these necessary bits accessible for the deferred connection startup
connection._smuggled_options = options
connection._startconn = start_connection
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of 'starting connection' here we might want to 'reserve socket path', to create it if it doesn't exist and to 'touch it' if it does (making a-connection timeout reset so we avoid a race condition)

@ryanmerolle
Copy link
Contributor

If this was ready soon, would it wait for a .0 release/major release?

@ansibot ansibot added the needs_rebase https://docs.ansible.com/ansible/devel/dev_guide/developing_rebasing.html label Aug 22, 2022
@ansibot ansibot added the needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. label Jul 12, 2023
@ansibot ansibot added needs_ci This PR requires CI testing to be performed. Please close and re-open this PR to trigger CI. and removed needs_rebase https://docs.ansible.com/ansible/devel/dev_guide/developing_rebasing.html labels Oct 24, 2023
@webknjaz
Copy link
Member

ansibot added needs_ci and removed needs_rebase labels

@mkrizek what's happening here?

@ansibot ansibot added needs_rebase https://docs.ansible.com/ansible/devel/dev_guide/developing_rebasing.html and removed needs_ci This PR requires CI testing to be performed. Please close and re-open this PR to trigger CI. labels Oct 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects_2.13 bug This issue/PR relates to a bug. has_issue needs_rebase https://docs.ansible.com/ansible/devel/dev_guide/developing_rebasing.html needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. needs_triage Needs a first human triage before being processed. networking Network category stale_ci This PR has been tested by CI more than one week ago. Close and re-open this PR to get it retested. support:core This issue/PR relates to code supported by the Ansible Engineering Team. WIP This issue/PR is a work in progress. Nevertheless it was shared for getting input from peers.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unnecessary use of persistent connection resulting in increased task execution time
5 participants