Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raise exception if command timeout is triggered #43078

Merged
merged 6 commits into from Aug 2, 2018

Conversation

ganeshrn
Copy link
Member

SUMMARY

Fixes #43076

If persistent connection timeout is triggered, riase
exception which will be send over socket to module code
instead of silently shutting down the socket.

ISSUE TYPE
  • Bugfix Pull Request
COMPONENT NAME

bin/ansible-connection

ANSIBLE VERSION
2.7
ADDITIONAL INFORMATION

Before:

ok: [ios] => {
    "changed": false,
    "invocation": {
        "module_args": {
            "auth_pass": null,
            "authorize": null,
            "commands": [
                "show running-config all"
            ],
            "host": null,
            "interval": 1,
            "match": "all",
            "password": null,
            "port": null,
            "provider": null,
            "retries": 10,
            "ssh_keyfile": null,
            "timeout": null,
            "username": null,
            "wait_for": null
        }
    },
    "stdout": [
        "None"
    ],
    "stdout_lines": [
        [
            "None"
        ]
    ]
}

After:

[ios]: FAILED! => {
    "changed": false,
    "module_stderr": "Traceback (most recent call last):\n  File \"/var/folders/8l/27cd4l3n2vnc2cpk3498__680000gn/T/ansible_BoEKEj/ansible_module_ios_command.py\", line 247, in <module>\n    main()\n  File \"/var/folders/8l/27cd4l3n2vnc2cpk3498__680000gn/T/ansible_BoEKEj/ansible_module_ios_command.py\", line 217, in main\n    responses = run_commands(module, commands)\n  File \"/var/folders/8l/27cd4l3n2vnc2cpk3498__680000gn/T/ansible_BoEKEj/ansible_modlib.zip/ansible/module_utils/network/ios/ios.py\", line 125, in run_commands\n  File \"/var/folders/8l/27cd4l3n2vnc2cpk3498__680000gn/T/ansible_BoEKEj/ansible_modlib.zip/ansible/module_utils/connection.py\", line 149, in __rpc__\nansible.module_utils.connection.ConnectionError: command timeout triggered, timeout value is 1 secs\n",
    "module_stdout": "",
    "msg": "MODULE FAILURE",
    "rc": 1
}

@ansibot ansibot added affects_2.7 This issue/PR affects Ansible v2.7 bug This issue/PR relates to a bug. needs_triage Needs a first human triage before being processed. support:core This issue/PR relates to code supported by the Ansible Engineering Team. traceback This issue/PR includes a traceback. labels Jul 20, 2018
@ganeshrn ganeshrn added networking Network category and removed needs_triage Needs a first human triage before being processed. labels Jul 20, 2018
@ansibot
Copy link
Contributor

ansibot commented Jul 20, 2018

The test ansible-test sanity --test pylint [explain] failed with 3 errors:

bin/ansible-connection:150:8: unreachable Unreachable code
bin/ansible-connection:159:8: unreachable Unreachable code
bin/ansible-connection:168:8: unreachable Unreachable code

click here for bot help

@ansibot ansibot added the needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. label Jul 20, 2018
Copy link
Contributor

@Qalthos Qalthos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put these in a common method in case we want to change it later or add another handler. The handlers are still fine, but something like display_and_quit(msg) would cut down on repetition

% self.connection.get_option('persistent_connect_timeout'), log_only=True)
msg = 'persistent connection idle timeout triggered, timeout value is %s secs' % self.connection.get_option('persistent_connect_timeout')
display.display(msg, log_only=True)
raise Exception(msg)
Copy link
Contributor

@Qalthos Qalthos Jul 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, ansibot is right, you can't raise an exception here and still call shudown. If you really want the exception raised, you'd have to do it after shutdown

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's right I missed it completely. Though the issue here is after shutdown there is no active socket over which message can be sen to module side. Looking at alternate ways to implement this functionality.

Fixes ansible#43076

If persistent connection timeout is triggered, riase
exception which will be send over socket to module code
instead of silently shutting down the socket.
@ganeshrn
Copy link
Member Author

ready_for_review

@ansibot ansibot removed the needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. label Jul 24, 2018
@ganeshrn
Copy link
Member Author

ganeshrn commented Jul 24, 2018

@Qalthos Socket shutdown after timeout is triggered is handled here, that's the reason after raising exception, it is not required to explicitly call shutdown

@ganeshrn
Copy link
Member Author

@samccann As per our offline discussion added more context to error message. New message looks as below

fatal: [ios01]: FAILED! => {
    "changed": false,
    "invocation": {
        "module_args": {
            "auth_pass": null,
            "authorize": null,
            "commands": [
                "show running-config all",
                "show version"
            ],
            "host": null,
            "interval": 1,
            "match": "all",
            "password": null,
            "port": null,
            "provider": null,
            "retries": 10,
            "ssh_keyfile": null,
            "timeout": null,
            "username": null,
            "wait_for": null
        }
    },
    "msg": "command timeout triggered, timeout value is 1 secs.\nFor more details refer https://docs.ansible.com/ansible/latest/network/user_guide/network_debug_troubleshooting.html#timeout-issues"
}

@ganeshrn ganeshrn requested a review from samccann July 25, 2018 18:36
@ganeshrn ganeshrn added this to Docsite work in Ansible-maintained Collections Documentation via automation Jul 25, 2018
@ansibot
Copy link
Contributor

ansibot commented Jul 26, 2018

@samccann
Copy link
Contributor

@ganeshrn - I'm a little worried about embedding a website URL inside an error message. If the website changes, the error message becomes outdated. @acozine your thoughts here? Perhaps the error message should just mention the command(s) that could fix this timeout issue?


def handler(self, signum, frame):
display.display('signal handler called with signal %s' % signum, log_only=True)
self.shutdown()
msg = 'signal handler called with signal %s.\nFor more details refer %s' % (signum, TIMEOUT_TROUBLESHOOT_URL)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The url discussion aside, I really don't think we need it here- this is not a timeout, it's SIGTERM.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch

@ansibot ansibot added the needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. label Jul 26, 2018
@ganeshrn
Copy link
Member Author

ganeshrn commented Jul 27, 2018

@samccann Any thoughts on what should be the error message. As mentioned in the troubleshooting doc there are 3 different ways to increase the timeout value and I am not sure how to fit it in a single error message.

@samccann
Copy link
Contributor

@ganeshrn How about. "See command_timeout, timeout, or ansible_command_timeout options to resolve this issue." At least that way, the customer has a string to look up on docs.ansible.com to find the information. And leaves us the flexibility to restructure content in the future w/o ending up with broken URL links in error messages.

@ganeshrn
Copy link
Member Author

@samccann
The timeout value to use also depends on connection type used (for legacy reasons)

  1. command_timeout, timeout (provider timeout), and ANSIBLE_PERSISTENT_COMMAND_TIMEOUT works with local connection type
  2. command_timeout , ansible_command_timeout and ANSIBLE_PERSISTENT_COMMAND_TIMEOUT work with netconf and network_cli connection type

Mentioning all the 3 options in error message might be confusing to users

@samccann
Copy link
Contributor

Okay then perhaps "See the timeout setting options in the Network Debug and Troubleshooting Guide."

@ganeshrn
Copy link
Member Author

ready_for_review

@ganeshrn ganeshrn merged commit 3f3101d into ansible:devel Aug 2, 2018
Ansible-maintained Collections Documentation automation moved this from Docsite work to Done Aug 2, 2018
@ganeshrn ganeshrn deleted the command_timeout_raise branch August 2, 2018 06:38
@gundalow gundalow added this to Needs Triage in Networking Dec 7, 2018
@ikhan2010 ikhan2010 moved this from Needs Triage to 2.8 MUST in Networking Dec 10, 2018
@ikhan2010 ikhan2010 moved this from 2.8 MUST to Done in Networking Dec 10, 2018
@ansible ansible locked and limited conversation to collaborators Jul 22, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
affects_2.7 This issue/PR affects Ansible v2.7 bug This issue/PR relates to a bug. needs_revision This PR fails CI tests or a maintainer has requested a review/revision of the PR. networking Network category support:core This issue/PR relates to code supported by the Ansible Engineering Team. traceback This issue/PR includes a traceback.
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

ios_command return empty output if persistent socket timeout is triggered
4 participants