[RFE] Add debugging facilities for EmbeddedAnsible (w/ansible-runner) #20243

NickLaMuro · 2020-06-03T21:25:20Z

There is very little transparency with EmbeddedAnsible now that we are using ansible-runner directly instead of deferring to AnsibleTower/AWX. Currently, if something goes wrong, unless it is available in the output for the ansible STDOUT/JSON output, it won't be made apparent to the end user via the UI or even the logs. Even still, it is possible that even if there was STDOUT/JSON output, it won't make it to the UI if the playbook failed/timedout, which is even worse as that information is valuable in troubleshooting the root cause.

An example of this is a misconfigured SSH auth might cause the playbook run in ansible-runner to timeout when asking for a SSH password to auth with for a given host, but that might not be reflected in the output presented to the user, and is only noticable when looking at the ansible-runner directory that was generated for that given run.

A temp solution for debugging the above scenario is to comment out this line:

manageiq/lib/ansible/runner/response_async.rb

Line 37 in 44fae84

@response.cleanup_filesystem!

Restart the appliance, and then looking at the /tmp directory to view the artifacts of the given run in question (which is usually in a generated /tmp/ansible-runner* dir and is not easy to track down, and definitely not an appropriate solution for a production environment).

Simply a DEBUG toggle might be a decent first step that would avoid running the cleanup_filesystem! command on that given run, and allow for inspection to determine further action. Taking it further, presenting that data in the UI might also avoid needing to SSH, but possible just making this an option for admins might be the more secure option to start and determine a more tactical approach for end users in the future.

The text was updated successfully, but these errors were encountered:

Fryguy · 2020-10-30T14:10:35Z

I realize that this RFE is for the general case, however if we know specific cases can be solved by looking at data from the directory, can we code that in?

An example of this is a misconfigured SSH auth might cause the playbook run in ansible-runner to timeout when asking for a SSH password to auth with for a given host, but that might not be reflected in the output presented to the user, and is only noticable when looking at the ansible-runner directory that was generated for that given run.

So in this example, what exactly in the directory is noticable, and if that noticable thing is well defined, then I think we can code up fetching that information as part of the Ansible::Runner run itself. Asmittedly, this doesn't really solve the general case, but IMO, the general case is the safety net, but "known" cases should be coded for.

NickLaMuro · 2020-10-30T14:55:28Z

So in this example, what exactly in the directory is noticable, and if that noticable thing is well defined

In this case, generally you are looking at the env/passwords to see what was defined in the YAML for the regexp for:

is there an entry for "^SSH [pP]assword"?
is it correct?
is it blank?

Or did we write the proper args to ansible_runner to ask for the passwords, which you have to check the command line file for, or was there an SSH key passed? So there is a bunch of places that I check around when debugging. That said, some of it most likely could be extracted, but generally having a mechanism which doesn't require me to tell customers "comment out this line here to debug, and restart" would be ideal.

(also, FYI: s/noticable/noticeable/ )

miq-bot · 2023-02-27T00:18:19Z

This issue has been automatically marked as stale because it has not been updated for at least 3 months.

If you can still reproduce this issue on the current release or on master, please reply with all of the information you have about it in order to keep the issue open.

Thank you for all your contributions! More information about the ManageIQ triage process can be found in the triage process documentation.

NickLaMuro changed the title ~~[RFE] Debugging facilities for EmbeddedAnsible (w/ansibe-runner)~~ [RFE] Add debugging facilities for EmbeddedAnsible (w/ansibe-runner) Jun 3, 2020

Fryguy added the enhancement label Jun 3, 2020

Fryguy added this to Backlog in Roadmap Jun 3, 2020

gtanzillo added the help wanted label Jun 4, 2020

Fryguy changed the title ~~[RFE] Add debugging facilities for EmbeddedAnsible (w/ansibe-runner)~~ [RFE] Add debugging facilities for EmbeddedAnsible (w/ansible-runner) Jun 22, 2020

NickLaMuro mentioned this issue Jul 30, 2020

Embedded Ansible playbook service fails if the playbook is from a repository that requires SCM credentials #20400

Closed

NickLaMuro added the core/embedded ansible label Dec 11, 2020

NickLaMuro mentioned this issue Apr 21, 2021

Request and Task Diagnostics #21188

Closed

4 tasks

NickLaMuro mentioned this issue Aug 10, 2021

Add debugging for Ansible::Runner #21369

Merged

miq-bot added the stale label Feb 27, 2023

Fryguy removed the stale label Mar 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFE] Add debugging facilities for EmbeddedAnsible (w/ansible-runner) #20243

[RFE] Add debugging facilities for EmbeddedAnsible (w/ansible-runner) #20243

NickLaMuro commented Jun 3, 2020 •

edited

Fryguy commented Oct 30, 2020

NickLaMuro commented Oct 30, 2020

miq-bot commented Feb 27, 2023

[RFE] Add debugging facilities for EmbeddedAnsible (w/ansible-runner) #20243

[RFE] Add debugging facilities for EmbeddedAnsible (w/ansible-runner) #20243

Comments

NickLaMuro commented Jun 3, 2020 • edited

Fryguy commented Oct 30, 2020

NickLaMuro commented Oct 30, 2020

miq-bot commented Feb 27, 2023

NickLaMuro commented Jun 3, 2020 •

edited