Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFE] Add debugging facilities for EmbeddedAnsible (w/ansible-runner) #20243

Open
NickLaMuro opened this issue Jun 3, 2020 · 3 comments
Open

Comments

@NickLaMuro
Copy link
Member

NickLaMuro commented Jun 3, 2020

There is very little transparency with EmbeddedAnsible now that we are using ansible-runner directly instead of deferring to AnsibleTower/AWX. Currently, if something goes wrong, unless it is available in the output for the ansible STDOUT/JSON output, it won't be made apparent to the end user via the UI or even the logs. Even still, it is possible that even if there was STDOUT/JSON output, it won't make it to the UI if the playbook failed/timedout, which is even worse as that information is valuable in troubleshooting the root cause.

An example of this is a misconfigured SSH auth might cause the playbook run in ansible-runner to timeout when asking for a SSH password to auth with for a given host, but that might not be reflected in the output presented to the user, and is only noticable when looking at the ansible-runner directory that was generated for that given run.

A temp solution for debugging the above scenario is to comment out this line:

@response.cleanup_filesystem!

Restart the appliance, and then looking at the /tmp directory to view the artifacts of the given run in question (which is usually in a generated /tmp/ansible-runner* dir and is not easy to track down, and definitely not an appropriate solution for a production environment).

Simply a DEBUG toggle might be a decent first step that would avoid running the cleanup_filesystem! command on that given run, and allow for inspection to determine further action. Taking it further, presenting that data in the UI might also avoid needing to SSH, but possible just making this an option for admins might be the more secure option to start and determine a more tactical approach for end users in the future.

@NickLaMuro NickLaMuro changed the title [RFE] Debugging facilities for EmbeddedAnsible (w/ansibe-runner) [RFE] Add debugging facilities for EmbeddedAnsible (w/ansibe-runner) Jun 3, 2020
@Fryguy Fryguy added this to Backlog in Roadmap Jun 3, 2020
@Fryguy Fryguy changed the title [RFE] Add debugging facilities for EmbeddedAnsible (w/ansibe-runner) [RFE] Add debugging facilities for EmbeddedAnsible (w/ansible-runner) Jun 22, 2020
@Fryguy
Copy link
Member

Fryguy commented Oct 30, 2020

I realize that this RFE is for the general case, however if we know specific cases can be solved by looking at data from the directory, can we code that in?

An example of this is a misconfigured SSH auth might cause the playbook run in ansible-runner to timeout when asking for a SSH password to auth with for a given host, but that might not be reflected in the output presented to the user, and is only noticable when looking at the ansible-runner directory that was generated for that given run.

So in this example, what exactly in the directory is noticable, and if that noticable thing is well defined, then I think we can code up fetching that information as part of the Ansible::Runner run itself. Asmittedly, this doesn't really solve the general case, but IMO, the general case is the safety net, but "known" cases should be coded for.

@NickLaMuro
Copy link
Member Author

So in this example, what exactly in the directory is noticable, and if that noticable thing is well defined

In this case, generally you are looking at the env/passwords to see what was defined in the YAML for the regexp for:

Or did we write the proper args to ansible_runner to ask for the passwords, which you have to check the command line file for, or was there an SSH key passed? So there is a bunch of places that I check around when debugging. That said, some of it most likely could be extracted, but generally having a mechanism which doesn't require me to tell customers "comment out this line here to debug, and restart" would be ideal.

(also, FYI: s/noticable/noticeable/ )

@miq-bot
Copy link
Member

miq-bot commented Feb 27, 2023

This issue has been automatically marked as stale because it has not been updated for at least 3 months.

If you can still reproduce this issue on the current release or on master, please reply with all of the information you have about it in order to keep the issue open.

Thank you for all your contributions! More information about the ManageIQ triage process can be found in the triage process documentation.

@miq-bot miq-bot added the stale label Feb 27, 2023
@Fryguy Fryguy removed the stale label Mar 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Roadmap
  
Backlog
Development

No branches or pull requests

4 participants