Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

win_shell / win_command task stuck on execution from 2.5.1 onwards #44905

Closed
madrover opened this issue Aug 30, 2018 · 23 comments
Closed

win_shell / win_command task stuck on execution from 2.5.1 onwards #44905

madrover opened this issue Aug 30, 2018 · 23 comments
Labels
affects_2.6 This issue/PR affects Ansible v2.6 bug This issue/PR relates to a bug. module This issue/PR relates to a module. support:core This issue/PR relates to code supported by the Ansible Engineering Team. windows Windows community

Comments

@madrover
Copy link

madrover commented Aug 30, 2018

SUMMARY

win_shell task gets stuck on execution since version 2.5.1. Same exact task works on previous versions such as 2.4.3 and 2.5.0.

Same behaviors appears with win_command

ISSUE TYPE
  • Bug Report
COMPONENT NAME

win_shell
win_command

ANSIBLE VERSION
ansible 2.6.3
  config file = /Users/miqueladrover/.ansible.cfg
  configured module search path = [u'/Users/miqueladrover/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/local/lib/python2.7/site-packages/ansible
  executable location = /usr/local/bin/ansible
  python version = 2.7.15 (default, Jun 17 2018, 13:05:56) [GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.39.2)]
CONFIGURATION
DEFAULT_CALLBACK_WHITELIST(/Users/miqueladrover/.ansible.cfg) = ['timer', 'profile_roles', 'profile_tasks']
OS / ENVIRONMENT

OSX as control host, Windows as managed host

STEPS TO REPRODUCE

The task definition is:

    - name: Installing rhapsody
      win_shell: ./rhapsody_installer.exe -q -varfile response.varfile
      args:
        chdir: "{{ rhapsody_temporary_folder }}"
EXPECTED RESULTS

Executing the task up to 2.5.0 the installer executes via win_shell finished and the playbook continues.

    TASK [ansible-rhapsody-engine : Installing rhapsody] ***************************
    task path: /Users/miqueladrover/Projects/ansible/ansible-rhapsody-engine/tasks/Windows_installation.yml:47
    Thursday 30 August 2018  16:16:20 +0200 (0:00:04.607)       0:00:46.304 *******
    Thursday 30 August 2018  16:16:20 +0200 (0:00:04.607)       0:00:46.302 *******
    Using module file /usr/local/lib/python2.7/site-packages/ansible/modules/windows/win_shell.ps1
    <127.0.0.1> ESTABLISH WINRM CONNECTION FOR USER: vagrant on PORT 55986 TO 127.0.0.1
    EXEC (via pipeline wrapper)
    changed: [Rhapsody01] => {
        "changed": true,
        "cmd": "./rhapsody_installer.exe -q -varfile response.varfile",
        "delta": "0:00:22.640596",
        "end": "2018-08-30 02:16:43.843527",
        "rc": 0,
        "start": "2018-08-30 02:16:21.202930",
        "stderr": "",
        "stderr_lines": [],
        "stdout": "Preparing JRE ...\r\nExtracting files...\r\nDownloading ...\r\nExtracting files...\r\nFinishing installation...\r\n",
        "stdout_lines": [
            "Preparing JRE ...",
            "Extracting files...",
            "Downloading ...",
            "Extracting files...",
            "Finishing installation..."
        ]
    }
ACTUAL RESULTS

Executing the same task since 2.5.1 up until 2.6.3, the task gets stuck:

    TASK [ansible-rhapsody-engine : Installing rhapsody] ***************************
    task path: /Users/miqueladrover/Projects/ansible/ansible-rhapsody-engine/tasks/Windows_installation.yml:47
    Thursday 30 August 2018  15:57:40 +0200 (0:00:04.301)       0:00:44.180 *******
    Thursday 30 August 2018  15:57:40 +0200 (0:00:04.301)       0:00:44.178 *******
    Using module file /usr/local/lib/python2.7/site-packages/ansible/modules/windows/win_shell.ps1
    <127.0.0.1> ESTABLISH WINRM CONNECTION FOR USER: vagrant on PORT 55986 TO 127.0.0.1
    EXEC (via pipeline wrapper)

and nothing more is shown. We can see on the managed server that the installer has finished.

@ansibot
Copy link
Contributor

ansibot commented Aug 30, 2018

Files identified in the description:

If these files are inaccurate, please update the component name section of the description or use the !component bot command.

click here for bot help

@ansibot
Copy link
Contributor

ansibot commented Aug 30, 2018

@ansibot ansibot added affects_2.6 This issue/PR affects Ansible v2.6 bug This issue/PR relates to a bug. module This issue/PR relates to a module. needs_triage Needs a first human triage before being processed. support:core This issue/PR relates to code supported by the Ansible Engineering Team. windows Windows community labels Aug 30, 2018
@madrover madrover changed the title win_shell task stuck on execution from 2.5.1 onwards win_shell / win_command task stuck on execution from 2.5.1 onwards Aug 30, 2018
@ansibot
Copy link
Contributor

ansibot commented Aug 30, 2018

Files identified in the description:

If these files are inaccurate, please update the component name section of the description or use the !component bot command.

click here for bot help

@jborean93
Copy link
Contributor

jborean93 commented Aug 30, 2018

While I know you said running it with win_command is the same, you shouldn't be using win_shell to run an executable as you could be coming across escaping issues. The args you have are relatively tame so I doubt that is the issue but just an FYI. As for why this is happening IIRC we changed win_shell and win_command to use the raw CreateProcessW call and bypassed Win32 .NET to help with escaping issues and support stdin in 2.5.0 but there hasn't been a change since.

If it is working on 2.5.0 and not 2.5.1+ then I'm not entirely sure what the problem could be. Are you running the task with become or anything custom?

@ansibot ansibot removed the needs_triage Needs a first human triage before being processed. label Aug 30, 2018
@madrover
Copy link
Author

Thanks for tip about win_command vs win_shell. I recall seeing it as well on the lint rule ANSIBLE0013 but I now notice it's not capturing the win_shell, only shell.

I just tried to execute win_command with become: false on 2.6.3 to ensure I am not using privilege elevation but the outcome is the same and I don't think I'm using anything custom...

@jborean93
Copy link
Contributor

jborean93 commented Aug 30, 2018

@madrover are you able to 100% confirm whether 2.5.0 works and the issue starts with 2.5.1?

You are saying the process is no longer running on the remote host. This is what a running process looks like from Ansible (using procexp to get the process hierarchy)

image

If you have a winrshost.exe it's a likely something is still running host side.

What happens if you run this with the raw task?

@madrover
Copy link
Author

I can totally confirm that this works with Ansible 2.5.0 but not with >= 2.5.1

Using procexp.exe I can see how the installer process I'm launching is being executed but then it just finishes but the Ansible task remains stuck.
image
The process seems to have succeeded because the service I'm installing is actually working even though the Ansible task remains stuck.

On the other hand, I can confirm that using raw on 2.6.3 does the trick, the installer works and the playbook continues:

- name: Installing rhapsody with raw
  raw: "{{ rhapsody_temporary_folder }}\\rhapsody_installer.exe -q -varfile {{ rhapsody_temporary_folder }}\\response.varfile"
  become: false

@joliverm
Copy link

Same result on macOS High Sierra (10.13.6) and Ansible >= 2.5.1 (Ansible 2.5.0 works Ok), when testing the same scripts that @madrover is testing.

@jborean93
Copy link
Contributor

jborean93 commented Aug 31, 2018

@madrover so this shows the rhapsody installer is still running in the background and because that hasn't closed Ansible hasn't returned back from the task. Is that screenshot from a raw task or win_shell/win_command?

We need to figure out why this is happening. Is this a public executable I can download and test with. I'm struggling to understand why 2.5.0 works but a dot release (which only contains bugfixes) does not. Even looking at the git history there's nothing I can see that may have caused this.

@madrover
Copy link
Author

madrover commented Sep 3, 2018

On the previous screenshot I was showing that the requested process was actually being executed via win_command or win_shell, I'm not 100% sure. Once the installer is finished the process under winrshost.exe disappears and I can actually see the java.exe of the service that has just been installed. See:
image
On the other hand, when using the raw model, the installer is not being executed via winrshost.exe and when it finishes the service gets started on the box and the following task in Ansible gets executed.
See:
image

@jborean93
Copy link
Contributor

The raw module still runs under winrshost.exe and is a bit more direct than win_shell/win_command which is run under a few more layers controlled by Ansible.

Is this an installer that is available online that I can play around with?

@madrover
Copy link
Author

madrover commented Sep 3, 2018

This installer is not commonly available online. I might be able to share with you but I need to check on my company beforehand, I'll keep you posted.
Is there anything else that we could do?

@jborean93
Copy link
Contributor

@madrover you can try and run things locally and see how it works, what you would need to do is;

It would be good to know;

  • If the command returns and you get an RC, stdio output back
  • If it ends without the -q or -varfile args and you manually step through the installer
  • If you get any error messages
  • Whether procexp still shows the spawned executable still running from the PowerShell instance?

@madrover
Copy link
Author

madrover commented Sep 4, 2018

You can see in the screenshot the output both in silent or interactive mode and the installer finishes successfully every time.
image
You can also see the spawned executable, but it disappears once the installation is finished...

@jborean93
Copy link
Contributor

Ok so that’s a good sign that the win_shell/win_command modules are fine as they do exactly what you are running locally. The next thing to rule out is running the code from a Runspace and seeing if that works. This is a bit trickier and I’ll see if I can give you a decent set of instructions tomorrow morning (~9 hours). This could account for raw working compared to a module as raw doesn’t run in our normal wrapper.

@jborean93
Copy link
Contributor

Ok so here's how you can run the same code in a Runspace.

$run_module = [ScriptBlock]::Create([System.IO.File]::ReadAllText("Ansible.ModuleUtils.CommandUtil.psm1"))

$ps = [PowerShell]::Create()

$ps.AddCommand("New-Module").AddParameters(@{Name="Ansible.ModuleUtils.CommandUtil"; ScriptBlock=$run_module}) > $null
$ps.AddCommand("Import-Module").AddParameter("WarningAction", "SilentlyContinue") > $null
$ps.AddCommand("Out-Null") > $null
$ps.AddStatement().AddScript('Run-Command -Command "whoami.exe /all"') > $null

$ps.Invoke()
$ps.HadErrors
if ($ps.HadErrors) {
    $ps.Streams.Error
}

Let me know if that works or not.

@madrover
Copy link
Author

madrover commented Sep 5, 2018

I just ran the code you shared with some minor tweaks to add absolute paths to the files and the output has been:

PS C:\Windows\Temp> .\runspace.ps1

Name                           Value
----                           -----
stdout                         Preparing JRE ......
rc                             0
stderr
executable                     C:\windows\temp\rhapsody_installer.exe
False

PS C:\Windows\Temp>

The installer has finished successfully and the installed service is running.

On a related note, I'm liaising with my company so I can share the installer with you to perform some testing but I'm still waiting for the approval...

@madrover
Copy link
Author

madrover commented Sep 7, 2018

@jborean93, I've contacted you privately to provide the installer we're using and do further testing.
Thanks!

@jborean93
Copy link
Contributor

@madrover I've figured out the issue, we changed the behaviour of CreateProcess that win_shell/win_command uses in #37229 to not allocate a new console for the process. This meant that the child processes shared the same conhost as the WinRM process and was found to be more beneficial for various reasons. This change was introduce in 2.5.1 and fixed the issue where non-ascii characters failed to be returned correctly.

The trouble in your scenario is that the RhapsodyMonitor.exe process that is spawned as part of the installer is inheriting the console host of the WinRM process and keeps the root WinRM process running. You can verify this by killing RhapsodyMonitor.exe after the installer finishes and the WinRM process will finish.

I'm toying around with the idea of controlling the CREATE_NEW_CONSOLE flag for win_command and win_shell. This won't help you in the short term as it would only be available in 2.8. To get you going I recommend you use async to run this as it effectively escapes the WinRM booundary. You can simply get this working by adding async: 120 where 120 is the timeout seconds that the async process can run for.

@jborean93
Copy link
Contributor

One thing I should mention, if you have control over the installer, it may be better to have that create the monitor process with the CREATE_NEW_CONSOLE flag and not be reliant on Ansible’s internal processes. I’m going to post the question as to whether we expose the flag field in the command/shell modules at the next working group. Personally the more I think about it the more I think async is the right choice for this issue.

@madrover
Copy link
Author

Awesome, @jborean93!

Unfortunately, I don't have much control about the installer, so modifying the process creation flag would be a complex endeavour.

On the other hand, I have modified the task call to:

    - name: Installing rhapsody
      win_command: ./rhapsody_installer.exe -q -varfile response.varfile
      args:
        chdir: "{{ rhapsody_temporary_folder }}"
      async: 180
      poll: 30

and now the task is happily continuing :)

Many thanks for that!

This ticket can be closed, unless further discussion of the CREATE_NEW_CONSOLE flag needs to be discussed.

@jborean93
Copy link
Contributor

@madrover thanks for the confirmation, glad you've got it working. I've posted a question to see if we want to expose the flag in win_command/win_shell ansible/community#294 (comment). Will close that based on whatever is decided there.

@jborean93
Copy link
Contributor

At the latest WWG we decided not to go ahead with this feature as async can achieve the same result and the differences between winrm and psrp makes it hard to reconcile the console options.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
affects_2.6 This issue/PR affects Ansible v2.6 bug This issue/PR relates to a bug. module This issue/PR relates to a module. support:core This issue/PR relates to code supported by the Ansible Engineering Team. windows Windows community
Projects
None yet
Development

No branches or pull requests

4 participants