New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support a long-running listener process on Windows to speed up Ansible #47707
Comments
Hi @trondhindenes, thank you for submitting this issue! |
Files identified in the description: If these files are inaccurate, please update the |
Hey @trondhindenes this is something we've been wanting to implement with persisted connections. Unfortunately the current persistent code that is used for Network modules ( This is definitely something we want to move towards so I'll keep the issue open to start brainstorming idea.s |
nice, good to hear that there's movement here. I did some research a while ago trying to figure out if it's possible to configure the winrm service in such a way that invocations against reuse the same process instead of spinning up a new, but I didn't get anywhere. That said, Powershell has the notion of "sessions", so I guess it is supported (I didn't dig into the protocol spec for how sessions work, but I'm sure there exists a spec). BTW efforts in this area needs to be balanced against the effort of making ssh an alternative to winrm, which could make this whole issue moot. |
This seems to document the "session" apis in WinRM: |
did a bit more digging - I have never really looked at the winrm stuff in Ansible before. I see it persists the shell_id in the winrm Sorry for this turning into a blog. After some more testing, I notice that if persisting the "shell id" between tasks there is perhaps a tiny performance gain, but each "command" still spins up a new powershell.exe process, which seems to be the heaviest part. This is the same beavior as when invoking multple I'll conclude for now that there's nothing to gain from just persisting the Another thing I'm noticing when comparing commands from a "native" powershell client (with or without a session) is that powershell.exe is actually never started when a native client performs an operation - everything seems to be done thru a process called |
a new connection plugin is created on every task so for WinRM it will send at minimum these messages
If we were to share the connection plugin we could potentially eliminate the first and last message but the remaining will still be there and slow. We could use an agent based process to cache modules and keep PowerShell up and running in a local RunspacePool but this won't happen anytime soon (if it ever does).
I agree, the original
It also has the added benefit of having PowerShell up and running so the step to create the pipeline is just a matter of it creating a new thread and not a whole new PowerShell process. Right now it is slightly faster than winrm due to this behaviour but we could potentially irk out some more performance. I started looking at using The |
Nice, thanks for the input. For us the cpu usage on targets are a bigger problem than speed actually. We try and run t3.micro for nonprod if there isn't any reason for using bigger instances, and they spike hard when we run Ansible against them. So even without the performance (as in speed) gains, a persistent runspace would be enough for us to consider switching connection plugins - at least it's something we'd be more than willing to test. |
oh and btw: My sincerest respect to you for diving into this. I spent 30 mins investigating winrm soap messages and had to stop because I was feeling dizzy ;-) |
@trondhindenes have you tried
Start looking into PSRP and you come across XML embedded in XML, does your head in :P. I'm just happy that it isn't ASN.1 encoded data, had to deal with that for CredSSP auth and that's enough for me. |
will try! That said, I think real perf gains will only be achieved by a long-living session - which is not implemented py psrp today I guess. Anyways, I'll try it! |
Just wanted to share a little bit more on this one: cpu-wise tho, it's pretty much the same. Which leads me to think that the way Ansible sends in commands to a Windows node might be inefficient - as far as I can see, a simple win_service task sends in a string of 63450 characters - altho the win_service module itself is only 20k characters. I'm guessing that Ansible will construct and execute the "global function" set upon each task execution (since there's no state to store the functions if using regular winrm), and that this eats up a lot of cpu. We're essentially re-importing the same set of functions again and again for each task that gets executed on a node. Also noticing that the actual module seems to encoded and sent in a module_wrapper object which I'm guessing gets decoded and executed. I don't know if that process has been perf-tested, but something tells me this might not be the most efficient (cpu-wise) way of sending a few lines of PS over to a node. Even a 2-cpu vm on my fairly new laptop I can see a very noticeable cpu spike when Ansible does its thing - far more than a simple "Get-Service" call should (imho). |
Yep that's pretty much what I expected and something we are looking to do with persistent connections in the future. Using psrp should mean we don't have to worry about managing and securing our own sockets but may take some hits on the WinRM payloads. Having a persistent runspace means we can;
|
Yup. One area that probably needs fine-tuning is environment variable manipulation. This is easy in Ansible today, since envvars get "reloaded" on every task, but with a long-living session of any kind there would probably have to be some special sauce that took care of reloading the session upon changes in envvars. Shouldn't be too bad tho. In any case, I welcome this. As we're scaling out our Windows estate and trying to keep instance sizes as small as possible, the current limitations in how winrm is used (both the slow execution and high cpu usage) are really painful for us. |
btw results from my testing: regular winrm psrp: trond's tcp hack thingy: |
Agreed, we definitely have a few ideas as to how to go about this. It's the next major goal in our minds for improving the performance and this is the area we are targeting. |
awesome, really good news. I'm fine with closing this if you want, please let me know if there's any testing or other type of input I can be of assistance with. |
Let's keep it open, I don't think we have an issue already for this work so good to have something there. Thanks for your investigations so far, it's been really helpful. |
great! Will keep an eye on it! |
I am happy to test too when there is something to try out. |
@dagwieers - I thought you might be interested in this thread too. |
8 months down line, has anything come of this? |
I would love to see some movement here, we desperately need a more efficient way of communicating with Windows from Ansible. |
Any news about this issue ? Ansible on Windows is really slower than on Linux. We don't know what to do to improve our timmings on Windows. |
IMHO the first thing that needs to be addressed is the jit compilations that occur on every task. From what I understand it should be possible to do a lot of optimizations there without affecting the overall connection architecture. Maybe @jborean93 has some thoughts? |
I think Kerberos is a part of the waste of time that occurs on windows's nodes too. |
@doyl54 I haven't seen any indication of that. |
Actually this is my main issue, i've noticed that every action that needs ansible to do is always long when he ask kerberos for his rights to do it or not. (i'm using Kerberos authentification as connection system to my nodes) |
This isn't lost on us there's just no real way of implementing this in a common way and not just a hack for Windows. There are tentative plans to get working on trying to implement those ideas for a future Ansible version but right now a lot of the focus of our development to 2.10 is to enable collections support in Ansible. |
I've been working on a side project that will enable Ansible to work against a JEA endpoint, but it works just as well without JEA so it has a fair amount of overlap with this kind of thing (many of the things @jborean93 mentioned like caching the modules, sending fewer WinRM messages, possibility of pre-compiling the C# utils). I've got working prototypes just not anything that's quite in a state to share. The split to collections kind of put a wrench in the some of the ideas for how to version it and keep it aligned so I'm still waiting to see how a lot of that plays out. |
@trondhindenes have you seen https://docs.ansible.com/ansible/latest/user_guide/windows_performance.html ? I contributed that a little while ago. We do this inside of our cloud instance's startup script (and google Windows instances have it out of the box since GoogleCloudPlatform/compute-image-windows#174). |
While this is a feature we'd like to implement, we have no plans to do so in the near future. |
SUMMARY
Ansible currently uses "native" winrm. This means that each task Ansible sends to a windows node results in the target node spinning up a new process/runspace to host the command(s) invoked by Ansible. This is very slow.
Although it might be considered breaking the agentless nature of Ansible, it would be good to have an option where Ansible would self-install an ephemeral process that could accept Ansible tasks without spinning up a new process/runspace all the time. This process could self-destruct after a given time of inactivity.
ISSUE TYPE
COMPONENT NAME
winrm
ADDITIONAL INFORMATION
Just an idea, I don't have any more information.
The text was updated successfully, but these errors were encountered: