New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WinRMTransport Kerberos authentication error but logged as success in windows logs #15669
Comments
This feels like an environmental issue. I have a few suggestions of things that might be worth investigating. Check your ansible machine's clock is synchronized with your domain controllers. I have seen random failures when the clocks have drifted. I'd also examine the winrm configuration on the target windows box to,see if any limits or timeouts might be affecting things. Oh and maybe the firewall configurations on both ends. What version of Windows are you targeting? |
The times are in sync and using the same ntp source. The firewalls are open on both ends, the error is sporadic, sometimes it works, sometimes it fails. There is nothing of interest logged for winrm on the windows side, and I haven't found any limits or timeouts (yet). Most of our windows targets are windows 2008r2, including this one. I messed up the formatting in the issue template, let me go fix that. |
Ok, I have a couple of other ideas. Is the S2008R2 box fully up to date with Windows Updates? There was a bug in WMF 3.0 when S2008R2 was first released which messed up the memory allocation. There is a hotfix for it, although my preferred fix was to update to Windows Management Framework 4.0. My other ideas boil down to health checking the hardware itself - checking network cables, hard disk errors, temperature sensors. Other than that monitoring the CPU and memory usage might at least establish if anything else is going on at the time of the failures. Hope this helps |
I found a memory/cpu issue with one of our hosts, and the ansible debug run noted the out of memory condition. However, since this issue is effecting 12 other windows servers (since I started tracking it), let's keep this open and see what else I can find. I'll check the version of WMF on the servers in question, and look for the hotfix mentions if they are running 3.0. Thanks for the suggestion. |
just going through old windows-related issues @MichaelBaydoun - were any of the affected hosts running WMF 3.0? needs_info |
@jhawkesworth yes, at least some of the hosts were running WMV 3.0. However, we haven't seen this issue in a long time. Currently running ansible 2.1.4.0. Going to close this as it's not an issue for us any longer. |
There are a few WinRMTransport issues that are possibly related to this issue, but can't be sure. I'm hoping the debug details in this issue will help solve some of the outstanding WinRMTransport issues.
ISSUE TYPE
ANSIBLE VERSION
CONFIGURATION
[defaults]
callback_whitelist = profile_tasks,timer
forks = 50
host_key_checking = false
inventory = ~/ansible/hosts-dev
max_fail_percentage = 1
pattern = NONE
retry_files_enabled = false
transport = ssh
OS / ENVIRONMENT
From: Red Hat Enterprise Linux Server release 6.7 (Santiago)
To: Windows 2008R2
SUMMARY
Sporadically, but frequently, WinRM connection is attempted, and 62 seconds later a failure is logged. However, the windows logs show the login was successful.
STEPS TO REPRODUCE
Unable to reproduce in any reliable way. We see the problem frequently and caught some details during a full debug run.
From the debug run, the setup module is causing a connection winrm connection attempt
I believe the first number, 9020, is unique to this server and attempt. The next 9020 debug entries occur 62 seconds later
It's classified as a Kerberos-based authentication error however the windows security log shows an immediate successful login
which a second later get's special privileges
followed by an immediate successful logout
I looked at the pywinrm code and the timeout is 3600 seconds, so I don't think that's the problem.
EXPECTED RESULTS
I expected so see a successful login, followed by the transfer of the setup module to the windows server, followed by it's execution
ACTUAL RESULTS
actual results with ANSIBLE_DEBUG=yes and -vvvv are posted above
The text was updated successfully, but these errors were encountered: