Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows pods are in crashloopback #86

Closed
knabben opened this issue Aug 24, 2021 · 12 comments
Closed

Windows pods are in crashloopback #86

knabben opened this issue Aug 24, 2021 · 12 comments
Assignees

Comments

@knabben
Copy link
Member

knabben commented Aug 24, 2021

Investigate why Windows pods are on crashloopback, logs attached:

vagrant@controlplane:~$ kubectl logs windows-server-iis-7985c648cc-gmtq9 

Success Restart Needed Exit Code      Feature Result                           
------- -------------- ---------      --------------                           
True    No             Success        {Common HTTP Features, Default Documen...
Invoke-WebRequest : The remote name could not be resolved: 
'dotnetbinaries.blob.core.windows.net'
At line:1 char:32
+ ... Web-Server; Invoke-WebRequest -UseBasicParsing -Uri 'https://dotnetbi ...
+                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (System.Net.HttpWebRequest:Htt 
   pWebRequest) [Invoke-WebRequest], WebException
    + FullyQualifiedErrorId : WebCmdletWebResponseException,Microsoft.PowerShe 
   ll.Commands.InvokeWebRequestCommand
 
C:\ServiceMonitor.exe : The term 'C:\ServiceMonitor.exe' is not recognized as 
the name of a cmdlet, function, script file, or operable program. Check the 
spelling of the name, or if a path was included, verify that the path is 
correct and try again.
At line:1 char:311
+ ... ml>' > C:\inetpub\wwwroot\default.html; C:\ServiceMonitor.exe 'w3svc' ...
+                                             ~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : ObjectNotFound: (C:\ServiceMonitor.exe:String) [ 
   ], CommandNotFoundException
    + FullyQualifiedErrorId : CommandNotFoundException

from Jay:

I guess its failing bc it does some weird DNS request. probably bc that private-network doesnt have public DNS. Thats fine, maybe we can just remove thet Invoke-WebRequest call from the IIS pod

@jayunit100
Copy link
Contributor

this is probably my fault, i copy/pasted the wrong windows YAML from the internet :)

@jayunit100
Copy link
Contributor

@knabben .... lets get rid of this ServiceMonitor nonsense ?

@jayunit100
Copy link
Contributor

jayunit100 commented Aug 24, 2021

Lets use this pattern sais @jsturtevant ,... but DONT USE THIS IMAGE , use IIS ?
https://github.com/jsturtevant/windows-k8s-playground/blob/master/deployments/iis/iis-prewarm.yaml

maybe ?

@knabben
Copy link
Member Author

knabben commented Aug 31, 2021

/assign

@johnSchnake
Copy link
Contributor

@knabben is there something I can do to help on this? I was just about to report this same issue as I was working on the sonobuoy integration.

@knabben
Copy link
Member Author

knabben commented Sep 2, 2021

@johnSchnake I'm trying a few images without servicemonitor.exe if you have one more stable feel free to replace the one we have in the yaml

@johnSchnake
Copy link
Contributor

johnSchnake commented Sep 3, 2021

So I'm trying to just "fix" the existing code and found 2 things of note:

    1. The version of servicemonitor that was specified did not exist on the downloads page for some reason (https://github.com/microsoft/IIS.ServiceMonitor/releases)
    1. After updating it to an existing release I still got the same error. However, from the NODE it worked so I think something is wrong with the networking on the pod.
 Invoke-WebRequest -UseBasicParsing -Uri 'https://dotnetbinaries.blob.core.windows.net/servicemonitor/2.0.1.9/ServiceMonitor.exe'
...
StatusCode        : 200
StatusDescription : OK

Maybe ya'll already knew that was the case, but it seemed like progress to me. It seems to me that if we cut out the servicemonitor then maybe the iis server will start up but perhaps all the windows nodes will still be effectively broken.

@johnSchnake
Copy link
Contributor

https://stackoverflow.com/questions/60885492/cant-access-internet-from-within-windows-gke-pod says that MAC spoofing needs to be enabled but I'm out of my norm here and unsure if this is something to be done at the virtualbox/vagrant/windows level.

Am I off base here or does this seem like a trail to follow?

@knabben
Copy link
Member Author

knabben commented Sep 4, 2021

We agreed in the last sig-windows pair to change the image here to the official IIS, instead of changing the command of the current YAML.

Probably a better try IMO. We could go with: https://hub.docker.com/_/microsoft-windows-servercore-iis

johnSchnake added a commit to johnSchnake/sig-windows-dev-tools that referenced this issue Sep 30, 2021
There was a problem grabbing the service monitor and running it.
In that ticket it was suggested to change the image. This seems
to fix the issue.

Fixes kubernetes-sigs#86

Signed-off-by: John Schnake <jschnake@vmware.com>
@johnSchnake
Copy link
Contributor

I think this issue is effectively resolved and we've moved onto having other issues related to the windows networking; I think this can be closed. Right?

@knabben
Copy link
Member Author

knabben commented Oct 7, 2021

yes, we changed the pod spec.

/close

@k8s-ci-robot
Copy link
Contributor

@knabben: Closing this issue.

In response to this:

yes, we changed the pod spec.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants