Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows Service fails due to timeout issue. #1467

Closed
phillipsj opened this issue Jul 27, 2021 · 2 comments
Closed

Windows Service fails due to timeout issue. #1467

phillipsj opened this issue Jul 27, 2021 · 2 comments
Assignees

Comments

@phillipsj
Copy link
Contributor

Environmental Info:
RKE2 Version:

rke2.exe version v1.21.3+rke2r1 (2ed0b0d)
go version go1.16.6b7

Node(s) CPU architecture, OS, and Version:

WindowsBuildLabEx                                       : 19041.1.amd64fre.vb_release.191206-1406
WindowsCurrentVersion                                   : 6.3
WindowsEditionId                                        : ServerStandardACor
WindowsInstallationType                                 : Server Core
WindowsInstallDateFromRegistry                          : 7/26/2021 2:10:32 PM
WindowsProductId                                        : 00431-30000-00000-AA117
WindowsProductName                                      : Windows Server Standard

Cluster Configuration:
1 server
1 windows node

Describe the bug:
Running the agent as a windows service errors with a timeout issue, why executing the agent not as a service works.

rke2.exe agent service --add

Steps To Reproduce:
Install RKE2 for Windows and try to run the agent as a windows service.

Expected behavior:
rke2 starts up as a windows service and joins a cluster.

Actual behavior:
rke2 starts then fails.

Additional context / logs:

@rancher-max
Copy link
Contributor

I'm still seeing this fail on v1.21.3-rc1+rke2r2, but it's no longer a timeout. Details below:

Windows Version:

WindowsBuildLabEx                                       : 17763.1.amd64fre.rs5_release.180914-1434
WindowsCurrentVersion                                   : 6.3
WindowsEditionId                                        : ServerDatacenter
WindowsInstallationType                                 : Server
WindowsInstallDateFromRegistry                          : 7/24/2021 6:08:30 AM
WindowsProductId                                        : 00430-00000-00000-AA230
WindowsProductName                                      : Windows Server 2019 Datacenter

Cluster Info:
1 server (ubuntu) running calico cni with strictaffinity

Windows Node Steps:

$ProgressPreference = 'SilentlyContinue'
New-Item -Type Directory c:\usr\local\bin -Force
Invoke-WebRequest -UseBasicParsing https://raw.githubusercontent.com/rancher/rke2/master/install.ps1 -OutFile c:\usr\local\bin\install-rke2.ps1
c:\usr\local\bin\install-rke2.ps1 -Version v1.21.3-rc1+rke2r2
$env:PATH+=";C:\usr\local\bin;C:\var\lib\rancher\rke2\bin"
New-Item -Type Directory c:/etc/rancher/rke2 -Force
notepad c:/etc/rancher/rke2/config.yaml

server: https://<server ip>:9345
token: testsecret

rke2.exe agent service --add

Results from Above:

PS C:\Users\Administrator> Get-Service rke2

Status   Name               DisplayName
------   ----               -----------
Stopped  rke2               rke2

Also noted there were no rke2 Event Logs. So ran:

PS C:\Users\Administrator> Start-Service rke2
PS C:\Users\Administrator> Get-Service rke2

Status   Name               DisplayName
------   ----               -----------
Running  rke2               rke2

...

PS C:\Users\Administrator> Get-Service rke2

Status   Name               DisplayName
------   ----               -----------
Stopped  rke2               rke2

Now the Event Logs appears to show the correct information, but then stop after running containerd.

time="2021-08-03T19:11:26Z" level=info msg="Pulling runtime image index.docker.io/rancher/rke2-runtime:v1.21.3-rc1-rke2r2"
time="2021-08-03T19:11:29Z" level=info msg="Extracting file bin/calico-ipam.exe to C:\\var\\lib\\rancher\\rke2\\data\\v1.21.3-rc1-rke2r2-56cc29082a90\\bin\\calico-ipam.exe"
...
time="2021-08-03T19:11:37Z" level=info msg="Extracting file bin/wins.exe to C:\\var\\lib\\rancher\\rke2\\data\\v1.21.3-rc1-rke2r2-56cc29082a90\\bin\\wins.exe"
time="2021-08-03T19:11:38Z" level=info msg="Okay, exiting setup."
time="2021-08-03T19:11:38Z" level=info msg="Logging containerd to C:\\var\\lib\\rancher\\rke2\\agent\\containerd\\containerd.log"
time="2021-08-03T19:11:38Z" level=info msg="Running containerd -c C:\\var\\lib\\rancher\\rke2\\agent\\etc\\containerd\\config.toml"

Removed some of the logs for brevity, but that last line is the last event log for rke2. It tries to restart itself and always just ends up with the same result in the logs.

@rancher-max
Copy link
Contributor

I see this working following the steps in the docs exactly. The problem I was having was that my PATH in my environment was not set:

[Environment]::SetEnvironmentVariable(
    "Path",
    [Environment]::GetEnvironmentVariable("Path", [EnvironmentVariableTarget]::Machine) + ";c:\var\lib\rancher\rke2\bin;c:\usr\local\bin",
    [EnvironmentVariableTarget]::Machine)

The docs are correct and this is working using v1.21.3-rc4+rke2r2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants