Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Agent install tests failing on main #277

Open
dliappis opened this issue Jun 20, 2024 · 6 comments
Open

Agent install tests failing on main #277

dliappis opened this issue Jun 20, 2024 · 6 comments
Assignees
Labels
test Everything about testing then build runner, package compiler and packages themselves

Comments

@dliappis
Copy link
Contributor

Seen in https://buildkite.com/elastic/elastic-stack-installers/builds/5429#01903577-fb5d-45f5-8f2f-2ca2bf7a942e via an unrelated PR: #276

Error:

{"log.level":"info","@timestamp":"2024-06-20T12:08:03.107Z","log.origin":{"file.name":"cmd/enroll_cmd.go","file.line":518},"message":"Starting enrollment to URL: https://placeholder:443/","log":{"source":"elastic-agent"},"ecs.version":"1.6.0"}
   [-] Can be installed and uninstalled via MSI, with installargs, in Fleet mode 68.88s (68.87s|9ms)
    [0] Expected $true, but got $false.
    at Has-AgentFleetEnrollmentAttempt | Should -BeTrue, C:\buildkite-agent\builds\bk-agent-prod-gcp-1718883984823248883\elastic\elastic-stack-installers\src\agent-qa\msi.tests.ps1:53
    at <ScriptBlock>, C:\buildkite-agent\builds\bk-agent-prod-gcp-1718883984823248883\elastic\elastic-stack-installers\src\agent-qa\msi.tests.ps1:53
    at Assert-AgentHealthy, C:\buildkite-agent\builds\bk-agent-prod-gcp-1718883984823248883\elastic\elastic-stack-installers\src\agent-qa\msi.tests.ps1:119
    at <ScriptBlock>, C:\buildkite-agent\builds\bk-agent-prod-gcp-1718883984823248883\elastic\elastic-stack-installers\src\agent-qa\msi.tests.ps1:126
    [1] Expected $false, because The agent should have been cleaned up already, but got $true.
    at Is-AgentBinaryPresent | Should -BeFalse -Because "The agent should have been cleaned up already", C:\buildkite-agent\builds\bk-agent-prod-gcp-1718883984823248883\elastic\elastic-stack-installers\src\agent-qa\msi.tests.ps1:72
    at Check-AgentRemnants, C:\buildkite-agent\builds\bk-agent-prod-gcp-1718883984823248883\elastic\elastic-stack-installers\src\agent-qa\msi.tests.ps1:72
    at <ScriptBlock>, C:\buildkite-agent\builds\bk-agent-prod-gcp-1718883984823248883\elastic\elastic-stack-installers\src\agent-qa\msi.tests.ps1:109
WARNING: Found Agent remnants between tests. Removing Agent.
Open file may block uninstall \Device\HarddiskVolume3\Program Files\Elastic\Agent\data\elastic-agent-8.15.0-SNAPSHOT-33598b\logs\elastic-agent-20240620.ndjson with PID 840 opened by elastic-agent
VERBOSE: Invoking msiexec.exe /x {E550A894-5C44-5BEF-9967-A2CD4B022161} /qn  /l*v "C:\buildkite-agent\builds\bk-agent-prod-gcp-1718883984823248883\elastic\elastic-stack-installers\src\agent-qa\logs\20240620-120807-x.log"

Link to CI logs: https://buildkite.com/elastic/elastic-stack-installers/builds/5429#01903577-fb5d-45f5-8f2f-2ca2bf7a942e/3842-4533

@dliappis dliappis added the test Everything about testing then build runner, package compiler and packages themselves label Jun 20, 2024
@dliappis
Copy link
Contributor Author

@strawgate
Copy link
Contributor

@cmacknz this looks like Fleet attempts enrollment perpetually, which stops it from responding to the service manager?

@strawgate
Copy link
Contributor

strawgate commented Jun 20, 2024

perhaps related to elastic/elastic-agent#4800 which was backported to 8.14

@cmacknz
Copy link
Member

cmacknz commented Jun 20, 2024

Yes that would be because of elastic/elastic-agent#4727 if we are using --delay-enroll.

The actual root cause is the enrollment is failing.

@cmacknz
Copy link
Member

cmacknz commented Jun 20, 2024

We are trying to enroll with a non-existent host?

{"log.level":"error","@timestamp":"2024-06-20T13:05:10.936Z","log.origin":{"file.name":"cmd/run.go","file.line":565},"message":"failed to perform delayed enrollment (will try again): fail to enroll: fail to execute request to fleet-server: lookup placeholder: no such host","log":{"source":"elastic-agent"},"ecs.version":"1.6.0"}

@strawgate
Copy link
Contributor

strawgate commented Jun 20, 2024

We are trying to enroll with a non-existent host?

{"log.level":"error","@timestamp":"2024-06-20T13:05:10.936Z","log.origin":{"file.name":"cmd/run.go","file.line":565},"message":"failed to perform delayed enrollment (will try again): fail to enroll: fail to execute request to fleet-server: lookup placeholder: no such host","log":{"source":"elastic-agent"},"ecs.version":"1.6.0"}

Yes, this should not prevent stopping or removal of Elastic Agent as this will be a common scenario post-deployment.

I cannot stop the service or uninstall the agent via the MSI or the elastic-agent binary once the service has started but enrollment is retrying

I believe this is an agent bug not related to the MSI

image

dliappis pushed a commit that referenced this issue Jul 15, 2024
…#289)

Backport of #287

This PR adds an alternate step in the elastic-stack-installers that the Independent Agent Release will use in order to get only the Elastic Agent windows installer (.msi) built.

The new pipeline step uses a new script that does the following:

Retrieve the MSI artifacts from the Build step
Moves the MSI artifacts to a new directory
Creates a .sha512 file for the MSI file
Saves the MSI and .sha512 artifacts using Buildkite's built-in upload functionality
Sets metadata in the calling pipeline if the TRIGGER_JOB_ID is set
The calling pipeline will then use this metadata to download the saved artifacts

Merged, bypassing checks due to #277
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test Everything about testing then build runner, package compiler and packages themselves
Projects
None yet
Development

No branches or pull requests

3 participants