Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Amazon SSM agent fails to start #554

Closed
tormath1 opened this issue Jan 11, 2024 · 3 comments
Closed

Amazon SSM agent fails to start #554

tormath1 opened this issue Jan 11, 2024 · 3 comments

Comments

@tormath1
Copy link

tormath1 commented Jan 11, 2024

Hello,

I am a maintainer of Flatcar container Linux, a Linux based OS. We upgraded Amazon SSM Agent from 2.3.1319.0 to 3.2.985.0 and we're noticing issues which impact Flatcar AWS users:

Initializing new seelog logger
New Seelog Logger Creation Complete
1704967520066534055 [Debug] Start File Watcher On: /etc/amazon/ssm/seelog.xml
1704967520066608958 [Debug] Start Watcher on directory: /etc/amazon/ssm
1704967520066663367 [Debug] [ssm-agent-worker] Current GoMaxProc value - 2
1704967520066714557 [Debug] [ssm-agent-worker] Checking if agent has OnPrem identity type
1704967520066728478 [Info] [ssm-agent-worker] Checking if agent identity type OnPrem can be assumed
1704967520066750635 [Warn] [ssm-agent-worker] failed to read runtime config 'identity_config.json': open /var/lib/amazon/ssm/runtimeconfig/identity_config.json: no such file or directory
1704967520066760431 [Debug] [ssm-agent-worker] Checking if agent has EC2 identity type
1704967520066765411 [Info] [ssm-agent-worker] Checking if agent identity type EC2 can be assumed
1704967520124509707 [Debug] [AuthRegisterService] Determining endpoint for service ssm in region us-west-2
1704967520124660050 [Debug] [EC2Identity] Determining endpoint for service ssm in region us-west-2
1704967520124684850 [Warn] [ssm-agent-worker] failed to read runtime config 'identity_config.json': open /var/lib/amazon/ssm/runtimeconfig/identity_config.json: no such file or directory
1704967520124695295 [Debug] [ssm-agent-worker] Checking if agent has CustomIdentity identity type
1704967520124701698 [Info] [ssm-agent-worker] Checking if agent identity type CustomIdentity can be assumed
1704967520124716273 [Warn] [ssm-agent-worker] failed to read runtime config 'identity_config.json': open /var/lib/amazon/ssm/runtimeconfig/identity_config.json: no such file or directory
1704967520124831803 [Error] [ssm-agent-worker] Agent failed to assume any identity
1704967520124845329 [Error] [ssm-agent-worker] failed to find identity, retrying: failed to find agent identity

The instance is started with a role having the following permission: AmazonSSMManagedInstanceCore and I even tried using the Fleet Manager: Default Host Management Configuration on this role.

Running the diagnostic tool, I see this:

$ sudo ssm-cli get-diagnostics --output table
┌──────────────────────────────────────┬─────────┬─────────────────────────────────────────────────────────────────────────┐
│ Check                                │ Status  │ Note                                                                    │
├──────────────────────────────────────┼─────────┼─────────────────────────────────────────────────────────────────────────┤
│ EC2 IMDS                             │ Success │ IMDS is accessible and has instance id i-12345 in region    │
│                                      │         │ us-west-2                                                               │
├──────────────────────────────────────┼─────────┼─────────────────────────────────────────────────────────────────────────┤
│ Hybrid instance registration         │ Skipped │ Instance does not have hybrid registration                              │
├──────────────────────────────────────┼─────────┼─────────────────────────────────────────────────────────────────────────┤
│ Connectivity to ssm endpoint         │ Success │ ssm.us-west-2.amazonaws.com is reachable                                │
├──────────────────────────────────────┼─────────┼─────────────────────────────────────────────────────────────────────────┤
│ Connectivity to ec2messages endpoint │ Success │ ec2messages.us-west-2.amazonaws.com is reachable                        │
├──────────────────────────────────────┼─────────┼─────────────────────────────────────────────────────────────────────────┤
│ Connectivity to ssmmessages endpoint │ Success │ ssmmessages.us-west-2.amazonaws.com is reachable                        │
├──────────────────────────────────────┼─────────┼─────────────────────────────────────────────────────────────────────────┤
│ Connectivity to s3 endpoint          │ Success │ s3.us-west-2.amazonaws.com is reachable                                 │
├──────────────────────────────────────┼─────────┼─────────────────────────────────────────────────────────────────────────┤
│ Connectivity to kms endpoint         │ Success │ kms.us-west-2.amazonaws.com is reachable                                │
├──────────────────────────────────────┼─────────┼─────────────────────────────────────────────────────────────────────────┤
│ Connectivity to logs endpoint        │ Success │ logs.us-west-2.amazonaws.com is reachable                               │
├──────────────────────────────────────┼─────────┼─────────────────────────────────────────────────────────────────────────┤
│ Connectivity to monitoring endpoint  │ Success │ monitoring.us-west-2.amazonaws.com is reachable                         │
├──────────────────────────────────────┼─────────┼─────────────────────────────────────────────────────────────────────────┤
│ AWS Credentials                      │ Success │ Credentials are for                                                     │
│                                      │         │ arn:aws:sts::12345... │
│                                      │         │ and will expire at 2024-01-11 11:10:10.87810707 +0000 UTC               │
│                                      │         │ m=+3749.157475872                                                       │
├──────────────────────────────────────┼─────────┼─────────────────────────────────────────────────────────────────────────┤
│ Agent service                        │ Failed  │ Agent is installed as a systemctl service but is not running            │
├──────────────────────────────────────┼─────────┼─────────────────────────────────────────────────────────────────────────┤
│ Proxy configuration                  │ Skipped │ No proxy configuration detected                                         │
├──────────────────────────────────────┼─────────┼─────────────────────────────────────────────────────────────────────────┤
│ SSM Agent version                    │ Failed  │ Failed to get SSM Agent version: exit status 2                          │
└──────────────────────────────────────┴─────────┴─────────────────────────────────────────────────────────────────────────┘

I tried to get more logs without success and I am not sure if the following warning is somehow related:

 1704967520066750635 [Warn] [ssm-agent-worker] failed to read runtime config 'identity_config.json': open /var/lib/amazon/ssm/runtimeconfig/identity_config.json: no such file or directory

Another information:

  • /var/lib/amazon/ssm/ does not even exist.
  • IMDSv2 is required (and it's reachable)

Any chance to get some information on what to do next for debugging?

EDIT: I tried with ubuntu with the same role/config, and it works as expected.

@armnejad
Copy link
Contributor

Hello,
Please be aware that SSM Agent does not claim to support Flatcar Linux. That said, I would recommend uninstalling your Agent and then reinstalling the desired version (rather than trying to perform an update). Updating from a much older version to a much newer version can be the cause of issues like the one you are seeing.

@jepio
Copy link

jepio commented Jan 12, 2024

Hello, Please be aware that SSM Agent does not claim to support Flatcar Linux. That said, I would recommend uninstalling your Agent and then reinstalling the desired version (rather than trying to perform an update). Updating from a much older version to a much newer version can be the cause of issues like the one you are seeing.

When @tormath1 says "upgraded" he means that this was done during the AMI build process and not on a running instance. So the instance that fails in this way only ever came up with 3.2.985.0.

@armnejad are you able to share how/when /var/lib/amazon/ssm/ and /var/lib/amazon/ssm/runtimeconfig are populated?

@tormath1
Copy link
Author

Closing this, it has been solved on our side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants