Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Saving Agent Policy fails with "Cannot read existing Message Signing Key pair" #176528

Closed
cxlashey opened this issue Feb 8, 2024 · 6 comments
Closed
Labels
bug Fixes for quality problems that affect the customer experience Team:Fleet Team label for Observability Data Collection Fleet team

Comments

@cxlashey
Copy link

cxlashey commented Feb 8, 2024

Kibana version:
8.12.0

Elasticsearch version:
8.12.0

Server OS version:
Ubuntu 20.04.6 LTS

Browser version:
Google Chrome 121.0.6167.160

Browser OS version:
Ubuntu 20.04.6 LTS

Original install method (e.g. download page, yum, from source, etc.):
https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-quickstart.html
ECK Deployment. Created CRDs, Operator, Elasicsearch and Kibana from the web page instructions

Describe the bug:
When adding a Prometheus Integration and hitting "Save and continue", the save fails.
"Configuration Error" "Cannot read existing Message Signing Key pair"

I see a new Agent Policy in the Fleet UI, but it doesn't have any integrations.

Steps to reproduce:

  1. Login into Kibana
  2. Select "Add Integrations" from home page
  3. Search and select "Prometheus"
  4. "Add Prometheus"
  5. Error message is displayed for first time on "Add Prometheus Integration" page. Bottom right corner popup: "Configuration Error" "Cannot read existing Message Signing Key pair"
  6. Enter Prometheus Settings
  7. Select "Save and Continue"
  8. Browser stays on the same page and same error is displayed. "Configuration Error" "Cannot read existing Message Signing Key pair"
  9. Go to Management --> Fleet --> Agent Policies
  10. A new Agent Policy has been created but there are no integrations

Expected behavior:
New Prometheus integration is created

Screenshots (if relevant):
Forbidden by company policy

Errors in browser console (if relevant):

POST https://localhost:5601/api/fleet/agent_policies?sys_monitoring=true 400 (Bad Request)
fetchResponse @ core.entry.js:1
(anonymous) @ core.entry.js:1
await in (anonymous) (async)
(anonymous) @ core.entry.js:1
(anonymous) @ core.entry.js:1
at @ esUiShared.plugin.js:1
u @ fleet.plugin.js:2
m @ fleet.plugin.js:2
(anonymous) @ fleet.chunk.3.js:3
(anonymous) @ fleet.chunk.3.js:3
await in (anonymous) (async)
onClick @ fleet.chunk.3.js:3
He @ kbn-ui-shared-deps-npm.dll.js:425
Ve @ kbn-ui-shared-deps-npm.dll.js:425
(anonymous) @ kbn-ui-shared-deps-npm.dll.js:425
Cr @ kbn-ui-shared-deps-npm.dll.js:425
Tr @ kbn-ui-shared-deps-npm.dll.js:425
(anonymous) @ kbn-ui-shared-deps-npm.dll.js:425
De @ kbn-ui-shared-deps-npm.dll.js:425
(anonymous) @ kbn-ui-shared-deps-npm.dll.js:425
Dr @ kbn-ui-shared-deps-npm.dll.js:425
en @ kbn-ui-shared-deps-npm.dll.js:425
Jt @ kbn-ui-shared-deps-npm.dll.js:425
t.unstable_runWithPriority @ kbn-ui-shared-deps-npm.dll.js:433
Vo @ kbn-ui-shared-deps-npm.dll.js:425
Le @ kbn-ui-shared-deps-npm.dll.js:425
Zt @ kbn-ui-shared-deps-npm.dll.js:425

Provide logs and/or server output (if relevant):

[2024-02-08T16:57:36.632+00:00][INFO ][plugins.fleet] Beginning fleet setup
[2024-02-08T16:57:36.750+00:00][ERROR][plugins.encryptedSavedObjects] Failed to decrypt "passphrase" attribute: Unsupported state or unable to authenticate data
[2024-02-08T16:57:36.751+00:00][WARN ][plugins.fleet.messageSigningService] failed to get message signing key pair. retrying attempt: 1
[2024-02-08T16:57:37.703+00:00][ERROR][plugins.encryptedSavedObjects] Failed to decrypt "passphrase" attribute: Unsupported state or unable to authenticate data
[2024-02-08T16:57:37.703+00:00][WARN ][plugins.fleet.messageSigningService] failed to get message signing key pair. retrying attempt: 2
[2024-02-08T16:57:38.337+00:00][ERROR][plugins.encryptedSavedObjects] Failed to decrypt "passphrase" attribute: Unsupported state or unable to authenticate data
[2024-02-08T16:57:38.338+00:00][WARN ][plugins.fleet.messageSigningService] failed to get message signing key pair. retrying attempt: 3
[2024-02-08T16:57:40.124+00:00][ERROR][plugins.encryptedSavedObjects] Failed to decrypt "passphrase" attribute: Unsupported state or unable to authenticate data
[2024-02-08T16:57:40.124+00:00][WARN ][plugins.fleet.messageSigningService] failed to get message signing key pair. retrying attempt: 4
[2024-02-08T16:57:40.779+00:00][ERROR][plugins.encryptedSavedObjects] Failed to decrypt "passphrase" attribute: Unsupported state or unable to authenticate data
[2024-02-08T16:57:40.780+00:00][WARN ][plugins.fleet.messageSigningService] failed to get message signing key pair. retrying attempt: 5
[2024-02-08T16:57:43.516+00:00][ERROR][plugins.encryptedSavedObjects] Failed to decrypt "passphrase" attribute: Unsupported state or unable to authenticate data
[2024-02-08T16:57:43.517+00:00][WARN ][plugins.fleet.messageSigningService] failed to get message signing key pair. retrying attempt: 6
[2024-02-08T16:57:44.952+00:00][ERROR][plugins.encryptedSavedObjects] Failed to decrypt "passphrase" attribute: Unsupported state or unable to authenticate data
[2024-02-08T16:57:44.952+00:00][WARN ][plugins.fleet.messageSigningService] failed to get message signing key pair. retrying attempt: 7
[2024-02-08T16:57:45.482+00:00][ERROR][plugins.encryptedSavedObjects] Failed to decrypt "passphrase" attribute: Unsupported state or unable to authenticate data
[2024-02-08T16:57:45.483+00:00][WARN ][plugins.fleet.messageSigningService] failed to get message signing key pair. retrying attempt: 8
[2024-02-08T16:57:45.714+00:00][ERROR][plugins.encryptedSavedObjects] Failed to decrypt "passphrase" attribute: Unsupported state or unable to authenticate data
[2024-02-08T16:57:45.715+00:00][WARN ][plugins.fleet.messageSigningService] failed to get message signing key pair. retrying attempt: 9
[2024-02-08T16:57:46.945+00:00][ERROR][plugins.encryptedSavedObjects] Failed to decrypt "passphrase" attribute: Unsupported state or unable to authenticate data
[2024-02-08T16:57:46.945+00:00][WARN ][plugins.fleet.messageSigningService] failed to get message signing key pair. retrying attempt: 10
[2024-02-08T16:57:47.011+00:00][INFO ][plugins.fleet] Encountered non fatal errors during Fleet setup
[2024-02-08T16:57:47.011+00:00][INFO ][plugins.fleet] {"name":"MessageSigningError","message":"Cannot read existing Message Signing Key pair"}
[2024-02-08T16:57:47.011+00:00][INFO ][plugins.fleet] Fleet setup completed

Any additional context:
Originally I couldn't see the list of integrations, Kibana could not connect to the public elastic package registry. I am using docker to run the Elastic Package registry inside our lab. With this change, I was able to reach the EPR and see the packages

Steps followed
https://www.elastic.co/guide/en/fleet/8.12/air-gapped.html#air-gapped-diy-epr
kibana.yaml change

  config:
    xpack.fleet.registryUrl: "http://<internal IP>:8080"
@cxlashey cxlashey added the bug Fixes for quality problems that affect the customer experience label Feb 8, 2024
@botelastic botelastic bot added the needs-team Issues missing a team label label Feb 8, 2024
@cxlashey
Copy link
Author

cxlashey commented Feb 8, 2024

I attempted to use the workaround listed in this issue
#171630 (comment)

Following steps were completed successfully

# Create a role with CRUD access to system indices
POST _security/role/system-index-superuser
{
  "indices": [
    {
      "names": [
        "*"
      ],
      "privileges": [
        "all"
      ],
      "allow_restricted_indices": true
    }
  ]
}

# Create a user and assign it that role
POST _security/user/temp_user
{
  "password": "temp_password",
  "roles": [
    "superuser",
    "system-index-superuser"
  ]
}

From another incognito window the following ran successfully


# Check the list of Fleet uninstall tokens, note the results somewhere like a .txt file
GET .kibana_ingest/_search?q=type:fleet-uninstall-tokens

# Delete all uninstall tokens
POST .kibana_ingest/_delete_by_query
{
  "query": {
    "match": { 
      "type": "fleet-uninstall-tokens"
    }
  }
}

The following step resulted in an error message

# Rerun Fleet setup to generate new tokens
POST kbn:/api/fleet/setup

Message response

{
  "isInitialized": true,
  "nonFatalErrors": [
    {
      "name": "MessageSigningError",
      "message": "Cannot read existing Message Signing Key pair"
    }
  ]
}

After running all the workaround steps, the same "Message Signing Key" error occurred when adding integrations

@dmlemeshko dmlemeshko added the Team:Fleet Team label for Observability Data Collection Fleet team label Feb 9, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

@botelastic botelastic bot removed the needs-team Issues missing a team label label Feb 9, 2024
@kpollich
Copy link
Member

kpollich commented Feb 9, 2024

This sounds quite similar to [removed]

@juliaElastic wrote an extensive KB article about this particular error message here, I'll paste the contents

This looks like potentially another instance where ECK is involved and there was potentially some kind of implicit key rotation that occurred during installation or upgrade. [removed] has some more details.

@cxlashey
Copy link
Author

cxlashey commented Feb 9, 2024

Thank you for responding. I'm getting "Page not Found" for both of the github links. The article at support.elastic.co requires an elastic account, which I do not have.

@kpollich
Copy link
Member

kpollich commented Feb 9, 2024

Ah apologies I have linked to some private repos - I mistook this for an internal support ticket. Sorry about that. I'll paste the knowledge base article below so it's not behind a login.

Issue Description

We have a documented process of rotating kibana SO encryption key: https://www.elastic.co/guide/en/kibana/current/saved-objects-api-rotate-encryption-key.html
If the encryptionKey was changed manually, and the old key was not used as decryptionOnlyKeys, Fleet will no longer be able to decrypt the Message Singing Key, and this will prevent any policy changes to be deployed to Agents.

This can be observed by querying that the .fleet-policies index doesn't contain the latest revision of the policy in kibana SO.

Environment

Self-managed deployment.

Cause

Kibana can only decrypt saved objects if the original encryption key is available.

Workaround

NOTE: Before starting this workaround, check if tamper protection is enabled on the Agent policy and whether Elastic Defend integration is used. Check out this guide for more info.

  1. Create a superuser to be able to delete kibana saved objects manually (run in Kibana console)
POST _security/role/fleet_superuser
 {
    "indices": [
        {
            "names": [".fleet*",".kibana*"],
            "privileges": ["all"],
            "allow_restricted_indices": true
        }
    ]
  }
        
POST _security/user/fleet_superuser 
 {
    "password": "password",
    "roles": ["superuser", "fleet_superuser"]
  }
  1. Delete message singing key SO
// query and save the original message signing key
GET .kibana_ingest/_search?q=type:fleet-message-signing-keys

// delete
curl -sk -XPOST --user fleet_superuser:password -H 'content-type:application/json' \
    -H'x-elastic-product-origin:fleet' \
   https://ES_HOST:PORT/.kibana_ingest/_delete_by_query \
   -d '{
    "query": {
      "bool": {"filter": [
        {"match": {"type": "fleet-message-signing-keys"}}
      ]}
    }
  }'
  1. Delete uninstall token SOs
// query and save the original uninstall tokens
GET .kibana_ingest/_search?q=type:fleet-uninstall-tokens

// delete them
curl -sk -XPOST --user fleet_superuser:password -H 'content-type:application/json' \
    -H'x-elastic-product-origin:fleet' \
   https://ES_HOST:PORT/.kibana_ingest/_delete_by_query \
   -d '{
    "query": {
      "bool": {"filter": [
        {"match": {"type": "fleet-uninstall-tokens"}}
      ]}
    }
  }'
  1. Restart kibana (this is needed to regenerate the uninstall tokens)

  2. Verify that there are no more decrypt or message signing key errors

  3. Verify that the uninstall tokens and message signing key is regenerated

GET .kibana_ingest/_search?q=type:fleet-uninstall-tokens
GET .kibana_ingest/_search?q=type:fleet-message-signing-keys
  1. Verify that the latest agent policy revision is created in .fleet-policies and sent to agents.
GET .kibana_ingest/_doc/ingest-agent-policies:<policy_id>

// revision_idx should match the revision of the SO above
GET .fleet-policies/_search?q=policy_id:<policy_id>
{
  "size": 1, 
  "sort": [
    {
      "revision_idx": {
        "order": "desc"
      }
    }
  ]
}
  1. Delete the superuser role and user
DELETE _security/user/fleet_superuser
DELETE _security/role/fleet_superuser
  1. If Elastic Defend is used, reinstall is needed:
  • remove Elastic Defend integration, which will uninstall all Endpoints (wait for a while to confirm all are gone), then add the integration again
  1. If tamper protection is enabled, reinstall of agents is needed too. For this, you need the original uninstall tokens used to install the agents.
  • Unenroll the agents with the uninstall tokens
  • Enroll the agents again

If the original uninstall tokens are not available, raise a support ticket, the security team can help uninstall the affected Endpoints

Resolution

To prevent the issue, follow the guide of key rotation instead of manually changing the encryptionKey: https://www.elastic.co/guide/en/kibana/current/saved-objects-api-rotate-encryption-key.html

If the encryptionKey was changed manually, make sure that the old key is set as decryptionOnlyKeys, e.g.

xpack.encryptedSavedObjects.encryptionKey: newValue
xpack.encryptedSavedObjects.keyRotation.decryptionOnlyKeys: [oldValue,oldValue2]

References

Guide to rotate encryption key: https://www.elastic.co/guide/en/kibana/current/saved-objects-api-rotate-encryption-key.html
Kibana SO settings guide: https://www.elastic.co/guide/en/kibana/current/security-settings-kb.html#security-encrypted-saved-objects-settings
Agent tamper protection guide: https://www.elastic.co/guide/en/security/current/agent-tamper-protection.html

@cxlashey
Copy link
Author

cxlashey commented Feb 9, 2024

Thank you! The problem has gone away.

@kpollich kpollich closed this as completed Feb 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Team:Fleet Team label for Observability Data Collection Fleet team
Projects
None yet
Development

No branches or pull requests

4 participants