Advanced API Key invalidation #53868

mikecote · 2019-12-31T15:40:39Z

Once the simple invalidation of API keys is resolved (#45144), there will still be some cases that errors may happen and should be solved in this issue.

The extra cases are:

If the underlying alert task is running while the API key is invalidated, the task may throw an error if it needed that key (authentication failure, invalid key, etc)
The same applies to actions if they are running while the API key is invalidated
If actions are scheduled but not executed before the API key is invalidated, the action will throw an error if it depended on that key
If an action is in a retry state and depended on that key, it will keep failing

elasticmachine · 2019-12-31T15:40:41Z

Pinging @elastic/kibana-alerting-services (Team:Alerting Services)

mikecote · 2020-04-22T20:15:05Z

Some quick notes:

To solve this, we may have to create an API key from the alert's API key (similar to a token) and delete it after the alert / action finished execution. We may also have to create a new API key for each action scheduled from an alert to know that when the API key is deleted, we're really done with it.

The mechanism to create an API key from the alert's API key may have to be re-attempted (with a max retires) in the scenario the API key got invalidated between reading the alert and creating a sub-API key. This scenario is pretty common in the API integration tests and we've had to workaround this a lot.

mikecote · 2020-05-05T19:03:28Z

Turns out using API keys to create sub API keys doesn't work when using Kibana to create them on behalf of other users. This leaves a few alternatives worth thinking about but I believe we may hold off until this becomes a more severe issue.

The best scenario to describe the issue is when an alert or action is taking a while to run and the alert gets updated, the API key no longer works half way through.

1. Queue the API key for invalidation

The way I'm thinking, it would involve the following changes from a high level:

Store the API key id decrypted and key encrypted in action_task_params instead of currently being stored together and encrypted
Alerting plugin to define a type of saved object used to queue invalidation of API keys (with API key id (unencrypted), alert id, etc)
Alerting to create an invalidation object for each API key that needs to be invalidated instead of invalidating it immediately
A single recurring task (or maybe one-per) to check on a regular interval for any API key ids that doesn't exist in action_task_params and the alert task is "idle".

This one will require a saved object migration on action_task_params or some extra code to handle dual scenarios. This doesn't require external dependencies and also solves all 4 scenarios in the description without having to create more API keys or other objects.

2. Look into allowing this scenario (API keys creating sub API keys)?

To allow API keys to create another API key. This would require external dependencies and research to see if this is worth pursuing.

3. Look into using a token service?

I'm not too familiar with it but it requires manage_token cluster privilege which Kibana probably doesn't have.

https://www.elastic.co/guide/en/elasticsearch/reference/master/token-authentication-services.html

4. Leave functionality as is

Probably the option for now until we need to solve it with a scenario like above or others. Will mark issue as blocked / discuss for now.

kobelb · 2020-05-05T19:43:39Z

The best scenario to describe the issue is when an alert or action is taking a while to run and the alert gets updated, the API key no longer works half way through.

Does the API key no longer work because it was explicitly invalidated by the user? If so, we can only assume the intent is to prevent it from being used for authentication, so having it fail mid-way through processing the alert makes sense.

Users are able to invalidate API Keys using Elasticsearch APIs, and we can't prevent users from directly invalidating API Keys using those APIs and force them to be queued until they're no longer used.

If we tried to circumvent this by cloning the API Key, the clone could just be invalidated also.

mikecote · 2020-05-05T20:10:49Z

Does the API key no longer work because it was explicitly invalidated by the user?

That is definitely one use case which we won't be able to protect ourselves from as you mention.

The other scenario is more when users are updating alerts. We immediately create a new API key and invalidate the old one (here: https://github.com/elastic/kibana/blob/master/x-pack/plugins/alerting/server/alerts_client.ts#L300). If the alert was running or had pending / running actions, those are all possible scenarios that could fail due to referencing a now invalidated API key from an update.

The alert would recover from this scenario at the next execution / interval. Some actions may fail to run and not re-try due to always having a reference to an old API key.

kobelb · 2020-05-05T21:35:45Z

The other scenario is more when users are updating alerts. We immediately create a new API key and invalidate the old one (here: https://github.com/elastic/kibana/blob/master/x-pack/plugins/alerting/server/alerts_client.ts#L300). If the alert was running or had pending / running actions, those are all possible scenarios that could fail due to referencing a now invalidated API key from an update.

Gotcha, that makes sense. Thanks for explaining. 🤔

kobelb · 2020-05-05T22:50:45Z

Queue the API key for invalidation

This is an interesting proposal. It seems somewhat complicated, but seems feasible.

Look into allowing this scenario (API keys creating sub API keys)?

I've proposed adding the ability to allow API Keys to optionally clone themselves to enable other features in the past. We ended up not going this approach, but it's worth exploring.

Look into using a token service?

The Get token API is primarily consumed by Kibana's auth providers internally to support features like SAML, OIDC, Kerberos, etc. Unfortunately, it's not possible to have an APIKey and use this to get a token.

Prevent the user from updating Alerts while they're running

This is somewhat limiting, and it could be a potentially frustrating UX. However, we could use optimistic concurrency control to make sure that we can update the API Key which will be used before invalidating it, to ensure it didn't start running during the process.

mikecote · 2020-05-12T13:07:00Z

Prevent the user from updating Alerts while they're running

This is somewhat limiting, and it could be a potentially frustrating UX. However, we could use optimistic concurrency control to make sure that we can update the API Key which will be used before invalidating it, to ensure it didn't start running during the process.

This sounds interesting. How would the OCC work, would it be with the task in Task Manager?

mikecote · 2020-07-09T14:43:28Z

I created an issue within Elasticsearch repo to explore option 2.

elastic/elasticsearch#59304

mikecote · 2020-09-01T13:50:37Z

I created an issue within Elasticsearch repo to explore option 2.

elastic/elasticsearch#59304

We'll have to come up with a different approach based on timing. I'm thinking of a new option where we just delete the API key at a future time by using task manager. The API key ID for deletion can be stored in a saved object and either a single task manager task or one-per API key would execute the cleanup.

If ever something was to fail after this fixed deletion time, we could consider this the same scenario as a user manually deleting the API key and leave the functionality as is for now.

YulNaumenko · 2020-09-29T16:59:58Z

After discussion we decided to investigate the two approaches:

use SO for saving the API key IDs that should be deleted and create a configuration option where can set an execution interval for a TM task which will get the data from this SO and remove marked for delete keys.
Create a task for each API key which should delete it after the configurable awaiting time will be passed.
As a result come with pros and cons for both.

mikecote added Feature:Alerting Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) labels Dec 31, 2019

mikecote mentioned this issue Jan 2, 2020

Invalidate alert API Key when generating a new one #53732

Merged

mikecote self-assigned this Jan 15, 2020

mikecote mentioned this issue Jan 17, 2020

Fix enable alert API to schedule task after alert is updated #55095

Merged

mikecote mentioned this issue Jan 29, 2020

API keys seem to be missing at times in alert type executors #56210

Closed

mikecote removed their assignment Feb 18, 2020

mikecote mentioned this issue Mar 18, 2020

Fix race condition in flaky alerting test #60438

Merged

YulNaumenko self-assigned this Apr 22, 2020

YulNaumenko mentioned this issue Apr 29, 2020

Implemented advanced API Key invalidation: create sub API Key on alert and actions task execution #64731

Closed

mikecote added blocked discuss labels May 5, 2020

rylnd mentioned this issue May 8, 2020

[SIEM] Removing a Rule's Action breaks Rule execution #64870

Closed

mikecote mentioned this issue Jul 9, 2020

Ability to clone granted API keys elastic/elasticsearch#59304

Open

mikecote mentioned this issue Aug 11, 2020

Alerting GA #74788

Closed

36 tasks

mikecote unassigned YulNaumenko Aug 13, 2020

mikecote added R&D Research and development ticket (not meant to produce code, but to make a decision) and removed blocked labels Sep 8, 2020

YulNaumenko self-assigned this Sep 14, 2020

This was referenced Oct 30, 2020

Used SO for saving the API key IDs that should be deleted #82211

Merged

Create a task for each API key which should be invalidated instead of direct invalidating #82238

Closed

YulNaumenko linked a pull request Nov 6, 2020 that will close this issue

Create a task for each API key which should be invalidated instead of direct invalidating #82238

Closed

YulNaumenko closed this as completed in #82211 Nov 17, 2020

kobelb added the needs-team Issues missing a team label label Jan 31, 2022

botelastic bot removed the needs-team Issues missing a team label label Jan 31, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Advanced API Key invalidation #53868

Advanced API Key invalidation #53868

mikecote commented Dec 31, 2019

elasticmachine commented Dec 31, 2019

mikecote commented Apr 22, 2020 •

edited

Loading

mikecote commented May 5, 2020 •

edited

Loading

kobelb commented May 5, 2020

mikecote commented May 5, 2020

kobelb commented May 5, 2020

kobelb commented May 5, 2020 •

edited

Loading

mikecote commented May 12, 2020

mikecote commented Jul 9, 2020

mikecote commented Sep 1, 2020 •

edited

Loading

YulNaumenko commented Sep 29, 2020

Advanced API Key invalidation #53868

Advanced API Key invalidation #53868

Comments

mikecote commented Dec 31, 2019

elasticmachine commented Dec 31, 2019

mikecote commented Apr 22, 2020 • edited Loading

mikecote commented May 5, 2020 • edited Loading

1. Queue the API key for invalidation

2. Look into allowing this scenario (API keys creating sub API keys)?

3. Look into using a token service?

4. Leave functionality as is

kobelb commented May 5, 2020

mikecote commented May 5, 2020

kobelb commented May 5, 2020

kobelb commented May 5, 2020 • edited Loading

mikecote commented May 12, 2020

mikecote commented Jul 9, 2020

mikecote commented Sep 1, 2020 • edited Loading

YulNaumenko commented Sep 29, 2020

mikecote commented Apr 22, 2020 •

edited

Loading

mikecote commented May 5, 2020 •

edited

Loading

kobelb commented May 5, 2020 •

edited

Loading

mikecote commented Sep 1, 2020 •

edited

Loading