Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Ingest Manager] Agent bulk actions UI #77690

Merged
merged 37 commits into from
Sep 22, 2020

Conversation

jen-huang
Copy link
Contributor

@jen-huang jen-huang commented Sep 16, 2020

Resolves #72712. Resolves #72653.

Summary

This PR adds bulk reassign agents and bulk unenroll agents capabilities. Both the UI and API for these capabilities are license protected (Gold+):

  • Added UI for bulk actions selection per design spec
  • Revised UI for single unenroll agent modal (same modal for bulk unenroll) per design spec
  • Added POST /api/ingest_manager/fleet/agents/bulk_reassign endpoint which expects payload
{
  agents: [array of agent IDs | kuery string],
  policy_id: [ID of new policy to reassign agents to]
}
  • Added POST /api/ingest_manager/fleet/agents/bulk_unenroll endpoint which expects payload
{
  agents: [array of agent IDs | kuery string],
  force: [boolean to force unenroll or not]
}
  • Adjusted GET /api/ingest_manager/fleet/agents API response to include totalInactive property, which is the total number of inactive agents from the returned list of agents (in other words total - totalInactive = # of active agents).
  • Added API integration tests for bulk unenroll to existing unenroll test suite. Added new reassign test suite that includes tests for bulk reassign.

In addition, this PR also includes some refactoring related to supporting this work:

  • class LicenseServer moved to /common, isGoldPlus() method added to it, and now both client and server plugins instantiates it. On server it is a simple service, and on client a useLicense() hook was added to use it.
  • Added normalizeKuery() method for use in server code, DRY-ing up all places where the same code was used.
  • Added findAllSOs() method for use in server code to retrieve all SOs without paging parameters (aka returns all results by paging though pageSize: 10000 limit).

Screenshots

image

image

It's hard to capture all the permutations of this UI, so here are some gifs 😄 Note that I changed table size to be shorter for demo purposes.

Basic table selection behavior

Sep-17-2020 12-37-37

Reassigning agents by manual selection

Sep-17-2020 12-41-58

Force unenroll agents based on current table query (aka select all pages)

Sep-17-2020 12-51-49

Taking actions based on current table query (including when inactive agents are shown)

Sep-17-2020 12-46-02

… action selection UI that may or may not include active agents
@jen-huang jen-huang self-assigned this Sep 17, 2020
@jen-huang jen-huang added release_note:enhancement Team:Fleet Team label for Observability Data Collection Fleet team v7.10.0 v8.0.0 labels Sep 17, 2020
@jen-huang jen-huang marked this pull request as ready for review September 17, 2020 20:04
@jen-huang jen-huang requested a review from a team September 17, 2020 20:04
@elasticmachine
Copy link
Contributor

Pinging @elastic/ingest-management (Team:Ingest Management)

@ph
Copy link
Contributor

ph commented Sep 17, 2020

@jen-huang I can't wait to try that out tomorrow.

};
}, {});
return response.ok({ body });
} catch (e) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mind replacing this with the new defaultIngestErrorHandler pattern?

Like df19026#diff-a4f9e2e344a4b58d7209d56878afed76

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, didn't know about this. I updated here as well as the equivalent single agent handler too for consistency, as well as the unenroll agent(s) handlers. I guess we have a task somewhere to replace and use it in all handlers?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we have a task somewhere to replace and use it in all handlers?

@jen-huang better yet, a PR #77975 🎉

throw error;
}
setIsSubmitting(false);
const successMessage = forceUnenroll
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am struggling to read this nested ternary. Would you mind splitting it up?

| {
agentIds: string[];
}
| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

const apiKeys: string[] = [];

// Get all API keys that need to be invalidated
agentsToUpdate.map((agent) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe .forEach or for-of since we're not using the return value?

page: currentPage,
});
if (currentPageSavedObjects.length) {
savedObjectResults = [...savedObjectResults, ...currentPageSavedObjects];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use an area method like .concat or .push here? Spreading thousands of objects is taxing on the engine

.send({
policy_id: 'INVALID_ID',
})
.expect(500);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems more like a 4xx error but not a big deal rn.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed and will defer too, we don't have a super great pattern of handling errors thrown from services in route handlers right now

Copy link
Contributor Author

@jen-huang jen-huang Sep 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually apparently we are able to forward the Boom.notFound error code :D updated this assertion to 404

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I did the same in #77975 & @nchaulet agrees #77975 (comment)

@nchaulet
Copy link
Member

Out of curiosity did you tested this with a high number of agents? like 1k? 10k?
I am curious how actions like force unenroll are working as we are doing a lot of thing per agent, invalidating two key, update a SO.

@jen-huang
Copy link
Contributor Author

Good question @nchaulet, I just tested with 1000 and 10,000 agents. For 1000 agents, the force unenroll action took a few seconds but completed well. For 10,000 agents, it failed with various errors that look to be caused by too many parallel ES requests:

[search_phase_execution_exception] response from /_security/api_key: {"error":{"root_cause":[],"type":"search_phase_execution_exception","reason":"","phase":"fetch","grouped":true,"failed_shards":[],"caused_by":{"type":"es_rejected_execution_exception","reason":"rejected execution of org.elasticsearch.action.search.FetchSearchPhase$1@18e5165e on EsThreadPoolExecutor[name = Jens-MacBook-Pro.local/system_read, queue capacity = 2000, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@335e9005[Running, pool size = 4, active threads = 4, queued tasks = 2000, completed tasks = 176131]]"}},"status":429}
server    log   [14:03:47.271] [error][api-key][plugins][security] Failed to invalidate API key as current user: [es_rejected_execution_exception] rejected execution of org.elasticsearch.action.ActionRunnable$2@437aacef on EsThreadPoolExecutor[name = Jens-MacBook-Pro.local/system_read, queue capacity = 2000, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@335e9005[Running, pool size = 4, active threads = 4, queued tasks = 2000, completed tasks = 177222]]
...
server    log   [14:04:21.780] [error][elasticsearch][security] Request error, retrying
DELETE http://localhost:9200/_security/api_key => connect EADDRNOTAVAIL 127.0.0.1:9200 - Local (0.0.0.0:0)
...
server    log   [14:05:55.074] [error][api-key][plugins][security] Failed to invalidate API key as current user: Request Timeout after 30000ms

The ES API keys interface doesn't provide a way to bulk invalidate API keys, so Kibana security interface doesn't either. I'll investigate next week if it is possible to batch the requests or run in them in the background. If you have any suggestions, let me know!

@nchaulet
Copy link
Member

@jen-huang I am not sure how we can implement that, but maybe this can be done as a background task, at some point with a high number of agents it's sure that the request is going to be too long.

We can probably merge it like is it, and create an issue for scaling this.

@jen-huang
Copy link
Contributor Author

jen-huang commented Sep 22, 2020

@nchaulet I adjusted the code so that it invalidates API keys in batches instead, 500 keys at a time. This worked well for 10,000 agents, it took a reasonable amount of time to execute and didn't trigger any Kibana or ES errors 🎉

@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Build metrics

@kbn/optimizer bundle module count

id value diff baseline
ingestManager 551 +3 548

async chunks size

id value diff baseline
ingestManager 1.1MB +12.3KB 1.1MB

page load bundle size

id value diff baseline
ingestManager 518.7KB +4.2KB 514.5KB

distributable file count

id value diff baseline
default 45943 +1 45942

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@jen-huang jen-huang merged commit 48b81a6 into elastic:master Sep 22, 2020
@jen-huang jen-huang deleted the ingest/bulk-actions branch September 22, 2020 21:23
jen-huang added a commit to jen-huang/kibana that referenced this pull request Sep 22, 2020
* Add temporary client-side license service/hook

* Initial pass at bulk actions UI (UI behavior only)

* Initial pass at implementing reassign agent policy by agent IDs

* Allow bulk reassign agent policy by kuery

* Return total inactive agents in list agents API to better handle bulk action selection UI that may or may not include active agents

* Add isGoldPlus method to license service

* Add `normalizeKuery` function

* Add `.findAllSOs` method and refactor bulk reassign to use that

* Initial pass at backend work for bulk unenroll

* Covert unenroll provider to unenroll modal and adjust UI to include force option

* Move license protection to handler level, fix misc bugs

* Add comments about `data` field response in create agent action(s)

* Clean up license service

* Fix i18n

* Add tests for bulk. unenroll

* Add tests for reassign and bulk reassign

* Fix typing

* Adjust single actions icon and text to be consistent

* Fix i18n

* PR feedback

* Increment api key test assertion to account for adding another agent policy to es archiver data

* Fix test

* Fix duplicate declaration after merging

* Add comments to SO find all function

* Batch invalidate API keys requests
# Conflicts:
#	x-pack/plugins/ingest_manager/server/services/index.ts
gmmorris added a commit to gmmorris/kibana that referenced this pull request Sep 23, 2020
* master: (31 commits)
  skip tests for old pacakge (elastic#78194)
  [Ingest Pipelines] Add url generator for ingest pipelines app (elastic#77872)
  [Lens] Rename "telemetry" to "stats" (elastic#78125)
  [CSM] Url search (elastic#77516)
  [Drilldowns] Config to disable URL Drilldown  (elastic#77887)
  [Lens] Combined histogram/range aggregation for numbers (elastic#76121)
  Remove legacy plugins support (elastic#77599)
  'Auto' interval must be correctly calculated for natural numbers (elastic#77995)
  [CSM] fix ingest data retry order messed up (elastic#78163)
  Add response status helpers (elastic#78006)
  Bump react-beautiful-dnd (elastic#78028)
  [Security Solution][Detection Engine] Bubbles up more error messages from ES queries to the UI (elastic#78004)
  Index pattern  - refactor constructor (elastic#77791)
  Add `xpack.security.sameSiteCookies` to docker allow list (elastic#78192)
  Remove [key: string]: any; from IIndexPattern (elastic#77968)
  Remove requirement for manage_index_templates privilege for Index Management (elastic#77377)
  [Ingest Manager] Agent bulk actions UI (elastic#77690)
  [Metrics UI] Add inventory view timeline (elastic#77804)
  Reporting/Docs: Updates for setting to enable CSV Download (elastic#78101)
  Update to latest rum-react (elastic#78193)
  ...
jen-huang added a commit that referenced this pull request Sep 24, 2020
* [Ingest Manager] Agent bulk actions UI (#77690)

* Add temporary client-side license service/hook

* Initial pass at bulk actions UI (UI behavior only)

* Initial pass at implementing reassign agent policy by agent IDs

* Allow bulk reassign agent policy by kuery

* Return total inactive agents in list agents API to better handle bulk action selection UI that may or may not include active agents

* Add isGoldPlus method to license service

* Add `normalizeKuery` function

* Add `.findAllSOs` method and refactor bulk reassign to use that

* Initial pass at backend work for bulk unenroll

* Covert unenroll provider to unenroll modal and adjust UI to include force option

* Move license protection to handler level, fix misc bugs

* Add comments about `data` field response in create agent action(s)

* Clean up license service

* Fix i18n

* Add tests for bulk. unenroll

* Add tests for reassign and bulk reassign

* Fix typing

* Adjust single actions icon and text to be consistent

* Fix i18n

* PR feedback

* Increment api key test assertion to account for adding another agent policy to es archiver data

* Fix test

* Fix duplicate declaration after merging

* Add comments to SO find all function

* Batch invalidate API keys requests
# Conflicts:
#	x-pack/plugins/ingest_manager/server/services/index.ts

* Fix export

* Fix spacing

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release_note:enhancement Team:Fleet Team label for Observability Data Collection Fleet team v7.10.0 v8.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Ingest Manager] Add bulk actions [Ingest Manager] Bulk unenroll and force unenroll
6 participants