Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fleet] Integration installation issues in multi-space Kibana environment #143388

Closed
kpollich opened this issue Oct 14, 2022 · 13 comments · Fixed by #144066
Closed

[Fleet] Integration installation issues in multi-space Kibana environment #143388

kpollich opened this issue Oct 14, 2022 · 13 comments · Fixed by #144066
Assignees
Labels
bug Fixes for quality problems that affect the customer experience QA:Validated Issue has been validated by QA Team:Fleet Team label for Observability Data Collection Fleet team v8.5.1 v8.6.0

Comments

@kpollich
Copy link
Member

Summary

When installing integrations in a non-default space, the installed integration is not visible under Installed Integrations in either the default space or the non-default space.

To reproduce

  1. Create a new space "Fleet"
  2. Change your current Kibana space to the new "Fleet" space
  3. Install any integration while in this space
  4. Navigate to the "Installed Integrations" screen
  5. Note that the integration does not appear in the "installed integrations" list and similarly no "Integration Policies" tab exists
  6. Change between the "Default" and "Fleet" spaces and observe the issue persists
Screen.Recording.2022-10-14.at.11.17.40.AM.mov

I do see some interesting error logs during the installation process

[2022-10-14T11:18:05.462-04:00][DEBUG][plugins.fleet] kicking off bulk install of 1password, system, elastic_agent
[2022-10-14T11:18:05.469-04:00][DEBUG][plugins.fleet] found bundled package for requested install of elastic_agent-1.3.5 - installing from bundled package archive
[2022-10-14T11:18:05.470-04:00][DEBUG][plugins.fleet] kicking off install of 1password-1.6.0 from registry
[2022-10-14T11:18:05.567-04:00][DEBUG][plugins.fleet] setting file list to the cache for elastic_agent-1.3.5
[2022-10-14T11:18:05.567-04:00][DEBUG][plugins.fleet] setting package info to the cache for elastic_agent-1.3.5
[2022-10-14T11:18:05.648-04:00][DEBUG][plugins.fleet] setting package info to the cache for 1password-1.6.0
[2022-10-14T11:18:05.999-04:00][WARN ][plugins.fleet] Not performing package verification as no local verification key found
[2022-10-14T11:18:06.016-04:00][DEBUG][plugins.fleet] setting file list to the cache for 1password-1.6.0
[2022-10-14T11:18:11.073-04:00][WARN ][plugins.fleet] Failure to install package [1password]: [ConcurrentInstallOperationError: Concurrent installation or upgrade of 1password-1.6.0 detected, aborting. Original error: Saved object [tag/managed] conflict]
[2022-10-14T11:18:11.074-04:00][ERROR][plugins.fleet] Concurrent installation or upgrade of 1password-1.6.0 detected, aborting. Original error: Saved object [tag/managed] conflict
[2022-10-14T11:18:11.074-04:00][ERROR][plugins.fleet] Concurrent installation or upgrade of elastic_agent-1.3.5 detected, aborting. Original error: Saved object [tag/managed] conflict
@kpollich kpollich added bug Fixes for quality problems that affect the customer experience Team:Fleet Team label for Observability Data Collection Fleet team labels Oct 14, 2022
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

@hop-dev hop-dev self-assigned this Oct 19, 2022
@hop-dev
Copy link
Contributor

hop-dev commented Oct 19, 2022

Cause

On package installation we create a "managed" tag which we then tag all package assets with.

This tag is a saved object itsself. Currently we create this saved object with the same hard coded ID every time. The issue is that IDs for saved objects have to be unique across spaces This is because tag saved objects are 'multiple-isolated' meaning they are restricted to one space but their IDs must be unique across all spaces (read more here).

When we install a package in the default space, the managed tag saved object is created in that space. Then when we switch to another space and install a package, we first query to see if the managed tag saved object exists, this query is only done for saved objects in the current space so returns nothing. We then attempt to create the saved object with the hard coded ID, however this is rejected as a saved object with that ID already exists.

The race condition / Why do we use a hard coded ID?

The reason we use a hard coded ID is to prevent a race condition, for example when installing a package e.g apache in a new policy with system monitoring enabled, we install the apache package and the system package semi-concurrently, meaning they both attempt to create the managed tag saved object concurrently. By using a hard coded ID and setting overwrite to true on creation, this means we only ever create one tag saved object.

Possible solutions

1. Make the tag a shared saved object

Update: Raised #143869, not proceeding with this solution, TLDR: too much work to justify for this bug.

i.e move from 'multiple-isolated' to 'multiple' for tag saved objects.
Currently the saved object tagging client only handles one space (the current one) so cant create tags that are shared across multiple. We could request to change this so that every time a new package is installed we query for the tag saved object as internal user, and amend it with the new spaceId.

2. Create a managed tag in every space on installation

2.a include space in the saved object ID 🟠 (See solution 3 in comment below)
We could create a tag in each space. This could mean changing our hard coded ID to also have the space name in it. e.g tag:managed => tag:default-managed. Once created, a space ID cannot be changed so this wouldn't be brittle. We would need to migrate the saved objects to change their IDs to include the space.

2.b Use auto generated saved object IDs ❌ (Probably not)
I have tried using auto generated IDs and setting refresh: true on SO creation, but the race condition still occurs (2 tags are created when concurrent package installs occur) . We could look to use generated IDs but have some other lock in place for the creation of the saved object to prevent duplicates, this would have to work across multiple kibanas, maybe tasks would work? I think it would be over engineered. Also generally I think we want to know the ID to be able to look the tags up, as the tag name is not a strong lookup as it is user-editable.

Steps forward

I think option 2.a is the way forward as it is simplest and requires minimal changes, the only complication is the SO migration which I am looking into now.

@lucabelluccini
Copy link
Contributor

Hello @hop-dev - by saved objects per space, do we mean the assets or also the Fleet objects?
Does it make sense to have space-aware integrations?
Fleet policies / integrations are not space aware as agents are not linked to spaces at the moment right?

@rudolf
Copy link
Contributor

rudolf commented Oct 24, 2022

2a has some edge cases which could cause a mess. E.g. importing saved objects that were exported before the id renaming upgrade would still reference the old id and probably contain the old tag causing duplicate tags. We could fix this by running a migration task every X but if X is large this causes weird behaviour in the UI when a tag might suddenly dissapear.

Isn't (1) closer to what we really want? We want one tag to tag all fleet-installed saved objects. The fact that users can change the tag colors etc across spaces is a feature of sharing. Although we're currently in this inbetween state with multiple-isolated it feels wrong to build a kind of space isolation into fleet's tags.

This makes me wonder about the ideal behaviour of installing packages across spaces more generally. If dashboards were shareable and a package containing a dashboard gets installed to a second space, would we want to share the existing dashboard to that space or create a new dashboard in that space. And what about data views? What does it mean to install a package to a space that might contain non-space isolated "assets" like if a package creates an index and sets up ILM policies.

@hop-dev
Copy link
Contributor

hop-dev commented Oct 24, 2022

2a has some edge cases which could cause a mess

Agreed, I hadn't thought through the implications of changing saved object IDs, I don't think periodically migrating is desireable for these tags, especially as they are quite inconsequential to the user.

Isn't (1) closer to what we really want?

Yes on reflection I think so, Is changing from 'multiple-isolated' to 'multiple' a big change? I guess we would need to update the tagging client to be space aware. I will create an issue for this.

This makes me wonder about the ideal behaviour of installing packages across spaces more generally...

All very good points, we will be looking at this eventually and I agree it's going to be complicated! Currently we just want fleet not to break when installing packages in different spaces, I agree the UX is not great at the min. Currently this tags saved object issue is preventing users from installing any package in a second space once a package has been installed in another.

@hop-dev
Copy link
Contributor

hop-dev commented Oct 24, 2022

@rudolf I've created this issue for making tags shareable I'd love to get your opinion on it: #143869

@hop-dev
Copy link
Contributor

hop-dev commented Oct 25, 2022

@kpollich I've moved this to blocked while I try and figure out a way forward with the core team

@hop-dev
Copy link
Contributor

hop-dev commented Oct 25, 2022

@rudolf @kpollich After going round the houses a bit I am back to proposing a modification of 2.a Above,

Solution 3: Create tag saved object in each space, but keep legacy tag + no SO migration

As in 2.a above, use the spaceId in the tag ID, so tag:managed would become tag:managed-default.

However if tag:managed already exists, then use that instead. This prevents us from having to do a fragile and complicated migration. Pseudo code would look like:

const legacyTag = getTag('managed')
const tag = legacyTag ? legacyTag : getTag('managed-default');
if(!tag){
  tag = createTag('managed-default')
}

// assign tag to assets

Why avoid migrating?

The migration would have to create the new tag, delete the old one and re-assign the assets to the new tag. This would not be atomic and we would have to do it on plugin start, which would be complex and fragile.

@kpollich
Copy link
Member Author

Thanks, @hop-dev - the new solution makes sense to me. I appreciate you describing why we can't do this as a migration here.

@hop-dev
Copy link
Contributor

hop-dev commented Oct 27, 2022

Fix has been backported to 8.5.x as I thought it was severe enough 👍

@kpollich kpollich added v8.6.0 v8.5.1 QA:Needs Validation Issue needs to be validated by QA labels Nov 9, 2022
@ghost
Copy link

ghost commented Nov 10, 2022

Hi @kpollich,

We have re-validated this issue on the latest 8.6.0 SNAPSHOT Kibana Staging environment and found that the issue is fixed.

Build details:

Version: 8.6.0 SNAPSHOT
Build: 58097
Commit: 8094003b4664dbb5c802d90d3e1bda7422327d1f

Below are the observations:

  • When installing integrations in a non-default space, the installed integration is visible under Installed Integrations in both the default space and the non-default space.

Screen Recording:

Spaces.-.Elastic.-.Google.Chrome.2022-11-10.12-52-07_Trim.mp4

Hence, marking this issue as QA: Validated.

Thanks!

@ghost ghost added the QA:Validated Issue has been validated by QA label Nov 10, 2022
@ghost
Copy link

ghost commented Nov 14, 2022

Hi @kpollich,

We have created 01 test case for this feature under our Fleet Test Suite:

Please let us know if anything else is required from our end.

Thanks!

@ghost
Copy link

ghost commented Nov 15, 2022

Hi @kpollich,

We have re-validated this issue on the latest 8.5.1 BC1 Kibana Staging environment and found that the issue is fixed.

Build details:

VERSION: 8.5.1
BUILD: 57136
COMMIT: 87149bfd06f4fe41dbfa7e95461294e9dadfb1d8

Below are the observations:

  • Installing integrations in a non-default space, the installed integration is visible under Installed Integrations in both the default space and the non-default space.

Screen Recording:

Home.-.Elastic.-.Google.Chrome.2022-11-15.16-04-55.mp4

Please let us know if anything is missing from our end.
Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience QA:Validated Issue has been validated by QA Team:Fleet Team label for Observability Data Collection Fleet team v8.5.1 v8.6.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants