Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fleet] Support for Remote Elasticsearch cluster #104986

Closed
12 tasks done
mostlyjason opened this issue Jul 8, 2021 · 48 comments · Fixed by #169252 or #172464
Closed
12 tasks done

[Fleet] Support for Remote Elasticsearch cluster #104986

mostlyjason opened this issue Jul 8, 2021 · 48 comments · Fixed by #169252 or #172464
Assignees
Labels
QA:Validated Issue has been validated by QA Team:Fleet Team label for Observability Data Collection Fleet team

Comments

@mostlyjason
Copy link
Contributor

mostlyjason commented Jul 8, 2021

We want to allow users to send agent monitoring data to a remote ES cluster. This will allow the user to add a new Elasticsearch output for a remote cluster. Its the same as a regular ES output but it allows the user to set a service token manually for the remote cluster. The service token is statically passed down to the Fleet server. Fleet server generates API keys for each agent. Ideally, the service token includes limited permissions.

Fleet Server requirements

Open questions:

  1. Should we allow all the authentication options for ES output https://www.elastic.co/guide/en/beats/filebeat/current/elasticsearch-output.html?
    • We should require a service token so we can generate API keys per agent, which is more secure than shared credentials
  2. Should we provide a UI or allow users to edit the YAML?
    • We should have a UI element allowing users to paste in the service token, and link to docs so they can learn how to generate it

Implementation Tasks

🖌️ = UX input wanted
🟠 = further design or scoping needed

  • -1 Create feature flag
  • 0. 🟠 Create new service account elastic/fleet-server-remote with minimum privileges necessary for agent data to be sent to the cluster [Fleet] Create new service account elastic/fleet-server-remote for remote agent connections elasticsearch#85747 (comment)
    • here is where we currently create the service accounts.
  • 1. Create new remote-elasticsearch output type (backend)
    • 1.a Add validation on create - the remote-elasticsearch output type cannot have is_default(default output for integration data) set to true.
    • [ ] 1.b Add new remote service token encrypted field to agent policy (service_token?). .
    • 1.b Add new remote service token as a secret field to output (dependent on Output secrets work being ready )
    • 1.c DO NOT send the service token to the agent, it should only be used by fleet server to generate API keys in remote ES
  • 2. Create remote elasticsearch output form (designs from
    • 2.a Form should have the following:
      • Hostname input
      • Service token input
      • Instructions on how to generate service token using either:
      • “Make this output the default for agent integrations.” should be removed as remote ES does not support integration data
      • Additional settings input
      • If the user has not set xpack.encryptedSavedObjects.encryptionKey then form should be disabled (this has now been added to logstash output as part of [Fleet] Support logstash as an output type in API and Kibana config #125990)
  • 3. 🖌️ Add way to generate a service token for elastic/fleet-server-remote to fleet UI (design/discussion here)
    • UI work to create button (a lot can be directly copied from the fleet server steps)
    • should check for manage_service_account privilege.
    • API to generate elastic/fleet-server-remote token will be clone of existing API (or we could add a ?remote query param to change the service account the token is generated from)
  • 4. Do not allow remote elasticsearch output to be selected as integrations data output (should not be shown in dropdown)
  • 5. 🟠 If fleet server reports the remote cluster is unreachable then display a banner on the fleet page with the error
    • not sure if this is going to be in scope yet, there is some ongoing discussion here
    • this could be reported by the fleet-server agent status, which could go to unhealthy if can't access the remote ES cluster
@mostlyjason mostlyjason added the Team:Fleet Team label for Observability Data Collection Fleet team label Jul 8, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/fleet (Team:Fleet)

@ruflin
Copy link
Member

ruflin commented Jul 9, 2021

Ideally this would also work with API keys and the user should not have to figure out all the exact permissions for this to work. We would need to figure out how the user could easily retrieve this API key.

@dborodyansky
Copy link
Contributor

@mostlyjason @jen-huang Updated mockup per our conversation below. Please let me know of any questions or concerns.

image

@hop-dev
Copy link
Contributor

hop-dev commented Mar 23, 2022

Hi @dborodyansky, if you get a chance we will need to prevent the user from adding a remote ES output if they havent set xpack.encryptedSavedObjects.encryptionKey. Not sure if to disable the whole form and have a banner in the flyout? More details under 2.a above:

🖌️ If the user has not set xpack.encryptedSavedObjects.encryptionKey, an error should be shown saying that it must be set to add remote elasticsearch output as we save the service token. Also link to instructions on how to set it? (we had something similar for alerting in 7.9 https://i.stack.imgur.com/BL3UC.png)

@dborodyansky
Copy link
Contributor

@hop-dev Here is a proposal to show a callout for this case, and disable subsequent fields and save button to prevent errors. What are your thoughts?

image

@jen-huang
Copy link
Contributor

“Make this output the default for agent integrations.” should be disabled or removed (remote ES does not support integration data)

I did not realize this limitation, but that tracks with the documented proposal. @dborodyansky can we also get this included in the mockups?

@dborodyansky
Copy link
Contributor

Got it @jen-huang. Removing the toggle seems appropriate in this case. Mockups updated. Source here

@hop-dev
Copy link
Contributor

hop-dev commented Apr 6, 2022

@dborodyansky I have just realised something missing from the output form design. We will need to instruct the user how to create the service token to paste in the box. There are 2 ways they will be able to do it which I've outlined below.

With the Logstash output we have inline instructions in the create output form, I guess we could do something similar here, but maybe using tabs for the 2 options? Any help would be really appreciated.

Here are the two ways to generate a token:

Option 1. Kibana

If the remote ES has a Kibana instance connected to it, the user will be able to go to Fleet > Settings to generate a service token. I've made a mockup of the UI here

Option 2. Command Line
If the user doesnt have access to Kibana then they can use the following command to generate the token:

bin/elasticsearch-service-tokens create elastic/fleet-server-remote remote-es-token

And then paste it back into the input.

@dborodyansky
Copy link
Contributor

@hop-dev What do you think about the following approach?

image

@hop-dev
Copy link
Contributor

hop-dev commented Apr 7, 2022

Looks great @dborodyansky! I've just seen there are some associated docs with the command here, could we add a link somewhere?

@hop-dev
Copy link
Contributor

hop-dev commented Apr 7, 2022

Hi @dborodyansky, I think this is the last design we will need. We will need a place for users to generate a service token in the settings area. I stole our existing service token generation UI and created a mockup of what it could look like in the design doc here. But that was only for illustration not sure the text is correct.

I could also do with some guidance on the behaviour when a user doesn't have the manage_service_account elasticsearch cluster privilege, would we hide the section or disable it?

Let me know if I can provide more info.

@dborodyansky
Copy link
Contributor

@hop-dev I think what you have created is great as it keeps consistency with token generation elsewhere. I have some minor suggestions, though would defer to copywriters on final copy.

  • Seems "Remote service token" may be better as sentence case, not title case.
  • Omitting the fill prop from the button would balance the affordance in context to other elements on the screen and expected usage. This is not a primary call-to-action for the screen overall.
  • Success callout may need a success message preface such as "Service token generated..."
  • Since we are setting context for the token code already, the additional label over the code block may be redundant.

Regarding privileges, is it likely that a user without manage_service_account privileges would have access to create the remote output in the first place? If not, then removing the section entirely is more suitable. If so, then disabling and clarifying reason provides useful information.

image

@mholttech
Copy link

I was just looking for this capability, hope to see this implemented sometime soon as it will allow us more options for deploying and managing fleet agents and separating the data out into different clusters.

@juliaElastic
Copy link
Contributor

juliaElastic commented Sep 1, 2023

Reviewed the implementation tasks in the description, and I have some comments/questions:

The rest of the tasks look good.

@kpollich
Copy link
Member

kpollich commented Sep 1, 2023

I think storing the service token as a secret is best now that we're adding support for secrets.

@juliaElastic
Copy link
Contributor

@kilfoyle Sure thing, I've set up a demo for Monday.

@IanLee1521
Copy link

@juliaElastic - if it would be helpful, I'm interested in this feature and would be willing to give feedback and help review the docs that come up if there is a way to do that. Thanks!

@nimarezainia
Copy link
Contributor

@IanLee1521 will be in touch. thank you.

delanni pushed a commit to delanni/kibana that referenced this issue Nov 6, 2023
## Summary

Resolves elastic#104986

Opening up for review, the feature flag is off for now, and the TODO
items can come in follow up prs.

TODO:
- make service_token a secret field in output - depends on
elastic#157458
- should link to remote elasticsearch docs in UI - depends on
elastic/ingest-docs#530
- remote es connection check and report on UI - depends on fleet-server
to report unhealthy status if can't access the remote ES cluster
- enable feature flag when feature is ready

Added Remote ES output type, support to generate service token for
`fleet-server-remote` account, support to create and edit remote es
output.
Added validation to disallow making remote ES output as default for
integration data.

## How to test locally?
Enable feature flag by adding this to `kibana.dev.yml`:
```
xpack.fleet.enableExperimental: ['remoteESOutput']
```
See e2e test instructions here:
elastic/fleet-server#3051

## Generate service token

Create remote service token API:
```
POST kbn:/api/fleet/service_tokens
{
  "remote": true
}

// kibana logs out
[2023-10-19T16:22:05.776+02:00][DEBUG][plugins.fleet] Creating service token for account elastic/fleet-server-remote
```

## Add/Edit output flyout:
Add output flyout:

<img width="675" alt="image"
src="https://github.com/elastic/kibana/assets/90178898/dafc7d0e-05be-467f-871c-c4256fc833f6">

Edd output flyout:

<img width="660" alt="image"
src="https://github.com/elastic/kibana/assets/90178898/0d58fcfb-8c22-4e27-8719-db86ecba2e8d">

Remote ES output not allowed to be set as integrations data output in
agent policies, only as monitoring output:
<img width="690" alt="image"
src="https://github.com/elastic/kibana/assets/90178898/675279cd-1c89-4069-9e07-e448aa796885">
<img width="683" alt="image"
src="https://github.com/elastic/kibana/assets/90178898/6f67179d-b971-497f-9b04-3d3db5a42976">


Example API call to create/update output:
```
POST kbn:/api/fleet/outputs
{"name":"remote1","type":"remote-elasticsearch","hosts":["http://localhost:9200"],"is_default":false,"is_default_monitoring":false,"config_yaml":"","service_token":"token1","proxy_id":null}

PUT kbn:/api/fleet/outputs/39168010-6db8-11ee-9bf3-ed5492034535
{"name":"remote2","type":"remote-elasticsearch","hosts":["http://localhost:9200"],"is_default":false,"is_default_monitoring":false,"config_yaml":"","service_token":"token2","proxy_id":null}
```

### Checklist

- [x] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)
- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
kibanamachine pushed a commit to sabarasaba/kibana that referenced this issue Nov 20, 2023
## Summary

Relates elastic#104986

Hide Remote Elasticsearch output in serverless from Create/Edit output
flyout.

Should we also add validation to prevent creating it in API?


Verified locally by starting kibana in serverless mode:
<img width="751" alt="image"
src="https://github.com/elastic/kibana/assets/90178898/061514f3-25fe-4e52-ad85-194cc612bea7">

### Checklist

- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
kibanamachine added a commit to XavierM/kibana that referenced this issue Nov 23, 2023
## Summary

Related to elastic#104986

Updated doc link to the new remote ES output doc.
Leads to
https://www.elastic.co/guide/en/fleet/master/monitor-elastic-agent.html#external-elasticsearch-monitoring

<img width="682" alt="image"
src="https://github.com/elastic/kibana/assets/90178898/a8b9a6bc-60df-4826-8a6a-0fa45b6011bc">

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
jpdjere pushed a commit to jpdjere/kibana that referenced this issue Nov 28, 2023
## Summary

Related to elastic#104986

Making remote ES output's service_token a secret.

fleet-server change here:
elastic/fleet-server#3051 (comment)

Steps to verify:
- Enable remote ES output and output secrets in `kibana.dev.yml`
locally:
 ```
xpack.fleet.enableExperimental: ['remoteESOutput',
'outputSecretsStorage']
```
- Start es, kibana, fleet-server locally and start a second es locally
 - see detailed steps here: elastic/fleet-server#3051
- Create a remote ES output, verify that the service_token is stored as a secret reference
```
GET .kibana_ingest/_search?q=type:ingest-outputs
```
- Verify that the enrolled agent sends data to the remote ES successfully

<img width="561" alt="image" src="https://github.com/elastic/kibana/assets/90178898/122d9800-a2ec-47f8-97a7-acf64b87172a">
<img width="549" alt="image" src="https://github.com/elastic/kibana/assets/90178898/e1751bdd-5aaf-4f68-9f92-7076b306cdfe">



### Checklist

- [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios
juliaElastic added a commit that referenced this issue Nov 29, 2023
## Summary

Related to #104986

Found a bug in `diffOutputSecretPaths` where output secret was deleted
if updating an output without change of service_token. Added unit tests
to cover the logic.

Steps to verify:
- enable feature flags: `xpack.fleet.enableExperimental:
['remoteESOutput', 'outputSecretsStorage']`
- create a remote es output with a service_token
- check that the service_token is stored as secret in `.fleet-secrets`
- update host in remote es output
- verify that the secret is not deleted


### Checklist

- [x] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
@juliaElastic
Copy link
Contributor

Filed a bug for Agent that output errors are not reflected in unit state: https://github.com/elastic/ingest-dev/issues/2692

juliaElastic added a commit that referenced this issue Dec 5, 2023
## Summary

Closes #104986

Enable feature flags for `remoteESOutput` and `outputSecretsStorage`.

The feature is ready when #172181
and elastic/fleet-server#3127 is merged.

Output secret storage
[issues](#157458) are closed, so
I think the feature flag for that should be enabled too. cc
@jillguyonnet
juliaElastic added a commit that referenced this issue Dec 5, 2023
## Summary

Relates elastic/fleet-server#3116

Relates #104986

Reading latest output health state from
`logs-fleet_server.output_health-default` data stream by output id, and
displaying error state on UI - Edit Output flyout.

Steps to verify:
- enable feature flag `remoteESOutput`
- add `remote_elasticsearch` output, can be a non-existent host for this
test
- add the output as monitoring output of an agent policy
- run fleet-server with the changes
[here](elastic/fleet-server#3116)
- enroll an agent
- wait until fleet-server starts reporting degraded state in the output
health data stream
- open edit output flyout on UI and verify that the error state is
visible
- when the connection is back again (update host to a valid one, or
remote es was temporarily down), the error state goes away

<img width="568" alt="image"
src="https://github.com/elastic/kibana/assets/90178898/46d0cf95-6aa4-4f7c-8608-4362ada4eb6c">

The UI was suggested in the design doc:
https://docs.google.com/document/d/19D0bX7oURf0yms4qemfqDyisw_IYB-OVw4oU-t4lf18/edit#bookmark=id.595r8l91kaq8

### Notes/suggestions:

- We might want to add the output state to the output list as well
(maybe as badges like agent health?) as it's not too visible in the
flyout (have to scroll down).
- Also the error state will be reported earliest when an agent is
enrolled and fleet-server can't create api key, so not immediately when
the output is added. It would be good to show the time of the last state
(e.g. how we display on agents last checkin x minutes ago)
- I think it would be beneficial to display the healthy state too.

Added badges to output list:
<img width="1233" alt="image"
src="https://github.com/elastic/kibana/assets/90178898/07ff06ec-b778-4420-975b-b46a0a18c7cc">

Added healthy state UI to Edit output:
<img width="627" alt="image"
src="https://github.com/elastic/kibana/assets/90178898/4222d849-c957-41d7-9606-b58493264115">


### Checklist

Delete any items that are not applicable to this PR.

- [x] Any text added follows [EUI's writing
guidelines](https://elastic.github.io/eui/#/guidelines/writing), uses
sentence case text and includes [i18n
support](https://github.com/elastic/kibana/blob/main/packages/kbn-i18n/README.md)
- [ ] [Unit or functional
tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)
were updated or added to match the most common scenarios
@amolnater-qasource
Copy link

Hi Team,

As per the changes under elastic/fleet-server#3116 we have created 02 more testcases under testrail under links:

Please let us know if any other scenario needs to be added from our end.

Thanks!

@amolnater-qasource
Copy link

amolnater-qasource commented Dec 29, 2023

Hi Team,

As per changes under #173353, we have updated 01 testcase for this feature under Fleet Test suite at link:

cc: @nchaulet

Thanks!

@amolnater-qasource
Copy link

Hi Team,

We have executed 11 testcases under the Feature test run for the 8.12.0 release at the link:

Status:

PASS: 10
FAIL: 01

Build details:
VERSION: 8.12.0 BC4
BUILD: 70016
COMMIT: c2fda47
Artifact Link: https://staging.elastic.co/8.12.0-e9640208/summary-8.12.0.html

As the testing is completed on this feature, we are marking this as QA:Validated.

Please let us know if anything else is required from our end.
Thanks

@amolnater-qasource amolnater-qasource added QA:Validated Issue has been validated by QA and removed QA:Needs Validation Issue needs to be validated by QA labels Jan 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
QA:Validated Issue has been validated by QA Team:Fleet Team label for Observability Data Collection Fleet team
Projects
None yet