Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configure Elastic Agent host path volume to point to correct path #5890

Merged
merged 4 commits into from
Jul 29, 2022

Conversation

pebrc
Copy link
Collaborator

@pebrc pebrc commented Jul 27, 2022

Fixes #4428

Mount the host path volume to point to the path where Elastic Agent stores its local state. This should make sure that while running on the same version no duplicate ingestion is happening. After version upgrades the internal sub-paths within the state directory still change see elastic/elastic-agent#750 and events will be re-ingested.

This is a potentially breaking change as I am configuring the host path volume now independently of the mode the agents are running in: Fleet or stand-alone. It will be breaking for users who have configured a deployment with multiple replicas where more than one replicas is placed by the Kubernetes scheduler on the same K8s node.

Can we mitigate the effects of this? We could optimistically assume that Deployments with multiple replicas will be mainly used for Fleet Server and not use the host path volumes in this case.

I tried to think of a test case but it is tricky without restarting Elastic Agent Pods to see if duplicates are ingested or not and no clear indication of ingest state (i.e. have we waited long enough for duplicates to appear etc)

@pebrc pebrc added the v2.4.0 label Jul 27, 2022
@botelastic botelastic bot added the triage label Jul 27, 2022
@botelastic botelastic bot removed the triage label Jul 27, 2022
@pebrc pebrc added the >bug Something isn't working label Jul 27, 2022
@barkbay barkbay self-assigned this Jul 28, 2022
Copy link
Contributor

@barkbay barkbay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are considering to flag this as a breaking change, and since Agent is still considered as "experimental", I'm wondering if we should not change the path on the host node's filesystem

  - hostPath:
      path: /var/lib/<ns>/<name>/agent-data
      type: DirectoryOrCreate

I never realised that the first directory created by the operator is named according to the namespace name. However /var/lib is not exclusively used by ECK, I'm then wondering if we should not consider the case where there might be a conflict with an existing directory. For example on GKE nodes there is an existing directory named /var/lib/metrics. Creating an agent in a namespace named metrics would store Agent data in a directory that might have different purposes and lifecycle. Also it does not seem to be consistent with the naming from the Agent project which is /usr/share/elastic-agent/...

Should we change the directory path on the host to something like /usr/share/elastic-agent/server/<ns>/<name>/state ?

@pebrc
Copy link
Collaborator Author

pebrc commented Jul 28, 2022

Should we change the directory path

Yes I think that sounds reasonable.

/usr/share/elastic-agent/server///state

Why the server bit?

@pebrc
Copy link
Collaborator Author

pebrc commented Jul 28, 2022

Also /var/lib seems to the right place for application state data?

Should we do /var/lib/elastic-agent/<ns>/<name>/state?

@barkbay
Copy link
Contributor

barkbay commented Jul 28, 2022

Why the server bit?

I was wondering if it would make sense to add the Fleet "type" in the path, but you're right it's not required.

Should we do /var/lib/elastic-agent/<ns>/<name>/state?

The intent was to be consistent with the path in the Pod. It is however definitely not a strong opinion and you're right, I think /var/lib is better suited.

@barkbay
Copy link
Contributor

barkbay commented Jul 28, 2022

Can we mitigate the effects of this? We could optimistically assume that Deployments with multiple replicas will be mainly used for Fleet Server and not use the host path volumes in this case.

Maybe we could mention it in our documentation as a "limitation"?

Copy link
Contributor

@barkbay barkbay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pebrc
Copy link
Collaborator Author

pebrc commented Jul 29, 2022

run/e2e-tests tags=agent

@pebrc
Copy link
Collaborator Author

pebrc commented Jul 29, 2022

run/e2e-tests tags=agent

@pebrc pebrc merged commit dcd19e8 into elastic:main Jul 29, 2022
@david-kow david-kow changed the title Configure host path volume to point to correct path Configure Elastic Agent host path volume to point to correct path Aug 3, 2022
fantapsody pushed a commit to fantapsody/cloud-on-k8s that referenced this pull request Feb 7, 2023
Mount the host path volume to point to the path where Elastic Agent stores its local state. This should make sure that while running on the same version no duplicate ingestion is happening. After version upgrades the internal sub-paths within the state directory still change see elastic/elastic-agent#750 and events will be re-ingested.

This is a potentially breaking change as I am configuring the host path volume now independently of the mode the agents are running in: Fleet or stand-alone. It will be breaking for users who have configured a deployment with multiple replicas where more than one replicas is placed by the Kubernetes scheduler on the same K8s node.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>breaking >bug Something isn't working v2.4.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Adjust Elastic Agent controller to make use of new container state handling in 7.13
2 participants