Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration tests: test upgrades including grace period #2796

Closed
Tracked by #2176
pchila opened this issue Jun 7, 2023 · 5 comments · Fixed by #2957
Closed
Tracked by #2176

Integration tests: test upgrades including grace period #2796

pchila opened this issue Jun 7, 2023 · 5 comments · Fixed by #2957
Assignees
Labels
enhancement New feature or request Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team

Comments

@pchila
Copy link
Contributor

pchila commented Jun 7, 2023

Describe the enhancement:
In agent integration tests it should be possible to fully test the upgrade process by waiting for the grace period to expire thus confirming that the upgrade is successful.
Currently we cannot do that because the test would time out first (see #2752)
We need to either increase the timeout of the tests or reduce the grace period used by agent in integration tests (maybe as proposed here

Describe a specific use case for the enhancement or feature:
Any upgrade scenario that needs to be tested using integration tests

What is the definition of done?
Upgrade integration test can wait for the upgrade to be completed (including the grace period) without timing out

@pchila pchila added enhancement New feature or request Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team labels Jun 7, 2023
@cmacknz
Copy link
Member

cmacknz commented Jun 7, 2023

+1 to allowing us to reduce the grace period rather than having a test that always takes 10+ minutes.

@pierrehilbert
Copy link
Contributor

Should we add this to this list: #2176 ?

@cmacknz
Copy link
Member

cmacknz commented Jun 7, 2023

Yes

@ycombinator
Copy link
Contributor

ycombinator commented Jun 27, 2023

The general approach I'm thinking of taking here is:

  • Add upgrade watcher settings (grace period, etc.) to Agent configuration (elastic-agent.yml). TBD on whether we should document these and include them in the reference configuration file, as we most likely don't want end-users messing around with these.
  • Have the watch sub-command load the Agent configuration file, read these settings, and use them instead of the hardcoded constants today. The constants will serve as default values for these settings.
  • Provide tooling in the testing framework to modify the default Agent configuration before Agent is installed and run. Upgrade tests can use this tooling to reduce the upgrade watcher grace period, etc. as part of the test.

An alternative approach I considered was introducing the idea of a "developer-only" Agent configuration file, something like elastic-agent.dev.yml, which would reside as a sibling of elastic-agent.yml. If this file were present, it would be read by the upgrade watcher and the configuration within it would be used instead of the hardcoded constants today. We would, of course, not document this developer-only feature in the official Agent docs.

A third approach I'm considering is sort of a hybrid of the two mentioned above. Similar to the second approach, there could still be an optional elastic-agent.dev.yml file. If present, it would be read by the config package and merged with the contents of elastic-agent.yml. The upgrade watcher would then use the merged configuration instead of the hardcoded constants today.

I'm currently leaning towards the third approach, as it seems most extensible while also not polluting the main/production Agent configuration file.

@cmacknz
Copy link
Member

cmacknz commented Jun 28, 2023

TBD on whether we should document these and include them in the reference configuration file, as we most likely don't want end-users messing around with these.

You may as well document it to save future developers time in discovering this exists, I can't see users modifying this unless we give them a reason to do it. I could see us choosing to set this value to zero as a way of disabling the watcher if we ever find a bug in it, which might be useful in support cases.

I think just putting this in the main agent configuration file is fine, adding the .dev.yml file is additional complexity to protect us from an unlikely outcome. Realistically anyone who can figure out what the upgrade watcher configuration does can also figure out the .dev.yml file exists and just try to use that instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants