Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix log level value reported by Agent to Fleet #4838

Merged
merged 9 commits into from
Jun 13, 2024

Conversation

pchila
Copy link
Contributor

@pchila pchila commented Jun 3, 2024

What does this PR do?

This PR fixes the log level reported by Agent when checking in with Fleet.
Since merging #3090, a managed elastic-agent has now 2 sources of configuration for log level:

  1. agent-specific log level setting saved in fleet.enc
  2. log level setting in the Fleet policy itself (with lower priority than option 1.)

Before this change, the agent was sending only the information coming from 1. (with a default value equal to info), now we send back to Fleet the actual log level specified in coordinator state (which is the end result of evaluating the multiple settings with the correct priority)

There has also been a modification of the integration test related to the feature with additional assertions on log level received by Fleet.

Why is it important?

The log level is now correctly displayed in Fleet UI regardless for each agent, regardless of how it's been set (agent settings, policy or default value)

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • [ ] I have made corresponding changes to the documentation
  • [ ] I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • [ ] I have added an entry in ./changelog/fragments using the changelog tool
  • I have added an integration test or an E2E test

Disruptive User Impact

How to test this PR locally

  1. Spin up a Stack deployment
  2. Create a policy without any specific log level settings (check in Advanced settings under the Policy's settings Tab
    image
  3. Install and enroll an agent and verify that the default log level reported is info
  4. Set some log level in policy and check in the agent details that the change is reported correctly (Ignore the offline state in the image below, the screenshot was taken after the agent has been uninstalled)
    image
  5. Set some specific (different!) log level in the agent settings tab and check that in a few seconds that change is also reflected in UI
  6. Have fun! Try any combination of setting/unsetting the log level and verify that the agent log level is updated within a reasonable time in the agent details tab

Related issues

Questions to ask yourself

  • How are we going to support this in production?
  • How are we going to measure its adoption?
  • How are we going to debug this?
  • What are the metrics I should take care of?
  • ...

@pchila pchila added bug Something isn't working Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team Team:Elastic-Agent Label for the Agent team backport-skip skip-changelog labels Jun 3, 2024
@pchila pchila self-assigned this Jun 3, 2024
@pchila pchila marked this pull request as ready for review June 3, 2024 19:16
@pchila pchila requested a review from a team as a code owner June 3, 2024 19:16
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

Copy link
Contributor

@michalpristas michalpristas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@@ -329,6 +329,15 @@ func (f *FleetGateway) execute(ctx context.Context) (*fleetapi.CheckinResponse,
// convert components into checkin components structure
components := f.convertToCheckinComponents(state.Components)

if ecsMeta.Elastic == nil || ecsMeta.Elastic.Agent == nil {
// escMeta struct is incomplete: log a warning
f.log.Warnw("Agent ECSMetadata struct is missing/incomplete", "elastic_ecs_metadata", ecsMeta.Elastic)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't this become a bit chatty in case ecs metadata is in fact incomplete?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can, the fact is that the current implementation of AgentInfo.ECSMetadata() guarantees that those structures exist but they are pointers so I would like to avoid a segmentation fault when something changes...
Maybe I can replace that with a unit test + assertion that AgentInfo.ECSMetadata() always returns initialized Elastic and Elastic.Agent pointers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed in 4973844 in favor of assertions on ECSMetadata() unit tests

@pchila pchila requested a review from michalpristas June 4, 2024 15:13
Copy link
Contributor

@blakerouse blakerouse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically a 1 line change, with lots of tests!

Looks good.

@jlind23
Copy link
Contributor

jlind23 commented Jun 10, 2024

buildkite test it

@jlind23 jlind23 enabled auto-merge (squash) June 10, 2024 14:41
@pchila pchila force-pushed the fix-loglevel-value-for-fleet-checkin branch from 2657ada to 75403ca Compare June 11, 2024 10:00
@ycombinator ycombinator removed the request for review from kaanyalti June 11, 2024 22:00
@jlind23 jlind23 merged commit a206cf7 into elastic:main Jun 13, 2024
12 of 14 checks passed
rdner added a commit that referenced this pull request Jun 14, 2024
rdner added a commit that referenced this pull request Jun 14, 2024
pchila added a commit to pchila/elastic-agent that referenced this pull request Jun 17, 2024
pchila added a commit that referenced this pull request Jun 18, 2024
…" (#4938)" (#4941)

* Revert "Revert "Fix log level value reported by Agent to Fleet (#4838)" (#4938)"

This reverts commit 2b2e8d0.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-skip bug Something isn't working skip-changelog Team:Elastic-Agent Label for the Agent team Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

log_level metadata is not accurate
5 participants