Skip to content

Fix: TF Agent host charm rebuilds COS scrape jobs based on supervisor config.#1027

Merged
rene-oromtz merged 1 commit intomainfrom
fix-tf-agent-charm
Apr 22, 2026
Merged

Fix: TF Agent host charm rebuilds COS scrape jobs based on supervisor config.#1027
rene-oromtz merged 1 commit intomainfrom
fix-tf-agent-charm

Conversation

@gntzio
Copy link
Copy Markdown
Collaborator

@gntzio gntzio commented Apr 22, 2026

Description

After deploying the Grafana Agent hosts in TEL, the /etc/grafana-agent.yaml config was not properly populated with scrape endpoints.

Following investigation, the suspicion is that since charm.py is instantiated on every Juju event, the in-memory self.scrape_endpoints variable is not consistently populated.

To address this, COSAgentProvider.scrape_configs are now sourced directly from the supervisord configuration.

Resolved issues

CERTTF-1038

Documentation

Charm unit tests.

Web service API changes

N/A

Tests

Charm unit tests.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 22, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 77.21%. Comparing base (ebf71bd) to head (da8c9a7).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1027   +/-   ##
=======================================
  Coverage   77.21%   77.21%           
=======================================
  Files         116      116           
  Lines       12065    12065           
  Branches      996      996           
=======================================
  Hits         9316     9316           
  Misses       2520     2520           
  Partials      229      229           
Flag Coverage Δ *Carryforward flag
agent 75.78% <ø> (ø)
cli 91.80% <ø> (ø) Carriedforward from ebf71bd
device 63.34% <ø> (ø) Carriedforward from ebf71bd
server 86.92% <ø> (ø) Carriedforward from ebf71bd

*This pull request uses carry forward flags. Click here to find out more.

Components Coverage Δ
Agent 75.78% <ø> (ø)
CLI 91.80% <ø> (ø)
Common ∅ <ø> (∅)
Device Connectors 63.34% <ø> (ø)
Server 86.92% <ø> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@gntzio gntzio changed the title Fix: TF Agent host charm rebuilds COS scrape jobs based supervisor config. Fix: TF Agent host charm rebuilds COS scrape jobs based on supervisor config. Apr 22, 2026
Copy link
Copy Markdown
Contributor

@rene-oromtz rene-oromtz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, as far as I remember, Grafana Agent required the relation with Prometheus to actually configure the scrape jobs.

Due to initial TEL deployment without Prometheus, that relation data may never got populated from our charm side... I think the key is actually the refresh_events addition which also makes sense because when we add new agents, we should also be refreshing the scrape_jobs with the new agents config.

Comment thread agent/charms/testflinger-agent-host-charm/src/supervisord.py
Comment thread agent/charms/testflinger-agent-host-charm/src/supervisord.py
Comment thread agent/charms/testflinger-agent-host-charm/src/supervisord.py
Comment thread agent/charms/testflinger-agent-host-charm/src/supervisord.py
Comment thread agent/charms/testflinger-agent-host-charm/src/supervisord.py
Comment thread agent/charms/testflinger-agent-host-charm/src/supervisord.py
@gntzio gntzio force-pushed the fix-tf-agent-charm branch from 55a02a9 to da8c9a7 Compare April 22, 2026 16:23
@gntzio
Copy link
Copy Markdown
Collaborator Author

gntzio commented Apr 22, 2026

@ajzobro, thanks for pointing out potential improvements.
As far as I can see, they are (mostly) not related to the changes in this PR itself.
I would appreciate it, if these could be addressed in a separate "improvements" PR.

Copy link
Copy Markdown
Collaborator

@ajzobro ajzobro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please address the issues that can easily be addressed within the scope of the code that was introduced here, including tests for the failure conditions.

CERTTF-1041 now eixsts as a strategy for handling (kicking along) the technical debt this time.

Comment thread agent/charms/testflinger-agent-host-charm/src/supervisord.py
Copy link
Copy Markdown
Collaborator

@ajzobro ajzobro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we write the tests for the error cases we will clearly see the issues in the code.

Comment thread agent/charms/testflinger-agent-host-charm/src/supervisord.py
Comment thread agent/charms/testflinger-agent-host-charm/src/supervisord.py
@ajzobro
Copy link
Copy Markdown
Collaborator

ajzobro commented Apr 22, 2026

I guess I'm not getting any more tests, huh?

Copy link
Copy Markdown
Collaborator

@ajzobro ajzobro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with creation of 1041 for improved testing and fixing the relied-upon-but-out-of-scope functions.

@rene-oromtz
Copy link
Copy Markdown
Contributor

Thanks @ajzobro! Agree that there is lots to improve. At least now you already identify what needs to be done it should be a bit faster to implement in 1041

@rene-oromtz rene-oromtz merged commit 426436e into main Apr 22, 2026
17 checks passed
@rene-oromtz rene-oromtz deleted the fix-tf-agent-charm branch April 22, 2026 22:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants