-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Setup parca-agent for ci-runs #187
Conversation
# Hide the setting of the bearer token to the parca agent. | ||
set +x | ||
{ | ||
sudo snap set parca-agent remote-store-bearer-token="${{PARCA_BEARER_TOKEN}}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The $PARCA_BEARER_TOKEN
is setup in the credentials.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need to set up an || true
in case it fails? I also don't know if this sort of thing ends up in bash history, etc.
fi | ||
|
||
sudo snap install parca-agent --classic | ||
sudo snap set parca-agent metadata-external-labels="machine=ci-run-${{BOOTSTRAP_PROVIDER}}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a thought, we could add some labels like TEST_RUNNER_NAME
to make it easier to correlate to issues we eventually find?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe, I think we just care about the provider/substrate rather what test is running. Otherwise we might start gaming for a specific test and not juju as a whole.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AIUI, this actually sets it up on the host whenever we have an lxd run, is that correct? So this isn't setting up parca-agent in lxd, it is setting up parca-agent on hosts that run lxd.
I have a slight concern about side effects if you are trying to run the CI suite locally. In general, we might want to have a CI step for Jenkins ephemeral bots to install and configure parca-agent, but I'm kind of against doing it as a side effect of running a test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM thanks!
Start collecting profiling data for ci-runs (specifically looking at longer gating test runs). We want the data for 2 reasons: 1. See if there are any potential bottlenecks in terms of performance that we can see when the jujud and jujud-controller is running. 2. We can use profile guided optimisations (PGO) from the runs to apply to the binaries. This currently only applies to lxd deployments. For other providers we will need to install it via cloud-init or bake it into an image. This is an experiment to see if it's useful or not.
b357113
to
7ffa6bb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is going to be of limited use.
Test scenarios won't give enough quality signal to inform profile-guided optimisation.
The controllers are so short-lived and so narrow in focus, that we won't see the kind of issues (growth over time for example) that profiling in-theatre shows.
The problem is, we don't as a Juju team run long-running controllers. Most of our work is driven by features and tearing down controllers is a natural practice. I would say that even some information is better than no information. I will mention that the tests are inherently backwards in the way they run. The concept was to keep one singular controller runner per substrate for the lifetime of the gating tests. At the moment they're destroyed better each test. Making that change alone would ensure that we're recreating the lifetime of a controller with models added and removed, although in a condensed time frame. That would at least elevate the "narrow in focus" concern (which I do agree with). I should have stated that this was intended to be a prototype for LXD to ensure that we are surfacing the parca-agent information to polar signals. |
Start collecting profiling data for ci-runs (specifically looking at longer gating test runs). We want the data for 2 reasons:
This currently only applies to lxd deployments. For other providers we will need to install it via cloud-init or bake it into an image.
This is an experiment to see if it's useful or not.