Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

maint: adding proper smoke test(s) #310

Merged
merged 28 commits into from
Nov 30, 2023
Merged

Conversation

robbkidd
Copy link
Member

@robbkidd robbkidd commented Nov 1, 2023

Which problem is this PR solving?

Description of the changes

Set up a smoke test, details included in the new make targets below:

  • smokey_cluster_create: Create kind cluster
  • smokey_collector_install: Install collector via helm with new collector values, including a filter processor that only keeps echoserver spans from pods with a response code of 405
  • smokey_agent_install: Install agent via helm with new agent values, including usage of a local image (installed if not available in the shell) and export to the collector
  • smokey_echo_job: Install echoserver and run a job that sends a POST request expecting a 405 response
  • smokey_copy_output: After a short wait, copy output from the newly created file saved with telemetry from collector
  • smokey_verify_output: Make assertions via the new verify.bats bats test (similar to previous tests on other distros and in go auto-instrumentation) that the output contains the expected span and attributes
  • smoke: Do all of the things above
  • unsmoke: Tear it all down when done

Also updated PR, main, and release workflows to run the new smoke test.

How to verify that this has the expected result

  • Run make smoke and get a bunch of passing tests - also check out the (git ignored) json file that contains the remaining span that's being tested. make unsmoke to spin it down.

@robbkidd robbkidd added type: maintenance The necessary chores to keep the dust off. version: no bump A PR with maintenance or doc changes that aren't included in a release. labels Nov 1, 2023
@robbkidd robbkidd added this to the Pre beta milestone Nov 1, 2023
@robbkidd robbkidd self-assigned this Nov 1, 2023
@MikeGoldsmith MikeGoldsmith self-assigned this Nov 22, 2023
JamieDanielson and others added 4 commits November 27, 2023 13:35
echoserver is separate for now bc data flows without it (at least local)
debug is for pyroscope so removed for now
not sure we need it here
@JamieDanielson
Copy link
Contributor

I've made a few changes, and can get telemetry to the collector and the file if I set the endpoint based on the IP address of the collector pod, but haven't yet figured out how to set it based on the name.

The latest change uses the helm chart again with a local image - that's currently commented out in Makefile but should probably be uncommented when this is ready to go so it builds in CI.

After running make smoke, if you adjust values in either of the files you can simply update with helm upgrade smokey-agent honeycomb/network-agent --values smoke-tests/agent-helm-values.yaml for agent and helm upgrade smokey-collector open-telemetry/opentelemetry-collector -f smoke-tests/collector-helm-values.yaml for collector. Add a --dry-run flag at the end of the command to see the full rendered output (also here is a collector helm chart example for reference).

I also temporarily moved the echoserver stuff out of the smoke target for now while working on getting it to send to collector, because it was so noisy with local kube-system stuff anyway it didn't seem necessary to wait for the echoserver to spin up. With the current output it will be a pretty large file to work with for testing... but we may be able to get a decent test in with one of the liveness probe events. TBD.

MikeGoldsmith and others added 8 commits November 28, 2023 15:35
For some reason, the agent pod won't resolve DNS name for the collector
k8s service. (Maybe because it's a daemonset pod on the host network?
Different DNS available to it?)

So we need to look up the collector service's IP and use that in the
agent's Helm install values. That's done in a Make function
get_collector_ip that gets called inline to set an env var for envsubst.

BONUS! Conditionally build the docker image if it doesn't exist when
smoke testing. This necessitated breaking up the steps of smoke into
separate targets that all get set as prereqs of smoke.

Co-authored-by: Jamie Danielson <jamiedanielson@honeycomb.io>
Use smoke job with echo for more precise test.
Current output also includes all the live/ready probes in k8s
which can be difficult to pin down. Using the echoserver
command with a specific 404 code makes it easier to pinpoint a span.
Smoke target now includes smoke job, copying of collector
log output, and running the bats test.
Gitignore the copy of the collector log output.
WIP to limlit the test to only the echoserver 404 span.
All scopes have the same hny-network-agent name so it is difficult
to target only a specific resourceSpan for this request.
For now there is a traces-temp-test.json file being used that contains
only the resource span for the echo server (copied from the log output).
Tests check for base requirements like span ID and http attributes,
but also includes custom stuff like resource attributes and headers.
@JamieDanielson
Copy link
Contributor

Current state as of last push: running make smoke will successfully spin up the agent, spin up the collector, spin up the echoserver, run the smoke-job with a 404 (intentional) response, copy output from the collector log file, and run assertions on the trace using bats*.
Big * at the moment: I haven't yet been able to narrow down the test to only look at the echoserver spans. Because of this, the output is somewhat nondeterministic depending on how many health probes succeed while the thing is up. I've copied a resourceSpan for echoserver into a file smoke-tests/traces-temp-test.json so I could write the tests against it.

✓ Agent includes service.name in resource attributes
✓ Agent span includes custom resource attributes
✓ Agent emits a span name '{http.method}' (per semconv)
✓ Agent includes specified headers as attributes
✓ Agent includes k8s source attributes
✓ Agent includes k8s destination attributes
✓ HTTP span includes http.method attribute
✓ HTTP span includes http.target attribute
✓ HTTP span includes http.status_code attribute
✓ HTTP span includes client error attribute with 404 response
✓ Trace ID present in all spans
✓ Span ID present in all spans

12 tests, 0 failures

The smoke test builds the docker image, so remove the extra build step.
Also, we don't need to use QEMU here bc we're not building both arches.
Finally, docker-build was a bit confusing as a job name, especially when
looking at the workflow screen. That is now called build-and-test.
@JamieDanielson JamieDanielson marked this pull request as ready for review November 29, 2023 20:37
@JamieDanielson JamieDanielson requested a review from a team November 29, 2023 20:37
@JamieDanielson
Copy link
Contributor

Note: I renamed the PR workflow job from docker-build to build-and-test, which seems to result in this "expected" status check that will never resolve. If that causes us problems I can revert that name change.

@robbkidd
Copy link
Member Author

I'm not against. We'll need to update the branch protection rules for main to replace docker-build with build-and-test.

Copy link
Contributor

@JamieDanielson JamieDanielson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So because Robb opened this PR he can't approve it. I can approve it but uhh I'm a little biased as I picked it up and did the later work on it 😅 . I'm approving for the green checks - feel free to merge if y'all are good with it or let me know further feedback to be addressed!

Copy link
Contributor

@MikeGoldsmith MikeGoldsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've all worked on this, so we can collectively approve & merge. Good work @robbkidd and @JamieDanielson 🎉

@MikeGoldsmith MikeGoldsmith merged commit 9b5bfcd into main Nov 30, 2023
4 checks passed
@MikeGoldsmith MikeGoldsmith deleted the robb.smoke-em-if-you-got-em branch November 30, 2023 09:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: maintenance The necessary chores to keep the dust off. version: no bump A PR with maintenance or doc changes that aren't included in a release.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add basic smoke test
3 participants