Description
The default provided CI image is a great way to start tracing CI/CD jobs. https://github.com/agardnerIT/tracepusher/tree/main/samples/gitlab
Problem to solve
When a job uses a custom image, this results in tracepusher not being available, and I as a user need to adapt to re-enabling tracepusher with the required steps. This makes using tracepusher harder because I need to first analyze my pipelines for custom container images and then adapt the installation steps for all CI/CD jobs.
Example
# CPU waste simulator
build-cpp-stress:
stage: build
image: debian:11 ###### THIS
script:
- ci/build_cpp_stress.sh
The job fails with
Tasks for jobs with custom images
- Install openssl and generate the IDs
- Install python3 and download tracepusher
This can be achieved with before_script
in the default
job and requires installation existing checks to avoid wasting too much time while tracing jobs.
Proposal
- Extend before_script examples to download tracepusher and all its requirements
- As a second iteration, create a script that detects the /etc/os-releases and runs distribution specific installation steps. Use that script in
before_script
A first test script for detecting Debian/Ubuntu and installing all requirements:
# Do not change the before_script or after_script sections
default:
before_script: |
# Install requirements if not using the default image. TODO: Adapt to more distributions
if [ ! -f /app/tracepusher.py ]; then
if egrep -q "Debian|Ubuntu" /etc/os-release; then
apt update && apt -y --no-install-recommends install curl wget python3 python3-requests openssl
fi
curl -L https://github.com/agardnerIT/tracepusher/archive/refs/tags/0.7.0.tar.gz -o /tmp/tracepusher.tar.gz
tar xzf /tmp/tracepusher.tar.gz -C /tmp/ && mkdir -p /app && cp /tmp/tracepusher*/tracepusher.py /app/
fi
echo subspan_start_time=$(date +%s) >> /tmp/vars.env
echo subspan_id=$(openssl rand -hex 8) >> /tmp/vars.env
after_script:
- source /tmp/vars.env . # reconsitute env vars from before_script
- subspan_end_time=$(date +%s)
- duration=$(( subspan_end_time - subspan_start_time ))
- python3 /app/tracepusher.py
--endpoint=$OTEL_COLLECTOR_ENDPOINT
--service-name=$CI_PROJECT_NAME
--span-name=$CI_JOB_NAME
--duration=$duration
--trace-id=$main_trace_id
--span-id=$subspan_id
--parent-span-id=$main_span_id
--time-shift=True
Example in https://gitlab.com/gitlab-de/use-cases/observability/devsecops-efficiency/slow-pipeline-for-analysis/-/merge_requests/1#note_1455101443 The trace view has other bugs - some spans are not grouped into the pipeline trace ID for some reason.