Skip to content
This repository was archived by the owner on Jun 9, 2025. It is now read-only.

Conversation

@beyang
Copy link
Member

@beyang beyang commented Mar 22, 2020

  • Removes the Jaeger Operator in favor of including the Jaeger k8s files directly in deploy-sourcegraph. (The Jaeger Operator required permissions to create a Custom Resource Definition and added an additional layer of complexity to reason about that wasn't that useful.)
  • Installs the Jaeger all-in-one instance as part of the default Sourcegraph installation.
    • --memory.max-traces=20000 to limit memory consumption and memory/cpu resources of Jaeger resource objects are limited, but sufficient to handle 20000 spans (under the assumption that each span is less than 1KB)
  • Defines Dockerfiles for jaeger-agent and jaeger-all-in-one. These build off the official images, but (1) include common unix commands for debuggability and (2) run as non-root users.
  • Updates docs.

TODO

  • Determine if bind-tools is necessary (no)

Post-merge follow-up

  • Note in the CHANGELOG the increased resource utilization overhead that Jaeger introduces (1/10th of a cpu, 100mb per jaeger-agent, 1/2 cpu, 500mb for the jaeger-all-in-one)
  • Link to the new docs from CHANGELOG
  • Jaeger for Docker Compose deployment
    • Address Stephen's comments pertaining to the Docker Compose environment.
  • Jaeger for Single Docker container deployment

@beyang beyang force-pushed the bl/non-operator-jaeger branch 2 times, most recently from 02248ab to 5e058f2 Compare April 4, 2020 21:20
@beyang beyang changed the title WIP: non-operator Jaeger Jaeger without the Jaeger Operator Apr 4, 2020
@beyang beyang marked this pull request as ready for review April 4, 2020 21:24
@beyang beyang force-pushed the bl/non-operator-jaeger branch 4 times, most recently from 3b2e8a9 to 6d6c6d8 Compare April 5, 2020 02:00
Copy link
Member

@keegancsmith keegancsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Few inline questions. Are there any implications to requiring an extra 100m CPU and mem per replica in our clusters?

RUN apk update
RUN apk add bash curl

COPY --from=base /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ca-certificates.crt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could just install ca-certificates in the above line, would seem "safer". Also do we need bind-tools @slimsag? this agent does need to resolve the address of the jaeger collector. However, I don't think bind-tools is needed in k8s?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is bind-tools necessary for DNS resolution? If so, it may be required for the "send spans to external jaeger-collector" use case (which is documented, but I haven't actually tested). I will check that scenario.

apk add ca-certificates is indeed what is run in the standard jaeger-agent image. We could run it here instead of copying the value from the standard jaeger-agent image. Benefit would be we might have slightly more up-to-date certs, but at the cost of not as closely tracking the source image. My preference is to track the base image as closely as possible (and update the base image version if an update is required).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is bind-tools necessary for DNS resolution?

I think at some customers using the deploy-sourcegraph-docker method needed the bind-tools package. I don't think it is needed for k8s clusters, but just wanted to confirm that.

If so, it may be required for the "send spans to external jaeger-collector" use case (which is documented, but I haven't actually tested). I will check that scenario.

Isn't it also needed to send to the collector in the cluster? IE we do a dns lookup, don't use a hardcoded IP.

apk add ca-certificates is indeed what is run in the standard jaeger-agent image . We could run it here instead of copying the value from the standard jaeger-agent image. Benefit would be we might have slightly more up-to-date certs, but at the cost of not as closely tracking the source image. My preference is to track the base image as closely as possible (and update the base image version if an update is required).

Yeah sounds good to just keep this line as it is then.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it also needed to send to the collector in the cluster? IE we do a dns lookup, don't use a hardcoded IP.

I don't think so? At least, the jaeger-agent has no problem talking to the collector in the test cluster I spun up.

Copy link
Member

@emidoots emidoots Apr 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@beyang
Copy link
Member Author

beyang commented Apr 7, 2020

Are there any implications to requiring an extra 100m CPU and mem per replica in our clusters?

Yes, if a cluster is at 100% resource utilization, this might require the addition of another node to the cluster. I will call this out in the docs.

@beyang beyang force-pushed the bl/non-operator-jaeger branch 2 times, most recently from a5e9d89 to 05cee67 Compare April 7, 2020 23:29
@beyang beyang force-pushed the bl/non-operator-jaeger branch from 05cee67 to c02576d Compare April 7, 2020 23:30
FROM jaegertracing/jaeger-agent:$JAEGER_VERSION as base

FROM alpine:3.11.5
RUN apk --no-cache bash curl
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these Dockerfiles ONLY going to be used by deploy-sourcegraph, or are they also going to be used by deploy-sourcegraph-docker?

If the latter, these should all be moved into https://github.com/sourcegraph/sourcegraph/tree/master/docker-images and follow the conventions there.

This is also extremely important because I want us to get all Docker images versioned alongside Sourcegraph (3.14.1 instead of <arbitrary> tags).

# Web HTTP
EXPOSE 16686

VOLUME ["/tmp"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this Docker image ever be used in non-Kubernetes environments? If yes, you MUST create /tmp within the image with the desired permissions or else the container will not be able to write to /tmp in certain environments. See https://sourcegraph.com/github.com/sourcegraph/sourcegraph@730f31fb13b8380ed02fd01fe4a8583e43b029d4/-/blob/cmd/precise-code-intel/api-server/Dockerfile#L30-37

@beyang beyang force-pushed the bl/non-operator-jaeger branch from c298c0a to 96c4cce Compare April 8, 2020 00:07
@beyang beyang force-pushed the bl/non-operator-jaeger branch from 7da1404 to 6b44be1 Compare April 8, 2020 04:37
@beyang beyang force-pushed the bl/non-operator-jaeger branch from 9b4bd53 to d7db009 Compare April 8, 2020 05:30
@beyang
Copy link
Member Author

beyang commented Apr 8, 2020

I will address @slimsag's comments in a follow up to this PR (as I tackle Jaeger in Docker Compose).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants