diff --git a/DEVELOPING.md b/.github/CONTRIBUTING.md similarity index 100% rename from DEVELOPING.md rename to .github/CONTRIBUTING.md diff --git a/.github/pull_request_template.md b/.github/pull_request_template.md index ee0f1a3f42..be477a5e6f 100644 --- a/.github/pull_request_template.md +++ b/.github/pull_request_template.md @@ -12,6 +12,6 @@ A few sentences describing the overall goals of the pull request's commits. - [ ] I made sure to update `./CHANGELOG.yml`. - [ ] I made sure to add any docs changes required for my change (including release notes). - [ ] My change is adequately tested. - - [ ] I updated `DEVELOPING.md` with any special dev tricks I had to use to work on this code efficiently. + - [ ] I updated `CONTRIBUTING.md` with any special dev tricks I had to use to work on this code efficiently. - [ ] I updated `TELEMETRY.md` if I added, changed, or removed a metric name. - [ ] Once my PR is ready to have integration tests ran, I posted the PR in #telepresence-dev in the datawire-oss slack so that the "ok to test" label can be applied. diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 0000000000..f9b51d92b0 --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,385 @@ +# Developing Telepresence 2 + +## Set up your environment + +### Development environment + + - `TELEPRESENCE_REGISTRY` (required) is the Docker registry that + `make push-images` pushes the `tel2` and `telepresence` image to. + For most developers, the easiest thing is to set it to + `docker.io/USERNAME`. + + - `TELEPRESENCE_VERSION` (optional) is the "vSEMVER" string to + compile-in to the binary and Docker image, if set. Otherwise, + `make` will automatically set this based on the current Git commit + and the current time. + + - `DTEST_KUBECONFIG` (optional) is the cluster that is used by tests, + if set. Otherwise the tests will automatically use a K3s cluster + running locally in Docker. It is not normally necessary to set + this, but it is useful to set it in order to test against different + Kubernetes versions/configurations than what + https://github.com/datawire/dtest uses. + + - `DTEST_REGISTRY` (optional) is the Docker registry that images are + pushed to by the tests, if set. Otherwise, the tests will + automatically use a registry running locally in Docker + ("localhost:5000"). The tests will push images named `tel2` with + various version tags. It is not necessary to set this unless you + have set `DTEST_KUBECONFIG`. + + If `DTEST_KUBECONFIG` is pointing to a pre-existing cluster, and you + would like the `DTEST_REGISTRY` to point to a private registry that is + hosted in that cluster, then you can use `make private-registry`. It + will deploy a registry and set it up so that it is reachable at + `localhost:5000`, both from the cluster and from the local workstation. + + - `DEV_TELEPRESENCE_VERSION` (optional) if set to a version such as + `v2.12.1-alpha.0`, the integration tests will assume that this version + is pre-built and available, both as a CLI client (accessible from the + current runtime path), and also pre-pushed into a pre-existing cluster + accessible from `DTEST_KUBECONFIG`. In other words, if this is set, no + no binaries will be built or pushed so the development + test cycle + can be quit rapid. + + - `DEV_CLIENT_IMAGE` (optional) can be set to the fully qualified name of + an alternative image to use for the docker image used for the containerized + daemon when running in docker mode. + + - `DEV_MANAGER_IMAGE` (optional) can be set to the fully qualified name of + an alternative image to use for the traffic manager. + + - `DEV_AGENT_IMAGE` (optional) can be set to the fully qualified name of + an alternative image to use for the traffic agent. + + - `DEV_USERD_PROFILING_PORT` and `DEV_ROOTD_PROFILING_PORT` (optional) if + set, will cause the `telepresence connect` calls in the integration tests + to start daemons where pprof is enabled (see + [Profiling the daemons](#profiling_the_daemons) below). + +The above environment can optionally be provided in a `itest.yml` file +that is placed adjacent to the normal `config.yml` file used to configure +Telepresence. The `itest.yml` currently has only one single entry, the +`Env` which is a map. It can look something like this: + +```yaml +Env: + DEV_TELEPRESENCE_VERSION: v2.12.1-alpha.0 + DTEST_KUBECONFIG: /home/thhal/.kube/testconfig +``` + +The output of `make help` has a bit more information. + +### Running integration tests + +Integration tests can be run using `go test ./integration_test/...`. For +individual tests, use the `-m.testify=` flag. Verbose output using +the `-v` flag is also recommended, because the tests are built with human +readable output in mind and timestamps can be compared to timestamps found +in the telepresence logs. + +Example of running one test with existing cluster and registry: +``` +make private-registry +export DTEST_KUBECONFIG= +export DTEST_REGISTRY=localhost:5000 +go test ./integration_test/... -v -testify.m=Test_InterceptDetailedOutput +``` + +If you run these tests on a Mac, localhost won't work. Please use the docker hub, or this value for the registry: + +```cli +export DTEST_REGISTRY=host.docker.internal:5000 +``` + +You must also set this in your docker engine settings: + +```json +{ + "insecure-registries": [ + "host.docker.internal:5000" + ] +} +``` + +The test takes about a minute to complete when using an existing cluster +and a private registry created by `make private-registry`. During that time +it: +- builds the traffic-manager image +- pushes the image to the registry +- builds the client binary +- creates two namespaces for the test +- performs a helm install of a namespace scoped traffic-manager +- runs the test +- uninstalls the traffic-manager +- deletes the namespaces + +The first two can be omitted (and are omitted when the tests run +from CI) by building the binary using `make build`. +Example of running test with existing client and traffic-mananager: + +``` +make private-registry +export TELEPRESENCE_VERSION=v2.12.1-alpha.0 +export TELEPRESENCE_REGISTRY=localhost:5000 +make build +make push-images +export DTEST_KUBECONFIG= +export DTEST_REGISTRY=$TELEPRESENCE_REGISTRY +export DEV_TELEPRESENCE_VERSION=$TELEPRESENCE_VERSION + +# Run any number of indivitual test with this setup +go test ./integration_test/... -v -testify.m=Test_InterceptDetailedOutput +``` + +The `DEV_TELEPRESENCE_VERSION` tells the integration test that a client and +a traffic-manager of that version has been prebuilt and pushed. This usually +shortens the time for the test with about 20 seconds. + +### Runtime environment + + - The main thing is that in your `~/.config/telepresence/config.yml` + (`~/Library/Application Support/telepresence/config.yml` on macOS) + file you set `images.registry` to match the `TELEPRESENCE_REGISTRY` + environment variable. See + https://www.getambassador.io/docs/telepresence/latest/reference/config/ + for more information. + + - `TELEPRESENCE_VERSION` is is the "vSEMVER" string used by the + `telepresence` binary *if* one was not compiled in (for example, if + you're running it with `go run ./cmd/telepresence` rather than + having built it with `make build`). + + - `TELEPRESENCE_AGENT_IMAGE` is is the "name:vSEMVER" string used when + the telepresence auto-installs the traffic-manager unless the config.yml + overrides it by defining `images.agentImage`. + + - You will need have a `~/.kube/config` file (or set `KUBECONFIG` to + point to a different file) file in order to connect to a cluster; + same as any other Kubernetes tool. + + - You will need to have [mockgen](https://github.com/golang/mock) installed + to generate new or updated testing mocks for interfaces. + +## Blocking Ambassador telemetry +Telemetry to Ambassador Labs can be disabled by having your os resolve the `metriton.datawire.io` to `127.0.0.1`. + +### Windows +`echo "127.0.0.1 metriton.datawire.io" >> c:\windows\system32\drivers\etc\hosts` + +### Linux and MacOS +`echo "127.0.0.1 metriton.datawire.io" | sudo tee -a /etc/hosts` + +## Build the binary, push the image + +The easiest thing to do to get going: + +```console +$ TELEPRESENCE_REGISTRY=docker.io/thhal make build push-images # use .\build-aux\winmake.bat build on windows +[make] TELEPRESENCE_VERSION=v2.12.1-19-g37085c2d7-1655891839 +... # Lots of output +2.12.1-19-g37085c2d7-1655891839: digest: sha256:40fe852f8d8026a89f196293f37ae8c462c765c85572150d26263d78c43cdd4b size: 1157 +``` + +This has 3 primary outputs: + 1. The `./build-output/bin/telepresence` executable binary + 2. The `${TELEPRESENCE_REGISTRY}/tel2` Docker image + 3. The `${TELEPRESENCE_REGISTRY}/telepresence` Docker image + +It essentially does 4 separate tasks: + 1. `make build` to build the `./build-output/bin/telepresence` + executable binary + 2. `make tel2-image` to build the `${TELEPRESENCE_REGISTRY}/tel2` Docker + image. + 3. `make client-image` to build the `${TELEPRESENCE_REGISTRY}/telepresence` Docker + image. + 4. `make push-images` to push the `${TELEPRESENCE_REGISTRY}/tel2` and `${TELEPRESENCE_REGISTRY}/telepresence` + Docker images. + +You can run any of those tasks separately, but be warned: The +`TELEPRESENCE_VERSION` for all 4 needs to agree, and `make` includes a +timestamp in the default `TELEPRESENCE_VERSION`; if you run the tasks +separately you will need to explicitly set the `TELEPRESENCE_VERSION` +environment variable so that they all agree. + +When working on just the command-line binary, it is often useful to +run it simply using `go run ./cmd/telepresence` rather than compiling +it first; but be warned: When run this way it won't know its own +version number (`telepresence version` will report "v0.0.0-devel") +unless you set the `TELEPRESENCE_VERSION` environment variable, you +will want to set it to the version of a previously-pushed Docker +image. + +You may think that the initial suggestion of running `make build +push-images` all the time (so that every build gets new matching +version numbers) would be terribly slow. However, This is not as slow +as you might think; both `go` and `docker` are very good about reusing +existing builds and avoiding unnecessary work. + +## Run the tests + +Running the tests does *not* require having previously built or pushed +anything. + +The tests make use of `sudo`; it is useful to get in the habit of +running a no-op `sudo` command to pre-emptively prompt for your +password to avoid having to notice when the prompt appears in the test +output. + +```console +$ sudo id +[sudo] password for lukeshu: +uid=0(root) gid=0(root) groups=0(root) + +$ make check-unit +[make] TELEPRESENCE_VERSION=v2.6.7-20-g9de10e316-1655892249 +... +``` + +The first time you run the tests, you should use `make check`, to get +`make` to automatically create the requisite `heml` tool +binaries. However, after that initial run, you can instead use +`gotestsum` or `go test` if you prefer. + +### Test metric collection + +**When running in CI,** `make check-unit` and `make check-integration` will report the result of test +runs to metriton, Ambassador Labs' metrics store. These reports include test name, running time, and +result. They are reported by the tool at `tools/src/test-report`. This `test-report` tool will also +visually modify test output; this happens even running locally, since the json output to go test +is piped to the tool anyway: + +```console +$ make check-unit +``` + +## Building for Release + +See https://www.notion.so/datawire/To-Release-Telepresence-2-x-x-2752ef26968444b99d807979cde06f2f + +## Updating license documentation + +Run `make generate` and commit changes to `DEPENDENCY_LICENSES.md` and `DEPENDENCIES.md` + +## Developing on Windows + +### Building on Windows + +We do not currently support using `make` directly to build on Windows. Instead, use `build-aux\winmake.bat` and pass it the same parameters +you would pass to make. `winmake.bat` will run `make` from inside a Docker container, with appropriate parameters to build windows binaries. + +## Debugging and Troubleshooting + +### Log output + +There are two logs: + - the `connector.log` log file which contains output from the + background-daemon parts of Telepresence that run as your regular + user: the interaction with the traffic-manager and the cluster + (traffic-manager and traffic-agent installs, intercepts, port + forwards, etc.), and + - the `daemon.log` log file which contains output from the parts of + telepresence that run as the "root" administrator user: the + networking changes and services that happen on your workstation. + +The location of both logs is: + + - on macOS: `~/Library/Logs/telepresence/` + - on GNU/Linux: `~/.cache/telepresence/logs/` + - on Windows `"%USERPROFILE%\AppData\Local\logs"` + +The logs are rotating and a new log is created every time Telepresence +creates a new connection to the cluster, e.g. on `telepresence +connect` after a `telepresence quit` that terminated the last session. + +#### Watching the logs + +A convenient way to watch rotating logs is to use `tail -F +`. It will automatically and seamlessly follow the +rotation. + +#### Debugging early-initialization errors + +If there's an error from the connector or daemon during early +initialization, it might quit before the logfiles are set up. Perhaps +the problem is even with setting up the logfile itself. + +You can run the `connector-foreground` or `daemon-foreground` commands +directly, to see what they spit out on stderr before dying: + +```console +$ telepresence connector-foreground # or daemon-foreground +``` + +If stdout is a TTY device, they don't set up logfiles and instead log +to stderr. In order to debug the logfile setup, simply pipe the +command to `cat` to trigger the usual logfile setup: + +```console +$ telepresence connector-foreground | cat +``` + +### Profiling the daemons + +The daemons can be profiled using [pprof](https://pkg.go.dev/net/http/pprof). +The profiling is initialized using the following flags: + +```console +$ telepresence quit -s +$ telepresence connect --userd-profiling-port 6060 --rootd-profiling-port 6061 +``` + +If a daemon is started with pprof, then the goroutine stacks and much other +info can be found by connecting your browser to http://localhost:6060/debug/pprof/ +(swap 6060 for whatever port you used with the flags) + +#### Dumping the goroutine stacks + +A dump will be produced in the respective logs for the daemon simply by killing it +with a SIGQUIT signal. On Windows however, using profiling is the only option. + +### RBAC issues + +If you are debugging or working on RBAC-related feature work with +Telepresence, it can be helpful to have a user with limited RBAC +privileges/roles. There are many ways you can do this, but the way we +do it in our tests is like so: + +```console +$ kubectl apply -f k8s/client_rbac.yaml +serviceaccount/telepresence-test-developer created +clusterrole.rbac.authorization.k8s.io/telepresence-role created +clusterrolebinding.rbac.authorization.k8s.io/telepresence-clusterrolebinding created + +$ kubectl get sa telepresence-test-developer -o "jsonpath={.secrets[0].name}" +telepresence-test-developer-token- + +$ kubectl get secret telepresence-test-developer-token- -o "jsonpath={.data.token}" > b64_token +$ cat b64_token | base64 --decode + + +$ kubectl config set-credentials telepresence-test-developer --token <plaintext token> +``` + +This creates a ServiceAccount, ClusterRole, and ClusterRoleBinding +which can be used with kubectl (`kubectl config use-context +telepresence-test-developer`) to work in a RBAC-restricted +environment. + +### Errors from `make generate` + +#### Missing go.sum entries +If you get an error like this: + +``` +cd tools/src/go-mkopensource && GOOS= GOARCH= go build -o /home/andres/source/production/telepresence/tools/bin/go-mkopensource $(sed -En 's,^import "(.*)".*,\1,p' pin.go) +missing go.sum entry for module providing package github.com/datawire/go-mkopensource; to add: + go mod download github.com/datawire/go-mkopensource +``` + +Add the missing entries by going to the folder that caused the failure (in this case it's +/home/andres/source/production/telepresence/tools/bin/go-mkopensource) and run the command provided by go: + +``` +go mod download github.com/datawire/go-mkopensource +``` diff --git a/MEETING_SCHEDULE.md b/MEETING_SCHEDULE.md index 36d7142992..02c5759e43 100644 --- a/MEETING_SCHEDULE.md +++ b/MEETING_SCHEDULE.md @@ -11,6 +11,6 @@ We hold troubleshooting sessions once a week on Tuesdays, at 2:30 pm Eastern. T The Telepresence Contributors Meeting is held on the first Wednesday of every month at 11am Eastern. The focus of this meeting is discussion of technical issues related to development of Telepresence. -New contributors are always welcome! Check out our [contributor's guide](DEVELOPING.md) to learn how you can help make Telepresence better. +New contributors are always welcome! Check out our [contributor's guide](CONTRIBUTING.md) to learn how you can help make Telepresence better. **Zoom Meeting Link**: https://us02web.zoom.us/j/6297823847 diff --git a/README.md b/README.md index 84aa8da6af..bf473d1083 100644 --- a/README.md +++ b/README.md @@ -2,6 +2,8 @@ [<img src="https://cncf-branding.netlify.app/img/projects/telepresence/horizontal/color/telepresence-horizontal-color.png" width="80"/>](https://cncf-branding.netlify.app/img/projects/telepresence/horizontal/color/telepresence-horizontal-color.png) +[![Artifact HUB](https://img.shields.io/endpoint?url=https://artifacthub.io/badge/repository/telepresence)](https://artifacthub.io/packages/helm/datawire/telepresence) + Telepresence gives developers infinite scale development environments for Kubernetes. Docs: @@ -28,7 +30,7 @@ A few quick ways to start using Telepresence * **Telepresence Quick Start:** [Quick Start](https://www.getambassador.io/docs/telepresence/latest/quick-start/) * **Install Telepresence:** [Install](https://www.getambassador.io/docs/telepresence/latest/install/) -* **Contributor's Guide:** [Guide](https://github.com/telepresenceio/telepresence/blob/release/v2/DEVELOPING.md) +* **Contributor's Guide:** [Guide](https://github.com/telepresenceio/telepresence/blob/release/v2/CONTRIBUTING.md) * **Meetings:** Check out our community [meeting schedule](https://github.com/telepresenceio/telepresence/blob/release/v2/MEETING_SCHEDULE.md) for opportunities to interact with Telepresence developers ## Walkthrough @@ -296,10 +298,10 @@ Containers: /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-svf4h (ro) Conditions: Type Status - Initialized True - Ready True - ContainersReady True - PodScheduled True + Initialized True + Ready True + ContainersReady True + PodScheduled True Volumes: kube-api-access-svf4h: Type: Projected (a volume that contains injected data from multiple sources) @@ -317,11 +319,11 @@ Volumes: Optional: false export-volume: Type: EmptyDir (a temporary directory that shares a pod's lifetime) - Medium: + Medium: SizeLimit: <unset> tel-agent-tmp: Type: EmptyDir (a temporary directory that shares a pod's lifetime) - Medium: + Medium: SizeLimit: <unset> QoS Class: BestEffort Node-Selectors: <none> @@ -342,10 +344,10 @@ Events: Normal Started 7m39s kubelet Started container traffic-agent ``` -Telepresence keeps track of all possible intercepts for containers that have an agent installed in the configmap `telepresence-agents`. +Telepresence keeps track of all possible intercepts for containers that have an agent installed in the configmap `telepresence-agents`. ```console -$ kubectl describe configmap telepresence-agents +$ kubectl describe configmap telepresence-agents Name: telepresence-agents Namespace: default Labels: app.kubernetes.io/created-by=traffic-manager