All cases need to observe cpu and memory of the ws-proxy
Remember to delete workspaces after test, since they are always active
port-forward prometheus 9090 to local, and search graph with query like
container_memory_rss(pod="<ws-proxy-pod-name>
", container="ws-proxy"}
container_cpu_rss(pod="<ws-proxy-pod-name>
", container="ws-proxy"}
Prepare workspace pair AB like this:
- We need to keep workspace alive, so edit gitpod-cli to build a new tmp exec file with command to keep heartbeat: send heartbeat every 30 seconds
- Open a workspace
A
as target workspace, copy file that step1 produce to it, and exec file to keep it alive - Open a workspace
B
, repeat step 2
Testing in prev envs and test several cases:
-
Many connections to learn the cost of single connection in terms of memory and CPU
For B, write ssh connect script with connect num 10000
go run main.go -u <workspace_url> -t <owner_token> -n <connection_num> lotconn
🟢 After 10000 connect exec,
ws-proxy
works fine, target workspace works fine, (but sender's workspace network broken)wait: remote command exited without exit status or exit signal
appear after exec command, maybe ssh gateway still has some difference with real ssh Fixed with PR -
Several connection with huge amount of data back and forth
Simply exec scp with large file (
dd if=/dev/zero of=test bs=1M count=1000
) from A to B and B to A several times🟢 scp 1G data works fine several times
-
Several connections and see how long they can stay alive with heartbeat
For B, open several terminals connect to A and exec command like
htop
🟢 7 tasks of ssh command htop work and stable in 20 hours
-
Try dropping and reopening connections to see whether ws-proxy leaks memory on such connections
go run main.go -u <workspace_url> -t <owner_token> -c <concurrent_num> reopen
🟠 Memory leak appeared and fixed by this commit
After fix