Bug with concurrent `docker run` calls #88

jeremysprofile · 2024-05-20T15:28:53Z

TL;DR: When calling 6 docker run ... & commands, Kubedock sometimes logged fatal error: concurrent map writes. This is not a priority for us (we have migrated away from Kubedock), but I wanted to report it in case you wanted to take action. I realize this is not how most people use Kubedock, so if you want to close this ticket as WONTFIX, that seems reasonable to me.

Using this function from a Jenkins pipeline:

export DOCKER_HOST=tcp://localhost:2475
CONTAINER_DIRECTORY=/tmp
start_containers() {  # $1. num_containers. $2. image
  local -r num_containers="$1"
  local -r image="$2"
  local i local_port pid exit_code
  local -a pids
  for ((i = 1; i <= num_containers; i++)); do
    (( local_port = 3000 + i ))
    docker run -e "TIMEOUT=60000" -e "CONCURRENT=16" -p "$local_port:3000" --rm -d --name "container$i" "$image" > "$CONTAINER_DIRECTORY/$i" &
    pid=$!
    pids+=("$pid")
  done

  for i in "${!pids[@]}"; do
    pid=${pids[i]}
    # will exit the script if it failed since we have `set -e`
    echo "Waiting for container $i to start..."
    wait "$pid" || { exit_code=$?; echo "Container $i failed to start"; exit "$exit_code"; }
  done
}
start_containers "6" "ghcr.io/browserless/chromium:v2.7.1"

In a Kubernetes Pod with a kubedock sidecar set up as shown (ci is the Jenkins container):

apiVersion: v1
kind: Pod
metadata:
  name: "testpants"
spec:
  initContainers:
    # Jenkins won't let you create a configmap alongside the pod, and I don't want another ECR image
    # that is kubedock + the PodTemplate it should use, because that will hide the fact that exists
    # and make it harder to update. So we use an init container to create the configmap
    - name: kubedock-pod-template
      image: busybox:1.35.0
      command: ["/bin/sh", "-c"]
      args:
        - |
          suffix=$(</dev/urandom tr -dc a-z0-9 | head -c 10)
          cat << EOF > /share/kubedock-pod-template.yaml
          apiVersion: v1
          kind: Pod
          metadata:
            annotations:
              example: kubedock
          spec:
            tolerations:
              - key: target
                operator: Equal
                value: kubedock-e2e-$suffix
                effect: NoSchedule
          EOF
      volumeMounts:
        - name: shared-volume
          mountPath: /share
  containers:
    - name: ci
      image: $OUR_ECR.dkr.ecr.us-east-1.amazonaws.com/deploy/executors/ci-image:active-fe7d1296
      command: ["/tini", "--", "sleep", "infinity"]
      resources:
        requests:
          cpu: "28"
          memory: 52Gi
      volumeMounts:
        - name: shared-volume
          mountPath: /share
    - name: kubedock
      # Please grep for this image when you update it and change it everywhere
      # SHA for 0.16.0
      image: "$OUR_ECR.dkr.ecr.us-east-1.amazonaws.com/docker-hub/joyrex2001/kubedock@sha256:f9203c5da759d3097ea6d555837377d8aea7f954337abd104ac306afd4130de6"
      resources:
        requests:
          cpu: "1"
          memory: "2Gi"
      args:
        - server
        # Allows Kubedock to forward traffic to the Pod (e.g., lets `docker run -p 8080:80 $MYIMAGE` work)
        - --reverse-proxy
        # See the init container
        - --pod-template
        - /share/kubedock-pod-template.yaml
        # Without explicit resoure requests, the Pod won't have anything
        - --request-cpu
        - "8"
        - --request-memory
        - "8Gi"
        # default timeout of 1m is not sufficient for Karpenter to create a new Node and schedule the Pod
        - --timeout
        - "10m"
        # default grabs the image from public docker hub
        - --initimage
        # SHA for 0.16.0
        - $OUR_ECR.dkr.ecr.us-east-1.amazonaws.com/docker-hub/joyrex2001/kubedock@sha256:f9203c5da759d3097ea6d555837377d8aea7f954337abd104ac306afd4130de6
      volumeMounts:
        - name: shared-volume
          mountPath: /share
  volumes:
    - name: shared-volume
      emptyDir: {}

Full logs from Kubedock attached:
log.txt

The text was updated successfully, but these errors were encountered:

joyrex2001 · 2024-07-23T19:10:13Z

Thanks, very helpful (especially the log). Fixed.

joyrex2001 added the bug Something isn't working label May 25, 2024

joyrex2001 added a commit that referenced this issue Jul 23, 2024

Fix concurrency issues events #88

d0449a5

joyrex2001 closed this as completed Jul 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug with concurrent `docker run` calls #88

Bug with concurrent `docker run` calls #88

jeremysprofile commented May 20, 2024

joyrex2001 commented Jul 23, 2024

Bug with concurrent docker run calls #88

Bug with concurrent docker run calls #88

Comments

jeremysprofile commented May 20, 2024

joyrex2001 commented Jul 23, 2024

Bug with concurrent `docker run` calls #88

Bug with concurrent `docker run` calls #88