build: Harden flaky Aeron tests in CI #32242

patriknw · 2023-11-27T14:55:54Z

increase /dev/shm and use that (by default)
use default term buffer size
increase cpu requests, shouldn't matter but corresponds to what we want to use, 2 pods per node

This looks very promising. I have tried in a gke cluster. Verified with df -h. It was 64 MB and now 1G.

No more "Scheduled sending of heartbeat was delayed".

This wasn't possible when we tried last time #30601

patriknw · 2023-11-27T15:00:19Z

.github/workflows/nightly-builds.yml

@@ -215,8 +215,6 @@ jobs:
          -Dakka.test.tags.exclude=gh-exclude,gh-exclude-aeron,timing \
          -Dakka.test.multi-in-test=false \
          -Dakka.cluster.assert=on \
-          -Daeron.dir=/opt/volumes/media-driver \
-          -Daeron.term.buffer.length=33554432 \
          clean ${{ matrix.command }}


This job is not in Kubernetes. Might have same problem with too small /dev/shm. Let me try...

Plenty of space, no problem.

Filesystem Size Used Avail Use% Mounted on /dev/root 84G 62G 22G 74% / tmpfs 7.9G 172K 7.9G 1% /dev/shm tmpfs 3.2G 1.1M 3.2G 1% /run tmpfs 5.0M 0 5.0M 0% /run/lock /dev/sdb15 105M 6.1M 99M 6% /boot/efi /dev/sda1 63G 4.1G 56G 7% /mnt tmpfs 1.6G 12K 1.6G 1% /run/user/1001

octonato · 2023-11-27T15:06:14Z

.github/workflows/multi-node.yml

@@ -147,7 +144,8 @@ jobs:
          gcloud components install gke-gcloud-auth-plugin
          gcloud config set compute/region us-central1
          gcloud config set compute/zone us-central1-c
-          ./kubernetes/create-cluster-gke.sh "akka-artery-aeron-cluster-${GITHUB_RUN_ID}"
+          gcloud container clusters get-credentials akka-artery-aeron-cluster-test --zone us-central1-c --project akka-team
+          # ./kubernetes/create-cluster-gke.sh "akka-artery-aeron-cluster-test"


is this intentional? Not calling the script to create the cluster?

leftover from my testing, thanks

* increase /dev/shm and use that (by default) * use default term buffer size * increase cpu requests, shouldn't matter but corresponds to what we want to use, 2 pods per node

pvlugter

LGTM

* more memory request * separate Aeron run in another workflow to make such test failures more clear

patriknw · 2023-11-28T07:28:21Z

There was an error: "insuffiient usable storage for new log of ". I have increased it. I don't know if it accumulates when running all tests? It's supposed to delete the files on shutdown.

patriknw · 2023-11-28T07:31:19Z

I separated the aeron run in separate workflow. I hope that shows up so I can trigger a manual run if I merge this?

patriknw · 2023-11-28T10:32:36Z

a successful run in https://github.com/akka/akka/actions/runs/7015560435

* increase /dev/shm and use that (by default) * use default term buffer size * increase cpu requests, shouldn't matter but corresponds to what we want to use, 2 pods per node * more memory request * separate Aeron run in another workflow to make such test failures more clear

patriknw commented Nov 27, 2023

View reviewed changes

octonato reviewed Nov 27, 2023

View reviewed changes

patriknw force-pushed the wip-dev-shm-patriknw branch from 77334b7 to a50c9e5 Compare November 27, 2023 15:13

build: Harden flaky Aeron tests in CI

bf1d4f0

* increase /dev/shm and use that (by default) * use default term buffer size * increase cpu requests, shouldn't matter but corresponds to what we want to use, 2 pods per node

patriknw force-pushed the wip-dev-shm-patriknw branch from a50c9e5 to bf1d4f0 Compare November 27, 2023 15:14

pvlugter approved these changes Nov 27, 2023

View reviewed changes

increase /dev/shm

5506475

* more memory request * separate Aeron run in another workflow to make such test failures more clear

patriknw merged commit 95d7210 into main Nov 28, 2023
5 checks passed

patriknw deleted the wip-dev-shm-patriknw branch November 28, 2023 07:31

patriknw added this to the 2.9.1 milestone Nov 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

build: Harden flaky Aeron tests in CI #32242

build: Harden flaky Aeron tests in CI #32242

patriknw commented Nov 27, 2023

patriknw Nov 27, 2023

patriknw Nov 27, 2023

octonato Nov 27, 2023

patriknw Nov 27, 2023

pvlugter left a comment

patriknw commented Nov 28, 2023

patriknw commented Nov 28, 2023

patriknw commented Nov 28, 2023

build: Harden flaky Aeron tests in CI #32242

build: Harden flaky Aeron tests in CI #32242

Conversation

patriknw commented Nov 27, 2023

patriknw Nov 27, 2023

Choose a reason for hiding this comment

patriknw Nov 27, 2023

Choose a reason for hiding this comment

octonato Nov 27, 2023

Choose a reason for hiding this comment

patriknw Nov 27, 2023

Choose a reason for hiding this comment

pvlugter left a comment

Choose a reason for hiding this comment

patriknw commented Nov 28, 2023

patriknw commented Nov 28, 2023

patriknw commented Nov 28, 2023