Standalone EVE OS containers with RT benchmarking support#7
Merged
uncleDecart merged 15 commits intomainfrom Mar 5, 2026
Merged
Standalone EVE OS containers with RT benchmarking support#7uncleDecart merged 15 commits intomainfrom
uncleDecart merged 15 commits intomainfrom
Conversation
Signed-off-by: Pavel Abramov <uncle.decart@gmail.com>
Signed-off-by: Pavel Abramov <uncle.decart@gmail.com>
Signed-off-by: Pavel Abramov <uncle.decart@gmail.com>
Signed-off-by: Pavel Abramov <uncle.decart@gmail.com>
Signed-off-by: Pavel Abramov <uncle.decart@gmail.com>
Signed-off-by: Pavel Abramov <uncle.decart@gmail.com>
Signed-off-by: Pavel Abramov <uncle.decart@gmail.com>
Signed-off-by: Pavel Abramov <uncle.decart@gmail.com>
Restructure caterpillar and cyclictest into self-contained containers for deployment on EVE OS with full RT benchmarking support. Build & deploy: - Add build-all.sh with TAG= and optional registry push support - SSH_KEY build arg for key-only auth (defaults to ~/.ssh/ztest_key.pub) - BASE_TAG build arg to pin child images to versioned base - Print build summary with full FQDN image tags Base image (Dockerfile.base): - Add bash, git, procps, ncurses-term, openssh-server - Configure sshd with pubkey auth, start at container boot - Add login banner (motd) with Jupyter/SSH/CLI instructions - Add shell aliases: jupyter-start, rt-preflight, rt-info - Expose ports 22 (SSH) and 8888 (Jupyter) - Keep container alive after benchmark via sshd foreground RT preflight checks (src/rt_preflight.py): - 14-point validation: PREEMPT_RT, isolcpus, nohz_full, rcu_nocbs, irqaffinity, C-states, intel_pstate, governor, clocksource, NUMA balancing, split_lock, hugepages, capabilities, kernel threads - Detects cmdline typos (e.g. rocessor.max_cstate) - PASS/WARN/FAIL output visible in EVE OS cloud log viewer Container improvements: - Copy Python code, config, and notebooks into child images - Add run.interactive flag: tqdm in terminal, brief logs in containers - Fix detect_cpus to return actual core list from cgroup, not count - Fix cyclictest: remove redundant chrt (handled by -p 95) - Remove rdtset references, skip nested docker when run.docker=false - Fix typo in main.py import (scr -> src) Signed-off-by: Mikhail Malyshev <mike.malyshev@gmail.com>
Entrypoint dynamically splits the container cpuset at boot: - First core → housekeeping (entrypoint, sshd, python, uv) - Remaining cores → benchmarks only (exported as RT_BENCHMARK_CORES) All service processes inherit the housekeeping affinity via taskset. detect_cpus() reads RT_BENCHMARK_CORES first, so caterpillar/cyclictest only receive the clean cores. No hardcoded core numbers — fully dynamic from cgroup cpuset at runtime. Signed-off-by: Mikhail Malyshev <mike.malyshev@gmail.com>
in sysinfo - Replace exec sshd -D with sleep infinity to prevent container exit (sshd already running in background, second instance failed on port conflict) - Move detect_cpus() before sysinfo collection so effective cores are captured in sysinfo.json under new "runtime" section (effective_cores, housekeeping_core, source, config_cores) Signed-off-by: Mikhail Malyshev <mike.malyshev@gmail.com>
After rebase onto main, the demo_mode branch (from hde2e PR) is now in the code path. Containers must explicitly opt out to avoid hitting the DockerHDE2E path instead of DockerTestRunner. Config default remains demo_mode=true for bare-metal HDE2E workflow.
- Remove the cmdline typo detector for 'rocessor.max_cstate' — it was fragile and produced false positives. - Parse isolcpus flags (managed_irq, domain, io_queue) properly instead of feeding them to _parse_cpulist which would crash on non-numeric tokens. - Warn when managed_irq or domain flags are missing — without them the kernel still schedules IRQs and tasks onto isolated cores. Note that when any flag is specified, 'domain' is no longer implied and must be listed explicitly. - Warn when io_queue flag is missing on kernel 6.17+ — this flag prevents block-layer IO completion queues from landing on isolated cores. - Support open-ended 1-N notation in CPU lists (meaning 'through the last available CPU'), as accepted by the kernel. - Make _parse_cpulist resilient to non-numeric tokens (skip instead of crashing with ValueError).
uncleDecart
reviewed
Mar 3, 2026
| # Output directory for results | ||
| RUN mkdir -p /tmp/output | ||
|
|
||
| ENTRYPOINT ["/entrypoint.sh", "uv", "run", "python", "main.py", "run.docker=false", "pqos.enable=false", "run.stressor=false", "run.interactive=false", "demo.demo_mode=false", "run.command=caterpillar"] |
Member
There was a problem hiding this comment.
caterpillar container should just include caterpillar binary to run it, main python script is used to run those tests, here it feels to me like a cyclic dep
uncleDecart
approved these changes
Mar 3, 2026
Member
uncleDecart
left a comment
There was a problem hiding this comment.
lgtm only question about entrypoint for cyclictest and caterpillar
…OS) variants Restore caterpillar/Dockerfile and cyclictest/Dockerfile to their original lightweight docker-host versions. The EVE OS standalone containers (with SSH, Jupyter, cpuset pinning, entrypoint) are now in Dockerfile.eve, Dockerfile.base.eve, caterpillar/Dockerfile.eve, and cyclictest/Dockerfile.eve. Rename build-all.sh to build-all-eve.sh and update it to reference the .eve Dockerfiles. Signed-off-by: Mikhail Malyshev <mike.malyshev@gmail.com>
Member
|
Thank you @rucoder , looks amazing ! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description of the change
Restructure caterpillar and cyclictest into self-contained containers for deployment on EVE OS with full RT benchmarking support. This PR supersedes #5 and includes all its changes (BIOS scraping, detect CPUs, PQOS optional, reboot testing) rebased on top of current
main.Key changes
Base image (
Dockerfile.base):jupyter-start,rt-preflight,rt-infosshdforeground (for SSH/Jupyter access)RT preflight checks (
src/rt_preflight.py):Container improvements:
detect_cpus()reads effective cores fromRT_BENCHMARK_CORESenv (set by entrypoint), falls back to cgroup/proc/sysconfsysinfo.jsonrun.docker=falseandrun.interactive=falsefor in-container executionchrt(handled by-p 95)pqos.enableflag)Compatibility with demo mode (from
main):mainwhich includes the HDE2E demo mode PR (add codesys hde2e demo: Dockerfile, PLC data bundles, configs, hde2e … #6)demo.demo_mode=falseto ensure they take theDockerTestRunnerpathChecks and balances
Type of change
functionality to not work as expected)
Related stories, issues and pulls
mainSecurity considerations
SSH_KEYbuild arg