Symptom
For the linux-release flavor (-version 20.0.0.l or local-linux-release), the per-test Bokeh charts in GeoDMSTestResults/<run>/<test>.html show flatlined memory and CPU usage even for tests that demonstrably did real work. Example: t405_2_NetworkModel_PBL_zonderFence and t405_3_NetworkModel_PBL_metFence each ran ~400 sec, allocated ~24 GB peak, and produced output CSVs (visible in the GeoDmsRun /L log) — but the Bokeh series shows ~0% memory the whole run.
Status code is correct ("OK"), but the visualisation makes the run look like a no-op.
Cause
profiler/profiler.py invokes the experiment via subprocess.Popen(cmd_parts, ...) and then samples psutil.Process(parent_pid).children(...).
For the Linux flavor the command is wsl -- /opt/.../GeoDmsRun .... psutil sees the wsl.exe shim on the Windows side — a tiny relay process — and its children. The actual GeoDmsRun Linux process lives inside the WSL VM, in a separate kernel + process tree, invisible to psutil from Windows.
So the sampler sees ~0 memory / ~0 CPU and the Bokeh series flatlines.
Reproduce
python full.py -version local-linux-release -tests t405_2
Open …/20_0_0_l_x64_SF_S1S2S3_OVSRV10/t405_2_NetworkModel_PBL_zonderFence.html. Note flatlines despite the GeoDmsRun /L log showing "Highest allocated: 24713[MB]".
Mitigation options
- WSL-side memory probe: in the sampling loop, when
GeoDmsLocalFlavor == "linux-release", additionally wsl bash -c "cat /proc/<pid>/status" (find the GeoDmsRun pid via pgrep -f GeoDmsRun once startup completes) and read VmRSS / VmPeak. ~30 lines, one extra wsl shell-out per sample.
- Mark series as N/A on Linux: emit a clear placeholder ("WSL profiling not measured") in the Bokeh series so users don't read the flat zero as "test was a no-op".
- Move the sampler inside WSL: spawn the WSL-side sampler inline with the GeoDmsRun command (e.g. via a shell wrapper) and merge results back. More invasive.
Recommendation: (1) plus (2) as a fallback when probe fails.
Workaround until fixed
Trust the GeoDmsRun /L log's "Highest allocated/CommitCharge" line and the test's status code; ignore the Bokeh memory series for the .l flavor.
Symptom
For the linux-release flavor (
-version 20.0.0.lorlocal-linux-release), the per-test Bokeh charts inGeoDMSTestResults/<run>/<test>.htmlshow flatlined memory and CPU usage even for tests that demonstrably did real work. Example: t405_2_NetworkModel_PBL_zonderFence and t405_3_NetworkModel_PBL_metFence each ran ~400 sec, allocated ~24 GB peak, and produced output CSVs (visible in the GeoDmsRun /L log) — but the Bokeh series shows ~0% memory the whole run.Status code is correct ("OK"), but the visualisation makes the run look like a no-op.
Cause
profiler/profiler.pyinvokes the experiment viasubprocess.Popen(cmd_parts, ...)and then samplespsutil.Process(parent_pid).children(...).For the Linux flavor the command is
wsl -- /opt/.../GeoDmsRun .... psutil sees thewsl.exeshim on the Windows side — a tiny relay process — and its children. The actualGeoDmsRunLinux process lives inside the WSL VM, in a separate kernel + process tree, invisible to psutil from Windows.So the sampler sees ~0 memory / ~0 CPU and the Bokeh series flatlines.
Reproduce
Open
…/20_0_0_l_x64_SF_S1S2S3_OVSRV10/t405_2_NetworkModel_PBL_zonderFence.html. Note flatlines despite the GeoDmsRun /L log showing "Highest allocated: 24713[MB]".Mitigation options
GeoDmsLocalFlavor == "linux-release", additionallywsl bash -c "cat /proc/<pid>/status"(find the GeoDmsRun pid viapgrep -f GeoDmsRunonce startup completes) and readVmRSS/VmPeak. ~30 lines, one extra wsl shell-out per sample.Recommendation: (1) plus (2) as a fallback when probe fails.
Workaround until fixed
Trust the GeoDmsRun /L log's "Highest allocated/CommitCharge" line and the test's status code; ignore the Bokeh memory series for the
.lflavor.