Skip to content

Profiler bokeh series flat-lines for the .l (Linux/WSL) flavor #1104

Description

@MaartenHilferink

Symptom

For the linux-release flavor (-version 20.0.0.l or local-linux-release), the per-test Bokeh charts in GeoDMSTestResults/<run>/<test>.html show flatlined memory and CPU usage even for tests that demonstrably did real work. Example: t405_2_NetworkModel_PBL_zonderFence and t405_3_NetworkModel_PBL_metFence each ran ~400 sec, allocated ~24 GB peak, and produced output CSVs (visible in the GeoDmsRun /L log) — but the Bokeh series shows ~0% memory the whole run.

Status code is correct ("OK"), but the visualisation makes the run look like a no-op.

Cause

profiler/profiler.py invokes the experiment via subprocess.Popen(cmd_parts, ...) and then samples psutil.Process(parent_pid).children(...).

For the Linux flavor the command is wsl -- /opt/.../GeoDmsRun .... psutil sees the wsl.exe shim on the Windows side — a tiny relay process — and its children. The actual GeoDmsRun Linux process lives inside the WSL VM, in a separate kernel + process tree, invisible to psutil from Windows.

So the sampler sees ~0 memory / ~0 CPU and the Bokeh series flatlines.

Reproduce

python full.py -version local-linux-release -tests t405_2

Open …/20_0_0_l_x64_SF_S1S2S3_OVSRV10/t405_2_NetworkModel_PBL_zonderFence.html. Note flatlines despite the GeoDmsRun /L log showing "Highest allocated: 24713[MB]".

Mitigation options

  1. WSL-side memory probe: in the sampling loop, when GeoDmsLocalFlavor == "linux-release", additionally wsl bash -c "cat /proc/<pid>/status" (find the GeoDmsRun pid via pgrep -f GeoDmsRun once startup completes) and read VmRSS / VmPeak. ~30 lines, one extra wsl shell-out per sample.
  2. Mark series as N/A on Linux: emit a clear placeholder ("WSL profiling not measured") in the Bokeh series so users don't read the flat zero as "test was a no-op".
  3. Move the sampler inside WSL: spawn the WSL-side sampler inline with the GeoDmsRun command (e.g. via a shell wrapper) and merge results back. More invasive.

Recommendation: (1) plus (2) as a fallback when probe fails.

Workaround until fixed

Trust the GeoDmsRun /L log's "Highest allocated/CommitCharge" line and the test's status code; ignore the Bokeh memory series for the .l flavor.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions