Skip to content

Add more Python-related leak suppressions.#6370

Merged
JanuszL merged 4 commits into
NVIDIA:mainfrom
JanuszL:more_lsan_supres
Jun 2, 2026
Merged

Add more Python-related leak suppressions.#6370
JanuszL merged 4 commits into
NVIDIA:mainfrom
JanuszL:more_lsan_supres

Conversation

@JanuszL
Copy link
Copy Markdown
Contributor

@JanuszL JanuszL commented May 26, 2026

  • After moving to Python 3.12 for the sanitized build,
    new false positives have started appearing. This requires
    adding additional suppression entries and modifies sanitizer
    options.

Category:

Other (e.g. Documentation, Tests, Configuration)

Description:

  • After moving to Python 3.12 for the sanitized build,
    new false positives have started appearing. This requires
    adding additional suppression entries.

Additional information:

Affected modules and functionalities:

  • sanitizer suppression list
  • sanitizer options

Key points relevant for the review:

  • NA

Tests:

  • Existing tests apply
    • all tests with sanitizers on
  • New tests added
    • Python tests
    • GTests
    • Benchmark
    • Other
  • N/A

Checklist

Documentation

  • Existing documentation applies
  • Documentation updated
    • Docstring
    • Doxygen
    • RST
    • Jupyter
    • Other
  • N/A

DALI team only

Requirements

  • Implements new requirements
  • Affects existing requirements
  • N/A

REQ IDs: N/A

JIRA TASK: N/A

@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [52651772]: BUILD STARTED

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 26, 2026

Greptile Summary

This PR addresses false-positive leak reports that emerged after upgrading the sanitized CI build to Python 3.12, by expanding qa/leak.sup with new suppression entries and tuning the ASAN/LSAN environment setup scripts.

  • qa/leak.sup: Adds suppressions for Python 3.12 runtime internals (_PyImport_*, _PyObject_GC_*, PyRun_*, etc.), external compiler/toolchain processes (gcc, g++, cc1plus, nvcc), and system binaries (sort, tr, fc-list, xargs, yyparse). Removes specific _PyObject_GC_NewVar / _PyObject_GC_New entries now covered by the new wildcard, and fixes the previously flagged missing trailing newline.
  • qa/test_template_impl.sh: Reorders enable_sanitizer() so the Python package-path discovery commands run while ASAN is still deactivated (using the global default start_deactivated=true), then activates ASAN fully with start_deactivated=false for the actual test body. Adds nvidia.nvjpeg2k and nvidia.nvtiff library directories to LD_LIBRARY_PATH as prerequisites for nvimgcodec.
  • qa/TL0_cpu_only/test_nofw.sh / test_pytorch.sh: Same LD_LIBRARY_PATH expansion as test_template_impl.sh, now appending to the existing variable rather than replacing it entirely.

Confidence Score: 5/5

Safe to merge — changes are limited to CI sanitizer configuration and suppression lists with no impact on production code.

All changes are confined to QA scripts and the LSAN suppression file. The ASAN reordering in enable_sanitizer() is logically sound: Python helper commands run while ASAN is still deactivated, then ASAN is fully enabled for the actual test body. The previously flagged issues (xargs typo, redundant GC entries) are both addressed. No production logic is touched.

No files require special attention.

Important Files Changed

Filename Overview
qa/leak.sup Adds Python 3.12 leak suppressions (Python runtime, compiler toolchain, system binaries), removes specific PyObject_GC entries superseded by the new wildcard, and adds a trailing newline. All previous review issues (xargs typo, redundant GC entries) are addressed.
qa/test_template_impl.sh Reorders enable_sanitizer() so Python helper invocations (nvjpeg2k/nvtiff/nvimgcodec path discovery) run before ASAN_OPTIONS is updated, preventing ASAN from intercepting those helper processes. Also changes start_deactivated from true to false so ASAN is fully active during actual test execution, and adds dependency library paths for nvjpeg2k and nvtiff.
qa/TL0_cpu_only/test_nofw.sh Expands LD_LIBRARY_PATH to include nvjpeg2k and nvtiff library directories as prerequisites for nvimgcodec, now appending to existing LD_LIBRARY_PATH instead of replacing it.
qa/TL0_cpu_only/test_pytorch.sh Same LD_LIBRARY_PATH expansion as test_nofw.sh: adds nvjpeg2k and nvtiff lib paths before nvimgcodec, appending to the existing variable rather than overwriting it.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[test_body_wrapper called] --> B{DALI_ENABLE_SANITIZERS set?}
    B -- No --> C[run test_body normally]
    B -- Yes --> D[enable_sanitizer]

    D --> D1[export PYTHONMALLOC=malloc]
    D1 --> D2[build libfakeclose.so]
    D2 --> D3[set LD_PRELOAD with libasan]
    D3 --> D4["Python: discover nvjpeg2k path\nASAN_OPTIONS still = start_deactivated=true"]
    D4 --> D5["Python: discover nvtiff path\n(ASAN deactivated)"]
    D5 --> D6["Python: discover nvimgcodec path\n(ASAN deactivated)"]
    D6 --> D7["set ASAN_OPTIONS\nstart_deactivated=false, detect_leaks=1\nsuppressions=address.sup"]
    D7 --> D8[set ASAN_SYMBOLIZER_PATH]
    D8 --> E[run test_body]
    E --> F[disable_sanitizer]
    F --> G[process_sanitizers_logs]
    G --> H{ERROR in log?}
    H -- Yes --> I[exit 1]
    H -- No --> J[continue]
Loading

Reviews (12): Last reviewed commit: "More leaks suppresed" | Re-trigger Greptile

Comment thread qa/test_template_impl.sh Outdated
Comment on lines +79 to +81
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:"$(python -c 'import nvidia.nvjpeg2k as n, os; print(os.path.dirname(n.__file__) + "/lib")' 2>/dev/null)"
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:"$(python -c 'import nvidia.nvtiff as n, os; print(os.path.dirname(n.__file__) + "/lib")' 2>/dev/null)"
export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:"$(python -c 'import nvidia.nvimgcodec as n, os; print(os.path.dirname(n.__file__))' 2>/dev/null)"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 [Nit] The two new lines in enable_sanitizer() use 2>/dev/null without || echo '', while the equivalent lines in test_nofw.sh and test_pytorch.sh use 2>/dev/null || echo ''. Both produce an empty string on failure, so the behaviour is identical, but the inconsistency may surprise a future reader. Consider aligning the style.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@JanuszL JanuszL force-pushed the more_lsan_supres branch from 44bf62a to 5bd9383 Compare May 26, 2026 15:17
@rostan-t rostan-t self-assigned this May 26, 2026
@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [52651772]: BUILD FAILED

@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [52687092]: BUILD STARTED

@JanuszL JanuszL force-pushed the more_lsan_supres branch from 5bd9383 to 853a10c Compare May 26, 2026 22:28
@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [52701305]: BUILD FAILED

@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [52702362]: BUILD STARTED

@JanuszL JanuszL force-pushed the more_lsan_supres branch from 853a10c to daf565c Compare May 27, 2026 08:23
@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [52752780]: BUILD STARTED

@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [52702362]: BUILD FAILED

- After moving to Python 3.12 for the sanitized build,
  new false positives have started appearing. This requires
  adding additional suppression entries and modifies sanitizer
  options.

Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
@JanuszL JanuszL force-pushed the more_lsan_supres branch from daf565c to fbc6629 Compare May 27, 2026 09:37
@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [52760946]: BUILD STARTED

@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [52687092]: BUILD FAILED

@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [52760946]: BUILD FAILED

@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [52953311]: BUILD STARTED

@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [52953745]: BUILD STARTED

@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [52953745]: BUILD FAILED

Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [53052736]: BUILD FAILED

@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [53053372]: BUILD STARTED

@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [53053372]: BUILD FAILED

Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
@JanuszL JanuszL force-pushed the more_lsan_supres branch from c1d8914 to 2666125 Compare May 30, 2026 18:39
@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [53173151]: BUILD STARTED

@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [53173151]: BUILD FAILED

@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [53258917]: BUILD STARTED

Comment thread qa/leak.sup Outdated
leak:/usr/bin/sort
leak:/usr/bin/tr
leak:/usr/bin/fc-list
leak:/usr/bin/xarg
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 [Bug] Typo in suppression path: /usr/bin/xarg does not exist on Linux — the tool is /usr/bin/xargs. Because the path never matches any frame, this suppression is silently inert. If xargs genuinely produces a false-positive leak report in CI, the test will continue to fail despite this entry.

Suggested change
leak:/usr/bin/xarg
leak:/usr/bin/xargs

Fix in Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@JanuszL JanuszL force-pushed the more_lsan_supres branch from 128d706 to c764f6c Compare June 1, 2026 05:06
Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
@JanuszL JanuszL force-pushed the more_lsan_supres branch from c764f6c to aef942a Compare June 1, 2026 05:08
@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [53260113]: BUILD STARTED

@dali-automaton
Copy link
Copy Markdown
Collaborator

CI MESSAGE: [53260113]: BUILD PASSED

@JanuszL JanuszL merged commit 96e7e28 into NVIDIA:main Jun 2, 2026
6 checks passed
@JanuszL JanuszL deleted the more_lsan_supres branch June 2, 2026 09:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants