MAINT: Migrate os.path.* to pathlib.Path in pyrit/#1877
Merged
romanlutz merged 5 commits intoJun 2, 2026
Conversation
Audit found ~22 os.path.* call sites across 11 pyrit/ source files,
plus a few paired sibling APIs (os.makedirs/os.remove/os.unlink).
This change does a mechanical pass to the pathlib equivalents:
- os.path.exists -> Path(...).exists()
- os.path.isfile -> Path(...).is_file()
- os.path.splitext(p)[1] -> Path(p).suffix
- os.path.basename -> Path(...).name
- os.path.join(os.path.expanduser('~'), ...) -> Path.home() / ...
- os.path.getsize -> Path(...).stat().st_size
- os.makedirs(p, exist_ok=True) -> Path(p).mkdir(parents=True, exist_ok=True)
- os.remove / os.unlink -> Path(...).unlink()
Public parameter and attribute types remain str / Optional[str];
Path(...) conversion happens at the boundary so callers passing
strings continue to work unchanged.
In tests/unit/models/test_seed_prompt.py, two tests previously
patched os.path.isfile / os.path.splitext on the source module.
Those mocks would be inert after the migration, so they were
replaced with real files created via the tmp_path fixture.
Ruff PTH ruleset is intentionally left disabled in this PR;
enabling it is a recommended follow-up now that pyrit/ is clean.
Verification:
- ruff check pyrit/ tests/unit/models/test_seed_prompt.py: passed
- ruff format --check (same scope): passed
- pytest tests/unit/{models/test_seed_prompt.py,
models/test_data_type_serializer.py, score,
message_normalizer, common, prompt_target,
prompt_converter}: 3245 passed, 81 skipped
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Path(value).is_file() raises OSError (ENAMETOOLONG: Errno 36 on Linux, Errno 63 on macOS) for strings exceeding the filesystem's name limit, and raises ValueError for embedded null bytes. The previous os.path.isfile(value) silently returned False in both cases, so SeedPrompt construction worked for long text-only values such as the academic-paper jailbreak template at pyrit/datasets/jailbreak/templates/Arth_Singh/context_flood_academic.yaml. The pathlib migration regressed this, causing TestJailbreakInitialization / TestJailbreakAttackGeneration to fail on macos-3.11 and the Linux coverage job. Wrap the is_file() probe in try/except (OSError, ValueError) so the inference falls through to data_type='text', preserving the prior os.path.isfile semantics. Add two unit tests in tests/unit/models/test_seed_prompt.py covering the long-value and null-byte cases. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The diff_cover gate (>=90% on changed lines) was failing because four lines in finally-style cleanup branches added during the os.path -> pathlib migration were not exercised by the existing test suite: - pyrit/models/data_type_serializer.py:208 (local_temp_path.unlink() after Azure storage upload in save_formatted_audio) - pyrit/prompt_converter/add_image_to_video_converter.py:185 (local_temp_path.unlink() in azure_storage_flag finally branch) - pyrit/score/audio_transcript_scorer.py:250 (Path(wav_path).unlink() cleanup when _ensure_wav_format produced a converted temp WAV) - pyrit/score/video_scorer.py:285 (Path(audio_path).unlink() cleanup after successful audio scoring in _score_video_audio_async) Added four tests that patch DB_DATA_PATH / extract_audio_from_video / _ensure_wav_format so the cleanup branches run against real temp files and verify the files are unlinked. Diff coverage now reports 98% on this PR (target: >=90%). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
jsong468
reviewed
Jun 2, 2026
jsong468
reviewed
Jun 2, 2026
jsong468
reviewed
Jun 2, 2026
jsong468
approved these changes
Jun 2, 2026
Applies @jsong468's review suggestions on PR microsoft#1877: the two temp-file cleanup branches in audio_transcript_scorer.py and video_scorer.py were probing Path.exists() before calling Path.unlink(). Collapse both to Path(...).unlink(missing_ok=True), which is race-safe (no TOCTOU gap between the existence check and the unlink) and a single syscall. Behavior is unchanged: if the file is absent the call is a no-op; if present it is removed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Brings in 29 new commits from upstream main. One conflict in pyrit/prompt_converter/add_image_to_video_converter.py: PR microsoft#1878 on main refactored _add_image_to_video into an async wrapper plus a sync helper (_add_image_to_video_sync) invoked via asyncio.to_thread, and guarded the temp-file cleanup with `local_temp_path is not None`. This branch had concurrently migrated the same cleanup from os.remove to local_temp_path.unlink(). Resolved by keeping main's structure (helper split, None guard, dropped the orphan logger.info/return that moved to the wrapper) and applying the pathlib migration on top: `local_temp_path.unlink()` instead of `os.remove(local_temp_path)`. Verification: - ruff check + ruff format --check on pyrit/ and tests/: clean - pytest tests/unit -n auto: 9160 passed, 6 skipped Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
The project convention (and modern Python style) is to use
pathlib.Pathfor filesystem paths, but ~22os.path.*call sites across 11pyrit/source files were still using the legacyos.pathAPI, plus a few paired sibling calls (os.makedirs/os.remove/os.unlink). This PR does a focused, mechanical pass to bring those files into line.Standard 1-for-1 swaps applied throughout:
os.path.exists->Path(...).exists()os.path.isfile->Path(...).is_file()os.path.splitext(p)[1]->Path(p).suffixos.path.basename->Path(...).nameos.path.join(os.path.expanduser('~'), ...)->Path.home() / ...os.path.getsize->Path(...).stat().st_sizeos.makedirs(p, exist_ok=True)->Path(p).mkdir(parents=True, exist_ok=True)os.remove/os.unlink->Path(...).unlink()Public parameter and attribute types are kept as
str/Optional[str]; conversion toPath(...)happens at the boundary, so callers passing strings continue to work unchanged. No public signatures changed.One callee improvement worth flagging:
HuggingFaceChatTargetpreviously builtcache_diras astrand passed it todownload_specific_files_async, whose declared parameter iscache_dir: Path. The runtime now honors the declared type.Ruff's
PTHruleset is infixablebut not yetselectinpyproject.toml. This PR intentionally does not flip that switch; enabling it (or at least the rules that catch the patterns above) is a natural follow-up now thatpyrit/is clean.Tests and Documentation
No new tests added. The audit identified one test file (
tests/unit/models/test_seed_prompt.py) whose@patch("os.path.isfile")/@patch("os.path.splitext")decorators would have become inert after the source migration, so those two tests were updated to use thetmp_pathfixture with real touched files instead.Verification:
ruff check+ruff format --check(fullpyrit/tree): cleanpytest tests/unit -n auto: 8382 passed / 120 skippedCoverage on the touched lines was measured with
pytest tests/unit --cov=pyrit. 6 of 11 source files have every changed line exercised by the unit suite; in the remaining 5 files some of the swapped lines are not directly hit (cleanup branches in scorers/serializers and the HF cache_dir resolution). Each of those is a literal 1-for-1 stdlib equivalence, so behavior is unchanged.No documentation changes required.
Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com