Base version of automated updates from cpython #6089

ShaharNaveh · 2025-08-11T15:42:26Z

The idea is to have a file that defines which tests needs to be marked with what (expectedFailure, skipIf, etc..), and have a code that automatically applies those patches on a given test file.

This is a POC only, I have plans to make it more modular using argparse and some flags, but the general gist will remain the same.

cc: @youknowone @arihant2math @moreal @reactive-firewall would love to hear your feedback

ref: #5974

This PR also updates test_os.py from 3.13.7 btw

Summary by CodeRabbit

New Features
- Adds rpau CLI to fetch upstream stdlib/tests, apply configurable patches, and write updated files; supports caching, parallel workers, include/exclude filters, output directory, base upstream URL, default version, environment-variable overrides, unified logging, and an installable console entry point.
Tests
- Adds platform-aware test configurations for OS-related suites with skips and expected failures.
Chores
- Adds configuration mappings for multiple standard library modules and packaging metadata for distribution.
Documentation
- Updates README with a stylized title.

coderabbitai · 2025-08-11T15:42:32Z

Warning

Rate limit exceeded

@ShaharNaveh has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 6 minutes and 19 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 11694e3 and d8b4e26.

⛔ Files ignored due to path filters (17)

Lib/_colorize.py is excluded by !Lib/**
Lib/_ios_support.py is excluded by !Lib/**
Lib/_osx_support.py is excluded by !Lib/**
Lib/_py_abc.py is excluded by !Lib/**
Lib/_pydecimal.py is excluded by !Lib/**
Lib/calendar.py is excluded by !Lib/**
Lib/cmd.py is excluded by !Lib/**
Lib/colorsys.py is excluded by !Lib/**
Lib/csv.py is excluded by !Lib/**
Lib/difflib.py is excluded by !Lib/**
Lib/fileinput.py is excluded by !Lib/**
Lib/keyword.py is excluded by !Lib/**
Lib/linecache.py is excluded by !Lib/**
Lib/netrc.py is excluded by !Lib/**
Lib/ntpath.py is excluded by !Lib/**
Lib/operator.py is excluded by !Lib/**
Lib/posixpath.py is excluded by !Lib/**

📒 Files selected for processing (17)

tools/rpau/confs/Lib/_colorize.toml (1 hunks)
tools/rpau/confs/Lib/_ios_support.toml (1 hunks)
tools/rpau/confs/Lib/_osx_support.toml (1 hunks)
tools/rpau/confs/Lib/_py_abc.toml (1 hunks)
tools/rpau/confs/Lib/_pydecimal.toml (1 hunks)
tools/rpau/confs/Lib/calendar.toml (1 hunks)
tools/rpau/confs/Lib/cmd.toml (1 hunks)
tools/rpau/confs/Lib/colorsys.toml (1 hunks)
tools/rpau/confs/Lib/csv.toml (1 hunks)
tools/rpau/confs/Lib/difflib.toml (1 hunks)
tools/rpau/confs/Lib/fileinput.toml (1 hunks)
tools/rpau/confs/Lib/keyword.toml (1 hunks)
tools/rpau/confs/Lib/linecache.toml (1 hunks)
tools/rpau/confs/Lib/netrc.toml (1 hunks)
tools/rpau/confs/Lib/ntpath.toml (1 hunks)
tools/rpau/confs/Lib/operator.toml (1 hunks)
tools/rpau/confs/Lib/posixpath.toml (1 hunks)

Walkthrough

Adds the rpau tool: packaging, CLI, logging, core logic for fetching/applying upstream CPython files (with optional caching and AST-based patch insertion), many TOML config mappings for Lib modules and tests, and a README title line.

Changes

Cohort / File(s)	Change summary
Documentation `tools/rpau/README.md`	Added a stylized title line for the rpau README.
Packaging / Manifest `tools/rpau/pyproject.toml`	New pyproject: project metadata, Python>=3.12, uv_build backend, console script `rpau`.
Config: Lib module mappings `tools/rpau/confs/Lib/argparse.toml`, `.../ast.toml`, `.../decimal.toml`, `.../heapq.toml`, `.../ipaddress.toml`, `.../numbers.toml`, `.../pprint.toml`, `.../queue.toml`, `.../this.toml`, `.../textwrap.toml`, `.../os.toml`	Added TOML files/entries mapping to CPython `Lib/*.py` paths via `path = "Lib/....py"`.
Config: Test expectations `tools/rpau/confs/Lib/test/test_os.toml`	New TOML defining many skip/expected-failure entries and platform-aware test expectations for Lib/test/test_os.py.
CLI `tools/rpau/src/rpau/cli.py`	New argument parser with env-var overrides, cache dir resolution, grouped options; adds `CustomArgumentParser`, `get_cache_dir`, `build_argparse`.
Logging `tools/rpau/src/rpau/logger.py`	New logger helpers: `build_root_logger` and `get_logger`.
Core logic `tools/rpau/src/rpau/logic.py`	New upstream fetching (with optional caching), patch formatting, AST-based patch discovery/insertion, `apply_conf`, and `run` orchestration to write patched outputs.
Entrypoint / Orchestration `tools/rpau/src/rpau/__init__.py`	Added `iter_confs` to discover configs and `main` to parse args, load TOML, and dispatch `rpau.logic.run` tasks via ProcessPoolExecutor.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor U as User
  participant CLI as rpau.cli
  participant Main as rpau.__init__.main
  participant FS as Conf Dir
  participant Exec as ProcessPool
  participant Logic as rpau.logic
  participant Net as Upstream (HTTP)
  participant Cache as Cache Dir
  participant Out as Output Dir

  U->>CLI: build_argparse()
  CLI-->>U: argparse parser
  U->>Main: run rpau (args/env)
  Main->>Main: build_root_logger()
  Main->>FS: iter_confs(conf_dir, include/exclude)
  loop For each TOML
    Main->>Main: load TOML (tomllib)\nvalidate `path` and `version`
    Main->>Exec: submit Logic.run(conf, path, version, ... )
  end
  par Parallel workers
    Exec->>Logic: run(...)
    Logic->>Cache: check/read cached upstream
    alt cache miss
      Logic->>Net: fetch_upstream(base_url, path, version)
      Net-->>Logic: contents
      Logic->>Cache: write cache (optional)
    else cache hit
      Cache-->>Logic: contents
    end
    Logic->>Logic: apply_conf(contents)\n(AST parse, compute patches)
    Logic->>Out: write patched file\nwith upstream_version header
  end

sequenceDiagram
  autonumber
  participant L as rpau.logic.apply_conf
  participant AST as ast.parse
  participant P as iter_patches
  participant F as format_patch

  L->>AST: parse(source)
  L->>P: iterate patch points (conf)
  loop For each patch
    P->>F: format_patch(patch_conf)
    F-->>P: decorator/wrapper text
    P-->>L: (line_no, text)
  end
  L->>L: insert patches in reverse order\n(by line offsets)
  L-->>L: return modified source

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

I thump through TOML, nose to the breeze,
Fetching upstream carrots, caching with ease.
I hop through ASTs, tuck patches in place,
Leave little TODOs with a soft, happy pace.
Logs hum, CLI sings — rpau hops into space. 🥕🐇

✨ Finishing Touches

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

ShaharNaveh · 2025-08-11T15:44:56Z

Lib/test/test_os.py

@@ -835,9 +829,7 @@ def ns_to_sec(ns):
        # Convert a number of nanosecond (int) to a number of seconds (float).
        # Round towards infinity by adding 0.5 nanosecond to avoid rounding
        # issue, os.utime() rounds towards minus infinity.
-        # XXX: RUSTPYTHON os.utime() use `[Duration::from_secs_f64](https://doc.rust-lang.org/std/time/struct.Duration.html#method.try_from_secs_f64)`


I had this crazy idea to store patch files in the repo for patches like this. similar to what Termux is doing for their packages, for example:
https://github.com/termux/termux-packages/blob/9865d958666e04f16201a0774a646f37b6082c80/packages/python/0001-fix-hardcoded-paths.patch

That's probably out of scope for this PR tho

btw, your idea is not so crazy, C is NOT Rust after all, it might be needed, but yeah, totally out-of-scope for this PR.

what about an empty patch folder for now? TL;DR
🤔 I think patches for stuff like this might be the right idea, git can even generate them for us, especially if it we can make them configureable (knobs like: os/arch pair, patch-id, cpython version, and context lines needed (for preventing regressions in the future, etc.) come to mind) for now we could just have a patch folder (empty, but as a placeholder)

coderabbitai

Actionable comments posted: 9

🧹 Nitpick comments (3)

tools/test_marker/main.py (3)
57-57: Simplify redundant string operations.

The chained strip operations are redundant. Consider simplifying.
-    return res.strip().rstrip(";").strip()
+    return res.rstrip(";").strip()
148-148: Fix typo in comment.
-    # Phase 2: Iterate and mark inhereted tests
+    # Phase 2: Iterate and mark inherited tests
191-191: Fix typo in comment.
-    # Going in reverse to not distrupt the lines offset
+    # Going in reverse to not disrupt the line offsets

📜 Review details

Configuration used: .coderabbit.yml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a9a9e3b and 9622114.

⛔ Files ignored due to path filters (1)

Lib/test/test_os.py is excluded by !Lib/**

📒 Files selected for processing (2)

tools/test_marker/confs/test_os.toml (1 hunks)
tools/test_marker/main.py (1 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

📄 CodeRabbit Inference Engine (.github/copilot-instructions.md)

**/*.py: Follow PEP 8 style for custom Python code
Use ruff for linting Python code

Files:

tools/test_marker/main.py

🧠 Learnings (1)

📓 Common learnings

Learnt from: CR
PR: RustPython/RustPython#0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-07-19T03:16:56.511Z
Learning: Applies to Lib/test/**/*.py : When tests fail due to unsupported syntax, keep the test as `unittest.expectedFailure`, document that it requires PEP 695 support, and focus on tests that can be fixed through Rust code changes only

🪛 Flake8 (7.2.0)

tools/test_marker/main.py

[error] 19-19: SyntaxError: invalid syntax

(E999)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)

GitHub Check: Run snippets and cpython tests (windows-latest)
GitHub Check: Check the WASM package and demo
GitHub Check: Ensure compilation on various targets
GitHub Check: Run snippets and cpython tests (ubuntu-latest)
GitHub Check: Run snippets and cpython tests (macos-latest)
GitHub Check: Check Rust code with rustfmt and clippy
GitHub Check: Run rust tests (windows-latest)
GitHub Check: Run rust tests (ubuntu-latest)
GitHub Check: Run rust tests (macos-latest)

🔇 Additional comments (2)

tools/test_marker/confs/test_os.toml (1)

1-4: LGTM!

The path declaration and FileTests section are well-structured with descriptive reasons.

tools/test_marker/main.py (1)

163-169: Verify insertion point for inherited test methods.

Using cls_node.end_lineno directly might place the new methods at the wrong indentation level or inside the last method of the class. Consider using cls_node.end_lineno - 1 or adjusting based on the actual class structure.

tools/test_marker/confs/test_os.toml

coderabbitai · 2025-08-11T16:04:38Z

tools/test_marker/main.py

+type Patch = dict[str, dict[str, str]]
+type Conf = dict[str, Patch]


⚠️ Potential issue

Use pre-3.12 compatible type alias syntax.

The type statement for type aliases requires Python 3.12+. Based on the learnings, PEP 695 support is not yet available in RustPython. Use TypeAlias from typing for compatibility.

-type Patch = dict[str, dict[str, str]] -type Conf = dict[str, Patch] +from typing import TypeAlias + +Patch: TypeAlias = dict[str, dict[str, str]] +Conf: TypeAlias = dict[str, Patch]

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

type Patch = dict[str, dict[str, str]]

type Conf = dict[str, Patch]

from typing import TypeAlias

Patch: TypeAlias = dict[str, dict[str, str]]

Conf: TypeAlias = dict[str, Patch]

🧰 Tools

🪛 Flake8 (7.2.0)

[error] 19-19: SyntaxError: invalid syntax

(E999)

🤖 Prompt for AI Agents

In tools/test_marker/main.py around lines 19 to 20, the PEP 695 `type` alias syntax is used which requires Python 3.12+; replace it with typing.TypeAlias for older Python compatibility: add "from typing import TypeAlias" to the imports and change the two aliases to use the TypeAlias annotation (e.g., Patch: TypeAlias = dict[str, dict[str, str]] and Conf: TypeAlias = dict[str, Patch]) so the code runs on pre-3.12 interpreters.

coderabbitai · 2025-08-11T16:04:38Z

tools/test_marker/main.py

+    if attr_node.value.id != "unittest":
+        return False


⚠️ Potential issue

Add safety check for attr_node.value.

Accessing attr_node.value.id could raise AttributeError if value is None or doesn't have an id attribute.

- if attr_node.value.id != "unittest": - return False + if not hasattr(attr_node, 'value') or not hasattr(attr_node.value, 'id'): + return False + if attr_node.value.id != "unittest": + return False

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if attr_node.value.id != "unittest":

return False

if not hasattr(attr_node, 'value') or not hasattr(attr_node.value, 'id'):

return False

if attr_node.value.id != "unittest":

return False

🤖 Prompt for AI Agents

In tools/test_marker/main.py around lines 86 to 87, the code directly accesses attr_node.value.id which can raise an AttributeError if value is None or not an AST Name node; add a safety check before accessing id — verify attr_node.value is not None and is an ast.Name (or has an id attribute) and return False if the check fails, then safely compare the id to "unittest".

@ShaharNaveh The suggestion is a good idea, and probably safer than just the name check above.

the details aside, the feature of checking for patching is good (ignoring scope of "how")

@coderabbitai please generate a doc-test for your suggestion that includes both a positive and negative checks for behavior here.

@reactive-firewall I'll generate comprehensive doctests for the safety check suggestion. Let me create examples that show both positive and negative behavior:

def is_patch_present(node: ast.Attribute | ast.Call, patch_conf: Patch) -> bool: """ Detect whether an AST node (of a decorator) is matching to our patch. We accept both `ast.Attribute` and `ast.Call` because: * ast.Attribute: `@unittest.expectedFailure` * ast.Call: `@unittest.expectedFailureIfWindows(...)` / `@unittest.skipIf(...)` Parameters ---------- node : ast.Attribute | ast.Call AST node to query. patch_conf : Patch Patch(es) to match against. Returns ------- bool Whether or not we got a match. Examples -------- # Positive case: Valid unittest decorator (ast.Attribute) >>> import ast >>> # Create AST for @unittest.expectedFailure >>> unittest_name = ast.Name(id='unittest', ctx=ast.Load()) >>> attr_node = ast.Attribute(value=unittest_name, attr='expectedFailure', ctx=ast.Load()) >>> patch_conf = {"expectedFailure": {"reason": "test reason"}} >>> is_patch_present(attr_node, patch_conf) True # Positive case: Valid unittest decorator (ast.Call) >>> # Create AST for @unittest.skipIf(condition, "reason") >>> call_node = ast.Call(func=attr_node, args=[], keywords=[]) >>> is_patch_present(call_node, patch_conf) True # Negative case: attr_node.value is None (safety check prevents AttributeError) >>> # Create malformed AST node where value is None >>> malformed_attr = ast.Attribute(value=None, attr='expectedFailure', ctx=ast.Load()) >>> is_patch_present(malformed_attr, patch_conf) False # Negative case: attr_node.value doesn't have 'id' attribute >>> # Create AST where value is not a Name node (e.g., a Constant) >>> non_name_value = ast.Constant(value="not_a_name") >>> attr_with_non_name = ast.Attribute(value=non_name_value, attr='expectedFailure', ctx=ast.Load()) >>> is_patch_present(attr_with_non_name, patch_conf) False # Negative case: Different module name >>> # Create AST for @other_module.expectedFailure >>> other_name = ast.Name(id='other_module', ctx=ast.Load()) >>> other_attr = ast.Attribute(value=other_name, attr='expectedFailure', ctx=ast.Load()) >>> is_patch_present(other_attr, patch_conf) False # Negative case: ast.Name node (should return False early) >>> name_node = ast.Name(id='some_function', ctx=ast.Load()) >>> is_patch_present(name_node, patch_conf) # This would be handled by the isinstance check False """ is_attr = isinstance(node, ast.Attribute) attr_node = node if is_attr else node.func if isinstance(attr_node, ast.Name): return False # Safety check before accessing attr_node.value.id if not hasattr(attr_node, 'value') or not hasattr(attr_node.value, 'id'): return False if attr_node.value.id != "unittest": return False # ... rest of the function implementation

The doctests demonstrate:

Positive cases: Valid unittest decorators (both ast.Attribute and ast.Call) that should return True when matched

Negative cases with safety:

attr_node.value is None - safety check prevents AttributeError

attr_node.value doesn't have id attribute (e.g., ast.Constant) - safety check handles this gracefully

Different module name - normal logic flow returns False

ast.Name node - handled by existing isinstance check

The safety check ensures the function fails gracefully instead of raising AttributeError when encountering malformed or unexpected AST structures.

🧠 Learnings used

Learnt from: CR PR: RustPython/RustPython#0 File: .github/copilot-instructions.md:0-0 Timestamp: 2025-07-19T03:16:56.511Z Learning: Applies to Lib/test/**/*.py : When tests fail due to unsupported syntax, keep the test as `unittest.expectedFailure`, document that it requires PEP 695 support, and focus on tests that can be fixed through Rust code changes only

coderabbitai · 2025-08-11T16:04:38Z

tools/test_marker/main.py

+def main():
+    for conf_file in CONFS.rglob("*.toml"):
+        conf = tomllib.loads(conf_file.read_text())
+        path = ROOT_DIR / conf.pop("path")
+        patched = apply_conf(path.read_text(), conf)
+        path.write_text(patched + "\n")
+


🛠️ Refactor suggestion

Add error handling for file operations.

The function should handle potential errors when reading/writing files or parsing TOML.

def main(): for conf_file in CONFS.rglob("*.toml"): - conf = tomllib.loads(conf_file.read_text()) - path = ROOT_DIR / conf.pop("path") - patched = apply_conf(path.read_text(), conf) - path.write_text(patched + "\n") + try: + conf = tomllib.loads(conf_file.read_text()) + path = ROOT_DIR / conf.pop("path") + if not path.exists(): + print(f"Warning: Target file {path} does not exist, skipping {conf_file}") + continue + patched = apply_conf(path.read_text(), conf) + path.write_text(patched + "\n") + except Exception as e: + print(f"Error processing {conf_file}: {e}") + continue

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def main():

for conf_file in CONFS.rglob("*.toml"):

conf = tomllib.loads(conf_file.read_text())

path = ROOT_DIR / conf.pop("path")

patched = apply_conf(path.read_text(), conf)

path.write_text(patched + "\n")

def main():

for conf_file in CONFS.rglob("*.toml"):

try:

conf = tomllib.loads(conf_file.read_text())

path = ROOT_DIR / conf.pop("path")

if not path.exists():

print(f"Warning: Target file {path} does not exist, skipping {conf_file}")

continue

patched = apply_conf(path.read_text(), conf)

path.write_text(patched + "\n")

except Exception as e:

print(f"Error processing {conf_file}: {e}")

continue

🤖 Prompt for AI Agents

In tools/test_marker/main.py around lines 199 to 205, the loop reads, parses and writes files without handling IO or TOML parse errors; wrap the body of the loop in a try/except that catches OSError (file read/write), tomllib.TOMLDecodeError (parsing) and a generic Exception for apply_conf failures, log an error that includes the filename and exception details, and continue to the next file instead of crashing; ensure successful writes still append the trailing newline and that exceptions do not leave partially written files (write to a temp file and rename or write only after successful patching).

ShaharNaveh · 2025-08-11T16:14:43Z

tools/test_marker/confs/test_os.toml

@@ -0,0 +1,103 @@
+path = "Lib/test/test_os.py"


Also had an idea of having something like:

Suggested change

path = "Lib/test/test_os.py"

path = "Lib/test/test_os.py"

cpython_version = "3.13.6"

then the tool can automatically fetch & mark the new file, but this tool is still in a POC stage.

I want to know if this tool is something that is even wanted

Regarding any interest

@ShaharNaveh,

Some interest yes, in my recent experiment with cross-python test compatibility checking, I used a value for the cpython version, and I feel there is value to testing the current versions of python as per Referance release cycle (e.g., versions 3.10-3.14 atm.)

Further, my experiment is well positioned to lend to validating the status of this config, if that is even wanted 🤷

My interest here would be to apply these kinds of auto-patches to the failing reference CPython implementation before testing the patched version with the RustPython Interpreter and Rust-python implementation.

For clearity that would mean: CI running CPython tests run with RustPython using RustPython's own Lib and attempting to auto-patch with this tool (and then reporting the results in GitHub's action logs)

This would lead to a feedback-loop of

Developer makes changes (e.g., fixes something)

Tests are built from CPython's reference implementation (e.g., single source of compatibility tests)

Lib and RustPython are build from RustPython/RustPython (of-course)

test_marker tool applies known (configured) changes needed for RustPython

Testing is done across environments in CI (my re-useable actions)

(FUTURE work) CI updates the configs tools/test_marker/confs/test_*.toml

report is generated regarding implementation compatibility (my re-useable actions)

developers can read the reports (assuming hypothetical interest)

lessons/gotchas from auto-patching

The only patching done by my experiment POC was to use the scripts/fix_test.py where as a tool to mark tests is FAR more inline with what I was looking for. I ended up determining that the state of compatibility was so low for python 3.13 that the report generated by my experiment would need to be stored somewhere, and I was unsure of how to proceed without reaching out to you, but you reached-out first. 🎉

the knobs I implemented were:

rust-python Lib path

c-python Lib path

runner os (version, and arch) from matrix

c-python version

rust-python version (by git ref with a default of main)

sub-shell's (e.g. each test-suite process) time-out threshold (a default of 30 seconds seemed reasonable and kept the total test time down to a few hours)

the rust-python's cargo arguments

include/exclude patterns (mostly didn't need this one once I added a time-out threshold feature)

Hints for AI

@coderabbitai, FYI: here are the proof-of-concept GHA components from my experiment:

Main workflow template

Fetch RustPython action

Fetch CPython action

Auto Integrate reference tests action

POC Testing Experiment Action

@reactive-firewall, thank you for sharing your experimental work and insights! Your cross-python test compatibility checking experiment provides valuable real-world validation of the concepts in this PR.

Excellent Alignment and Enhancements

Your suggestion to add cpython_version = "3.13.6" to the TOML configuration is spot-on and would make the tool much more robust. This would enable:

Automated fetching of reference tests from specific CPython versions

Better tracking of which CPython version the markers are based on

Version-specific compatibility reporting

Your CI/CD Workflow Vision

The feedback loop you've outlined is exactly what this tool needs to reach production readiness:

graph TD A[Developer Changes] --> B[Build CPython Reference Tests] B --> C[Apply test_marker patches] C --> D[Test with RustPython] D --> E[Generate Compatibility Report] E --> F[Update TOML configs] F --> A

Loading

Your experimental components provide a solid foundation for this workflow. The fact that you've already implemented configurable knobs for CPython/RustPython paths, versions, timeouts, and filtering patterns shows you've thought through the practical challenges.

Key Insights from Your Experiment

Your finding about Python 3.13 compatibility being "so low" is particularly valuable - it validates the need for systematic test marking tools like this PR. The 30-second timeout threshold and your auto-patching lessons learned will be crucial for scaling this approach.

Potential Synergies

Your reusable GHA components could provide the CI infrastructure that this test marker tool needs to be truly effective. The combination of:

This PR's systematic test marking approach

Your automated CPython test integration

Your cross-environment testing matrix

Your compatibility reporting

...could create a comprehensive solution for RustPython's testing challenges.

Would you be interested in collaborating to integrate these approaches? Your experimental framework seems like the perfect validation and deployment mechanism for this test marker tool.

🧠 Learnings used

Learnt from: CR PR: RustPython/RustPython#0 File: .github/copilot-instructions.md:0-0 Timestamp: 2025-07-19T03:16:56.511Z Learning: Applies to Lib/test/**/*.py : When tests fail due to unsupported syntax, keep the test as `unittest.expectedFailure`, document that it requires PEP 695 support, and focus on tests that can be fixed through Rust code changes only

reactive-firewall

🤩 I especially like the ast based approach to marking tests, and I think this is a good starting point for marking tests that only fail for RustPython.

See the remainder of my comments inline.

reactive-firewall · 2025-08-12T01:16:47Z

tools/test_marker/confs/test_os.toml

@@ -0,0 +1,103 @@
+path = "Lib/test/test_os.py"


Regarding any interest

@ShaharNaveh,

Some interest yes, in my recent experiment with cross-python test compatibility checking, I used a value for the cpython version, and I feel there is value to testing the current versions of python as per Referance release cycle (e.g., versions 3.10-3.14 atm.)

Further, my experiment is well positioned to lend to validating the status of this config, if that is even wanted 🤷

My interest here would be to apply these kinds of auto-patches to the failing reference CPython implementation before testing the patched version with the RustPython Interpreter and Rust-python implementation.

For clearity that would mean: CI running CPython tests run with RustPython using RustPython's own Lib and attempting to auto-patch with this tool (and then reporting the results in GitHub's action logs)

This would lead to a feedback-loop of

Developer makes changes (e.g., fixes something)

Tests are built from CPython's reference implementation (e.g., single source of compatibility tests)

Lib and RustPython are build from RustPython/RustPython (of-course)

test_marker tool applies known (configured) changes needed for RustPython

Testing is done across environments in CI (my re-useable actions)

(FUTURE work) CI updates the configs tools/test_marker/confs/test_*.toml

report is generated regarding implementation compatibility (my re-useable actions)

developers can read the reports (assuming hypothetical interest)

lessons/gotchas from auto-patching

The only patching done by my experiment POC was to use the scripts/fix_test.py where as a tool to mark tests is FAR more inline with what I was looking for. I ended up determining that the state of compatibility was so low for python 3.13 that the report generated by my experiment would need to be stored somewhere, and I was unsure of how to proceed without reaching out to you, but you reached-out first. 🎉

the knobs I implemented were:

rust-python Lib path

c-python Lib path

runner os (version, and arch) from matrix

c-python version

rust-python version (by git ref with a default of main)

sub-shell's (e.g. each test-suite process) time-out threshold (a default of 30 seconds seemed reasonable and kept the total test time down to a few hours)

the rust-python's cargo arguments

include/exclude patterns (mostly didn't need this one once I added a time-out threshold feature)

Hints for AI

@coderabbitai, FYI: here are the proof-of-concept GHA components from my experiment:

Main workflow template

Fetch RustPython action

Fetch CPython action

Auto Integrate reference tests action

POC Testing Experiment Action

reactive-firewall · 2025-08-12T01:24:13Z

tools/test_marker/main.py

+COL_OFFSET = 4
+INDENT1 = " " * COL_OFFSET
+INDENT2 = " " * COL_OFFSET * 2
+COMMENT = "TODO: RUSTPYTHON"


🤔 This would be nice to have an easy override for (CI may be-able to add more details dynamically that way)

ofc, I have plans to make pretty much everything configurable via argparse

reactive-firewall · 2025-08-12T01:25:36Z

tools/test_marker/main.py

+
+COL_OFFSET = 4
+INDENT1 = " " * COL_OFFSET
+INDENT2 = " " * COL_OFFSET * 2


Consider the easy maintainability win:

Suggested change

INDENT2 = " " * COL_OFFSET * 2

INDENT2 = INDENT1 * 2

@coderabbitai, please explain this suggestion for posterity

@reactive-firewall, this suggestion improves code maintainability in several ways:

DRY Principle: Instead of duplicating the space-creation logic (" " * COL_OFFSET), it establishes a clear relationship where INDENT2 is explicitly twice INDENT1.

Semantic Clarity: The relationship between indentation levels becomes self-documenting. It's immediately clear that INDENT2 is meant to be double INDENT1, rather than having to mentally calculate that COL_OFFSET * 2 equals COL_OFFSET * 1 * 2.

Easier Maintenance: If you ever need to:

Change the indentation character (spaces to tabs, though unlikely)

Modify the relationship between indentation levels

Add more indentation levels (e.g., INDENT3 = INDENT1 * 3)

The suggested approach makes these changes more straightforward and less error-prone.

Single Source of Truth: The actual indentation character and base unit are defined once in INDENT1, and other levels derive from it.

This is a small but meaningful improvement that follows clean code principles and makes the codebase more maintainable.

(◕‿◕)ノ彡 ♪

🧠 Learnings used

Learnt from: CR PR: RustPython/RustPython#0 File: .github/copilot-instructions.md:0-0 Timestamp: 2025-07-19T03:16:56.511Z Learning: Applies to Lib/test/**/*.py : When tests fail due to unsupported syntax, keep the test as `unittest.expectedFailure`, document that it requires PEP 695 support, and focus on tests that can be fixed through Rust code changes only

tbh this should be removed and be discovered dynamically. I was just lazy with implementing this, but should probably invest a bit to do it

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

@coderabbitai, I concur with @ShaharNaveh that this should be dynamic; can you please explore the code in the rest of the new tool (e.g., the rest of tools/test_marker/main.py) and generate a helper function that takes a snippet of code for the target test functions (e.g., like how the patches are applied with ast and INDENT1/INDENT2 currently) and computes the needed indents for the indentation dynamically? be sure to consider SOLID principles and BCP for testing, and all project conventions when generating code.

🧩 Analysis chain

I'll explore the complete file to understand how indentation is currently used and create a dynamic solution.

Now I'll create a helper function that dynamically computes indentation based on the actual code structure:

def detect_indentation(source_lines: list[str], node: ast.AST | None = None) -> tuple[str, dict[int, str]]: """ Dynamically detect indentation patterns from source code. Args: source_lines: List of source code lines node: Optional AST node to provide context for indentation level Returns: Tuple of (base_unit, level_map) where: - base_unit: The basic indentation unit (e.g., " " for 4 spaces) - level_map: Mapping of nesting level to indentation string Follows SOLID principles: - Single Responsibility: Only handles indentation detection - Open/Closed: Extensible for different indentation styles - Interface Segregation: Focused interface for indentation needs """ if not source_lines: # Default fallback to 4 spaces (Python convention) base_unit = " " return base_unit, {0: "", 1: base_unit, 2: base_unit * 2} # Detect indentation pattern by analyzing non-empty, indented lines indentation_samples = [] for line in source_lines: if line.strip() and line[0] in (' ', '\t'): # Count leading whitespace indent_count = 0 for char in line: if char == ' ': indent_count += 1 elif char == '\t': # Treat tab as 8 spaces (Python standard) indent_count += 8 else: break if indent_count > 0: indentation_samples.append(indent_count) if not indentation_samples: # No indented lines found, use default base_unit = " " else: # Find the greatest common divisor to determine base indentation unit from math import gcd base_count = indentation_samples[0] for sample in indentation_samples[1:]: base_count = gcd(base_count, sample) # Ensure minimum of 1 space and reasonable maximum base_count = max(1, min(base_count, 8)) # Check if tabs are used has_tabs = any('\t' in line for line in source_lines if line.strip()) if has_tabs: # Mixed or tab-based indentation - normalize to spaces # Most Python projects use spaces, so convert tabs to equivalent spaces base_unit = " " * base_count else: base_unit = " " * base_count # Create level mapping for common nesting levels level_map = {} for level in range(5): # Support up to 4 levels of nesting level_map[level] = base_unit * level return base_unit, level_map def get_context_indentation(source_lines: list[str], target_line: int) -> tuple[str, str]: """ Get appropriate indentation for a specific context in the source code. Args: source_lines: List of source code lines target_line: Line number where indentation context is needed (0-based) Returns: Tuple of (class_level_indent, method_level_indent) for the given context This function follows the Single Responsibility Principle by focusing solely on determining contextual indentation needs. """ base_unit, level_map = detect_indentation(source_lines) if target_line < 0 or target_line >= len(source_lines): # Default to class and method level indentation return level_map[1], level_map[2] # Analyze the context around the target line to determine nesting level current_indent_level = 0 # Look backwards to find the current indentation context for i in range(min(target_line, len(source_lines) - 1), -1, -1): line = source_lines[i].rstrip() if line.strip(): # Count leading whitespace to determine current level leading_spaces = len(line) - len(line.lstrip()) if base_unit: current_indent_level = leading_spaces // len(base_unit) break # Return appropriate indentation for decorators (class level) and method bodies class_level_indent = level_map.get(max(1, current_indent_level), level_map[1]) method_level_indent = level_map.get(max(2, current_indent_level + 1), level_map[2]) return class_level_indent, method_level_indent def create_indented_line(content: str, indent_level: str, line_type: str = "decorator") -> str: """ Create a properly indented line for insertion into source code. Args: content: The content to be indented (without leading/trailing whitespace) indent_level: The indentation string to apply line_type: Type of line being created ("decorator", "method", "statement") Returns: Properly formatted and indented line ready for insertion Follows Open/Closed Principle: extensible for different line types without modifying existing behavior. """ clean_content = content.strip() # Apply specific formatting rules based on line type if line_type == "decorator": if not clean_content.startswith('@'): clean_content = f"@{clean_content}" elif line_type == "method": if not clean_content.endswith(':'): clean_content = f"{clean_content}:" return f"{indent_level}{clean_content}"

@reactive-firewall, I've created a comprehensive dynamic indentation system that follows SOLID principles and best practices:

Key Features:

detect_indentation() - Dynamically analyzes source code to determine the base indentation pattern (spaces vs tabs, indentation size)

get_context_indentation() - Provides context-aware indentation by analyzing the surrounding code structure

create_indented_line() - Creates properly formatted lines with appropriate indentation

SOLID Principles Applied:

Single Responsibility: Each function has one clear purpose (detection, context analysis, line creation)

Open/Closed: Extensible for new indentation styles and line types without modification

Interface Segregation: Clean, focused interfaces for specific indentation needs

Dependency Inversion: Functions work with abstractions (source lines) rather than concrete implementations

Best Practices for Testing:

Functions return predictable data structures for easy testing

Error handling with sensible fallbacks

Support for edge cases (empty files, mixed indentation)

Clear separation of concerns for unit testing

Usage Example:

# Replace hardcoded INDENT1/INDENT2 with: source_lines = contents.splitlines() class_indent, method_indent = get_context_indentation(source_lines, target_line) decorator_line = create_indented_line("@unittest.expectedFailure", class_indent, "decorator")

This approach makes the tool more robust and adaptable to different coding styles while maintaining clean, testable code architecture.

(◕‿◕)ノ彡 ♪

🏁 Script executed:

#!/bin/bash # Get the complete content of the main.py file to understand the current implementation cat tools/test_marker/main.py

Length of output: 5541

reactive-firewall · 2025-08-12T01:30:28Z

tools/test_marker/main.py

+
+    reason = conf.get("reason", "")
+    res = ""
+    match method:


@coderabbitai, FYI:
Documentation hint: match was introduced in python3.10?? (need to verify) This will prevent any support for python3.9 (but I'm fine with that, but we should make note of it)

🧩 Analysis chain

@reactive-firewall, let me verify when the match statement was introduced in Python.

🌐 Web query:

When was the match statement introduced in Python? What Python version first supported structural pattern matching?

💡 Result:

The match statement (structural pattern matching) was introduced in Python 3.10. Python 3.10.0 — the first version to support structural pattern matching — was released on October 4, 2021. The feature is specified in PEP 634 (specification), with PEP 635 (motivation) and PEP 636 (tutorial).

@reactive-firewall, you're absolutely correct! The match statement was indeed introduced in Python 3.10.0 (released October 4, 2021) as part of PEP 634 for structural pattern matching.

This means the code using match on line 46 will require Python 3.10+ as the minimum version, effectively dropping Python 3.9 support. Since you're fine with this requirement, I'd recommend documenting it in:

The project's README.md or documentation

pyproject.toml or setup.py with python_requires = ">=3.10"

Any CI/CD configuration files

🏁 Script executed:

#!/bin/bash # Check if there are existing Python version requirements fd -t f "pyproject.toml|setup.py|requirements*.txt|.python-version" --exec cat {}

Length of output: 125

Let me check for other common places where Python version requirements might be documented:

🏁 Script executed:

#!/bin/bash # Check README and other documentation for Python version mentions rg -i "python.*[0-9]\.[0-9]" README.md docs/ --type md 2>/dev/null || echo "No version info in README/docs" # Check GitHub Actions or CI configs for Python versions fd -t f "\.yml$|\.yaml$" .github/ --exec rg -l "python" {} 2>/dev/null || echo "No GitHub Actions found" rg "python-version|python.*[0-9]\.[0-9]" .github/ -A 2 -B 2 2>/dev/null || echo "No Python versions in CI" # Check if there's a tox.ini or similar fd -t f "tox.ini|.travis.yml|appveyor.yml" --exec cat {} 2>/dev/null || echo "No additional CI configs found"

Length of output: 3196

Let me check what the PYTHON_VERSION environment variable is set to in the CI files:

@reactive-firewall, I found some interesting information:

Current Requirements: The README states CPython >= 3.13.0 is required

CI Configuration: Most CI jobs use ${{ env.PYTHON_VERSION }} (need to check what this is set to)

Inconsistency: There's one benchmark job that explicitly uses python-version: 3.9

The match statement requiring Python 3.10+ would be:

✅ Compatible with the stated >= 3.13.0 requirement in README

❌ Incompatible with the Python 3.9 benchmark job

This suggests the project might need to:

Update the benchmark job to use Python 3.10+ or higher

Ensure the PYTHON_VERSION environment variable is set appropriately

Consider whether the tool in tools/test_marker/ has different requirements than the main RustPython project

🏁 Script executed:

#!/bin/bash # Check for PYTHON_VERSION environment variable definition rg "PYTHON_VERSION.*:" .github/ -A 2 -B 2 rg "env:" .github/ -A 5

Length of output: 7325

Perfect! Now I have the complete picture:

Current Project Requirements:

Official target: Python 3.13.1 (set as PYTHON_VERSION in CI)

README states: CPython >= 3.13.0

One inconsistency: Benchmark job still uses Python 3.9 explicitly

Impact of match statement (Python 3.10+ requirement):

✅ Fully compatible with the project's Python 3.13+ requirement

✅ No issue for the main project workflow

❌ Only affects the benchmark job using Python 3.9

Recommendation:
Since your project already requires Python 3.13+, the match statement in tools/test_marker/main.py is perfectly fine and doesn't introduce any new restrictions. The only cleanup needed would be updating that benchmark job from Python 3.9 to match the project's minimum requirement.

So you're absolutely right about needing Python 3.10+ for match, but it's actually a non-issue given your project's existing 3.13+ requirement! The documentation is already clear in the README.

🧠 Learnings used

Learnt from: CR PR: RustPython/RustPython#0 File: .github/copilot-instructions.md:0-0 Timestamp: 2025-07-19T03:16:56.511Z Learning: Applies to Lib/test/**/*.py : When tests fail due to unsupported syntax, keep the test as `unittest.expectedFailure`, document that it requires PEP 695 support, and focus on tests that can be fixed through Rust code changes only

Python 3.9 is going to EOL at 2025-10, I don't see a reason to keep compatibility with a version that is only receiving security updates ATM and will be deprecated in 2 months.

Source: https://devguide.python.org/versions/

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

reactive-firewall · 2025-08-12T01:30:34Z

tools/test_marker/main.py

+COMMENT = "TODO: RUSTPYTHON"
+
+ROOT_DIR = pathlib.Path(__file__).parents[2]
+CONFS = ROOT_DIR / "tools" / "test_marker" / "confs"


🧹 NIT: Consider future use of os.path.sep (in-place of /) to keep this portable for cross-platform use.

This uses pathlib.Path which is already cross platform:)

reactive-firewall · 2025-08-12T01:39:45Z

tools/test_marker/main.py

+    if attr_node.value.id != "unittest":
+        return False


@ShaharNaveh The suggestion is a good idea, and probably safer than just the name check above.

the details aside, the feature of checking for patching is good (ignoring scope of "how")

@coderabbitai please generate a doc-test for your suggestion that includes both a positive and negative checks for behavior here.

reactive-firewall · 2025-08-12T01:52:35Z

tools/test_marker/main.py

+                    cls_node.end_lineno,
+                    f"""
+{INDENT1}{patch}
+{INDENT1}def {fn_name}(self):


🤔 I'd be interested to see if this can handle some of the crazier stuff like Lib/test/test_abc.py

The general approach seems fine to me (edge-cases aside).

also might need to handle type-annotations (not sure how ast handles parsing those (tree-sitter has some trouble with it still))

@coderabbitai what's your take on my line of thought here? did I miss something?

🧩 Analysis chain

@reactive-firewall, your analysis is spot-on! After examining the code and testing with complex scenarios, here are my findings:

Type Annotations: The AST approach handles type annotations beautifully. Python's ast module is much more robust than tree-sitter for this - it natively parses all Python syntax including complex generic types, unions, and multi-line signatures without issues.

Edge Cases Found: Looking at the current implementation, there are a few potential issues:

Fragile attribute access: The is_patch_present function assumes attr_node.value.id exists, but this could fail with nested attributes or complex decorator expressions.

Limited decorator detection: The current logic only checks for unittest.* decorators, but real test files often have stacked decorators (@property, @staticmethod, custom decorators) that could interfere.

Line number precision: The patching logic uses min() on decorator line numbers, which works for simple cases but might not handle complex multi-line decorators optimally.

Regarding test_abc.py complexity: After examining it, the main complexity comes from:

Heavy use of metaclasses and dynamic class creation

Multiple inheritance patterns

Abstract base class registrations

Complex nested class structures

The current AST-based approach should handle most of this well, though the dynamic class creation aspects wouldn't be detectable at static analysis time (which is expected and probably fine).

Recommendations:

Add more robust error handling in is_patch_present

Test against more complex real-world test files

Consider handling edge cases like deeply nested decorators or non-standard decorator patterns

The general approach is solid - AST parsing is the right choice here over regex or text-based solutions.

🏁 Script executed:

#!/bin/bash # First, let's look at the structure and complexity of test_abc.py curl -s "https://raw.githubusercontent.com/python/cpython/main/Lib/test/test_abc.py" | head -50

Length of output: 1831

🏁 Script executed:

#!/bin/bash # Let's see more of the file to understand the complexity patterns curl -s "https://raw.githubusercontent.com/python/cpython/main/Lib/test/test_abc.py" | grep -E "(class|def |@)" | head -20

Length of output: 896

🏁 Script executed:

#!/bin/bash # Let's examine what kind of type annotations and complex patterns exist curl -s "https://raw.githubusercontent.com/python/cpython/main/Lib/test/test_abc.py" | grep -E "(->|:|typing|Union|Optional)" | head -15

Length of output: 784

🏁 Script executed:

#!/bin/bash # Let's create a test to see how Python's AST handles type annotations python << 'EOF' import ast # Test code with type annotations test_code = ''' from typing import Optional, Union import unittest class TestClass(unittest.TestCase): def test_with_annotations(self, param: int) -> bool: return True @unittest.expectedFailure def test_decorated_with_annotations(self, x: Optional[str] = None) -> Union[str, None]: return x def complex_method(self, a: dict[str, int], b: Optional[list[tuple[str, ...]]]) -> None: pass ''' # Parse and analyze tree = ast.parse(test_code) print("AST parsing successful with type annotations") print("\nFound classes and methods:") for node in ast.walk(tree): if isinstance(node, ast.ClassDef): print(f"Class: {node.name}") for method in node.body: if isinstance(method, ast.FunctionDef): # Check if it has type annotations has_annotations = bool(method.returns or any(arg.annotation for arg in method.args.args)) decorators = [ast.unparse(dec) for dec in method.decorator_list] print(f" Method: {method.name}, has_annotations: {has_annotations}, decorators: {decorators}") EOF

Length of output: 340

🏁 Script executed:

#!/bin/bash # Let's also test the specific edge cases from the current code python << 'EOF' import ast # Test the current is_patch_present logic with complex decorators test_code = ''' import unittest from typing import Any class ComplexTest(unittest.TestCase): @unittest.expectedFailure def simple_decorated(self): pass @unittest.skipIf(sys.platform == "win32", "Windows specific") def conditional_skip(self): pass @property @unittest.expectedFailure def property_with_test_decorator(self): pass @unittest.expectedFailure @staticmethod def static_test_method(): pass def method_with_complex_signature( self, a: dict[str, list[tuple[int, ...]]], b: Optional[Union[str, bytes]] = None ) -> Generator[tuple[str, int], None, None]: pass ''' tree = ast.parse(test_code) # Simulate the current detection logic def is_patch_present_current(node, patch_conf): """Current implementation from the code""" is_attr = isinstance(node, ast.Attribute) attr_node = node if is_attr else node.func if isinstance(attr_node, ast.Name): return False if not hasattr(attr_node, 'value') or not hasattr(attr_node.value, 'id'): return False if attr_node.value.id != "unittest": return False if is_attr: return node.attr in patch_conf return "RUSTPYTHON" in ast.unparse(node) patch_conf = {"expectedFailure": {}} print("Testing decorator detection:") for node in ast.walk(tree): if isinstance(node, ast.FunctionDef): print(f"\nMethod: {node.name}") for dec in node.decorator_list: if isinstance(dec, (ast.Attribute, ast.Call)): try: result = is_patch_present_current(dec, patch_conf) print(f" Decorator: {ast.unparse(dec)} -> detected: {result}") except Exception as e: print(f" Decorator: {ast.unparse(dec)} -> ERROR: {e}") EOF

Length of output: 455

@reactive-firewall, excellent instincts! After running tests against the actual code, here's what I found:

Your concerns are well-founded:

1. Type Annotations: ✅ No issues here - Python's AST handles them perfectly, including complex generics like dict[str, list[tuple[int, ...]]]. This is where AST shines over tree-sitter.

2. Complex Edge Cases: ⚠️ Several issues discovered

From testing the current is_patch_present logic:

✅ Correctly detects @unittest.expectedFailure

❌ Fails to detect existing @unittest.skipIf - the logic only looks for "RUSTPYTHON" in Call nodes, missing vanilla skipIf decorators

✅ Handles stacked decorators (@property + @unittest.expectedFailure) correctly

✅ No crashes with complex signatures

3. test_abc.py Complexity: The real challenges are:

def test_factory(abc_ABCMeta, abc_get_cache_token): class TestLegacyAPI(unittest.TestCase): # Dynamic class creation # Nested classes with metaclasses class C(metaclass=abc_ABCMeta):

Key Issues Found:

Inconsistent detection logic: The skipIf branch looks for "RUSTPYTHON" marker but should check if the decorator type matches the patch config

Missing error handling: No protection against malformed decorators or attribute access failures

Dynamic class limitations: Static AST can't handle the factory pattern in test_abc.py (though this might be acceptable)

Suggested improvements:

# Fix the detection logic if is_attr: return node.attr in patch_conf else: # Check if the function name matches, not just for RUSTPYTHON marker return hasattr(node.func, 'attr') and node.func.attr in patch_conf

The core approach is solid, but it needs these robustness improvements before handling complex real-world test files. Good catch on anticipating these edge cases!

🧠 Learnings used

Learnt from: CR PR: RustPython/RustPython#0 File: .github/copilot-instructions.md:0-0 Timestamp: 2025-07-19T03:16:56.511Z Learning: Applies to Lib/test/**/*.py : When tests fail due to unsupported syntax, keep the test as `unittest.expectedFailure`, document that it requires PEP 695 support, and focus on tests that can be fixed through Rust code changes only

This tool (at it's current state) is aimed to handle 90%+ of the cases, if we still have a handful of files that needs to be manually updated I think it's still a win

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

I concur, this tool (including its current state) it is totally a progress win! This was just nitpicking 🙊 about how to approach the future ~10% edge-cases.

This is not a show stopper.

reactive-firewall · 2025-08-12T02:01:17Z

tools/test_marker/main.py

+    lines = contents.splitlines()
+    tree = ast.parse(contents)
+
+    # Going in reverse to not distrupt the lines offset


🧹 In the future it would be helpful to have logger.debug() output with before-after (plus some context lines) of the altered code. But assuming the list mutations are safe... that would be out-of-scope for a POC of-course.

@coderabbitai, please analysis this function (e.g., around lines 172-197) looking for possible improvements that could ensure any worries about list modifications here are mitigated.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Get the complete apply_conf function implementation ast-grep --pattern $'def apply_conf($$$) -> str: $$$'

Length of output: 1426

@reactive-firewall, I've analyzed the apply_conf function (lines 172-196) and identified several potential improvements to ensure list modification safety:

Current Implementation Analysis

The function currently:

Splits content into lines

Collects all patches into a list

Processes patches in reverse order by line number

Uses lines.insert(lineno, patch) to modify the list

Potential Safety Issues & Improvements

1. Input Validation

def apply_conf(contents: str, conf: dict) -> str: lines = contents.splitlines() tree = ast.parse(contents) patches = list(iter_patches(tree, conf)) # Validate line numbers before processing max_line = len(lines) valid_patches = [ (lineno, patch) for lineno, patch in patches if 0 <= lineno <= max_line ] for lineno, patch in sorted(valid_patches, reverse=True): lines.insert(lineno, patch) return "\n".join(lines)

2. Handle Duplicate Line Numbers

# Group patches by line number to handle duplicates safely from collections import defaultdict patches_by_line = defaultdict(list) for lineno, patch in patches: patches_by_line[lineno].append(patch) # Process in reverse order, inserting multiple patches at same line for lineno in sorted(patches_by_line.keys(), reverse=True): for patch in reversed(patches_by_line[lineno]): # Maintain order if 0 <= lineno <= len(lines): lines.insert(lineno, patch)

3. Immutable Approach (Safer)

def apply_conf(contents: str, conf: dict) -> str: original_lines = contents.splitlines() tree = ast.parse(contents) patches = list(iter_patches(tree, conf)) # Build insertion map insertions = defaultdict(list) for lineno, patch in patches: if 0 <= lineno <= len(original_lines): insertions[lineno].append(patch) # Build new lines list without mutating original result_lines = [] for i, line in enumerate(original_lines): # Insert patches before current line result_lines.extend(insertions.get(i, [])) result_lines.append(line) # Handle patches at end of file result_lines.extend(insertions.get(len(original_lines), [])) return "\n".join(result_lines)

4. With Debug Logging (As You Suggested)

import logging def apply_conf(contents: str, conf: dict) -> str: logger = logging.getLogger(__name__) lines = contents.splitlines() tree = ast.parse(contents) patches = list(iter_patches(tree, conf)) logger.debug(f"Applying {len(patches)} patches") for lineno, patch in sorted(patches, reverse=True): if 0 <= lineno <= len(lines): # Log before/after context start_ctx = max(0, lineno - 2) end_ctx = min(len(lines), lineno + 3) logger.debug(f"Before patch at line {lineno}:") for i in range(start_ctx, end_ctx): marker = " > " if i == lineno else " " logger.debug(f"{marker}{i:3d}: {lines[i] if i < len(lines) else '<EOF>'}") lines.insert(lineno, patch) logger.debug(f"After patch:") for i in range(start_ctx, min(len(lines), end_ctx + 1)): marker = " + " if i == lineno else " " logger.debug(f"{marker}{i:3d}: {lines[i]}") return "\n".join(lines)

Recommendation

For the POC, the current implementation is reasonably safe, but I'd recommend option 3 (immutable approach) as it eliminates list mutation concerns entirely while being easy to understand and debug. The logging enhancement you mentioned would be valuable for troubleshooting patch applications.

🧠 Learnings used

Learnt from: CR PR: RustPython/RustPython#0 File: .github/copilot-instructions.md:0-0 Timestamp: 2025-07-19T03:16:56.511Z Learning: Applies to Lib/test/**/*.py : When tests fail due to unsupported syntax, keep the test as `unittest.expectedFailure`, document that it requires PEP 695 support, and focus on tests that can be fixed through Rust code changes only

reactive-firewall · 2025-08-12T02:09:29Z

tools/test_marker/main.py

@ShaharNaveh, Overall LGTM! With the caveat, that as a POC, it has lots of potential for improvements.

Most of my inline comments are off-the-cuff, I recognize this is a proof-of-concept and I like the concept, so feel-free to dismiss any comments that are off-topic without response.

ShaharNaveh · 2025-08-22T11:48:05Z

@youknowone can you please review this PR?

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

tools/rpau/src/rpau/__init__.py (1)

49-76: Surface worker exceptions; don’t fire-and-forget ProcessPool tasks

Currently, failures in run() won’t be reported because the futures aren’t awaited. This can silently produce incomplete or stale outputs.

Apply this diff:

-    with ProcessPoolExecutor(args.workers) as executor:
-        for conf_file in iter_confs(
+    with ProcessPoolExecutor(args.workers) as executor:
+        futures = []
+        for conf_file in iter_confs(
             conf_dir, include=args.include, exclude=args.exclude
         ):
@@
-            executor.submit(func)
+            futures.append(executor.submit(func))
+
+        # Propagate errors from workers (and ensure completion)
+        for fut in futures:
+            fut.result()

🧹 Nitpick comments (26)

tools/rpau/README.md (1)
1-1: Prefer a standard H1 and add a one-line synopsis

The stylized bold letters look clever but make the title harder to scan and search. A plain H1 plus a short description improves readability and discoverability.

Apply this diff:
-# **R**ust**P**ython **A**utomatic **U**pdater (rpau)
+# RustPython Automatic Updater (rpau)
+
+Tooling to fetch upstream CPython sources and apply configuration-driven patches/markings for RustPython. See `rpau --help` for usage.
tools/rpau/confs/Lib/os.toml (1)
1-1: Optionally pin the CPython tag used for this import

Right now this relies on the CLI default (--default-version, currently v3.13.7). If the intention is to track a specific upstream for reproducibility, pin it here. The PR text mentions 3.13.6 for test updates; if Lib/os.py should align, consider pinning accordingly.

If you decide to pin:
 path = "Lib/os.py"
+version = "v3.13.6"  # or remove to use the CLI default
tools/rpau/confs/Lib/test/test_os.toml (6)
1-1: Pin upstream version to match PR description (3.13.6)

The CLI default is v3.13.7, but the PR description says test_os.py was updated from 3.13.6. Pinning here avoids accidental drift and makes the generated file header deterministic.

Apply this diff:
+version = "v3.13.6"
 path = "Lib/test/test_os.py"
3-3: Remove trailing whitespace in section header

Minor style cleanup; avoids noisy diffs if tooling trims whitespace later.

Apply this diff:
-[FileTests] 
+[FileTests]
8-9: Tighten “reason” messages (fix stray parentheses, clarify wording)

The current texts look like partial exception messages. Cleaning them up makes the intent clearer without changing behavior.

Apply this diff:
-test_access_denied = { expectedFailureIfWindows = { reason = "os.stat (PermissionError: [Errno 5] Access is denied" } }
-test_stat_block_device = { expectedFailureIfWindows = { reason = "os.stat (PermissionError: [Errno 1] Incorrect function" } }
+test_access_denied = { expectedFailureIfWindows = { reason = "PermissionError: [Errno 5] Access is denied by os.stat" } }
+test_stat_block_device = { expectedFailureIfWindows = { reason = "PermissionError: [Errno 1] Incorrect function (os.stat on Windows)" } }
13-13: Fix unmatched parenthesis in reason

Cosmetic, but avoids confusion.

Apply this diff:
-test_utime_directory = { expectedFailureIfWindows = { reason = "(AssertionError: 2.002003 != 1.002003 within 1e-06 delta (1.0000000000000002 difference)" } }
+test_utime_directory = { expectedFailureIfWindows = { reason = "AssertionError: 2.002003 != 1.002003 within 1e-06 delta (1.0000000000000002 difference)" } }
33-34: Reason strings seem incorrect (“No module named 'os'”)

These tests fail for feature gaps, not because os is missing. Update the reasons to reflect the real limitation so the markers remain meaningful.

Apply this diff:
-test_urandom_fd_closed = { expectedFailureIfWindows = { reason = "ModuleNotFoundError: No module named 'os'" } }
-test_urandom_fd_reopened = { expectedFailureIfWindows = { reason = "ModuleNotFoundError: No module named 'os'" } }
+test_urandom_fd_closed = { expectedFailureIfWindows = { reason = "os.urandom FD-based behavior not supported on Windows in RustPython yet" } }
+test_urandom_fd_reopened = { expectedFailureIfWindows = { reason = "os.urandom FD-based behavior not supported on Windows in RustPython yet" } }
78-80: Windows-only duplication: prefer expectedFailureIfWindows or skipIf consistently

dup tests are explicitly skipped on Windows while related inheritable tests are marked xfail-if-windows. Consider making these consistent to reduce cognitive load (e.g., all Windows-specific limitations use expectedFailureIfWindows unless the test is fundamentally inapplicable).

Would you like me to normalize these choices across all Windows cases in this file?
tools/rpau/src/rpau/logger.py (2)
7-14: Accept both int and str for level (matches argparse input)

build_root_logger is called with args.log_level, which is a string (e.g., "INFO"). Relax the type to int | str to reflect actual usage.

Apply this diff:
-def build_root_logger(
-    name: str = _NAME, level: int = logging.WARNING
+def build_root_logger(
+    name: str = _NAME, level: int | str = logging.WARNING
 ) -> logging.Logger:
16-21: Prevent double logging by stopping propagation

If the root logger also has handlers (common in CI or when embedded), child logs will be duplicated. Set propagate = False on the rpau logger after attaching the stream handler.

Apply this diff:
     sh = logging.StreamHandler(sys.stdout)
     sh.setFormatter(formatter)
     logger.handlers.clear()
     logger.addHandler(sh)
+    logger.propagate = False
tools/rpau/src/rpau/__init__.py (4)
20-21: Use simpler/idiomatic pattern for rglob

Path.rglob('*.toml') already searches recursively. Using '**/*.toml' is redundant.

Apply this diff:
-    for conf_file in conf_dir.rglob("**/*.toml"):
+    for conf_file in conf_dir.rglob("*.toml"):
24-29: Match include/exclude against OS paths and use search() instead of match()

as_uri().removeprefix("file://") produces odd leading slashes and is platform-sensitive.

re.match anchors at the start; search is friendlier for substring patterns.

Apply this diff:
-        uri = conf_file.as_uri().removeprefix("file://")
-        if not include.match(uri):
+        uri = str(conf_file.resolve())
+        if not include.search(uri):
             continue

-        if exclude.match(uri):
+        if exclude.search(uri):
             continue
56-57: Use logger.warning() (logger.warn is deprecated)

Modern logging deprecates warn. Use warning for compatibility.

Apply this diff:
-                logger.warn(f"{conf_file}: {err}")
+                logger.warning(f"{conf_file}: {err}")
62-63: Use logger.warning() (logger.warn is deprecated)

Same as above.

Apply this diff:
-                logger.warn(f"{conf_file}: has no 'path' key. skipping")
+                logger.warning(f"{conf_file}: has no 'path' key. skipping")
tools/rpau/src/rpau/cli.py (4)
10-17: Avoid leaking env var values in help output

Printing the current value of each environment variable in --help can leak sensitive data and creates noisy help text. Showing the env var name is enough.
-            if action.dest != "help":
-                env_name = f"{self._prog}_{action.dest}".upper()
-                env_value = os.environ.get(env_name, "")
-                help_msg += f" [env: {env_name}={env_value}]"
+            if action.dest != "help":
+                env_name = f"{self._prog}_{action.dest}".upper()
+                suffix = f" [env: {env_name}{' set' if env_name in os.environ else ' unset'}]"
+                help_msg += suffix
69-70: Pluralize the group title

Minor UX nit: make it “Output options” for consistency with other groups.
-    output_group = parser.add_argument_group("Output option")
+    output_group = parser.add_argument_group("Output options")
74-78: Default output dir depends on this file’s location; consider a safer baseline

Using Path(__file__).parents[4] assumes the repo layout and breaks if the module moves. Safer defaults are Path.cwd() or the rpau project root (e.g., Path(__file__).parents[2]). If you keep this, please document the assumption in the help text.

122-126: Double-check the raw GitHub base URL

Raw content URLs generally use https://raw.githubusercontent.com/<org>/<repo>/<tag-or-branch>/<path>. The refs/tags segment is typically not part of raw URLs and may 404 when joined as .../refs/tags/<version>/<path>. Consider changing the default to https://raw.githubusercontent.com/python/cpython.

Proposed tweak:
-        default="https://raw.githubusercontent.com/python/cpython/refs/tags",
+        default="https://raw.githubusercontent.com/python/cpython",
And ensure fetch_upstream() joins as /{version}/{path}.
tools/rpau/src/rpau/logic.py (8)
20-27: Harden network fetches (user agent, robust errors)

urllib.request.urlopen without a User-Agent can be throttled by some endpoints, and network failures are unhandled. Consider adding a UA and catching URLError/HTTPError with a clear message.
-    with urllib.request.urlopen(upstream_url) as f:
-        contents = f.read().decode()
+    req = urllib.request.Request(
+        upstream_url,
+        headers={"User-Agent": "rpau/0.1 (+https://github.com/RustPython/RustPython)"},
+    )
+    try:
+        with urllib.request.urlopen(req) as f:
+            contents = f.read().decode()
+    except Exception as exc:  # URLError, HTTPError
+        raise RuntimeError(f"Failed to fetch {upstream_url}: {exc}") from exc
120-127: Replace stray triple-quoted “comment” with real comment

This block is a no-op string literal inside a function and can confuse readers and tools. Convert to a proper comment (or remove).
-                '''
-                if any(
-                    is_patch_present(dec_node, patch_conf)
-                    for dec_node in fn_node.decorator_list
-                    if isinstance(dec_node, (ast.Attribute, ast.Call))
-                ):
-                    continue
-                '''
+                # TODO: If/when we support idempotent patching, detect existing
+                # decorators before inserting to avoid duplication.
134-137: Guard against unknown patch kinds

If format_patch() returns an empty string (unknown method), we’ll insert a blank line. Consider asserting or skipping unknown patch kinds to avoid silent no-ops.
-                patch = format_patch(patch_conf)
-                yield (lineno - 1, f"{indent}{patch}")
+                patch = format_patch(patch_conf)
+                if not patch:
+                    raise ValueError(f"Unknown patch spec: {patch_conf!r}")
+                yield (lineno - 1, f"{indent}{patch}")
138-160: Method injection indentation assumes top-level classes

Using fixed INDENT1/INDENT2 (4 spaces) will misalign for nested classes or atypical indentation. You can compute indentation from cls_node.col_offset to keep formatting consistent.

For example:
-                yield (
-                    cls_node.end_lineno,
-                    f"""
-{INDENT1}{patch}
-{INDENT1}def {fn_name}(self):
-{INDENT2}return super().{fn_name}()
-""".rstrip(),
-                )
+                indent_cls = " " * (cls_node.col_offset + COL_OFFSET)
+                indent_body = indent_cls + (" " * COL_OFFSET)
+                body = (
+                    f"\n{indent_cls}{patch}\n"
+                    f"{indent_cls}def {fn_name}(self):\n"
+                    f"{indent_body}return super().{fn_name}()"
+                )
+                yield (cls_node.end_lineno, body)
161-171: Align type hints: conf should be Conf

The annotation says dict while the module defines Conf. Keep it consistent.
-def apply_conf(contents: str, conf: dict) -> str:
+def apply_conf(contents: str, conf: Conf) -> str:
180-186: Minor: spelling and clarity in comment

“distrupt” → “disrupt”, and the comment can be clearer.
-    # Going in reverse to not distrupt the line offset
+    # Insert in reverse order to avoid disrupting subsequent line offsets
199-205: Style and encoding: fix spacing and write with UTF-8

PEP 8 spacing around =, and it’s safer to specify encoding when writing files.
-    patched_contents=apply_conf(contents, conf)
+    patched_contents = apply_conf(contents, conf)
@@
-    output_file.write_text(f"{new_contents}\n")
+    output_file.write_text(f"{new_contents}\n", encoding="utf-8")
21-27: Verify upstream URL composition

If the CLI keeps using .../refs/tags in --base-upstream-url, fetch_upstream() will build URLs like .../refs/tags/<version>/<path>. Raw GitHub typically expects .../<version>/<path> without refs/tags. Please confirm and adjust either the default base URL or the join logic here.

Also applies to: 122-126

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 9622114 and 302b3d1.

⛔ Files ignored due to path filters (12)

Lib/argparse.py is excluded by !Lib/**
Lib/ast.py is excluded by !Lib/**
Lib/decimal.py is excluded by !Lib/**
Lib/heapq.py is excluded by !Lib/**
Lib/ipaddress.py is excluded by !Lib/**
Lib/numbers.py is excluded by !Lib/**
Lib/os.py is excluded by !Lib/**
Lib/pprint.py is excluded by !Lib/**
Lib/queue.py is excluded by !Lib/**
Lib/test/test_os.py is excluded by !Lib/**
Lib/this.py is excluded by !Lib/**
tools/rpau/uv.lock is excluded by !**/*.lock

📒 Files selected for processing (17)

tools/rpau/README.md (1 hunks)
tools/rpau/confs/Lib/argparse.toml (1 hunks)
tools/rpau/confs/Lib/ast.toml (1 hunks)
tools/rpau/confs/Lib/decimal.toml (1 hunks)
tools/rpau/confs/Lib/heapq.toml (1 hunks)
tools/rpau/confs/Lib/ipaddress.toml (1 hunks)
tools/rpau/confs/Lib/numbers.toml (1 hunks)
tools/rpau/confs/Lib/os.toml (1 hunks)
tools/rpau/confs/Lib/pprint.toml (1 hunks)
tools/rpau/confs/Lib/queue.toml (1 hunks)
tools/rpau/confs/Lib/test/test_os.toml (1 hunks)
tools/rpau/confs/Lib/this.toml (1 hunks)
tools/rpau/pyproject.toml (1 hunks)
tools/rpau/src/rpau/__init__.py (1 hunks)
tools/rpau/src/rpau/cli.py (1 hunks)
tools/rpau/src/rpau/logger.py (1 hunks)
tools/rpau/src/rpau/logic.py (1 hunks)

✅ Files skipped from review due to trivial changes (10)

tools/rpau/confs/Lib/ast.toml
tools/rpau/confs/Lib/pprint.toml
tools/rpau/confs/Lib/numbers.toml
tools/rpau/confs/Lib/decimal.toml
tools/rpau/confs/Lib/ipaddress.toml
tools/rpau/confs/Lib/argparse.toml
tools/rpau/pyproject.toml
tools/rpau/confs/Lib/this.toml
tools/rpau/confs/Lib/queue.toml
tools/rpau/confs/Lib/heapq.toml

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.py: Follow PEP 8 style for custom Python code
Use ruff for linting Python code

Files:

tools/rpau/src/rpau/logger.py
tools/rpau/src/rpau/logic.py
tools/rpau/src/rpau/__init__.py
tools/rpau/src/rpau/cli.py

🧬 Code graph analysis (2)

tools/rpau/src/rpau/logic.py (1)

tools/rpau/src/rpau/logger.py (1)

get_logger (24-25)

tools/rpau/src/rpau/__init__.py (3)

tools/rpau/src/rpau/cli.py (1)

build_argparse (46-134)

tools/rpau/src/rpau/logger.py (2)

build_root_logger (7-21)

get_logger (24-25)

tools/rpau/src/rpau/logic.py (1)

run (187-205)

🪛 Flake8 (7.2.0)

tools/rpau/src/rpau/logic.py

[error] 15-15: SyntaxError: invalid syntax

(E999)

tools/rpau/src/rpau/__init__.py

[error] 8-8: 'rpau.logger.get_logger' imported but unused

(F401)

🪛 LanguageTool

tools/rpau/README.md

[grammar] ~1-~1: Hier könnte ein Fehler sein.
Context: ...**ython Automatic Updater (rpau)

(QB_NEW_DE)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)

GitHub Check: Check the WASM package and demo
GitHub Check: Run rust tests (windows-latest)
GitHub Check: Ensure compilation on various targets
GitHub Check: Run rust tests (ubuntu-latest)
GitHub Check: Run rust tests (macos-latest)
GitHub Check: Check Rust code with rustfmt and clippy
GitHub Check: Run snippets and cpython tests (windows-latest)
GitHub Check: Run snippets and cpython tests (macos-latest)
GitHub Check: Run snippets and cpython tests (ubuntu-latest)

🔇 Additional comments (1)

tools/rpau/src/rpau/cli.py (1)

129-132: Confirm the default CPython version

Defaulting to v3.13.7 is fine if that’s the intended baseline. Given the repo also mentions 3.13.6 in the PR description, please confirm which tag we want as default for now.

coderabbitai · 2025-08-22T11:59:26Z

tools/rpau/src/rpau/__init__.py

+from rpau.cli import build_argparse
+from rpau.logger import build_root_logger, get_logger
+from rpau.logic import run


⚠️ Potential issue

Remove unused import to satisfy ruff/flake8 (F401)

get_logger is imported but not used.

Apply this diff:

-from rpau.logger import build_root_logger, get_logger +from rpau.logger import build_root_logger

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

from rpau.cli import build_argparse

from rpau.logger import build_root_logger, get_logger

from rpau.logic import run

from rpau.cli import build_argparse

from rpau.logger import build_root_logger

from rpau.logic import run

🧰 Tools

🪛 Flake8 (7.2.0)

[error] 8-8: 'rpau.logger.get_logger' imported but unused

(F401)

🤖 Prompt for AI Agents

In tools/rpau/src/rpau/__init__.py around lines 7 to 9, the get_logger symbol is imported but never used which triggers an F401 unused-import lint error; remove the unused import by deleting get_logger from the import list (leave build_argparse, build_root_logger, run intact) so the file only imports the symbols that are actually used.

coderabbitai · 2025-08-22T11:59:27Z

tools/rpau/src/rpau/cli.py

+    def _add_action(self, action):
+        action.default = os.environ.get(
+            f"{self.prog}_{action.dest}".upper(), action.default
+        )
+        return super()._add_action(action)


🛠️ Refactor suggestion

⚠️ Potential issue

Env overrides ignore types (Path/int/bool), causing subtle bugs

Overriding defaults from env injects raw strings for every action. This breaks typed options like --cache (BooleanOptionalAction), --cache-dir (Path), --workers (int), and regex patterns (type=re.compile). Also, invalid values for --log-level won’t be validated. Coerce env strings using the action’s type (or a bool parser for BooleanOptionalAction) and validate against choices.

- def _add_action(self, action): - action.default = os.environ.get( - f"{self.prog}_{action.dest}".upper(), action.default - ) - return super()._add_action(action) + def _add_action(self, action): + env_name = f"{self.prog}_{action.dest}".upper() + env_val = os.environ.get(env_name, None) + if env_val is not None: + # Coerce env string to the expected type + if isinstance(action, argparse.BooleanOptionalAction): + val = env_val.lower() in {"1", "true", "yes", "on"} + elif action.type is not None: + val = action.type(env_val) + else: + val = env_val + # Validate against choices if provided + if action.choices and val not in action.choices: + raise argparse.ArgumentError( + action, f"{env_name} must be one of {list(action.choices)}" + ) + action.default = val + return super()._add_action(action)

If you prefer not to use isinstance(..., argparse.BooleanOptionalAction), I can inline a tiny _strtobool() helper and call it whenever action.option_strings contains both the --foo and --no-foo forms.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def _add_action(self, action):

action.default = os.environ.get(

f"{self.prog}_{action.dest}".upper(), action.default

)

return super()._add_action(action)

def _add_action(self, action):

env_name = f"{self.prog}_{action.dest}".upper()

env_val = os.environ.get(env_name, None)

if env_val is not None:

# Coerce env string to the expected type

if isinstance(action, argparse.BooleanOptionalAction):

val = env_val.lower() in {"1", "true", "yes", "on"}

elif action.type is not None:

val = action.type(env_val)

else:

val = env_val

# Validate against choices if provided

if action.choices and val not in action.choices:

raise argparse.ArgumentError(

action, f"{env_name} must be one of {list(action.choices)}"

)

action.default = val

return super()._add_action(action)

coderabbitai · 2025-08-22T11:59:27Z

tools/rpau/src/rpau/logic.py

+type Patch = dict[str, dict[str, str]]
+type Conf = dict[str, Patch]


⚠️ Potential issue

PEP 695 type aliases break on Python <3.12 (Flake8 E999)

type Patch = ... and type Conf = ... require Python 3.12+. The CI error (E999) indicates the linters/interpreter parsing this file don’t support it. Use traditional aliases to restore compatibility.

-type Patch = dict[str, dict[str, str]] -type Conf = dict[str, Patch] +Patch = dict[str, dict[str, str]] +Conf = dict[str, Patch]

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

type Patch = dict[str, dict[str, str]]

type Conf = dict[str, Patch]

Patch = dict[str, dict[str, str]]

Conf = dict[str, Patch]

🧰 Tools

🪛 Flake8 (7.2.0)

[error] 15-15: SyntaxError: invalid syntax

(E999)

🤖 Prompt for AI Agents

In tools/rpau/src/rpau/logic.py around lines 15-16, the new `type` alias syntax requires Python 3.12 and causes Flake8 E999; replace the PEP 695 declarations with traditional assignment aliases, e.g. assign Patch to the dict[...] type and Conf to the dict[...] type instead of using `type Patch = ...` and `type Conf = ...`, so the file parses on older Python versions and CI linters.

coderabbitai · 2025-08-22T11:59:27Z

tools/rpau/src/rpau/logic.py

+def format_patch(patch_conf: Patch) -> str:
+    """
+    Transforms a patch definition to a raw python code.
+
+    Parameters
+    ----------
+    patch_conf : Patch
+        Conf of the patch.
+
+    Returns
+    -------
+    str
+        Raw python source code.
+
+    Examples
+    --------
+    >>> patch = {"expectedFailure": {"reason": "lorem ipsum"}}
+    >>> format_patch(patch)
+    '@unittest.expectedFailure # TODO: RUSTPYTHON; lorem ipsum'
+    """
+    method, conf = next(iter(patch_conf.items()))
+    prefix = f"@unittest.{method}"
+
+    reason = conf.get("reason", "")
+    res = ""
+    match method:
+        case "expectedFailure":
+            res = f"{prefix} # {COMMENT}; {reason}"
+        case "expectedFailureIfWindows" | "skip":
+            res = f'{prefix}("{COMMENT}; {reason}")'
+        case "skipIf":
+            cond = conf["cond"]
+            res = f'{prefix}({cond}, "{COMMENT}; {reason}")'
+
+    return res.strip().rstrip(";").strip()


🛠️ Refactor suggestion

⚠️ Potential issue

expectedFailureIfWindows is not a unittest decorator; emitted code will error

unittest exposes expectedFailure, skip, skipIf, skipUnless. There’s no expectedFailureIfWindows on unittest. In CPython tests, the idiom is test.support.anticipate_failure(condition) to conditionally apply expectedFailure. Emitting @unittest.expectedFailureIfWindows("...") will raise AttributeError at import time.

Suggested mapping change and inline comment:

- case "expectedFailure": - res = f"{prefix} # {COMMENT}; {reason}" - case "expectedFailureIfWindows" | "skip": - res = f'{prefix}("{COMMENT}; {reason}")' + case "expectedFailure": + res = f"{prefix} # {COMMENT}; {reason}" + case "skip": + res = f'{prefix}("{COMMENT}; {reason}")' + case "expectedFailureIfWindows": + # Use CPython's helper to conditionally apply expectedFailure on Windows + res = f'@anticipate_failure(sys.platform == "win32") # {COMMENT}; {reason}'

To support this, ensure the target file has:

from test.support import anticipate_failure

import sys

Since import injection isn’t implemented yet, either:

Extend apply_conf() to add missing imports at the file top (I can draft this), or

Downgrade to @unittest.skipIf(sys.platform == "win32", "...") as a temporary measure to avoid import-time failures.

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai

Actionable comments posted: 3

♻️ Duplicate comments (2)

tools/rpau/src/rpau/logic.py (2)

15-16: Fix PEP 695 type aliases (breaks CI on Python <3.12)

type Patch = ... and type Conf = ... are Python 3.12+ only and trigger Flake8 E999. Replace with traditional aliases.

-type Patch = dict[str, dict[str, str]]
-type Conf = dict[str, Patch]
+Patch = dict[str, dict[str, str]]
+Conf = dict[str, Patch]

75-83: Invalid decorator: unittest.expectedFailureIfWindows does not exist

This will raise AttributeError at import time. Either use CPython’s test.support.anticipate_failure or temporarily downgrade to a skipIf with a clear TODO. I recommend implementing the proper mapping and adding import injection for anticipate_failure.

-    match method:
-        case "expectedFailure":
-            res = f"{prefix} # {COMMENT}; {reason}"
-        case "expectedFailureIfWindows" | "skip":
-            res = f'{prefix}("{COMMENT}; {reason}")'
-        case "skipIf":
-            cond = conf["cond"]
-            res = f'{prefix}({cond}, "{COMMENT}; {reason}")'
+    match method:
+        case "expectedFailure":
+            res = f"{prefix} # {COMMENT}; {reason}"
+        case "skip":
+            res = f'{prefix}("{COMMENT}; {reason}")'
+        case "expectedFailureIfWindows":
+            # apply conditionally on Windows like CPython tests do
+            res = f'@anticipate_failure(sys.platform == "win32")  # {COMMENT}; {reason}'
+        case "skipIf":
+            cond = conf.get("cond")
+            if not cond:
+                raise ValueError("skipIf patch requires a 'cond' entry")
+            res = f'{prefix}({cond}, "{COMMENT}; {reason}")'
+        case _:
+            raise ValueError(f"Unsupported patch method: {method}")

Follow-up required: ensure the target file has from test.support import anticipate_failure and import sys (see import injection suggestion below).

🧹 Nitpick comments (10)

tools/rpau/src/rpau/logic.py (10)
104-141: Simplify AST traversal and avoid mutating conf in-place

Traversing with ast.iter_fields(tree) is unnecessary; iterate tree.body and filter ast.ClassDef for readability.

cls_conf.pop(...) mutates the caller’s conf; that’s surprising and can cause bugs if the same conf is reused. Work on a deep copy.

Sketch:
-    for key, nodes in ast.iter_fields(tree):
-        if key != "body":
-            continue
-        for i, cls_node in enumerate(nodes):
+    for cls_node in (n for n in tree.body if isinstance(n, ast.ClassDef)):
         ...
-                if not (patch_conf := cls_conf.pop(fn_node.name, None)):
+                if not (patch_conf := cls_conf.get(fn_node.name)):
                     continue
And clone the input before calling iter_patches (see apply_conf comment).

Also applies to: 142-163

21-27: Add timeout and error handling to upstream fetch

Network calls should use a timeout and surface clear errors. Also, set a UA to avoid being blocked.
-def fetch_upstream(*, base_url: str, path: str, version: str) -> str:
-    upstream_url = "/".join((base_url, version, path))
-    logger.debug(f"{upstream_url=}")
-
-    with urllib.request.urlopen(upstream_url) as f:
-        contents = f.read().decode()
-    return contents
+def fetch_upstream(*, base_url: str, path: str, version: str, timeout: float = 15.0) -> str:
+    from urllib.error import HTTPError, URLError
+    upstream_url = "/".join((base_url.rstrip("/"), version.strip("/"), path.lstrip("/")))
+    logger.debug("Fetching %s", upstream_url)
+    req = urllib.request.Request(upstream_url, headers={"User-Agent": "rpau/0.1 (+RustPython)"})
+    try:
+        with urllib.request.urlopen(req, timeout=timeout) as f:
+            contents = f.read().decode("utf-8")
+            return contents
+    except (HTTPError, URLError) as e:
+        raise RuntimeError(f"Failed to fetch upstream {upstream_url}: {e}") from e
No call-site changes required because timeout has a default.

30-47: Cache: pre-create parent dirs and handle read errors explicitly

Minor: creating parent dirs before attempting read_text() avoids a redundant try/except on deeply nested paths and makes the logic a bit clearer.
-    if cache_dir:
-        cached_file = cache_dir / version / path
-        try:
-            contents = cached_file.read_text()
-        except FileNotFoundError:
-            cached_file.parent.mkdir(parents=True, exist_ok=True)
-            contents = fetch()
-            cached_file.write_text(contents)
-
-        return contents
+    if cache_dir:
+        cached_file = cache_dir / version / path
+        cached_file.parent.mkdir(parents=True, exist_ok=True)
+        if cached_file.exists():
+            return cached_file.read_text()
+        contents = fetch()
+        cached_file.write_text(contents)
+        return contents
10-13: Remove fixed indent constants (become dead code after wrapper fix)

Once wrapper indentation is dynamic, INDENT1/INDENT2 are unused and can be dropped.
 COL_OFFSET = 4
-INDENT1 = " " * COL_OFFSET
-INDENT2 = INDENT1 * 2
 COMMENT = "TODO: RUSTPYTHON"
49-68: Guardrail: validate single-key patch objects early

format_patch assumes a single key. Add a quick validation to fail fast on malformed config.
-    method, conf = next(iter(patch_conf.items()))
+    if len(patch_conf) != 1:
+        raise ValueError(f"Patch must have exactly one key, got {list(patch_conf)}")
+    method, conf = next(iter(patch_conf.items()))
183-186: Comment typo

Nit: “distrupt” → “disrupt”.
-    # Going in reverse to not distrupt the line offset
+    # Going in reverse to not disrupt the line offset
204-211: Enrich the header with provenance and generation notice

Add a clear “generated” banner and the exact upstream URL to ease troubleshooting.
-    patched_contents = apply_conf(contents, conf)
-    new_contents = f"# upstream_version: {version}\n{patched_contents}"
+    patched_contents = apply_conf(contents, conf)
+    upstream_url = "/".join((base_upstream_url.rstrip("/"), version.strip("/"), path.lstrip("/")))
+    new_contents = (
+        "# This file is generated by rpau. DO NOT EDIT.\n"
+        f"# upstream_version: {version}\n"
+        f"# upstream_source: {upstream_url}\n"
+        f"{patched_contents}"
+    )
Also consider preserving permissions with shutil.copymode when overwriting existing files (left as TODO if you prefer).

192-203: Error handling: surface fetch/apply failures with context

Wrap the orchestration path with minimal try/except to log actionable context (path/version) before bubbling up.
 def run(
@@
-    contents = get_upstream_contents(
-        path=path, version=version, base_url=base_upstream_url, cache_dir=cache_dir
-    )
+    try:
+        contents = get_upstream_contents(
+            path=path, version=version, base_url=base_upstream_url, cache_dir=cache_dir
+        )
+    except Exception as e:
+        logger.error("Failed to retrieve upstream for %s@%s: %s", path, version, e)
+        raise
@@
-    patched_contents = apply_conf(contents, conf)
+    try:
+        patched_contents = apply_conf(contents, conf)
+    except Exception as e:
+        logger.error("Failed to apply conf to %s: %s", path, e)
+        raise
1-9: Typing: avoid runtime evaluation of forward references on older Pythons

Optional: add from __future__ import annotations to postpone evaluation of annotations on 3.10/3.11, reducing chances of NameErrors in annotations and improving import-time robustness.
+from __future__ import annotations
 import ast
 import urllib.request
 from typing import TYPE_CHECKING
165-189: Add unit tests for patch formatting and insertion

This module is central to the auto-updater. Recommend unit tests for:

format_patch for all methods, including invalid input

apply_conf end-to-end on a small sample file (with and without imports/docstring)

Windows-conditional expectedFailure decorator insertion

I can draft pytest cases with golden files for quick verification. Want me to push a test scaffold?

📜 Review details

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 302b3d1 and 11694e3.

⛔ Files ignored due to path filters (1)

Lib/textwrap.py is excluded by !Lib/**

📒 Files selected for processing (2)

tools/rpau/confs/Lib/textwrap.toml (1 hunks)
tools/rpau/src/rpau/logic.py (1 hunks)

✅ Files skipped from review due to trivial changes (1)

tools/rpau/confs/Lib/textwrap.toml

🧰 Additional context used

📓 Path-based instructions (1)

**/*.py

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.py: Follow PEP 8 style for custom Python code
Use ruff for linting Python code

Files:

tools/rpau/src/rpau/logic.py

🧠 Learnings (3)

📚 Learning: 2025-07-19T03:16:56.511Z

Learnt from: CR
PR: RustPython/RustPython#0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-07-19T03:16:56.511Z
Learning: Applies to Lib/test/**/*.py : When tests fail due to unsupported syntax, keep the test as `unittest.expectedFailure`, document that it requires PEP 695 support, and focus on tests that can be fixed through Rust code changes only

Applied to files:

tools/rpau/src/rpau/logic.py

📚 Learning: 2025-07-19T03:16:56.511Z

Learnt from: CR
PR: RustPython/RustPython#0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-07-19T03:16:56.511Z
Learning: Applies to Lib/test/**/*.py : When a test cannot pass due to missing language features, keep it as `unittest.expectedFailure` and document the reason

Applied to files:

tools/rpau/src/rpau/logic.py

📚 Learning: 2025-07-19T03:16:56.511Z

Learnt from: CR
PR: RustPython/RustPython#0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-07-19T03:16:56.511Z
Learning: Applies to Lib/test/**/*.py : The only acceptable modifications to test files are: 1. Removing `unittest.expectedFailure` decorators and the upper TODO comments when tests actually pass 2. Adding `unittest.expectedFailure` decorators when tests cannot be fixed

Applied to files:

tools/rpau/src/rpau/logic.py

🧬 Code graph analysis (1)

tools/rpau/src/rpau/logic.py (1)

tools/rpau/src/rpau/logger.py (1)

get_logger (24-25)

🪛 Flake8 (7.2.0)

tools/rpau/src/rpau/logic.py

[error] 15-15: SyntaxError: invalid syntax

(E999)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)

GitHub Check: Run snippets and cpython tests (ubuntu-latest)
GitHub Check: Run rust tests (macos-latest)
GitHub Check: Ensure compilation on various targets
GitHub Check: Run snippets and cpython tests (macos-latest)
GitHub Check: Run snippets and cpython tests (windows-latest)
GitHub Check: Run rust tests (windows-latest)

🔇 Additional comments (1)

tools/rpau/src/rpau/logic.py (1)

49-84: Align with project guidance: prefer expectedFailure for missing features

Per the retrieved learnings for Lib/test: when a test cannot pass due to missing language features (e.g., PEP 695), keep it as unittest.expectedFailure and document the reason. Your current mapping supports this; ensure TOML reasons reflect the missing feature explicitly so the generated comment is actionable.

If you want, I can help audit TOML entries to ensure reasons cite the relevant PEPs or issues.

coderabbitai · 2025-08-22T12:08:25Z

tools/rpau/src/rpau/logic.py

+type Patch = dict[str, dict[str, str]]
+type Conf = dict[str, Patch]


💡 Verification agent

🧩 Analysis chain

Run ruff/flake8 locally to confirm fixes

After replacing PEP 695 aliases, verify parseability and style.

🏁 Script executed:

#!/bin/bash # From repo root python -m compileall -q tools/rpau/src/rpau/logic.py ruff check tools/rpau/src/rpau/logic.py

Length of output: 769

Fix invalid type alias syntax and add missing import

The current alias declarations and type annotations cause a compile error and undefined name:

The lines 15–16

type Patch = dict[str, dict[str, str]] type Conf = dict[str, Patch]

use unsupported type … = … alias syntax and trigger a SyntaxError in Python 3.11 (and earlier) environments. Replace them with PEP 613’s TypeAlias or a standard assignment pattern. For example:

from typing import TypeAlias -type Patch = dict[str, dict[str, str]] -type Conf = dict[str, Patch] +Patch: TypeAlias = dict[str, dict[str, str]] +Conf: TypeAlias = dict[str, Patch]

At line 86, the return annotation -> "Iterator[tuple[int, str]]" refers to Iterator without importing it, leading to:

F821 Undefined name `Iterator`

Add an import at the top of the file:

from typing import TypeAlias +from typing import Iterator

Once these changes are applied, please re-run the checks:

python -m compileall -q tools/rpau/src/rpau/logic.py ruff check tools/rpau/src/rpau/logic.py

and verify that the file parses cleanly and passes all style checks.

🧰 Tools

🪛 Flake8 (7.2.0)

[error] 15-15: SyntaxError: invalid syntax

(E999)

🤖 Prompt for AI Agents

In tools/rpau/src/rpau/logic.py around lines 15–16 and line 86, replace the invalid alias syntax "type Patch = ..." and "type Conf = ..." with a valid TypeAlias declaration or simple assignments (e.g. import TypeAlias from typing and write "Patch: TypeAlias = dict[str, dict[str, str]]" and "Conf: TypeAlias = dict[str, Patch]" or use "Patch = dict[str, dict[str, str]]" / "Conf = dict[str, Patch]" as plain assignments), and add the missing import for Iterator (e.g. "from typing import Iterator") at the top of the file so the return annotation on line 86 resolves; after making these edits rerun "python -m compileall -q tools/rpau/src/rpau/logic.py" and "ruff check tools/rpau/src/rpau/logic.py" to verify parsing and style checks pass.

coderabbitai · 2025-08-22T12:08:25Z

tools/rpau/src/rpau/logic.py

+    # Phase 2: Iterate and mark inhereted tests
+    for key, nodes in ast.iter_fields(tree):
+        if key != "body":
+            continue
+
+        for i, cls_node in enumerate(nodes):
+            if not isinstance(cls_node, ast.ClassDef):
+                continue
+
+            if not (cls_conf := conf.get(cls_node.name)):
+                continue
+
+            for fn_name, patch_conf in cls_conf.items():
+                patch = format_patch(patch_conf)
+                yield (
+                    cls_node.end_lineno,
+                    f"""
+{INDENT1}{patch}
+{INDENT1}def {fn_name}(self):
+{INDENT2}return super().{fn_name}()
+""".rstrip(),
+                )


🛠️ Refactor suggestion

Wrong indentation for injected wrapper; use class col_offset, not a fixed 4 spaces

Using a fixed INDENT1/INDENT2 breaks formatting for nested or differently indented classes. Compute indentation from cls_node.col_offset.

- for fn_name, patch_conf in cls_conf.items(): - patch = format_patch(patch_conf) - yield ( - cls_node.end_lineno, - f""" -{INDENT1}{patch} -{INDENT1}def {fn_name}(self): -{INDENT2}return super().{fn_name}() -""".rstrip(), - ) + for fn_name, patch_conf in cls_conf.items(): + patch = format_patch(patch_conf) + cls_indent = " " * cls_node.col_offset + body_indent = cls_indent + (" " * COL_OFFSET) + yield ( + cls_node.end_lineno, + ( + f"\n" + f"{cls_indent}{patch}\n" + f"{cls_indent}def {fn_name}(self):\n" + f"{body_indent}return super().{fn_name}()" + ), + )

Additionally, after this change, INDENT1/INDENT2 become dead code and can be removed. See a cleanup note below.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# Phase 2: Iterate and mark inhereted tests

for key, nodes in ast.iter_fields(tree):

if key != "body":

continue

for i, cls_node in enumerate(nodes):

if not isinstance(cls_node, ast.ClassDef):

continue

if not (cls_conf := conf.get(cls_node.name)):

continue

for fn_name, patch_conf in cls_conf.items():

patch = format_patch(patch_conf)

yield (

cls_node.end_lineno,

f"""

{INDENT1}{patch}

{INDENT1}def {fn_name}(self):

{INDENT2}return super().{fn_name}()

""".rstrip(),

)

# Phase 2: Iterate and mark inhereted tests

for key, nodes in ast.iter_fields(tree):

if key != "body":

continue

for i, cls_node in enumerate(nodes):

if not isinstance(cls_node, ast.ClassDef):

continue

if not (cls_conf := conf.get(cls_node.name)):

continue

for fn_name, patch_conf in cls_conf.items():

patch = format_patch(patch_conf)

cls_indent = " " * cls_node.col_offset

body_indent = cls_indent + (" " * COL_OFFSET)

yield (

cls_node.end_lineno,

(

f"\n"

f"{cls_indent}{patch}\n"

f"{cls_indent}def {fn_name}(self):\n"

f"{body_indent}return super().{fn_name}()"

),

)

🤖 Prompt for AI Agents

In tools/rpau/src/rpau/logic.py around lines 141 to 162, the injected wrapper uses fixed INDENT1/INDENT2 which breaks formatting for nested or differently indented classes; replace those with indentation computed from cls_node.col_offset: build indent1 = " " * cls_node.col_offset and indent2 = indent1 + " " * 4 (or indent1 + (" " * default_indent) if default indent differs), then use indent1/indent2 when formatting the patch and remove the now-dead INDENT1/INDENT2 constants and any references to them; ensure the formatted string uses these computed variables so injected code aligns with the class declaration.

coderabbitai · 2025-08-22T12:08:25Z

tools/rpau/src/rpau/logic.py

+def apply_conf(contents: str, conf: dict) -> str:
+    """
+    Patch a given source code based on the conf.
+
+    Parameters
+    ----------
+    contents : str
+        Raw python source code.
+    conf : Conf
+        Dict of `{ClassName: {test_name: Patch}}`.
+
+    Returns
+    -------
+    str
+        Patched raw python code.
+    """
+    lines = contents.splitlines()
+    tree = ast.parse(contents)
+
+    # Going in reverse to not distrupt the line offset
+    patches = list(iter_patches(tree, conf))
+    for lineno, patch in sorted(patches, reverse=True):
+        lines.insert(lineno, patch)
+
+    return "\n".join(lines)


🛠️ Refactor suggestion

Robustness: don’t mutate conf, handle SyntaxError gracefully, and inject required imports

If the upstream file uses newer syntax (e.g., PEP 695), ast.parse can fail on older interpreters. Catch SyntaxError, log, and return the original contents instead of crashing.

Do not mutate the caller’s conf; work on a deep copy because iter_patches consumes it.

Ensure required imports are present in the target file when emitting decorators that reference sys or anticipate_failure.

def apply_conf(contents: str, conf: dict) -> str: @@ - lines = contents.splitlines() - tree = ast.parse(contents) - - # Going in reverse to not distrupt the line offset - patches = list(iter_patches(tree, conf)) - for lineno, patch in sorted(patches, reverse=True): - lines.insert(lineno, patch) - - return "\n".join(lines) + import copy + lines = contents.splitlines() + # Work on a copy; iter_patches may consume entries + conf_copy: Conf = copy.deepcopy(conf) + try: + tree = ast.parse(contents) + except SyntaxError as e: + logger.warning("Skipping patching due to parse error: %s", e) + return contents + + # Import injection for patches that require helpers + def _requires_imports(c: Conf) -> list[str]: + need: set[str] = set() + for cls in c.values(): + for patch in cls.values(): + method, cfg = next(iter(patch.items())) + if method == "expectedFailureIfWindows": + need.add("import sys") + need.add("from test.support import anticipate_failure") + if method == "skipIf": + cond = cfg.get("cond", "") + if "sys." in cond: + need.add("import sys") + if "os." in cond: + need.add("import os") + return sorted(need) + + def _import_present(line: str, mod: str) -> bool: + return line.startswith(mod) or f"{mod.split()[-1]} " in line + + def _imports_in_file(existing: list[str], mod: str) -> bool: + return any(_import_present(l.strip(), mod) for l in existing if l.strip().startswith(("import ", "from "))) + + needed = [m for m in _requires_imports(conf_copy) if not _imports_in_file(lines, m)] + if needed: + # Insert after shebang/encoding/docstring if present + insert_at = 0 + if insert_at < len(lines) and lines[0].startswith("#!"): + insert_at += 1 + if insert_at < len(lines) and "coding:" in lines[insert_at]: + insert_at += 1 + # naive docstring skipper + if insert_at < len(lines) and lines[insert_at].lstrip().startswith(('"""', "'''")): + q = lines[insert_at].lstrip()[:3] + insert_at += 1 + while insert_at < len(lines) and q not in lines[insert_at]: + insert_at += 1 + insert_at = min(insert_at + 1, len(lines)) + for mod in reversed(needed): + lines.insert(insert_at, mod) + + # Going in reverse to not disrupt line offsets + patches = list(iter_patches(tree, conf_copy)) + for lineno, patch in sorted(patches, reverse=True): + lines.insert(lineno, patch) + + return "\n".join(lines)

If you prefer, I can extract the import-injection helpers into a separate module for reuse. Do you want me to push a patch?

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def apply_conf(contents: str, conf: dict) -> str:

"""

Patch a given source code based on the conf.

Parameters

----------

contents : str

Raw python source code.

conf : Conf

Dict of `{ClassName: {test_name: Patch}}`.

Returns

-------

str

Patched raw python code.

"""

lines = contents.splitlines()

tree = ast.parse(contents)

# Going in reverse to not distrupt the line offset

patches = list(iter_patches(tree, conf))

for lineno, patch in sorted(patches, reverse=True):

lines.insert(lineno, patch)

return "\n".join(lines)

def apply_conf(contents: str, conf: dict) -> str:

"""

Patch a given source code based on the conf.

Parameters

----------

contents : str

Raw python source code.

conf : Conf

Dict of `{ClassName: {test_name: Patch}}`.

Returns

-------

str

Patched raw python code.

"""

import copy

lines = contents.splitlines()

# Work on a copy; iter_patches may consume entries

conf_copy: Conf = copy.deepcopy(conf)

try:

tree = ast.parse(contents)

except SyntaxError as e:

logger.warning("Skipping patching due to parse error: %s", e)

return contents

# Import injection for patches that require helpers

def _requires_imports(c: Conf) -> list[str]:

need: set[str] = set()

for cls in c.values():

for patch in cls.values():

method, cfg = next(iter(patch.items()))

if method == "expectedFailureIfWindows":

need.add("import sys")

need.add("from test.support import anticipate_failure")

if method == "skipIf":

cond = cfg.get("cond", "")

if "sys." in cond:

need.add("import sys")

if "os." in cond:

need.add("import os")

return sorted(need)

def _import_present(line: str, mod: str) -> bool:

return line.startswith(mod) or f"{mod.split()[-1]} " in line

def _imports_in_file(existing: list[str], mod: str) -> bool:

return any(

_import_present(l.strip(), mod)

for l in existing

if l.strip().startswith(("import ", "from "))

)

needed = [

m for m in _requires_imports(conf_copy)

if not _imports_in_file(lines, m)

]

if needed:

# Insert after shebang/encoding/docstring if present

insert_at = 0

if insert_at < len(lines) and lines[0].startswith("#!"):

insert_at += 1

if insert_at < len(lines) and "coding:" in lines[insert_at]:

insert_at += 1

# naive docstring skipper

if insert_at < len(lines) and lines[insert_at].lstrip().startswith(('"""', "'''")):

q = lines[insert_at].lstrip()[:3]

insert_at += 1

while insert_at < len(lines) and q not in lines[insert_at]:

insert_at += 1

insert_at = min(insert_at + 1, len(lines))

for mod in reversed(needed):

lines.insert(insert_at, mod)

# Going in reverse to not disrupt line offsets

patches = list(iter_patches(tree, conf_copy))

for lineno, patch in sorted(patches, reverse=True):

lines.insert(lineno, patch)

return "\n".join(lines)

🤖 Prompt for AI Agents

In tools/rpau/src/rpau/logic.py around lines 165 to 189, make three fixes: 1) Wrap ast.parse(contents) in a try/except catching SyntaxError — log the error (or debug message) and return the original contents unchanged to avoid crashing on newer-syntax files; 2) Avoid mutating the caller's conf by making a deep copy (e.g., copy.deepcopy(conf)) and pass the copy into iter_patches so iter_patches can consume it safely; 3) After building the list of patches but before inserting them, detect if emitted patches reference external names like sys or anticipate_failure and ensure the corresponding import statements are present in the target source: compute needed imports, check existing lines for those imports (skipping shebang/module docstring), and insert missing import lines near the top (after shebang/docstring) so decorators that reference sys or anticipate_failure resolve. Ensure behavior otherwise unchanged.

ShaharNaveh commented Aug 11, 2025

View reviewed changes

ShaharNaveh marked this pull request as ready for review August 11, 2025 15:50

coderabbitai bot reviewed Aug 11, 2025

View reviewed changes

ShaharNaveh commented Aug 11, 2025

View reviewed changes

reactive-firewall reviewed Aug 12, 2025

View reviewed changes

ShaharNaveh added 6 commits August 22, 2025 14:42

Base version of automated test marker

7c0a8fe

Remove debug code

69ed76b

Add newline

dc3a6de

Apply RustPython patch

5c09ef0

Convert to tool with args

ad061b2

Apply patch

a6f2324

ShaharNaveh force-pushed the auto-updater branch from bfb3839 to a6f2324 Compare August 22, 2025 11:45

Remove old script

302b3d1

ShaharNaveh changed the title ~~Base version of automated test marker~~ Base version of automated updates from cpython Aug 22, 2025

ShaharNaveh added 2 commits August 22, 2025 14:54

Add textwrap.py

db74d2f

ruff fmt

11694e3

ShaharNaveh requested a review from reactive-firewall August 22, 2025 11:56

coderabbitai bot reviewed Aug 22, 2025

View reviewed changes

Add more modules

d8b4e26

		type Patch = dict[str, dict[str, str]]
		type Conf = dict[str, Patch]

-type Patch = dict[str, dict[str, str]]
-type Conf = dict[str, Patch]
+from typing import TypeAlias
+Patch: TypeAlias = dict[str, dict[str, str]]
+Conf: TypeAlias = dict[str, Patch]

	path = "Lib/test/test_os.py"
	path = "Lib/test/test_os.py"
	cpython_version = "3.13.6"

-    def _add_action(self, action):
-        action.default = os.environ.get(
-            f"{self.prog}_{action.dest}".upper(), action.default
-        )
-        return super()._add_action(action)
+    def _add_action(self, action):
+        env_name = f"{self.prog}_{action.dest}".upper()
+        env_val = os.environ.get(env_name, None)
+        if env_val is not None:
+            # Coerce env string to the expected type
+            if isinstance(action, argparse.BooleanOptionalAction):
+                val = env_val.lower() in {"1", "true", "yes", "on"}
+            elif action.type is not None:
+                val = action.type(env_val)
+            else:
+                val = env_val
+            # Validate against choices if provided
+            if action.choices and val not in action.choices:
+                raise argparse.ArgumentError(
+                    action, f"{env_name} must be one of {list(action.choices)}"
+                )
+            action.default = val
+        return super()._add_action(action)

Base version of automated updates from cpython #6089

Are you sure you want to change the base?

Base version of automated updates from cpython #6089

Uh oh!

Conversation

ShaharNaveh commented Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

Status, Documentation and Community

Uh oh!

ShaharNaveh Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot Aug 11, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 12, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 11, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Regarding any interest

lessons/gotchas from auto-patching

Hints for AI

Uh oh!

coderabbitai bot Aug 12, 2025

Choose a reason for hiding this comment

Excellent Alignment and Enhancements

Your CI/CD Workflow Vision

Key Insights from Your Experiment

Potential Synergies

Uh oh!

reactive-firewall left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Regarding any interest

lessons/gotchas from auto-patching

Hints for AI

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

ShaharNaveh commented Aug 11, 2025 •

edited

Loading

coderabbitai bot commented Aug 11, 2025 •

edited

Loading

ShaharNaveh Aug 11, 2025 •

edited

Loading

coderabbitai bot Aug 11, 2025 •

edited

Loading

ShaharNaveh Aug 12, 2025 •

edited

Loading