-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Base version of automated updates from cpython #6089
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Warning Rate limit exceeded@ShaharNaveh has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 6 minutes and 19 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ⛔ Files ignored due to path filters (17)
📒 Files selected for processing (17)
WalkthroughAdds the rpau tool: packaging, CLI, logging, core logic for fetching/applying upstream CPython files (with optional caching and AST-based patch insertion), many TOML config mappings for Lib modules and tests, and a README title line. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor U as User
participant CLI as rpau.cli
participant Main as rpau.__init__.main
participant FS as Conf Dir
participant Exec as ProcessPool
participant Logic as rpau.logic
participant Net as Upstream (HTTP)
participant Cache as Cache Dir
participant Out as Output Dir
U->>CLI: build_argparse()
CLI-->>U: argparse parser
U->>Main: run rpau (args/env)
Main->>Main: build_root_logger()
Main->>FS: iter_confs(conf_dir, include/exclude)
loop For each TOML
Main->>Main: load TOML (tomllib)\nvalidate `path` and `version`
Main->>Exec: submit Logic.run(conf, path, version, ... )
end
par Parallel workers
Exec->>Logic: run(...)
Logic->>Cache: check/read cached upstream
alt cache miss
Logic->>Net: fetch_upstream(base_url, path, version)
Net-->>Logic: contents
Logic->>Cache: write cache (optional)
else cache hit
Cache-->>Logic: contents
end
Logic->>Logic: apply_conf(contents)\n(AST parse, compute patches)
Logic->>Out: write patched file\nwith upstream_version header
end
sequenceDiagram
autonumber
participant L as rpau.logic.apply_conf
participant AST as ast.parse
participant P as iter_patches
participant F as format_patch
L->>AST: parse(source)
L->>P: iterate patch points (conf)
loop For each patch
P->>F: format_patch(patch_conf)
F-->>P: decorator/wrapper text
P-->>L: (line_no, text)
end
L->>L: insert patches in reverse order\n(by line offsets)
L-->>L: return modified source
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
Status, Documentation and Community
|
Lib/test/test_os.py
Outdated
@@ -835,9 +829,7 @@ def ns_to_sec(ns): | |||
# Convert a number of nanosecond (int) to a number of seconds (float). | |||
# Round towards infinity by adding 0.5 nanosecond to avoid rounding | |||
# issue, os.utime() rounds towards minus infinity. | |||
# XXX: RUSTPYTHON os.utime() use `[Duration::from_secs_f64](https://doc.rust-lang.org/std/time/struct.Duration.html#method.try_from_secs_f64)` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had this crazy idea to store patch files in the repo for patches like this. similar to what Termux is doing for their packages, for example:
https://github.com/termux/termux-packages/blob/9865d958666e04f16201a0774a646f37b6082c80/packages/python/0001-fix-hardcoded-paths.patch
That's probably out of scope for this PR tho
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, your idea is not so crazy, C
is NOT Rust
after all, it might be needed, but yeah, totally out-of-scope for this PR.
what about an empty patch folder for now? TL;DR
🤔 I think patches for stuff like this might be the right idea, git can even generate them for us, especially if it we can make them configureable (knobs like: os/arch pair, patch-id, cpython version, and context lines needed (for preventing regressions in the future, etc.) come to mind) for now we could just have a patch folder (empty, but as a placeholder)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 9
🧹 Nitpick comments (3)
tools/test_marker/main.py (3)
57-57
: Simplify redundant string operations.The chained strip operations are redundant. Consider simplifying.
- return res.strip().rstrip(";").strip() + return res.rstrip(";").strip()
148-148
: Fix typo in comment.- # Phase 2: Iterate and mark inhereted tests + # Phase 2: Iterate and mark inherited tests
191-191
: Fix typo in comment.- # Going in reverse to not distrupt the lines offset + # Going in reverse to not disrupt the line offsets
📜 Review details
Configuration used: .coderabbit.yml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Lib/test/test_os.py
is excluded by!Lib/**
📒 Files selected for processing (2)
tools/test_marker/confs/test_os.toml
(1 hunks)tools/test_marker/main.py
(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit Inference Engine (.github/copilot-instructions.md)
**/*.py
: Follow PEP 8 style for custom Python code
Use ruff for linting Python code
Files:
tools/test_marker/main.py
🧠 Learnings (1)
📓 Common learnings
Learnt from: CR
PR: RustPython/RustPython#0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-07-19T03:16:56.511Z
Learning: Applies to Lib/test/**/*.py : When tests fail due to unsupported syntax, keep the test as `unittest.expectedFailure`, document that it requires PEP 695 support, and focus on tests that can be fixed through Rust code changes only
🪛 Flake8 (7.2.0)
tools/test_marker/main.py
[error] 19-19: SyntaxError: invalid syntax
(E999)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: Run snippets and cpython tests (windows-latest)
- GitHub Check: Check the WASM package and demo
- GitHub Check: Ensure compilation on various targets
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Check Rust code with rustfmt and clippy
- GitHub Check: Run rust tests (windows-latest)
- GitHub Check: Run rust tests (ubuntu-latest)
- GitHub Check: Run rust tests (macos-latest)
🔇 Additional comments (2)
tools/test_marker/confs/test_os.toml (1)
1-4
: LGTM!The path declaration and FileTests section are well-structured with descriptive reasons.
tools/test_marker/main.py (1)
163-169
: Verify insertion point for inherited test methods.Using
cls_node.end_lineno
directly might place the new methods at the wrong indentation level or inside the last method of the class. Consider usingcls_node.end_lineno - 1
or adjusting based on the actual class structure.
tools/test_marker/main.py
Outdated
type Patch = dict[str, dict[str, str]] | ||
type Conf = dict[str, Patch] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use pre-3.12 compatible type alias syntax.
The type
statement for type aliases requires Python 3.12+. Based on the learnings, PEP 695 support is not yet available in RustPython. Use TypeAlias
from typing
for compatibility.
-type Patch = dict[str, dict[str, str]]
-type Conf = dict[str, Patch]
+from typing import TypeAlias
+
+Patch: TypeAlias = dict[str, dict[str, str]]
+Conf: TypeAlias = dict[str, Patch]
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
type Patch = dict[str, dict[str, str]] | |
type Conf = dict[str, Patch] | |
from typing import TypeAlias | |
Patch: TypeAlias = dict[str, dict[str, str]] | |
Conf: TypeAlias = dict[str, Patch] |
🧰 Tools
🪛 Flake8 (7.2.0)
[error] 19-19: SyntaxError: invalid syntax
(E999)
🤖 Prompt for AI Agents
In tools/test_marker/main.py around lines 19 to 20, the PEP 695 `type` alias
syntax is used which requires Python 3.12+; replace it with typing.TypeAlias for
older Python compatibility: add "from typing import TypeAlias" to the imports
and change the two aliases to use the TypeAlias annotation (e.g., Patch:
TypeAlias = dict[str, dict[str, str]] and Conf: TypeAlias = dict[str, Patch]) so
the code runs on pre-3.12 interpreters.
tools/test_marker/main.py
Outdated
if attr_node.value.id != "unittest": | ||
return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add safety check for attr_node.value.
Accessing attr_node.value.id
could raise AttributeError if value
is None or doesn't have an id
attribute.
- if attr_node.value.id != "unittest":
- return False
+ if not hasattr(attr_node, 'value') or not hasattr(attr_node.value, 'id'):
+ return False
+ if attr_node.value.id != "unittest":
+ return False
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if attr_node.value.id != "unittest": | |
return False | |
if not hasattr(attr_node, 'value') or not hasattr(attr_node.value, 'id'): | |
return False | |
if attr_node.value.id != "unittest": | |
return False |
🤖 Prompt for AI Agents
In tools/test_marker/main.py around lines 86 to 87, the code directly accesses
attr_node.value.id which can raise an AttributeError if value is None or not an
AST Name node; add a safety check before accessing id — verify attr_node.value
is not None and is an ast.Name (or has an id attribute) and return False if the
check fails, then safely compare the id to "unittest".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ShaharNaveh The suggestion is a good idea, and probably safer than just the name check above.
the details aside, the feature of checking for patching is good (ignoring scope of "how")
@coderabbitai please generate a doc-test for your suggestion that includes both a positive and negative checks for behavior here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@reactive-firewall I'll generate comprehensive doctests for the safety check suggestion. Let me create examples that show both positive and negative behavior:
def is_patch_present(node: ast.Attribute | ast.Call, patch_conf: Patch) -> bool:
"""
Detect whether an AST node (of a decorator) is matching to our patch.
We accept both `ast.Attribute` and `ast.Call` because:
* ast.Attribute: `@unittest.expectedFailure`
* ast.Call: `@unittest.expectedFailureIfWindows(...)` / `@unittest.skipIf(...)`
Parameters
----------
node : ast.Attribute | ast.Call
AST node to query.
patch_conf : Patch
Patch(es) to match against.
Returns
-------
bool
Whether or not we got a match.
Examples
--------
# Positive case: Valid unittest decorator (ast.Attribute)
>>> import ast
>>> # Create AST for @unittest.expectedFailure
>>> unittest_name = ast.Name(id='unittest', ctx=ast.Load())
>>> attr_node = ast.Attribute(value=unittest_name, attr='expectedFailure', ctx=ast.Load())
>>> patch_conf = {"expectedFailure": {"reason": "test reason"}}
>>> is_patch_present(attr_node, patch_conf)
True
# Positive case: Valid unittest decorator (ast.Call)
>>> # Create AST for @unittest.skipIf(condition, "reason")
>>> call_node = ast.Call(func=attr_node, args=[], keywords=[])
>>> is_patch_present(call_node, patch_conf)
True
# Negative case: attr_node.value is None (safety check prevents AttributeError)
>>> # Create malformed AST node where value is None
>>> malformed_attr = ast.Attribute(value=None, attr='expectedFailure', ctx=ast.Load())
>>> is_patch_present(malformed_attr, patch_conf)
False
# Negative case: attr_node.value doesn't have 'id' attribute
>>> # Create AST where value is not a Name node (e.g., a Constant)
>>> non_name_value = ast.Constant(value="not_a_name")
>>> attr_with_non_name = ast.Attribute(value=non_name_value, attr='expectedFailure', ctx=ast.Load())
>>> is_patch_present(attr_with_non_name, patch_conf)
False
# Negative case: Different module name
>>> # Create AST for @other_module.expectedFailure
>>> other_name = ast.Name(id='other_module', ctx=ast.Load())
>>> other_attr = ast.Attribute(value=other_name, attr='expectedFailure', ctx=ast.Load())
>>> is_patch_present(other_attr, patch_conf)
False
# Negative case: ast.Name node (should return False early)
>>> name_node = ast.Name(id='some_function', ctx=ast.Load())
>>> is_patch_present(name_node, patch_conf) # This would be handled by the isinstance check
False
"""
is_attr = isinstance(node, ast.Attribute)
attr_node = node if is_attr else node.func
if isinstance(attr_node, ast.Name):
return False
# Safety check before accessing attr_node.value.id
if not hasattr(attr_node, 'value') or not hasattr(attr_node.value, 'id'):
return False
if attr_node.value.id != "unittest":
return False
# ... rest of the function implementation
The doctests demonstrate:
- Positive cases: Valid unittest decorators (both
ast.Attribute
andast.Call
) that should returnTrue
when matched - Negative cases with safety:
attr_node.value
isNone
- safety check preventsAttributeError
attr_node.value
doesn't haveid
attribute (e.g.,ast.Constant
) - safety check handles this gracefully- Different module name - normal logic flow returns
False
ast.Name
node - handled by existing isinstance check
The safety check ensures the function fails gracefully instead of raising AttributeError
when encountering malformed or unexpected AST structures.
🧠 Learnings used
Learnt from: CR
PR: RustPython/RustPython#0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-07-19T03:16:56.511Z
Learning: Applies to Lib/test/**/*.py : When tests fail due to unsupported syntax, keep the test as `unittest.expectedFailure`, document that it requires PEP 695 support, and focus on tests that can be fixed through Rust code changes only
tools/test_marker/main.py
Outdated
def main(): | ||
for conf_file in CONFS.rglob("*.toml"): | ||
conf = tomllib.loads(conf_file.read_text()) | ||
path = ROOT_DIR / conf.pop("path") | ||
patched = apply_conf(path.read_text(), conf) | ||
path.write_text(patched + "\n") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add error handling for file operations.
The function should handle potential errors when reading/writing files or parsing TOML.
def main():
for conf_file in CONFS.rglob("*.toml"):
- conf = tomllib.loads(conf_file.read_text())
- path = ROOT_DIR / conf.pop("path")
- patched = apply_conf(path.read_text(), conf)
- path.write_text(patched + "\n")
+ try:
+ conf = tomllib.loads(conf_file.read_text())
+ path = ROOT_DIR / conf.pop("path")
+ if not path.exists():
+ print(f"Warning: Target file {path} does not exist, skipping {conf_file}")
+ continue
+ patched = apply_conf(path.read_text(), conf)
+ path.write_text(patched + "\n")
+ except Exception as e:
+ print(f"Error processing {conf_file}: {e}")
+ continue
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def main(): | |
for conf_file in CONFS.rglob("*.toml"): | |
conf = tomllib.loads(conf_file.read_text()) | |
path = ROOT_DIR / conf.pop("path") | |
patched = apply_conf(path.read_text(), conf) | |
path.write_text(patched + "\n") | |
def main(): | |
for conf_file in CONFS.rglob("*.toml"): | |
try: | |
conf = tomllib.loads(conf_file.read_text()) | |
path = ROOT_DIR / conf.pop("path") | |
if not path.exists(): | |
print(f"Warning: Target file {path} does not exist, skipping {conf_file}") | |
continue | |
patched = apply_conf(path.read_text(), conf) | |
path.write_text(patched + "\n") | |
except Exception as e: | |
print(f"Error processing {conf_file}: {e}") | |
continue |
🤖 Prompt for AI Agents
In tools/test_marker/main.py around lines 199 to 205, the loop reads, parses and
writes files without handling IO or TOML parse errors; wrap the body of the loop
in a try/except that catches OSError (file read/write), tomllib.TOMLDecodeError
(parsing) and a generic Exception for apply_conf failures, log an error that
includes the filename and exception details, and continue to the next file
instead of crashing; ensure successful writes still append the trailing newline
and that exceptions do not leave partially written files (write to a temp file
and rename or write only after successful patching).
tools/test_marker/confs/test_os.toml
Outdated
@@ -0,0 +1,103 @@ | |||
path = "Lib/test/test_os.py" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also had an idea of having something like:
path = "Lib/test/test_os.py" | |
path = "Lib/test/test_os.py" | |
cpython_version = "3.13.6" |
then the tool can automatically fetch & mark the new file, but this tool is still in a POC stage.
I want to know if this tool is something that is even wanted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding any interest
Some interest yes, in my recent experiment with cross-python test compatibility checking, I used a value for the cpython version, and I feel there is value to testing the current versions of python as per Referance release cycle (e.g., versions 3.10-3.14 atm.)
Further, my experiment is well positioned to lend to validating the status of this config, if that is even wanted 🤷
My interest here would be to apply these kinds of auto-patches to the failing reference CPython implementation before testing the patched version with the RustPython Interpreter and Rust-python implementation.
For clearity that would mean: CI running CPython tests
run with RustPython
using RustPython's own Lib
and attempting to auto-patch with this tool (and then reporting the results in GitHub's action logs)
This would lead to a feedback-loop of
- Developer makes changes (e.g., fixes something)
- Tests are built from CPython's reference implementation (e.g., single source of compatibility tests)
- Lib and RustPython are build from RustPython/RustPython (of-course)
test_marker
tool applies known (configured) changes needed for RustPython- Testing is done across environments in CI (my re-useable actions)
- (FUTURE work) CI updates the configs
tools/test_marker/confs/test_*.toml
- report is generated regarding implementation compatibility (my re-useable actions)
- developers can read the reports (assuming hypothetical interest)
lessons/gotchas from auto-patching
The only patching done by my experiment POC was to use the scripts/fix_test.py where as a tool to mark tests is FAR more inline with what I was looking for. I ended up determining that the state of compatibility was so low for python 3.13 that the report generated by my experiment would need to be stored somewhere, and I was unsure of how to proceed without reaching out to you, but you reached-out first. 🎉
the knobs I implemented were:
- rust-python
Lib
path - c-python
Lib
path - runner
os
(version, and arch) from matrix - c-python version
- rust-python version (by git
ref
with a default ofmain
) - sub-shell's (e.g. each test-suite process) time-out threshold (a default of 30 seconds seemed reasonable and kept the total test time down to a few hours)
- the rust-python's cargo arguments
- include/exclude patterns (mostly didn't need this one once I added a time-out threshold feature)
Hints for AI
@coderabbitai, FYI: here are the proof-of-concept GHA components from my experiment:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@reactive-firewall, thank you for sharing your experimental work and insights! Your cross-python test compatibility checking experiment provides valuable real-world validation of the concepts in this PR.
Excellent Alignment and Enhancements
Your suggestion to add cpython_version = "3.13.6"
to the TOML configuration is spot-on and would make the tool much more robust. This would enable:
- Automated fetching of reference tests from specific CPython versions
- Better tracking of which CPython version the markers are based on
- Version-specific compatibility reporting
Your CI/CD Workflow Vision
The feedback loop you've outlined is exactly what this tool needs to reach production readiness:
graph TD
A[Developer Changes] --> B[Build CPython Reference Tests]
B --> C[Apply test_marker patches]
C --> D[Test with RustPython]
D --> E[Generate Compatibility Report]
E --> F[Update TOML configs]
F --> A
Your experimental components provide a solid foundation for this workflow. The fact that you've already implemented configurable knobs for CPython/RustPython paths, versions, timeouts, and filtering patterns shows you've thought through the practical challenges.
Key Insights from Your Experiment
Your finding about Python 3.13 compatibility being "so low" is particularly valuable - it validates the need for systematic test marking tools like this PR. The 30-second timeout threshold and your auto-patching lessons learned will be crucial for scaling this approach.
Potential Synergies
Your reusable GHA components could provide the CI infrastructure that this test marker tool needs to be truly effective. The combination of:
- This PR's systematic test marking approach
- Your automated CPython test integration
- Your cross-environment testing matrix
- Your compatibility reporting
...could create a comprehensive solution for RustPython's testing challenges.
Would you be interested in collaborating to integrate these approaches? Your experimental framework seems like the perfect validation and deployment mechanism for this test marker tool.
🧠 Learnings used
Learnt from: CR
PR: RustPython/RustPython#0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-07-19T03:16:56.511Z
Learning: Applies to Lib/test/**/*.py : When tests fail due to unsupported syntax, keep the test as `unittest.expectedFailure`, document that it requires PEP 695 support, and focus on tests that can be fixed through Rust code changes only
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤩 I especially like the ast
based approach to marking tests, and I think this is a good starting point for marking tests that only fail for RustPython.
See the remainder of my comments inline.
tools/test_marker/confs/test_os.toml
Outdated
@@ -0,0 +1,103 @@ | |||
path = "Lib/test/test_os.py" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding any interest
Some interest yes, in my recent experiment with cross-python test compatibility checking, I used a value for the cpython version, and I feel there is value to testing the current versions of python as per Referance release cycle (e.g., versions 3.10-3.14 atm.)
Further, my experiment is well positioned to lend to validating the status of this config, if that is even wanted 🤷
My interest here would be to apply these kinds of auto-patches to the failing reference CPython implementation before testing the patched version with the RustPython Interpreter and Rust-python implementation.
For clearity that would mean: CI running CPython tests
run with RustPython
using RustPython's own Lib
and attempting to auto-patch with this tool (and then reporting the results in GitHub's action logs)
This would lead to a feedback-loop of
- Developer makes changes (e.g., fixes something)
- Tests are built from CPython's reference implementation (e.g., single source of compatibility tests)
- Lib and RustPython are build from RustPython/RustPython (of-course)
test_marker
tool applies known (configured) changes needed for RustPython- Testing is done across environments in CI (my re-useable actions)
- (FUTURE work) CI updates the configs
tools/test_marker/confs/test_*.toml
- report is generated regarding implementation compatibility (my re-useable actions)
- developers can read the reports (assuming hypothetical interest)
lessons/gotchas from auto-patching
The only patching done by my experiment POC was to use the scripts/fix_test.py where as a tool to mark tests is FAR more inline with what I was looking for. I ended up determining that the state of compatibility was so low for python 3.13 that the report generated by my experiment would need to be stored somewhere, and I was unsure of how to proceed without reaching out to you, but you reached-out first. 🎉
the knobs I implemented were:
- rust-python
Lib
path - c-python
Lib
path - runner
os
(version, and arch) from matrix - c-python version
- rust-python version (by git
ref
with a default ofmain
) - sub-shell's (e.g. each test-suite process) time-out threshold (a default of 30 seconds seemed reasonable and kept the total test time down to a few hours)
- the rust-python's cargo arguments
- include/exclude patterns (mostly didn't need this one once I added a time-out threshold feature)
Hints for AI
@coderabbitai, FYI: here are the proof-of-concept GHA components from my experiment:
tools/test_marker/main.py
Outdated
COL_OFFSET = 4 | ||
INDENT1 = " " * COL_OFFSET | ||
INDENT2 = " " * COL_OFFSET * 2 | ||
COMMENT = "TODO: RUSTPYTHON" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 This would be nice to have an easy override for (CI may be-able to add more details dynamically that way)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ofc, I have plans to make pretty much everything configurable via argparse
tools/test_marker/main.py
Outdated
|
||
COL_OFFSET = 4 | ||
INDENT1 = " " * COL_OFFSET | ||
INDENT2 = " " * COL_OFFSET * 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider the easy maintainability win:
INDENT2 = " " * COL_OFFSET * 2 | |
INDENT2 = INDENT1 * 2 |
@coderabbitai, please explain this suggestion for posterity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@reactive-firewall, this suggestion improves code maintainability in several ways:
-
DRY Principle: Instead of duplicating the space-creation logic (
" " * COL_OFFSET
), it establishes a clear relationship whereINDENT2
is explicitly twiceINDENT1
. -
Semantic Clarity: The relationship between indentation levels becomes self-documenting. It's immediately clear that
INDENT2
is meant to be doubleINDENT1
, rather than having to mentally calculate thatCOL_OFFSET * 2
equalsCOL_OFFSET * 1 * 2
. -
Easier Maintenance: If you ever need to:
- Change the indentation character (spaces to tabs, though unlikely)
- Modify the relationship between indentation levels
- Add more indentation levels (e.g.,
INDENT3 = INDENT1 * 3
)
The suggested approach makes these changes more straightforward and less error-prone.
-
Single Source of Truth: The actual indentation character and base unit are defined once in
INDENT1
, and other levels derive from it.
This is a small but meaningful improvement that follows clean code principles and makes the codebase more maintainable.
(◕‿◕)ノ彡 ♪
🧠 Learnings used
Learnt from: CR
PR: RustPython/RustPython#0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-07-19T03:16:56.511Z
Learning: Applies to Lib/test/**/*.py : When tests fail due to unsupported syntax, keep the test as `unittest.expectedFailure`, document that it requires PEP 695 support, and focus on tests that can be fixed through Rust code changes only
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tbh this should be removed and be discovered dynamically. I was just lazy with implementing this, but should probably invest a bit to do it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@coderabbitai, I concur with @ShaharNaveh that this should be dynamic; can you please explore the code in the rest of the new tool (e.g., the rest of tools/test_marker/main.py
) and generate a helper function that takes a snippet of code for the target test functions (e.g., like how the patches are applied with ast and INDENT1
/INDENT2
currently) and computes the needed indents for the indentation dynamically? be sure to consider SOLID principles and BCP for testing, and all project conventions when generating code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
I'll explore the complete file to understand how indentation is currently used and create a dynamic solution.
Now I'll create a helper function that dynamically computes indentation based on the actual code structure:
def detect_indentation(source_lines: list[str], node: ast.AST | None = None) -> tuple[str, dict[int, str]]:
"""
Dynamically detect indentation patterns from source code.
Args:
source_lines: List of source code lines
node: Optional AST node to provide context for indentation level
Returns:
Tuple of (base_unit, level_map) where:
- base_unit: The basic indentation unit (e.g., " " for 4 spaces)
- level_map: Mapping of nesting level to indentation string
Follows SOLID principles:
- Single Responsibility: Only handles indentation detection
- Open/Closed: Extensible for different indentation styles
- Interface Segregation: Focused interface for indentation needs
"""
if not source_lines:
# Default fallback to 4 spaces (Python convention)
base_unit = " "
return base_unit, {0: "", 1: base_unit, 2: base_unit * 2}
# Detect indentation pattern by analyzing non-empty, indented lines
indentation_samples = []
for line in source_lines:
if line.strip() and line[0] in (' ', '\t'):
# Count leading whitespace
indent_count = 0
for char in line:
if char == ' ':
indent_count += 1
elif char == '\t':
# Treat tab as 8 spaces (Python standard)
indent_count += 8
else:
break
if indent_count > 0:
indentation_samples.append(indent_count)
if not indentation_samples:
# No indented lines found, use default
base_unit = " "
else:
# Find the greatest common divisor to determine base indentation unit
from math import gcd
base_count = indentation_samples[0]
for sample in indentation_samples[1:]:
base_count = gcd(base_count, sample)
# Ensure minimum of 1 space and reasonable maximum
base_count = max(1, min(base_count, 8))
# Check if tabs are used
has_tabs = any('\t' in line for line in source_lines if line.strip())
if has_tabs:
# Mixed or tab-based indentation - normalize to spaces
# Most Python projects use spaces, so convert tabs to equivalent spaces
base_unit = " " * base_count
else:
base_unit = " " * base_count
# Create level mapping for common nesting levels
level_map = {}
for level in range(5): # Support up to 4 levels of nesting
level_map[level] = base_unit * level
return base_unit, level_map
def get_context_indentation(source_lines: list[str], target_line: int) -> tuple[str, str]:
"""
Get appropriate indentation for a specific context in the source code.
Args:
source_lines: List of source code lines
target_line: Line number where indentation context is needed (0-based)
Returns:
Tuple of (class_level_indent, method_level_indent) for the given context
This function follows the Single Responsibility Principle by focusing
solely on determining contextual indentation needs.
"""
base_unit, level_map = detect_indentation(source_lines)
if target_line < 0 or target_line >= len(source_lines):
# Default to class and method level indentation
return level_map[1], level_map[2]
# Analyze the context around the target line to determine nesting level
current_indent_level = 0
# Look backwards to find the current indentation context
for i in range(min(target_line, len(source_lines) - 1), -1, -1):
line = source_lines[i].rstrip()
if line.strip():
# Count leading whitespace to determine current level
leading_spaces = len(line) - len(line.lstrip())
if base_unit:
current_indent_level = leading_spaces // len(base_unit)
break
# Return appropriate indentation for decorators (class level) and method bodies
class_level_indent = level_map.get(max(1, current_indent_level), level_map[1])
method_level_indent = level_map.get(max(2, current_indent_level + 1), level_map[2])
return class_level_indent, method_level_indent
def create_indented_line(content: str, indent_level: str, line_type: str = "decorator") -> str:
"""
Create a properly indented line for insertion into source code.
Args:
content: The content to be indented (without leading/trailing whitespace)
indent_level: The indentation string to apply
line_type: Type of line being created ("decorator", "method", "statement")
Returns:
Properly formatted and indented line ready for insertion
Follows Open/Closed Principle: extensible for different line types
without modifying existing behavior.
"""
clean_content = content.strip()
# Apply specific formatting rules based on line type
if line_type == "decorator":
if not clean_content.startswith('@'):
clean_content = f"@{clean_content}"
elif line_type == "method":
if not clean_content.endswith(':'):
clean_content = f"{clean_content}:"
return f"{indent_level}{clean_content}"
@reactive-firewall, I've created a comprehensive dynamic indentation system that follows SOLID principles and best practices:
Key Features:
-
detect_indentation()
- Dynamically analyzes source code to determine the base indentation pattern (spaces vs tabs, indentation size) -
get_context_indentation()
- Provides context-aware indentation by analyzing the surrounding code structure -
create_indented_line()
- Creates properly formatted lines with appropriate indentation
SOLID Principles Applied:
- Single Responsibility: Each function has one clear purpose (detection, context analysis, line creation)
- Open/Closed: Extensible for new indentation styles and line types without modification
- Interface Segregation: Clean, focused interfaces for specific indentation needs
- Dependency Inversion: Functions work with abstractions (source lines) rather than concrete implementations
Best Practices for Testing:
- Functions return predictable data structures for easy testing
- Error handling with sensible fallbacks
- Support for edge cases (empty files, mixed indentation)
- Clear separation of concerns for unit testing
Usage Example:
# Replace hardcoded INDENT1/INDENT2 with:
source_lines = contents.splitlines()
class_indent, method_indent = get_context_indentation(source_lines, target_line)
decorator_line = create_indented_line("@unittest.expectedFailure", class_indent, "decorator")
This approach makes the tool more robust and adaptable to different coding styles while maintaining clean, testable code architecture.
(◕‿◕)ノ彡 ♪
🏁 Script executed:
#!/bin/bash
# Get the complete content of the main.py file to understand the current implementation
cat tools/test_marker/main.py
Length of output: 5541
tools/test_marker/main.py
Outdated
|
||
reason = conf.get("reason", "") | ||
res = "" | ||
match method: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@coderabbitai, FYI:
Documentation hint: match
was introduced in python3.10?? (need to verify) This will prevent any support for python3.9 (but I'm fine with that, but we should make note of it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
@reactive-firewall, let me verify when the match
statement was introduced in Python.
🌐 Web query:
When was the match statement introduced in Python? What Python version first supported structural pattern matching?
💡 Result:
The match statement (structural pattern matching) was introduced in Python 3.10. Python 3.10.0 — the first version to support structural pattern matching — was released on October 4, 2021. The feature is specified in PEP 634 (specification), with PEP 635 (motivation) and PEP 636 (tutorial).
@reactive-firewall, you're absolutely correct! The match
statement was indeed introduced in Python 3.10.0 (released October 4, 2021) as part of PEP 634 for structural pattern matching.
This means the code using match
on line 46 will require Python 3.10+ as the minimum version, effectively dropping Python 3.9 support. Since you're fine with this requirement, I'd recommend documenting it in:
- The project's
README.md
or documentation pyproject.toml
orsetup.py
withpython_requires = ">=3.10"
- Any CI/CD configuration files
🏁 Script executed:
#!/bin/bash
# Check if there are existing Python version requirements
fd -t f "pyproject.toml|setup.py|requirements*.txt|.python-version" --exec cat {}
Length of output: 125
Let me check for other common places where Python version requirements might be documented:
🏁 Script executed:
#!/bin/bash
# Check README and other documentation for Python version mentions
rg -i "python.*[0-9]\.[0-9]" README.md docs/ --type md 2>/dev/null || echo "No version info in README/docs"
# Check GitHub Actions or CI configs for Python versions
fd -t f "\.yml$|\.yaml$" .github/ --exec rg -l "python" {} 2>/dev/null || echo "No GitHub Actions found"
rg "python-version|python.*[0-9]\.[0-9]" .github/ -A 2 -B 2 2>/dev/null || echo "No Python versions in CI"
# Check if there's a tox.ini or similar
fd -t f "tox.ini|.travis.yml|appveyor.yml" --exec cat {} 2>/dev/null || echo "No additional CI configs found"
Length of output: 3196
Let me check what the PYTHON_VERSION
environment variable is set to in the CI files:
@reactive-firewall, I found some interesting information:
- Current Requirements: The README states
CPython >= 3.13.0
is required - CI Configuration: Most CI jobs use
${{ env.PYTHON_VERSION }}
(need to check what this is set to) - Inconsistency: There's one benchmark job that explicitly uses
python-version: 3.9
The match
statement requiring Python 3.10+ would be:
- ✅ Compatible with the stated
>= 3.13.0
requirement in README - ❌ Incompatible with the Python 3.9 benchmark job
This suggests the project might need to:
- Update the benchmark job to use Python 3.10+ or higher
- Ensure the
PYTHON_VERSION
environment variable is set appropriately - Consider whether the tool in
tools/test_marker/
has different requirements than the main RustPython project
🏁 Script executed:
#!/bin/bash
# Check for PYTHON_VERSION environment variable definition
rg "PYTHON_VERSION.*:" .github/ -A 2 -B 2
rg "env:" .github/ -A 5
Length of output: 7325
Perfect! Now I have the complete picture:
Current Project Requirements:
- Official target: Python 3.13.1 (set as
PYTHON_VERSION
in CI) - README states: CPython >= 3.13.0
- One inconsistency: Benchmark job still uses Python 3.9 explicitly
Impact of match
statement (Python 3.10+ requirement):
- ✅ Fully compatible with the project's Python 3.13+ requirement
- ✅ No issue for the main project workflow
- ❌ Only affects the benchmark job using Python 3.9
Recommendation:
Since your project already requires Python 3.13+, the match
statement in tools/test_marker/main.py
is perfectly fine and doesn't introduce any new restrictions. The only cleanup needed would be updating that benchmark job from Python 3.9 to match the project's minimum requirement.
So you're absolutely right about needing Python 3.10+ for match
, but it's actually a non-issue given your project's existing 3.13+ requirement! The documentation is already clear in the README.
🧠 Learnings used
Learnt from: CR
PR: RustPython/RustPython#0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-07-19T03:16:56.511Z
Learning: Applies to Lib/test/**/*.py : When tests fail due to unsupported syntax, keep the test as `unittest.expectedFailure`, document that it requires PEP 695 support, and focus on tests that can be fixed through Rust code changes only
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Python 3.9 is going to EOL at 2025-10, I don't see a reason to keep compatibility with a version that is only receiving security updates ATM and will be deprecated in 2 months.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
tools/test_marker/main.py
Outdated
COMMENT = "TODO: RUSTPYTHON" | ||
|
||
ROOT_DIR = pathlib.Path(__file__).parents[2] | ||
CONFS = ROOT_DIR / "tools" / "test_marker" / "confs" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 NIT: Consider future use of os.path.sep
(in-place of /
) to keep this portable for cross-platform use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This uses pathlib.Path which is already cross platform:)
tools/test_marker/main.py
Outdated
if attr_node.value.id != "unittest": | ||
return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ShaharNaveh The suggestion is a good idea, and probably safer than just the name check above.
the details aside, the feature of checking for patching is good (ignoring scope of "how")
@coderabbitai please generate a doc-test for your suggestion that includes both a positive and negative checks for behavior here.
tools/test_marker/main.py
Outdated
cls_node.end_lineno, | ||
f""" | ||
{INDENT1}{patch} | ||
{INDENT1}def {fn_name}(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 I'd be interested to see if this can handle some of the crazier stuff like Lib/test/test_abc.py
The general approach seems fine to me (edge-cases aside).
also might need to handle type-annotations (not sure how ast handles parsing those (tree-sitter has some trouble with it still))
@coderabbitai what's your take on my line of thought here? did I miss something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
@reactive-firewall, your analysis is spot-on! After examining the code and testing with complex scenarios, here are my findings:
Type Annotations: The AST approach handles type annotations beautifully. Python's ast
module is much more robust than tree-sitter for this - it natively parses all Python syntax including complex generic types, unions, and multi-line signatures without issues.
Edge Cases Found: Looking at the current implementation, there are a few potential issues:
-
Fragile attribute access: The
is_patch_present
function assumesattr_node.value.id
exists, but this could fail with nested attributes or complex decorator expressions. -
Limited decorator detection: The current logic only checks for
unittest.*
decorators, but real test files often have stacked decorators (@property
,@staticmethod
, custom decorators) that could interfere. -
Line number precision: The patching logic uses
min()
on decorator line numbers, which works for simple cases but might not handle complex multi-line decorators optimally.
Regarding test_abc.py complexity: After examining it, the main complexity comes from:
- Heavy use of metaclasses and dynamic class creation
- Multiple inheritance patterns
- Abstract base class registrations
- Complex nested class structures
The current AST-based approach should handle most of this well, though the dynamic class creation aspects wouldn't be detectable at static analysis time (which is expected and probably fine).
Recommendations:
- Add more robust error handling in
is_patch_present
- Test against more complex real-world test files
- Consider handling edge cases like deeply nested decorators or non-standard decorator patterns
The general approach is solid - AST parsing is the right choice here over regex or text-based solutions.
🏁 Script executed:
#!/bin/bash
# First, let's look at the structure and complexity of test_abc.py
curl -s "https://raw.githubusercontent.com/python/cpython/main/Lib/test/test_abc.py" | head -50
Length of output: 1831
🏁 Script executed:
#!/bin/bash
# Let's see more of the file to understand the complexity patterns
curl -s "https://raw.githubusercontent.com/python/cpython/main/Lib/test/test_abc.py" | grep -E "(class|def |@)" | head -20
Length of output: 896
🏁 Script executed:
#!/bin/bash
# Let's examine what kind of type annotations and complex patterns exist
curl -s "https://raw.githubusercontent.com/python/cpython/main/Lib/test/test_abc.py" | grep -E "(->|:|typing|Union|Optional)" | head -15
Length of output: 784
🏁 Script executed:
#!/bin/bash
# Let's create a test to see how Python's AST handles type annotations
python << 'EOF'
import ast
# Test code with type annotations
test_code = '''
from typing import Optional, Union
import unittest
class TestClass(unittest.TestCase):
def test_with_annotations(self, param: int) -> bool:
return True
@unittest.expectedFailure
def test_decorated_with_annotations(self, x: Optional[str] = None) -> Union[str, None]:
return x
def complex_method(self,
a: dict[str, int],
b: Optional[list[tuple[str, ...]]]) -> None:
pass
'''
# Parse and analyze
tree = ast.parse(test_code)
print("AST parsing successful with type annotations")
print("\nFound classes and methods:")
for node in ast.walk(tree):
if isinstance(node, ast.ClassDef):
print(f"Class: {node.name}")
for method in node.body:
if isinstance(method, ast.FunctionDef):
# Check if it has type annotations
has_annotations = bool(method.returns or any(arg.annotation for arg in method.args.args))
decorators = [ast.unparse(dec) for dec in method.decorator_list]
print(f" Method: {method.name}, has_annotations: {has_annotations}, decorators: {decorators}")
EOF
Length of output: 340
🏁 Script executed:
#!/bin/bash
# Let's also test the specific edge cases from the current code
python << 'EOF'
import ast
# Test the current is_patch_present logic with complex decorators
test_code = '''
import unittest
from typing import Any
class ComplexTest(unittest.TestCase):
@unittest.expectedFailure
def simple_decorated(self): pass
@unittest.skipIf(sys.platform == "win32", "Windows specific")
def conditional_skip(self): pass
@property
@unittest.expectedFailure
def property_with_test_decorator(self): pass
@unittest.expectedFailure
@staticmethod
def static_test_method(): pass
def method_with_complex_signature(
self,
a: dict[str, list[tuple[int, ...]]],
b: Optional[Union[str, bytes]] = None
) -> Generator[tuple[str, int], None, None]:
pass
'''
tree = ast.parse(test_code)
# Simulate the current detection logic
def is_patch_present_current(node, patch_conf):
"""Current implementation from the code"""
is_attr = isinstance(node, ast.Attribute)
attr_node = node if is_attr else node.func
if isinstance(attr_node, ast.Name):
return False
if not hasattr(attr_node, 'value') or not hasattr(attr_node.value, 'id'):
return False
if attr_node.value.id != "unittest":
return False
if is_attr:
return node.attr in patch_conf
return "RUSTPYTHON" in ast.unparse(node)
patch_conf = {"expectedFailure": {}}
print("Testing decorator detection:")
for node in ast.walk(tree):
if isinstance(node, ast.FunctionDef):
print(f"\nMethod: {node.name}")
for dec in node.decorator_list:
if isinstance(dec, (ast.Attribute, ast.Call)):
try:
result = is_patch_present_current(dec, patch_conf)
print(f" Decorator: {ast.unparse(dec)} -> detected: {result}")
except Exception as e:
print(f" Decorator: {ast.unparse(dec)} -> ERROR: {e}")
EOF
Length of output: 455
@reactive-firewall, excellent instincts! After running tests against the actual code, here's what I found:
Your concerns are well-founded:
1. Type Annotations: ✅ No issues here - Python's AST handles them perfectly, including complex generics like dict[str, list[tuple[int, ...]]]
. This is where AST shines over tree-sitter.
2. Complex Edge Cases:
From testing the current is_patch_present
logic:
- ✅ Correctly detects
@unittest.expectedFailure
- ❌ Fails to detect existing
@unittest.skipIf
- the logic only looks for "RUSTPYTHON" in Call nodes, missing vanilla skipIf decorators - ✅ Handles stacked decorators (
@property
+@unittest.expectedFailure
) correctly - ✅ No crashes with complex signatures
3. test_abc.py Complexity: The real challenges are:
def test_factory(abc_ABCMeta, abc_get_cache_token):
class TestLegacyAPI(unittest.TestCase): # Dynamic class creation
# Nested classes with metaclasses
class C(metaclass=abc_ABCMeta):
Key Issues Found:
- Inconsistent detection logic: The skipIf branch looks for "RUSTPYTHON" marker but should check if the decorator type matches the patch config
- Missing error handling: No protection against malformed decorators or attribute access failures
- Dynamic class limitations: Static AST can't handle the factory pattern in test_abc.py (though this might be acceptable)
Suggested improvements:
# Fix the detection logic
if is_attr:
return node.attr in patch_conf
else:
# Check if the function name matches, not just for RUSTPYTHON marker
return hasattr(node.func, 'attr') and node.func.attr in patch_conf
The core approach is solid, but it needs these robustness improvements before handling complex real-world test files. Good catch on anticipating these edge cases!
🧠 Learnings used
Learnt from: CR
PR: RustPython/RustPython#0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-07-19T03:16:56.511Z
Learning: Applies to Lib/test/**/*.py : When tests fail due to unsupported syntax, keep the test as `unittest.expectedFailure`, document that it requires PEP 695 support, and focus on tests that can be fixed through Rust code changes only
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This tool (at it's current state) is aimed to handle 90%+ of the cases, if we still have a handful of files that needs to be manually updated I think it's still a win
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I concur, this tool (including its current state) it is totally a progress win! This was just nitpicking 🙊 about how to approach the future ~10% edge-cases.
This is not a show stopper.
tools/test_marker/main.py
Outdated
lines = contents.splitlines() | ||
tree = ast.parse(contents) | ||
|
||
# Going in reverse to not distrupt the lines offset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 In the future it would be helpful to have logger.debug()
output with before-after (plus some context lines) of the altered code. But assuming the list mutations are safe... that would be out-of-scope for a POC of-course.
@coderabbitai, please analysis this function (e.g., around lines 172-197) looking for possible improvements that could ensure any worries about list modifications here are mitigated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Get the complete apply_conf function implementation
ast-grep --pattern $'def apply_conf($$$) -> str:
$$$'
Length of output: 1426
@reactive-firewall, I've analyzed the apply_conf
function (lines 172-196) and identified several potential improvements to ensure list modification safety:
Current Implementation Analysis
The function currently:
- Splits content into lines
- Collects all patches into a list
- Processes patches in reverse order by line number
- Uses
lines.insert(lineno, patch)
to modify the list
Potential Safety Issues & Improvements
1. Input Validation
def apply_conf(contents: str, conf: dict) -> str:
lines = contents.splitlines()
tree = ast.parse(contents)
patches = list(iter_patches(tree, conf))
# Validate line numbers before processing
max_line = len(lines)
valid_patches = [
(lineno, patch) for lineno, patch in patches
if 0 <= lineno <= max_line
]
for lineno, patch in sorted(valid_patches, reverse=True):
lines.insert(lineno, patch)
return "\n".join(lines)
2. Handle Duplicate Line Numbers
# Group patches by line number to handle duplicates safely
from collections import defaultdict
patches_by_line = defaultdict(list)
for lineno, patch in patches:
patches_by_line[lineno].append(patch)
# Process in reverse order, inserting multiple patches at same line
for lineno in sorted(patches_by_line.keys(), reverse=True):
for patch in reversed(patches_by_line[lineno]): # Maintain order
if 0 <= lineno <= len(lines):
lines.insert(lineno, patch)
3. Immutable Approach (Safer)
def apply_conf(contents: str, conf: dict) -> str:
original_lines = contents.splitlines()
tree = ast.parse(contents)
patches = list(iter_patches(tree, conf))
# Build insertion map
insertions = defaultdict(list)
for lineno, patch in patches:
if 0 <= lineno <= len(original_lines):
insertions[lineno].append(patch)
# Build new lines list without mutating original
result_lines = []
for i, line in enumerate(original_lines):
# Insert patches before current line
result_lines.extend(insertions.get(i, []))
result_lines.append(line)
# Handle patches at end of file
result_lines.extend(insertions.get(len(original_lines), []))
return "\n".join(result_lines)
4. With Debug Logging (As You Suggested)
import logging
def apply_conf(contents: str, conf: dict) -> str:
logger = logging.getLogger(__name__)
lines = contents.splitlines()
tree = ast.parse(contents)
patches = list(iter_patches(tree, conf))
logger.debug(f"Applying {len(patches)} patches")
for lineno, patch in sorted(patches, reverse=True):
if 0 <= lineno <= len(lines):
# Log before/after context
start_ctx = max(0, lineno - 2)
end_ctx = min(len(lines), lineno + 3)
logger.debug(f"Before patch at line {lineno}:")
for i in range(start_ctx, end_ctx):
marker = " > " if i == lineno else " "
logger.debug(f"{marker}{i:3d}: {lines[i] if i < len(lines) else '<EOF>'}")
lines.insert(lineno, patch)
logger.debug(f"After patch:")
for i in range(start_ctx, min(len(lines), end_ctx + 1)):
marker = " + " if i == lineno else " "
logger.debug(f"{marker}{i:3d}: {lines[i]}")
return "\n".join(lines)
Recommendation
For the POC, the current implementation is reasonably safe, but I'd recommend option 3 (immutable approach) as it eliminates list mutation concerns entirely while being easy to understand and debug. The logging enhancement you mentioned would be valuable for troubleshooting patch applications.
🧠 Learnings used
Learnt from: CR
PR: RustPython/RustPython#0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-07-19T03:16:56.511Z
Learning: Applies to Lib/test/**/*.py : When tests fail due to unsupported syntax, keep the test as `unittest.expectedFailure`, document that it requires PEP 695 support, and focus on tests that can be fixed through Rust code changes only
tools/test_marker/main.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ShaharNaveh, Overall LGTM! With the caveat, that as a POC, it has lots of potential for improvements.
Most of my inline comments are off-the-cuff, I recognize this is a proof-of-concept and I like the concept, so feel-free to dismiss any comments that are off-topic without response.
bfb3839
to
a6f2324
Compare
@youknowone can you please review this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tools/rpau/src/rpau/__init__.py (1)
49-76
: Surface worker exceptions; don’t fire-and-forget ProcessPool tasksCurrently, failures in
run()
won’t be reported because the futures aren’t awaited. This can silently produce incomplete or stale outputs.Apply this diff:
- with ProcessPoolExecutor(args.workers) as executor: - for conf_file in iter_confs( + with ProcessPoolExecutor(args.workers) as executor: + futures = [] + for conf_file in iter_confs( conf_dir, include=args.include, exclude=args.exclude ): @@ - executor.submit(func) + futures.append(executor.submit(func)) + + # Propagate errors from workers (and ensure completion) + for fut in futures: + fut.result()
🧹 Nitpick comments (26)
tools/rpau/README.md (1)
1-1
: Prefer a standard H1 and add a one-line synopsisThe stylized bold letters look clever but make the title harder to scan and search. A plain H1 plus a short description improves readability and discoverability.
Apply this diff:
-# **R**ust**P**ython **A**utomatic **U**pdater (rpau) +# RustPython Automatic Updater (rpau) + +Tooling to fetch upstream CPython sources and apply configuration-driven patches/markings for RustPython. See `rpau --help` for usage.tools/rpau/confs/Lib/os.toml (1)
1-1
: Optionally pin the CPython tag used for this importRight now this relies on the CLI default (
--default-version
, currently v3.13.7). If the intention is to track a specific upstream for reproducibility, pin it here. The PR text mentions 3.13.6 for test updates; ifLib/os.py
should align, consider pinning accordingly.If you decide to pin:
path = "Lib/os.py" +version = "v3.13.6" # or remove to use the CLI default
tools/rpau/confs/Lib/test/test_os.toml (6)
1-1
: Pin upstream version to match PR description (3.13.6)The CLI default is v3.13.7, but the PR description says test_os.py was updated from 3.13.6. Pinning here avoids accidental drift and makes the generated file header deterministic.
Apply this diff:
+version = "v3.13.6" path = "Lib/test/test_os.py"
3-3
: Remove trailing whitespace in section headerMinor style cleanup; avoids noisy diffs if tooling trims whitespace later.
Apply this diff:
-[FileTests] +[FileTests]
8-9
: Tighten “reason” messages (fix stray parentheses, clarify wording)The current texts look like partial exception messages. Cleaning them up makes the intent clearer without changing behavior.
Apply this diff:
-test_access_denied = { expectedFailureIfWindows = { reason = "os.stat (PermissionError: [Errno 5] Access is denied" } } -test_stat_block_device = { expectedFailureIfWindows = { reason = "os.stat (PermissionError: [Errno 1] Incorrect function" } } +test_access_denied = { expectedFailureIfWindows = { reason = "PermissionError: [Errno 5] Access is denied by os.stat" } } +test_stat_block_device = { expectedFailureIfWindows = { reason = "PermissionError: [Errno 1] Incorrect function (os.stat on Windows)" } }
13-13
: Fix unmatched parenthesis in reasonCosmetic, but avoids confusion.
Apply this diff:
-test_utime_directory = { expectedFailureIfWindows = { reason = "(AssertionError: 2.002003 != 1.002003 within 1e-06 delta (1.0000000000000002 difference)" } } +test_utime_directory = { expectedFailureIfWindows = { reason = "AssertionError: 2.002003 != 1.002003 within 1e-06 delta (1.0000000000000002 difference)" } }
33-34
: Reason strings seem incorrect (“No module named 'os'”)These tests fail for feature gaps, not because
os
is missing. Update the reasons to reflect the real limitation so the markers remain meaningful.Apply this diff:
-test_urandom_fd_closed = { expectedFailureIfWindows = { reason = "ModuleNotFoundError: No module named 'os'" } } -test_urandom_fd_reopened = { expectedFailureIfWindows = { reason = "ModuleNotFoundError: No module named 'os'" } } +test_urandom_fd_closed = { expectedFailureIfWindows = { reason = "os.urandom FD-based behavior not supported on Windows in RustPython yet" } } +test_urandom_fd_reopened = { expectedFailureIfWindows = { reason = "os.urandom FD-based behavior not supported on Windows in RustPython yet" } }
78-80
: Windows-only duplication: prefer expectedFailureIfWindows or skipIf consistently
dup
tests are explicitly skipped on Windows while related inheritable tests are marked xfail-if-windows. Consider making these consistent to reduce cognitive load (e.g., all Windows-specific limitations useexpectedFailureIfWindows
unless the test is fundamentally inapplicable).Would you like me to normalize these choices across all Windows cases in this file?
tools/rpau/src/rpau/logger.py (2)
7-14
: Accept both int and str for level (matches argparse input)
build_root_logger
is called withargs.log_level
, which is a string (e.g., "INFO"). Relax the type toint | str
to reflect actual usage.Apply this diff:
-def build_root_logger( - name: str = _NAME, level: int = logging.WARNING +def build_root_logger( + name: str = _NAME, level: int | str = logging.WARNING ) -> logging.Logger:
16-21
: Prevent double logging by stopping propagationIf the root logger also has handlers (common in CI or when embedded), child logs will be duplicated. Set
propagate = False
on the rpau logger after attaching the stream handler.Apply this diff:
sh = logging.StreamHandler(sys.stdout) sh.setFormatter(formatter) logger.handlers.clear() logger.addHandler(sh) + logger.propagate = False
tools/rpau/src/rpau/__init__.py (4)
20-21
: Use simpler/idiomatic pattern for rglob
Path.rglob('*.toml')
already searches recursively. Using'**/*.toml'
is redundant.Apply this diff:
- for conf_file in conf_dir.rglob("**/*.toml"): + for conf_file in conf_dir.rglob("*.toml"):
24-29
: Match include/exclude against OS paths and use search() instead of match()
as_uri().removeprefix("file://")
produces odd leading slashes and is platform-sensitive.re.match
anchors at the start;search
is friendlier for substring patterns.Apply this diff:
- uri = conf_file.as_uri().removeprefix("file://") - if not include.match(uri): + uri = str(conf_file.resolve()) + if not include.search(uri): continue - if exclude.match(uri): + if exclude.search(uri): continue
56-57
: Use logger.warning() (logger.warn is deprecated)Modern logging deprecates
warn
. Usewarning
for compatibility.Apply this diff:
- logger.warn(f"{conf_file}: {err}") + logger.warning(f"{conf_file}: {err}")
62-63
: Use logger.warning() (logger.warn is deprecated)Same as above.
Apply this diff:
- logger.warn(f"{conf_file}: has no 'path' key. skipping") + logger.warning(f"{conf_file}: has no 'path' key. skipping")tools/rpau/src/rpau/cli.py (4)
10-17
: Avoid leaking env var values in help outputPrinting the current value of each environment variable in
--help
can leak sensitive data and creates noisy help text. Showing the env var name is enough.- if action.dest != "help": - env_name = f"{self._prog}_{action.dest}".upper() - env_value = os.environ.get(env_name, "") - help_msg += f" [env: {env_name}={env_value}]" + if action.dest != "help": + env_name = f"{self._prog}_{action.dest}".upper() + suffix = f" [env: {env_name}{' set' if env_name in os.environ else ' unset'}]" + help_msg += suffix
69-70
: Pluralize the group titleMinor UX nit: make it “Output options” for consistency with other groups.
- output_group = parser.add_argument_group("Output option") + output_group = parser.add_argument_group("Output options")
74-78
: Default output dir depends on this file’s location; consider a safer baselineUsing
Path(__file__).parents[4]
assumes the repo layout and breaks if the module moves. Safer defaults arePath.cwd()
or the rpau project root (e.g.,Path(__file__).parents[2]
). If you keep this, please document the assumption in the help text.
122-126
: Double-check the raw GitHub base URLRaw content URLs generally use
https://raw.githubusercontent.com/<org>/<repo>/<tag-or-branch>/<path>
. Therefs/tags
segment is typically not part of raw URLs and may 404 when joined as.../refs/tags/<version>/<path>
. Consider changing the default tohttps://raw.githubusercontent.com/python/cpython
.Proposed tweak:
- default="https://raw.githubusercontent.com/python/cpython/refs/tags", + default="https://raw.githubusercontent.com/python/cpython",And ensure
fetch_upstream()
joins as/{version}/{path}
.tools/rpau/src/rpau/logic.py (8)
20-27
: Harden network fetches (user agent, robust errors)
urllib.request.urlopen
without a User-Agent can be throttled by some endpoints, and network failures are unhandled. Consider adding a UA and catchingURLError
/HTTPError
with a clear message.- with urllib.request.urlopen(upstream_url) as f: - contents = f.read().decode() + req = urllib.request.Request( + upstream_url, + headers={"User-Agent": "rpau/0.1 (+https://github.com/RustPython/RustPython)"}, + ) + try: + with urllib.request.urlopen(req) as f: + contents = f.read().decode() + except Exception as exc: # URLError, HTTPError + raise RuntimeError(f"Failed to fetch {upstream_url}: {exc}") from exc
120-127
: Replace stray triple-quoted “comment” with real commentThis block is a no-op string literal inside a function and can confuse readers and tools. Convert to a proper comment (or remove).
- ''' - if any( - is_patch_present(dec_node, patch_conf) - for dec_node in fn_node.decorator_list - if isinstance(dec_node, (ast.Attribute, ast.Call)) - ): - continue - ''' + # TODO: If/when we support idempotent patching, detect existing + # decorators before inserting to avoid duplication.
134-137
: Guard against unknown patch kindsIf
format_patch()
returns an empty string (unknown method), we’ll insert a blank line. Consider asserting or skipping unknown patch kinds to avoid silent no-ops.- patch = format_patch(patch_conf) - yield (lineno - 1, f"{indent}{patch}") + patch = format_patch(patch_conf) + if not patch: + raise ValueError(f"Unknown patch spec: {patch_conf!r}") + yield (lineno - 1, f"{indent}{patch}")
138-160
: Method injection indentation assumes top-level classesUsing fixed
INDENT1/INDENT2
(4 spaces) will misalign for nested classes or atypical indentation. You can compute indentation fromcls_node.col_offset
to keep formatting consistent.For example:
- yield ( - cls_node.end_lineno, - f""" -{INDENT1}{patch} -{INDENT1}def {fn_name}(self): -{INDENT2}return super().{fn_name}() -""".rstrip(), - ) + indent_cls = " " * (cls_node.col_offset + COL_OFFSET) + indent_body = indent_cls + (" " * COL_OFFSET) + body = ( + f"\n{indent_cls}{patch}\n" + f"{indent_cls}def {fn_name}(self):\n" + f"{indent_body}return super().{fn_name}()" + ) + yield (cls_node.end_lineno, body)
161-171
: Align type hints:conf
should beConf
The annotation says
dict
while the module definesConf
. Keep it consistent.-def apply_conf(contents: str, conf: dict) -> str: +def apply_conf(contents: str, conf: Conf) -> str:
180-186
: Minor: spelling and clarity in comment“distrupt” → “disrupt”, and the comment can be clearer.
- # Going in reverse to not distrupt the line offset + # Insert in reverse order to avoid disrupting subsequent line offsets
199-205
: Style and encoding: fix spacing and write with UTF-8PEP 8 spacing around
=
, and it’s safer to specify encoding when writing files.- patched_contents=apply_conf(contents, conf) + patched_contents = apply_conf(contents, conf) @@ - output_file.write_text(f"{new_contents}\n") + output_file.write_text(f"{new_contents}\n", encoding="utf-8")
21-27
: Verify upstream URL compositionIf the CLI keeps using
.../refs/tags
in--base-upstream-url
,fetch_upstream()
will build URLs like.../refs/tags/<version>/<path>
. Raw GitHub typically expects.../<version>/<path>
withoutrefs/tags
. Please confirm and adjust either the default base URL or the join logic here.Also applies to: 122-126
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (12)
Lib/argparse.py
is excluded by!Lib/**
Lib/ast.py
is excluded by!Lib/**
Lib/decimal.py
is excluded by!Lib/**
Lib/heapq.py
is excluded by!Lib/**
Lib/ipaddress.py
is excluded by!Lib/**
Lib/numbers.py
is excluded by!Lib/**
Lib/os.py
is excluded by!Lib/**
Lib/pprint.py
is excluded by!Lib/**
Lib/queue.py
is excluded by!Lib/**
Lib/test/test_os.py
is excluded by!Lib/**
Lib/this.py
is excluded by!Lib/**
tools/rpau/uv.lock
is excluded by!**/*.lock
📒 Files selected for processing (17)
tools/rpau/README.md
(1 hunks)tools/rpau/confs/Lib/argparse.toml
(1 hunks)tools/rpau/confs/Lib/ast.toml
(1 hunks)tools/rpau/confs/Lib/decimal.toml
(1 hunks)tools/rpau/confs/Lib/heapq.toml
(1 hunks)tools/rpau/confs/Lib/ipaddress.toml
(1 hunks)tools/rpau/confs/Lib/numbers.toml
(1 hunks)tools/rpau/confs/Lib/os.toml
(1 hunks)tools/rpau/confs/Lib/pprint.toml
(1 hunks)tools/rpau/confs/Lib/queue.toml
(1 hunks)tools/rpau/confs/Lib/test/test_os.toml
(1 hunks)tools/rpau/confs/Lib/this.toml
(1 hunks)tools/rpau/pyproject.toml
(1 hunks)tools/rpau/src/rpau/__init__.py
(1 hunks)tools/rpau/src/rpau/cli.py
(1 hunks)tools/rpau/src/rpau/logger.py
(1 hunks)tools/rpau/src/rpau/logic.py
(1 hunks)
✅ Files skipped from review due to trivial changes (10)
- tools/rpau/confs/Lib/ast.toml
- tools/rpau/confs/Lib/pprint.toml
- tools/rpau/confs/Lib/numbers.toml
- tools/rpau/confs/Lib/decimal.toml
- tools/rpau/confs/Lib/ipaddress.toml
- tools/rpau/confs/Lib/argparse.toml
- tools/rpau/pyproject.toml
- tools/rpau/confs/Lib/this.toml
- tools/rpau/confs/Lib/queue.toml
- tools/rpau/confs/Lib/heapq.toml
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.py
: Follow PEP 8 style for custom Python code
Use ruff for linting Python code
Files:
tools/rpau/src/rpau/logger.py
tools/rpau/src/rpau/logic.py
tools/rpau/src/rpau/__init__.py
tools/rpau/src/rpau/cli.py
🧬 Code graph analysis (2)
tools/rpau/src/rpau/logic.py (1)
tools/rpau/src/rpau/logger.py (1)
get_logger
(24-25)
tools/rpau/src/rpau/__init__.py (3)
tools/rpau/src/rpau/cli.py (1)
build_argparse
(46-134)tools/rpau/src/rpau/logger.py (2)
build_root_logger
(7-21)get_logger
(24-25)tools/rpau/src/rpau/logic.py (1)
run
(187-205)
🪛 Flake8 (7.2.0)
tools/rpau/src/rpau/logic.py
[error] 15-15: SyntaxError: invalid syntax
(E999)
tools/rpau/src/rpau/__init__.py
[error] 8-8: 'rpau.logger.get_logger' imported but unused
(F401)
🪛 LanguageTool
tools/rpau/README.md
[grammar] ~1-~1: Hier könnte ein Fehler sein.
Context: ...**ython Automatic Updater (rpau)
(QB_NEW_DE)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: Check the WASM package and demo
- GitHub Check: Run rust tests (windows-latest)
- GitHub Check: Ensure compilation on various targets
- GitHub Check: Run rust tests (ubuntu-latest)
- GitHub Check: Run rust tests (macos-latest)
- GitHub Check: Check Rust code with rustfmt and clippy
- GitHub Check: Run snippets and cpython tests (windows-latest)
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
🔇 Additional comments (1)
tools/rpau/src/rpau/cli.py (1)
129-132
: Confirm the default CPython versionDefaulting to
v3.13.7
is fine if that’s the intended baseline. Given the repo also mentions 3.13.6 in the PR description, please confirm which tag we want as default for now.
from rpau.cli import build_argparse | ||
from rpau.logger import build_root_logger, get_logger | ||
from rpau.logic import run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unused import to satisfy ruff/flake8 (F401)
get_logger
is imported but not used.
Apply this diff:
-from rpau.logger import build_root_logger, get_logger
+from rpau.logger import build_root_logger
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
from rpau.cli import build_argparse | |
from rpau.logger import build_root_logger, get_logger | |
from rpau.logic import run | |
from rpau.cli import build_argparse | |
from rpau.logger import build_root_logger | |
from rpau.logic import run |
🧰 Tools
🪛 Flake8 (7.2.0)
[error] 8-8: 'rpau.logger.get_logger' imported but unused
(F401)
🤖 Prompt for AI Agents
In tools/rpau/src/rpau/__init__.py around lines 7 to 9, the get_logger symbol is
imported but never used which triggers an F401 unused-import lint error; remove
the unused import by deleting get_logger from the import list (leave
build_argparse, build_root_logger, run intact) so the file only imports the
symbols that are actually used.
def _add_action(self, action): | ||
action.default = os.environ.get( | ||
f"{self.prog}_{action.dest}".upper(), action.default | ||
) | ||
return super()._add_action(action) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Env overrides ignore types (Path/int/bool), causing subtle bugs
Overriding defaults from env injects raw strings for every action. This breaks typed options like --cache
(BooleanOptionalAction), --cache-dir
(Path), --workers
(int), and regex patterns (type=re.compile). Also, invalid values for --log-level
won’t be validated. Coerce env strings using the action’s type
(or a bool parser for BooleanOptionalAction) and validate against choices
.
- def _add_action(self, action):
- action.default = os.environ.get(
- f"{self.prog}_{action.dest}".upper(), action.default
- )
- return super()._add_action(action)
+ def _add_action(self, action):
+ env_name = f"{self.prog}_{action.dest}".upper()
+ env_val = os.environ.get(env_name, None)
+ if env_val is not None:
+ # Coerce env string to the expected type
+ if isinstance(action, argparse.BooleanOptionalAction):
+ val = env_val.lower() in {"1", "true", "yes", "on"}
+ elif action.type is not None:
+ val = action.type(env_val)
+ else:
+ val = env_val
+ # Validate against choices if provided
+ if action.choices and val not in action.choices:
+ raise argparse.ArgumentError(
+ action, f"{env_name} must be one of {list(action.choices)}"
+ )
+ action.default = val
+ return super()._add_action(action)
If you prefer not to use isinstance(..., argparse.BooleanOptionalAction)
, I can inline a tiny _strtobool()
helper and call it whenever action.option_strings
contains both the --foo
and --no-foo
forms.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def _add_action(self, action): | |
action.default = os.environ.get( | |
f"{self.prog}_{action.dest}".upper(), action.default | |
) | |
return super()._add_action(action) | |
def _add_action(self, action): | |
env_name = f"{self.prog}_{action.dest}".upper() | |
env_val = os.environ.get(env_name, None) | |
if env_val is not None: | |
# Coerce env string to the expected type | |
if isinstance(action, argparse.BooleanOptionalAction): | |
val = env_val.lower() in {"1", "true", "yes", "on"} | |
elif action.type is not None: | |
val = action.type(env_val) | |
else: | |
val = env_val | |
# Validate against choices if provided | |
if action.choices and val not in action.choices: | |
raise argparse.ArgumentError( | |
action, f"{env_name} must be one of {list(action.choices)}" | |
) | |
action.default = val | |
return super()._add_action(action) |
type Patch = dict[str, dict[str, str]] | ||
type Conf = dict[str, Patch] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PEP 695 type
aliases break on Python <3.12 (Flake8 E999)
type Patch = ...
and type Conf = ...
require Python 3.12+. The CI error (E999) indicates the linters/interpreter parsing this file don’t support it. Use traditional aliases to restore compatibility.
-type Patch = dict[str, dict[str, str]]
-type Conf = dict[str, Patch]
+Patch = dict[str, dict[str, str]]
+Conf = dict[str, Patch]
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
type Patch = dict[str, dict[str, str]] | |
type Conf = dict[str, Patch] | |
Patch = dict[str, dict[str, str]] | |
Conf = dict[str, Patch] |
🧰 Tools
🪛 Flake8 (7.2.0)
[error] 15-15: SyntaxError: invalid syntax
(E999)
🤖 Prompt for AI Agents
In tools/rpau/src/rpau/logic.py around lines 15-16, the new `type` alias syntax
requires Python 3.12 and causes Flake8 E999; replace the PEP 695 declarations
with traditional assignment aliases, e.g. assign Patch to the dict[...] type and
Conf to the dict[...] type instead of using `type Patch = ...` and `type Conf =
...`, so the file parses on older Python versions and CI linters.
def format_patch(patch_conf: Patch) -> str: | ||
""" | ||
Transforms a patch definition to a raw python code. | ||
|
||
Parameters | ||
---------- | ||
patch_conf : Patch | ||
Conf of the patch. | ||
|
||
Returns | ||
------- | ||
str | ||
Raw python source code. | ||
|
||
Examples | ||
-------- | ||
>>> patch = {"expectedFailure": {"reason": "lorem ipsum"}} | ||
>>> format_patch(patch) | ||
'@unittest.expectedFailure # TODO: RUSTPYTHON; lorem ipsum' | ||
""" | ||
method, conf = next(iter(patch_conf.items())) | ||
prefix = f"@unittest.{method}" | ||
|
||
reason = conf.get("reason", "") | ||
res = "" | ||
match method: | ||
case "expectedFailure": | ||
res = f"{prefix} # {COMMENT}; {reason}" | ||
case "expectedFailureIfWindows" | "skip": | ||
res = f'{prefix}("{COMMENT}; {reason}")' | ||
case "skipIf": | ||
cond = conf["cond"] | ||
res = f'{prefix}({cond}, "{COMMENT}; {reason}")' | ||
|
||
return res.strip().rstrip(";").strip() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
expectedFailureIfWindows
is not a unittest decorator; emitted code will error
unittest
exposes expectedFailure
, skip
, skipIf
, skipUnless
. There’s no expectedFailureIfWindows
on unittest
. In CPython tests, the idiom is test.support.anticipate_failure(condition)
to conditionally apply expectedFailure
. Emitting @unittest.expectedFailureIfWindows("...")
will raise AttributeError at import time.
Suggested mapping change and inline comment:
- case "expectedFailure":
- res = f"{prefix} # {COMMENT}; {reason}"
- case "expectedFailureIfWindows" | "skip":
- res = f'{prefix}("{COMMENT}; {reason}")'
+ case "expectedFailure":
+ res = f"{prefix} # {COMMENT}; {reason}"
+ case "skip":
+ res = f'{prefix}("{COMMENT}; {reason}")'
+ case "expectedFailureIfWindows":
+ # Use CPython's helper to conditionally apply expectedFailure on Windows
+ res = f'@anticipate_failure(sys.platform == "win32") # {COMMENT}; {reason}'
To support this, ensure the target file has:
from test.support import anticipate_failure
import sys
Since import injection isn’t implemented yet, either:
- Extend
apply_conf()
to add missing imports at the file top (I can draft this), or - Downgrade to
@unittest.skipIf(sys.platform == "win32", "...")
as a temporary measure to avoid import-time failures.
Committable suggestion skipped: line range outside the PR's diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (2)
tools/rpau/src/rpau/logic.py (2)
15-16
: Fix PEP 695type
aliases (breaks CI on Python <3.12)
type Patch = ...
andtype Conf = ...
are Python 3.12+ only and trigger Flake8 E999. Replace with traditional aliases.-type Patch = dict[str, dict[str, str]] -type Conf = dict[str, Patch] +Patch = dict[str, dict[str, str]] +Conf = dict[str, Patch]
75-83
: Invalid decorator:unittest.expectedFailureIfWindows
does not existThis will raise AttributeError at import time. Either use CPython’s
test.support.anticipate_failure
or temporarily downgrade to askipIf
with a clear TODO. I recommend implementing the proper mapping and adding import injection foranticipate_failure
.- match method: - case "expectedFailure": - res = f"{prefix} # {COMMENT}; {reason}" - case "expectedFailureIfWindows" | "skip": - res = f'{prefix}("{COMMENT}; {reason}")' - case "skipIf": - cond = conf["cond"] - res = f'{prefix}({cond}, "{COMMENT}; {reason}")' + match method: + case "expectedFailure": + res = f"{prefix} # {COMMENT}; {reason}" + case "skip": + res = f'{prefix}("{COMMENT}; {reason}")' + case "expectedFailureIfWindows": + # apply conditionally on Windows like CPython tests do + res = f'@anticipate_failure(sys.platform == "win32") # {COMMENT}; {reason}' + case "skipIf": + cond = conf.get("cond") + if not cond: + raise ValueError("skipIf patch requires a 'cond' entry") + res = f'{prefix}({cond}, "{COMMENT}; {reason}")' + case _: + raise ValueError(f"Unsupported patch method: {method}")Follow-up required: ensure the target file has
from test.support import anticipate_failure
andimport sys
(see import injection suggestion below).
🧹 Nitpick comments (10)
tools/rpau/src/rpau/logic.py (10)
104-141
: Simplify AST traversal and avoid mutatingconf
in-place
- Traversing with
ast.iter_fields(tree)
is unnecessary; iteratetree.body
and filterast.ClassDef
for readability.cls_conf.pop(...)
mutates the caller’s conf; that’s surprising and can cause bugs if the same conf is reused. Work on a deep copy.Sketch:
- for key, nodes in ast.iter_fields(tree): - if key != "body": - continue - for i, cls_node in enumerate(nodes): + for cls_node in (n for n in tree.body if isinstance(n, ast.ClassDef)): ... - if not (patch_conf := cls_conf.pop(fn_node.name, None)): + if not (patch_conf := cls_conf.get(fn_node.name)): continueAnd clone the input before calling
iter_patches
(seeapply_conf
comment).Also applies to: 142-163
21-27
: Add timeout and error handling to upstream fetchNetwork calls should use a timeout and surface clear errors. Also, set a UA to avoid being blocked.
-def fetch_upstream(*, base_url: str, path: str, version: str) -> str: - upstream_url = "/".join((base_url, version, path)) - logger.debug(f"{upstream_url=}") - - with urllib.request.urlopen(upstream_url) as f: - contents = f.read().decode() - return contents +def fetch_upstream(*, base_url: str, path: str, version: str, timeout: float = 15.0) -> str: + from urllib.error import HTTPError, URLError + upstream_url = "/".join((base_url.rstrip("/"), version.strip("/"), path.lstrip("/"))) + logger.debug("Fetching %s", upstream_url) + req = urllib.request.Request(upstream_url, headers={"User-Agent": "rpau/0.1 (+RustPython)"}) + try: + with urllib.request.urlopen(req, timeout=timeout) as f: + contents = f.read().decode("utf-8") + return contents + except (HTTPError, URLError) as e: + raise RuntimeError(f"Failed to fetch upstream {upstream_url}: {e}") from eNo call-site changes required because
timeout
has a default.
30-47
: Cache: pre-create parent dirs and handle read errors explicitlyMinor: creating parent dirs before attempting
read_text()
avoids a redundant try/except on deeply nested paths and makes the logic a bit clearer.- if cache_dir: - cached_file = cache_dir / version / path - try: - contents = cached_file.read_text() - except FileNotFoundError: - cached_file.parent.mkdir(parents=True, exist_ok=True) - contents = fetch() - cached_file.write_text(contents) - - return contents + if cache_dir: + cached_file = cache_dir / version / path + cached_file.parent.mkdir(parents=True, exist_ok=True) + if cached_file.exists(): + return cached_file.read_text() + contents = fetch() + cached_file.write_text(contents) + return contents
10-13
: Remove fixed indent constants (become dead code after wrapper fix)Once wrapper indentation is dynamic,
INDENT1/INDENT2
are unused and can be dropped.COL_OFFSET = 4 -INDENT1 = " " * COL_OFFSET -INDENT2 = INDENT1 * 2 COMMENT = "TODO: RUSTPYTHON"
49-68
: Guardrail: validate single-key patch objects early
format_patch
assumes a single key. Add a quick validation to fail fast on malformed config.- method, conf = next(iter(patch_conf.items())) + if len(patch_conf) != 1: + raise ValueError(f"Patch must have exactly one key, got {list(patch_conf)}") + method, conf = next(iter(patch_conf.items()))
183-186
: Comment typoNit: “distrupt” → “disrupt”.
- # Going in reverse to not distrupt the line offset + # Going in reverse to not disrupt the line offset
204-211
: Enrich the header with provenance and generation noticeAdd a clear “generated” banner and the exact upstream URL to ease troubleshooting.
- patched_contents = apply_conf(contents, conf) - new_contents = f"# upstream_version: {version}\n{patched_contents}" + patched_contents = apply_conf(contents, conf) + upstream_url = "/".join((base_upstream_url.rstrip("/"), version.strip("/"), path.lstrip("/"))) + new_contents = ( + "# This file is generated by rpau. DO NOT EDIT.\n" + f"# upstream_version: {version}\n" + f"# upstream_source: {upstream_url}\n" + f"{patched_contents}" + )Also consider preserving permissions with
shutil.copymode
when overwriting existing files (left as TODO if you prefer).
192-203
: Error handling: surface fetch/apply failures with contextWrap the orchestration path with minimal try/except to log actionable context (path/version) before bubbling up.
def run( @@ - contents = get_upstream_contents( - path=path, version=version, base_url=base_upstream_url, cache_dir=cache_dir - ) + try: + contents = get_upstream_contents( + path=path, version=version, base_url=base_upstream_url, cache_dir=cache_dir + ) + except Exception as e: + logger.error("Failed to retrieve upstream for %s@%s: %s", path, version, e) + raise @@ - patched_contents = apply_conf(contents, conf) + try: + patched_contents = apply_conf(contents, conf) + except Exception as e: + logger.error("Failed to apply conf to %s: %s", path, e) + raise
1-9
: Typing: avoid runtime evaluation of forward references on older PythonsOptional: add
from __future__ import annotations
to postpone evaluation of annotations on 3.10/3.11, reducing chances of NameErrors in annotations and improving import-time robustness.+from __future__ import annotations import ast import urllib.request from typing import TYPE_CHECKING
165-189
: Add unit tests for patch formatting and insertionThis module is central to the auto-updater. Recommend unit tests for:
format_patch
for all methods, including invalid inputapply_conf
end-to-end on a small sample file (with and without imports/docstring)- Windows-conditional expectedFailure decorator insertion
I can draft pytest cases with golden files for quick verification. Want me to push a test scaffold?
📜 Review details
Configuration used: Path: .coderabbit.yml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (1)
Lib/textwrap.py
is excluded by!Lib/**
📒 Files selected for processing (2)
tools/rpau/confs/Lib/textwrap.toml
(1 hunks)tools/rpau/src/rpau/logic.py
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- tools/rpau/confs/Lib/textwrap.toml
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.py
: Follow PEP 8 style for custom Python code
Use ruff for linting Python code
Files:
tools/rpau/src/rpau/logic.py
🧠 Learnings (3)
📚 Learning: 2025-07-19T03:16:56.511Z
Learnt from: CR
PR: RustPython/RustPython#0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-07-19T03:16:56.511Z
Learning: Applies to Lib/test/**/*.py : When tests fail due to unsupported syntax, keep the test as `unittest.expectedFailure`, document that it requires PEP 695 support, and focus on tests that can be fixed through Rust code changes only
Applied to files:
tools/rpau/src/rpau/logic.py
📚 Learning: 2025-07-19T03:16:56.511Z
Learnt from: CR
PR: RustPython/RustPython#0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-07-19T03:16:56.511Z
Learning: Applies to Lib/test/**/*.py : When a test cannot pass due to missing language features, keep it as `unittest.expectedFailure` and document the reason
Applied to files:
tools/rpau/src/rpau/logic.py
📚 Learning: 2025-07-19T03:16:56.511Z
Learnt from: CR
PR: RustPython/RustPython#0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-07-19T03:16:56.511Z
Learning: Applies to Lib/test/**/*.py : The only acceptable modifications to test files are: 1. Removing `unittest.expectedFailure` decorators and the upper TODO comments when tests actually pass 2. Adding `unittest.expectedFailure` decorators when tests cannot be fixed
Applied to files:
tools/rpau/src/rpau/logic.py
🧬 Code graph analysis (1)
tools/rpau/src/rpau/logic.py (1)
tools/rpau/src/rpau/logger.py (1)
get_logger
(24-25)
🪛 Flake8 (7.2.0)
tools/rpau/src/rpau/logic.py
[error] 15-15: SyntaxError: invalid syntax
(E999)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Run snippets and cpython tests (ubuntu-latest)
- GitHub Check: Run rust tests (macos-latest)
- GitHub Check: Ensure compilation on various targets
- GitHub Check: Run snippets and cpython tests (macos-latest)
- GitHub Check: Run snippets and cpython tests (windows-latest)
- GitHub Check: Run rust tests (windows-latest)
🔇 Additional comments (1)
tools/rpau/src/rpau/logic.py (1)
49-84
: Align with project guidance: prefer expectedFailure for missing featuresPer the retrieved learnings for Lib/test: when a test cannot pass due to missing language features (e.g., PEP 695), keep it as
unittest.expectedFailure
and document the reason. Your current mapping supports this; ensure TOML reasons reflect the missing feature explicitly so the generated comment is actionable.If you want, I can help audit TOML entries to ensure reasons cite the relevant PEPs or issues.
type Patch = dict[str, dict[str, str]] | ||
type Conf = dict[str, Patch] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Run ruff/flake8 locally to confirm fixes
After replacing PEP 695 aliases, verify parseability and style.
🏁 Script executed:
#!/bin/bash
# From repo root
python -m compileall -q tools/rpau/src/rpau/logic.py
ruff check tools/rpau/src/rpau/logic.py
Length of output: 769
Fix invalid type alias syntax and add missing import
The current alias declarations and type annotations cause a compile error and undefined name:
-
The lines 15–16
type Patch = dict[str, dict[str, str]] type Conf = dict[str, Patch]
use unsupported
type … = …
alias syntax and trigger a SyntaxError in Python 3.11 (and earlier) environments. Replace them with PEP 613’sTypeAlias
or a standard assignment pattern. For example:from typing import TypeAlias -type Patch = dict[str, dict[str, str]] -type Conf = dict[str, Patch] +Patch: TypeAlias = dict[str, dict[str, str]] +Conf: TypeAlias = dict[str, Patch]
-
At line 86, the return annotation
-> "Iterator[tuple[int, str]]"
refers toIterator
without importing it, leading to:F821 Undefined name `Iterator`
Add an import at the top of the file:
from typing import TypeAlias +from typing import Iterator
Once these changes are applied, please re-run the checks:
python -m compileall -q tools/rpau/src/rpau/logic.py
ruff check tools/rpau/src/rpau/logic.py
and verify that the file parses cleanly and passes all style checks.
🧰 Tools
🪛 Flake8 (7.2.0)
[error] 15-15: SyntaxError: invalid syntax
(E999)
🤖 Prompt for AI Agents
In tools/rpau/src/rpau/logic.py around lines 15–16 and line 86, replace the
invalid alias syntax "type Patch = ..." and "type Conf = ..." with a valid
TypeAlias declaration or simple assignments (e.g. import TypeAlias from typing
and write "Patch: TypeAlias = dict[str, dict[str, str]]" and "Conf: TypeAlias =
dict[str, Patch]" or use "Patch = dict[str, dict[str, str]]" / "Conf = dict[str,
Patch]" as plain assignments), and add the missing import for Iterator (e.g.
"from typing import Iterator") at the top of the file so the return annotation
on line 86 resolves; after making these edits rerun "python -m compileall -q
tools/rpau/src/rpau/logic.py" and "ruff check tools/rpau/src/rpau/logic.py" to
verify parsing and style checks pass.
# Phase 2: Iterate and mark inhereted tests | ||
for key, nodes in ast.iter_fields(tree): | ||
if key != "body": | ||
continue | ||
|
||
for i, cls_node in enumerate(nodes): | ||
if not isinstance(cls_node, ast.ClassDef): | ||
continue | ||
|
||
if not (cls_conf := conf.get(cls_node.name)): | ||
continue | ||
|
||
for fn_name, patch_conf in cls_conf.items(): | ||
patch = format_patch(patch_conf) | ||
yield ( | ||
cls_node.end_lineno, | ||
f""" | ||
{INDENT1}{patch} | ||
{INDENT1}def {fn_name}(self): | ||
{INDENT2}return super().{fn_name}() | ||
""".rstrip(), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Wrong indentation for injected wrapper; use class col_offset, not a fixed 4 spaces
Using a fixed INDENT1/INDENT2
breaks formatting for nested or differently indented classes. Compute indentation from cls_node.col_offset
.
- for fn_name, patch_conf in cls_conf.items():
- patch = format_patch(patch_conf)
- yield (
- cls_node.end_lineno,
- f"""
-{INDENT1}{patch}
-{INDENT1}def {fn_name}(self):
-{INDENT2}return super().{fn_name}()
-""".rstrip(),
- )
+ for fn_name, patch_conf in cls_conf.items():
+ patch = format_patch(patch_conf)
+ cls_indent = " " * cls_node.col_offset
+ body_indent = cls_indent + (" " * COL_OFFSET)
+ yield (
+ cls_node.end_lineno,
+ (
+ f"\n"
+ f"{cls_indent}{patch}\n"
+ f"{cls_indent}def {fn_name}(self):\n"
+ f"{body_indent}return super().{fn_name}()"
+ ),
+ )
Additionally, after this change, INDENT1
/INDENT2
become dead code and can be removed. See a cleanup note below.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
# Phase 2: Iterate and mark inhereted tests | |
for key, nodes in ast.iter_fields(tree): | |
if key != "body": | |
continue | |
for i, cls_node in enumerate(nodes): | |
if not isinstance(cls_node, ast.ClassDef): | |
continue | |
if not (cls_conf := conf.get(cls_node.name)): | |
continue | |
for fn_name, patch_conf in cls_conf.items(): | |
patch = format_patch(patch_conf) | |
yield ( | |
cls_node.end_lineno, | |
f""" | |
{INDENT1}{patch} | |
{INDENT1}def {fn_name}(self): | |
{INDENT2}return super().{fn_name}() | |
""".rstrip(), | |
) | |
# Phase 2: Iterate and mark inhereted tests | |
for key, nodes in ast.iter_fields(tree): | |
if key != "body": | |
continue | |
for i, cls_node in enumerate(nodes): | |
if not isinstance(cls_node, ast.ClassDef): | |
continue | |
if not (cls_conf := conf.get(cls_node.name)): | |
continue | |
for fn_name, patch_conf in cls_conf.items(): | |
patch = format_patch(patch_conf) | |
cls_indent = " " * cls_node.col_offset | |
body_indent = cls_indent + (" " * COL_OFFSET) | |
yield ( | |
cls_node.end_lineno, | |
( | |
f"\n" | |
f"{cls_indent}{patch}\n" | |
f"{cls_indent}def {fn_name}(self):\n" | |
f"{body_indent}return super().{fn_name}()" | |
), | |
) |
🤖 Prompt for AI Agents
In tools/rpau/src/rpau/logic.py around lines 141 to 162, the injected wrapper
uses fixed INDENT1/INDENT2 which breaks formatting for nested or differently
indented classes; replace those with indentation computed from
cls_node.col_offset: build indent1 = " " * cls_node.col_offset and indent2 =
indent1 + " " * 4 (or indent1 + (" " * default_indent) if default indent
differs), then use indent1/indent2 when formatting the patch and remove the
now-dead INDENT1/INDENT2 constants and any references to them; ensure the
formatted string uses these computed variables so injected code aligns with the
class declaration.
def apply_conf(contents: str, conf: dict) -> str: | ||
""" | ||
Patch a given source code based on the conf. | ||
|
||
Parameters | ||
---------- | ||
contents : str | ||
Raw python source code. | ||
conf : Conf | ||
Dict of `{ClassName: {test_name: Patch}}`. | ||
|
||
Returns | ||
------- | ||
str | ||
Patched raw python code. | ||
""" | ||
lines = contents.splitlines() | ||
tree = ast.parse(contents) | ||
|
||
# Going in reverse to not distrupt the line offset | ||
patches = list(iter_patches(tree, conf)) | ||
for lineno, patch in sorted(patches, reverse=True): | ||
lines.insert(lineno, patch) | ||
|
||
return "\n".join(lines) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Robustness: don’t mutate conf
, handle SyntaxError gracefully, and inject required imports
- If the upstream file uses newer syntax (e.g., PEP 695),
ast.parse
can fail on older interpreters. CatchSyntaxError
, log, and return the original contents instead of crashing. - Do not mutate the caller’s
conf
; work on a deep copy becauseiter_patches
consumes it. - Ensure required imports are present in the target file when emitting decorators that reference
sys
oranticipate_failure
.
def apply_conf(contents: str, conf: dict) -> str:
@@
- lines = contents.splitlines()
- tree = ast.parse(contents)
-
- # Going in reverse to not distrupt the line offset
- patches = list(iter_patches(tree, conf))
- for lineno, patch in sorted(patches, reverse=True):
- lines.insert(lineno, patch)
-
- return "\n".join(lines)
+ import copy
+ lines = contents.splitlines()
+ # Work on a copy; iter_patches may consume entries
+ conf_copy: Conf = copy.deepcopy(conf)
+ try:
+ tree = ast.parse(contents)
+ except SyntaxError as e:
+ logger.warning("Skipping patching due to parse error: %s", e)
+ return contents
+
+ # Import injection for patches that require helpers
+ def _requires_imports(c: Conf) -> list[str]:
+ need: set[str] = set()
+ for cls in c.values():
+ for patch in cls.values():
+ method, cfg = next(iter(patch.items()))
+ if method == "expectedFailureIfWindows":
+ need.add("import sys")
+ need.add("from test.support import anticipate_failure")
+ if method == "skipIf":
+ cond = cfg.get("cond", "")
+ if "sys." in cond:
+ need.add("import sys")
+ if "os." in cond:
+ need.add("import os")
+ return sorted(need)
+
+ def _import_present(line: str, mod: str) -> bool:
+ return line.startswith(mod) or f"{mod.split()[-1]} " in line
+
+ def _imports_in_file(existing: list[str], mod: str) -> bool:
+ return any(_import_present(l.strip(), mod) for l in existing if l.strip().startswith(("import ", "from ")))
+
+ needed = [m for m in _requires_imports(conf_copy) if not _imports_in_file(lines, m)]
+ if needed:
+ # Insert after shebang/encoding/docstring if present
+ insert_at = 0
+ if insert_at < len(lines) and lines[0].startswith("#!"):
+ insert_at += 1
+ if insert_at < len(lines) and "coding:" in lines[insert_at]:
+ insert_at += 1
+ # naive docstring skipper
+ if insert_at < len(lines) and lines[insert_at].lstrip().startswith(('"""', "'''")):
+ q = lines[insert_at].lstrip()[:3]
+ insert_at += 1
+ while insert_at < len(lines) and q not in lines[insert_at]:
+ insert_at += 1
+ insert_at = min(insert_at + 1, len(lines))
+ for mod in reversed(needed):
+ lines.insert(insert_at, mod)
+
+ # Going in reverse to not disrupt line offsets
+ patches = list(iter_patches(tree, conf_copy))
+ for lineno, patch in sorted(patches, reverse=True):
+ lines.insert(lineno, patch)
+
+ return "\n".join(lines)
If you prefer, I can extract the import-injection helpers into a separate module for reuse. Do you want me to push a patch?
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
def apply_conf(contents: str, conf: dict) -> str: | |
""" | |
Patch a given source code based on the conf. | |
Parameters | |
---------- | |
contents : str | |
Raw python source code. | |
conf : Conf | |
Dict of `{ClassName: {test_name: Patch}}`. | |
Returns | |
------- | |
str | |
Patched raw python code. | |
""" | |
lines = contents.splitlines() | |
tree = ast.parse(contents) | |
# Going in reverse to not distrupt the line offset | |
patches = list(iter_patches(tree, conf)) | |
for lineno, patch in sorted(patches, reverse=True): | |
lines.insert(lineno, patch) | |
return "\n".join(lines) | |
def apply_conf(contents: str, conf: dict) -> str: | |
""" | |
Patch a given source code based on the conf. | |
Parameters | |
---------- | |
contents : str | |
Raw python source code. | |
conf : Conf | |
Dict of `{ClassName: {test_name: Patch}}`. | |
Returns | |
------- | |
str | |
Patched raw python code. | |
""" | |
import copy | |
lines = contents.splitlines() | |
# Work on a copy; iter_patches may consume entries | |
conf_copy: Conf = copy.deepcopy(conf) | |
try: | |
tree = ast.parse(contents) | |
except SyntaxError as e: | |
logger.warning("Skipping patching due to parse error: %s", e) | |
return contents | |
# Import injection for patches that require helpers | |
def _requires_imports(c: Conf) -> list[str]: | |
need: set[str] = set() | |
for cls in c.values(): | |
for patch in cls.values(): | |
method, cfg = next(iter(patch.items())) | |
if method == "expectedFailureIfWindows": | |
need.add("import sys") | |
need.add("from test.support import anticipate_failure") | |
if method == "skipIf": | |
cond = cfg.get("cond", "") | |
if "sys." in cond: | |
need.add("import sys") | |
if "os." in cond: | |
need.add("import os") | |
return sorted(need) | |
def _import_present(line: str, mod: str) -> bool: | |
return line.startswith(mod) or f"{mod.split()[-1]} " in line | |
def _imports_in_file(existing: list[str], mod: str) -> bool: | |
return any( | |
_import_present(l.strip(), mod) | |
for l in existing | |
if l.strip().startswith(("import ", "from ")) | |
) | |
needed = [ | |
m for m in _requires_imports(conf_copy) | |
if not _imports_in_file(lines, m) | |
] | |
if needed: | |
# Insert after shebang/encoding/docstring if present | |
insert_at = 0 | |
if insert_at < len(lines) and lines[0].startswith("#!"): | |
insert_at += 1 | |
if insert_at < len(lines) and "coding:" in lines[insert_at]: | |
insert_at += 1 | |
# naive docstring skipper | |
if insert_at < len(lines) and lines[insert_at].lstrip().startswith(('"""', "'''")): | |
q = lines[insert_at].lstrip()[:3] | |
insert_at += 1 | |
while insert_at < len(lines) and q not in lines[insert_at]: | |
insert_at += 1 | |
insert_at = min(insert_at + 1, len(lines)) | |
for mod in reversed(needed): | |
lines.insert(insert_at, mod) | |
# Going in reverse to not disrupt line offsets | |
patches = list(iter_patches(tree, conf_copy)) | |
for lineno, patch in sorted(patches, reverse=True): | |
lines.insert(lineno, patch) | |
return "\n".join(lines) |
🤖 Prompt for AI Agents
In tools/rpau/src/rpau/logic.py around lines 165 to 189, make three fixes: 1)
Wrap ast.parse(contents) in a try/except catching SyntaxError — log the error
(or debug message) and return the original contents unchanged to avoid crashing
on newer-syntax files; 2) Avoid mutating the caller's conf by making a deep copy
(e.g., copy.deepcopy(conf)) and pass the copy into iter_patches so iter_patches
can consume it safely; 3) After building the list of patches but before
inserting them, detect if emitted patches reference external names like sys or
anticipate_failure and ensure the corresponding import statements are present in
the target source: compute needed imports, check existing lines for those
imports (skipping shebang/module docstring), and insert missing import lines
near the top (after shebang/docstring) so decorators that reference sys or
anticipate_failure resolve. Ensure behavior otherwise unchanged.
The idea is to have a file that defines which tests needs to be marked with what (
expectedFailure
,skipIf
, etc..), and have a code that automatically applies those patches on a given test file.This is a POC only, I have plans to make it more modular using argparse and some flags, but the general gist will remain the same.
cc: @youknowone @arihant2math @moreal @reactive-firewall would love to hear your feedback
ref: #5974
This PR also updates
test_os.py
from 3.13.7 btwSummary by CodeRabbit
New Features
Tests
Chores
Documentation