Chores by Xmaster6y · Pull Request #2 · Xmaster6y/tdhook

Xmaster6y · 2025-08-14T10:25:54Z

What does this PR do?

Various chores.

Linked Issues

Closes #?
#?

Summary by Sourcery

Revise benchmark configurations, broaden Python compatibility, and introduce a benchmark testing harness

Enhancements:

Adjust default parameters and impact ranges for GPT2 and Integrated Gradients benchmark tasks and tidy task ordering in get_stats

Build:

Lower Python requirement to >=3.10 and add Python 3.10 classifier

CI:

Add Python 3.10 to CI workflow matrix

Tests:

Add scripts/bench/test_stats.py and SLURM launch script for automated benchmark testing

Chores:

Increase SLURM time limit for get-stats.sh

sourcery-ai · 2025-08-14T10:26:00Z

Reviewer's Guide

This PR bundles maintenance updates across benchmark scripts, packaging requirements, CI configurations, scheduler settings, and core code comments, and introduces a new test harness for benchmark statistics.

Entity relationship diagram for TASKS dictionary reordering in get_stats.py

erDiagram
  TASKS {
    string gpt2_cache
    string integrated_gradients
    string lrp
    string mlp_intervene
  }
  gpt2_cache ||--o{ tdhook : uses
  gpt2_cache ||--o{ nnsight : uses
  gpt2_cache ||--o{ transformer_lens : uses
  integrated_gradients ||--o{ tdhook : uses
  integrated_gradients ||--o{ captum_add : uses
  integrated_gradients ||--o{ captum : uses
  lrp ||--o{ tdhook : uses
  lrp ||--o{ zennit : uses
  mlp_intervene ||--o{ tdhook : uses
  mlp_intervene ||--o{ nnsight : uses

Class diagram for impact_parameters and default_parameters changes in gpt2_cache and integrated_gradients

classDiagram
  class GPT2CacheParameters {
    +model_size: ["gpt2", "gpt2-medium", "gpt2-large"]
    +batch_size: [10, 50, 100, 500]
    +variation: ["all"]
  }
  class GPT2CacheDefaults {
    +model_size: "gpt2"
    +batch_size: 50
    +variation: "all"
  }
  class IntegratedGradientsParameters {
    +width: [100, 1000, 5000, 10000]
    +height: [5, 10, 20, 50]
    +batch_size: [10, 100, 1000, 5000]
    +variation: ["multiply", "no-multiply"]
  }

File-Level Changes

Change	Details	Files
Refined benchmark task parameters	Reduced GPT2 model sizes and adjusted its batch size and variation key Corrected batch size values and formatting in integrated_gradients task	`scripts/bench/tasks/gpt2_cache/__init__.py` `scripts/bench/tasks/integrated_gradients/__init__.py`
Reordered TASKS mapping for stats collection	Moved gpt2_cache to the top of TASKS Swapped positions of integrated_gradients and lrp entries	`scripts/bench/get_stats.py`
Broadened Python version support	Lowered required Python minimum from 3.11 to 3.10 Added Python 3.10 classifier Expanded CI matrix to include 3.10	`pyproject.toml` `.github/workflows/ci.yml`
Tweaked scheduler timeouts	Increased SBATCH runtime limit for get-stats.sh from 1h to 10h	`scripts/bench/launch/get-stats.sh`
Added benchmark stats test harness	Created test_stats.py to automate benchmarking runs and save results Introduced test-stats.sh SBATCH launcher script Added uv.lock for dependency consistency	`scripts/bench/test_stats.py` `scripts/bench/launch/test-stats.sh` `uv.lock`
Annotated code with TODOs for future enhancements	Added TODO to move version metadata to packaging Inserted TODO to support kwargs in gradient attribution hook	`src/tdhook/__init__.py` `src/tdhook/attribution/gradient_attribution/common.py`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey there - I've reviewed your changes - here's some feedback:

SEEDS and TASKS are duplicated in get_stats.py and test_stats.py—consider centralizing these constants to reduce maintenance overhead.
The numeric literals "10_00" and "50_00" in the integrated_gradients impact_parameters are unconventional—prefer standard grouping like 1_000 and 5_000 for clarity.
There are lingering TODOs for version metadata and init_target kwargs in tdhook—address these or track them as issues to prevent accumulating technical debt.

Prompt for AI Agents

Please address the comments from this code review:
## Overall Comments
- SEEDS and TASKS are duplicated in get_stats.py and test_stats.py—consider centralizing these constants to reduce maintenance overhead.
- The numeric literals "10_00" and "50_00" in the integrated_gradients impact_parameters are unconventional—prefer standard grouping like 1_000 and 5_000 for clarity.
- There are lingering TODOs for version metadata and init_target kwargs in tdhook—address these or track them as issues to prevent accumulating technical debt.

## Individual Comments

### Comment 1
<location> `scripts/bench/tasks/integrated_gradients/__init__.py:11` </location>
<code_context>

 impact_parameters = {
-    "model_size": ["gpt2", "gpt2-medium", "gpt2-large", "gpt2-xl"],
-    "batch_size": [100, 1000, 10_000, 100_000],
-    "variations": ["all", "specific"],
+    "model_size": ["gpt2", "gpt2-medium", "gpt2-large"],
</code_context>

<issue_to_address>
Possible typo in batch_size values: '10_00' and '50_00' should likely be '1000' and '5000'.

Please update the batch size values to '1000' and '5000' to avoid confusion and ensure consistency.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

poupart and others added 4 commits August 14, 2025 12:19

adjusted tasks

2f39bc9

TODOs

eb49c12

future scripts

4c01e4b

python 3.10 compatibility

bd596a1

Xmaster6y temporarily deployed to ci August 14, 2025 10:25 — with GitHub Actions Inactive

sourcery-ai Bot reviewed Aug 14, 2025

View reviewed changes

Comment thread scripts/bench/tasks/integrated_gradients/__init__.py

typo

ffcdd23

Xmaster6y temporarily deployed to ci August 14, 2025 10:37 — with GitHub Actions Inactive

Xmaster6y merged commit de3f5b7 into main Aug 14, 2025
3 checks passed

Xmaster6y deleted the chores branch August 14, 2025 10:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chores#2

Chores#2
Xmaster6y merged 5 commits intomainfrom
chores

Xmaster6y commented Aug 14, 2025 •

edited by sourcery-ai Bot

Loading

Uh oh!

sourcery-ai Bot commented Aug 14, 2025 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Xmaster6y commented Aug 14, 2025 • edited by sourcery-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Linked Issues

Summary by Sourcery

Uh oh!

sourcery-ai Bot commented Aug 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Entity relationship diagram for TASKS dictionary reordering in get_stats.py

Class diagram for impact_parameters and default_parameters changes in gpt2_cache and integrated_gradients

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Xmaster6y commented Aug 14, 2025 •

edited by sourcery-ai Bot

Loading

sourcery-ai Bot commented Aug 14, 2025 •

edited

Loading