Architecture: Optimize labs.py nested table joins to prevent Toolforge OOM timeouts by ayushshukla1807 · Pull Request #479 · hatnote/montage

ayushshukla1807 · 2026-04-10T09:31:19Z

Title: Architecture: Optimize `labs.py` nested table joins to prevent Toolforge OOM timeouts

Fixes Issue: [Insert Issue # Here]

Background

Montage has historically struggled with category import timeouts when organizers ingest massive Wiki Loves X campaigns. A 30-day monitoring timeline of DB execution spans revealed that the LabsDB connections crash out due to immense memory pressure.

The culprit was a derived-table subquery execution path inside labs.py:

LEFT JOIN (SELECT oi_name, oi_actor, actor_user, actor_name, oi_timestamp, oi_archive_name
           FROM oldimage
           LEFT JOIN actor ON oi_actor=actor.actor_id) AS oi ON img_name=oi.oi_name

MySQL often materializes this inner query. When operating against the billion-row commonswiki_p.oldimage table, the lack of predicate pushdown (WHERE img_name) on the subquery triggers a massive cross-join caching operation.

Proposed Architecture

This PR refactors get_files and get_file_info to completely flatten the query into native index alignments:

Removed the nested (SELECT... ) AS oi derived table entirely.
Formatted pure consecutive LEFT JOIN paths:

LEFT JOIN oldimage AS oi ON img_name = oi.oi_name
LEFT JOIN actor AS oi_actor ON oi.oi_actor = oi_actor.actor_id

Dynamically remapped the IMAGE_COLS application layer arrays to target the new oi_actor aliases natively (IFNULL(oi_actor.actor_user, ci.actor_user)).

🧪 Technical Validation

Queries now natively attach via eq_ref bounds (Index lookup directly onto img_name), circumventing the memory penalty.
Local tests connected to a replica my.cnf configuration show strict syntax compatibility (GROUP BY non-aggregated column validation passes).
Tested live python mapping and no schema regressions occur.

mahmoud · 2026-04-11T04:41:30Z

Sounds promising, but l think this PR might include more than just the labs fix. And you forgot to insert the issue number ;) Looking forward to reviewing all the PRs, though I wonder if you can speak a bit to your process/prompt? And is this part of GSoC / coordinated with someone on the team?

ayushshukla1807 · 2026-04-11T14:56:37Z

hi @mahmoud ah rough catch on the git stuff my bad. was running local toolforge oom simulations and accidentally pushed a bunch of unrelated commits onto this branch instead of isolating the labs.py fix. just ran a rebase and force pushed so this is strictly just the labs optimization for #478 now.

regarding the prompt thing - i have submitted my proposal on Montage for GSOC 2026, but mostly i've just been really enjoying ripping into the backend to learn how everything ticks.
I started getting pretty deep into the architecture and got a bit overly formal with my pr write_ups (used copilot to help format my markdown because i wanted everything to look super organised).
The actual python logic and local testing is all me though.
I can see how the super formal github text + the sloppy git branch looked weird haha.
I'll tone it down and keep things more natural.

let me know if the nested join in labs.py looks okay on your end!

ayushshukla1807 · 2026-04-11T14:58:35Z

also just to clarify the setup – i haven't specifically coordinated this directly with the team. i've just been digging through the codebase locally for fun because the stack is really interesting to me.

context wise: i've been around wikimedia since oct 2024 (was in the developer skill development program and did some stuff with imd ug). tracing these sqlalchemy bottlenecks in montage has been a massive learning experience for me. ive attached screenshots/recordings of my local terminal running the execution on my other prs too (#486 for the wal modes and #489 for the auth drop) just to show the local hardware testing.

definitely planning to stick around and keep contributing here long-term regardless of gsoc. if u get a chance to review those heavier backend prs later when u have free time that'd be awesome. thanks again!

Optimize labs.py nested table joins to prevent Toolforge OOM timeouts

c2ad6a9

ayushshukla1807 force-pushed the perf/labs-sql-optimization-1775813476 branch from bdd8390 to c2ad6a9 Compare April 11, 2026 14:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Architecture: Optimize labs.py nested table joins to prevent Toolforge OOM timeouts#479

Architecture: Optimize labs.py nested table joins to prevent Toolforge OOM timeouts#479
ayushshukla1807 wants to merge 1 commit intohatnote:masterfrom
ayushshukla1807:perf/labs-sql-optimization-1775813476

ayushshukla1807 commented Apr 10, 2026

Uh oh!

mahmoud commented Apr 11, 2026

Uh oh!

ayushshukla1807 commented Apr 11, 2026 •

edited

Loading

Uh oh!

ayushshukla1807 commented Apr 11, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ayushshukla1807 commented Apr 10, 2026

Title: Architecture: Optimize labs.py nested table joins to prevent Toolforge OOM timeouts

Background

Proposed Architecture

🧪 Technical Validation

Uh oh!

mahmoud commented Apr 11, 2026

Uh oh!

ayushshukla1807 commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ayushshukla1807 commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Title: Architecture: Optimize `labs.py` nested table joins to prevent Toolforge OOM timeouts

ayushshukla1807 commented Apr 11, 2026 •

edited

Loading

ayushshukla1807 commented Apr 11, 2026 •

edited

Loading