Fix M-bridge report generation #778

srivatsankrishnan · 2026-01-17T01:48:03Z

Summary

The report generation looks for the file name megatron_bridge_launcher.log instead of the correct file name cloudai_megatron_bridge_launcher.log as generated by the command generation strategy.

Test Plan

CI/CD
Internal cluster

Additional Notes

Include any other notes or comments about the pull request here. This can include challenges faced, future considerations, or context that reviewers might find helpful.

coderabbitai · 2026-01-17T01:48:10Z

📝 Walkthrough

Walkthrough

The changes update log file naming conventions from megatron_bridge_launcher.log to cloudai_megatron_bridge_launcher.log in report generation logic and corresponding tests. Error messages are also adjusted to reference output directories instead of specific file paths.

Changes

Cohort / File(s)	Summary
Megatron Bridge Log File Updates `src/cloudai/workloads/megatron_bridge/report_generation_strategy.py`, `tests/report_generation_strategy/test_megatron_bridge_report_generation_strategy.py`	Consistent rename of log filename references from `megatron_bridge_launcher.log` to `cloudai_megatron_bridge_launcher.log` across log discovery, metrics retrieval, and report generation paths. Error messages updated to include base output directory context.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 A log file renamed with cloudai's new flair,
From bridge to clouds, the launcher flies through the air,
Consistent and clean, the pattern flows true,
Simple adjustments in the files we renew! ✨

🚥 Pre-merge checks | ✅ 2

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Fix M-bridge report generation' directly describes the main change: correcting the report generation to use the correct log filename (cloudai_megatron_bridge_launcher.log instead of megatron_bridge_launcher.log).
Description check	✅ Passed	The description clearly relates to the changeset by explaining that report generation was using the wrong log filename and that this PR corrects it to use the proper filename generated by the command generation strategy.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…idge

greptile-apps · 2026-01-17T01:57:10Z

Greptile Summary

Fixed a critical bug where report generation was looking for the wrong log filename. The command generation strategy creates cloudai_megatron_bridge_launcher.log, but the report generation strategy was searching for megatron_bridge_launcher.log, causing report generation to always fail.

Key Changes:

Updated log filename references in get_log_file() and results_file property from megatron_bridge_launcher.log to cloudai_megatron_bridge_launcher.log
Improved error messages to show the output directory path instead of hardcoded incorrect filename
Updated test fixture to use the correct filename, ensuring tests validate actual runtime behavior
Updated copyright year to 2025-2026

Impact:
This fix ensures Megatron-Bridge report generation works correctly by aligning the filename used for reading logs with the filename created during command execution.

Confidence Score: 5/5

This PR is safe to merge with minimal risk - it's a straightforward bug fix with comprehensive test coverage
The change is a simple string replacement to fix a filename mismatch. All references have been updated consistently (4 occurrences in the main file), the test fixture has been updated to match, and the fix aligns with the actual filename generated by the command strategy (verified in slurm_command_gen_strategy.py:115). No other files reference the old filename.
No files require special attention

Important Files Changed

Filename	Overview
src/cloudai/workloads/megatron_bridge/report_generation_strategy.py	Fixed log filename from `megatron_bridge_launcher.log` to `cloudai_megatron_bridge_launcher.log` to match what command generation creates. Also improved error messages to show directory path instead of hardcoded wrong filename.
tests/report_generation_strategy/test_megatron_bridge_report_generation_strategy.py	Updated test fixture to create log file with correct `cloudai_megatron_bridge_launcher.log` filename, ensuring tests validate actual runtime behavior.

Sequence Diagram

sequenceDiagram
    participant CmdGen as Command Generation Strategy
    participant FS as File System
    participant RepGen as Report Generation Strategy
    
    Note over CmdGen,RepGen: Before this PR
    CmdGen->>FS: Create cloudai_megatron_bridge_launcher.log
    RepGen->>FS: Look for megatron_bridge_launcher.log
    FS-->>RepGen: File not found ❌
    RepGen->>RepGen: Error: Cannot generate report
    
    Note over CmdGen,RepGen: After this PR
    CmdGen->>FS: Create cloudai_megatron_bridge_launcher.log
    RepGen->>FS: Look for cloudai_megatron_bridge_launcher.log
    FS-->>RepGen: File found ✓
    RepGen->>RepGen: Parse metrics and generate report

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@src/cloudai/workloads/megatron_bridge/report_generation_strategy.py`:
- Around line 31-39: The literal log filename is duplicated in get_log_file and
the results_file property; add a class-level constant (e.g., LOG_FILENAME =
"cloudai_megatron_bridge_launcher.log") to the MegatronBridge report generation
class and use that constant in get_log_file (construct log =
self.test_run.output_path / self.LOG_FILENAME) and in results_file (fallback
path using self.LOG_FILENAME) so the filename is defined in one place and can be
reused by tests.

In
`@tests/report_generation_strategy/test_megatron_bridge_report_generation_strategy.py`:
- Line 41: Replace the hard-coded filename
"cloudai_megatron_bridge_launcher.log" in the test with the strategy's filename
constant so the test follows the source of truth; import
MegatronBridgeReportGenerationStrategy (or the class that defines the filename
constant) and use its filename constant (e.g., FILENAME) when writing/reading
(refer to MegatronBridgeReportGenerationStrategy and its filename constant) to
keep the test synchronized with the implementation.

src/cloudai/workloads/megatron_bridge/report_generation_strategy.py

tests/report_generation_strategy/test_megatron_bridge_report_generation_strategy.py

srivatsankrishnan added 2 commits January 16, 2026 17:41

fix bug in report generation log name

094a306

fix unit test

c2d3ec3

srivatsankrishnan added 5 commits January 16, 2026 17:48

Merge branch 'main' into m-bridge

94e41bb

fix bug in report generation log name

07d13e1

fix unit test

59de221

fix copyright year

91f7b35

Merge remote-tracking branch 'refs/remotes/origin/m-bridge' into m-br…

28435e5

…idge

srivatsankrishnan marked this pull request as ready for review January 17, 2026 01:54

srivatsankrishnan requested review from alexmanle, amaslenn and jeffnvidia as code owners January 17, 2026 01:54

alexmanle approved these changes Jan 17, 2026

View reviewed changes

coderabbitai bot reviewed Jan 17, 2026

View reviewed changes

src/cloudai/workloads/megatron_bridge/report_generation_strategy.py Show resolved Hide resolved

tests/report_generation_strategy/test_megatron_bridge_report_generation_strategy.py Show resolved Hide resolved

srivatsankrishnan merged commit 3f0d873 into NVIDIA:main Jan 17, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix M-bridge report generation #778

Fix M-bridge report generation #778

Uh oh!

srivatsankrishnan commented Jan 17, 2026

Uh oh!

coderabbitai bot commented Jan 17, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

greptile-apps bot commented Jan 17, 2026

Sequence Diagram

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix M-bridge report generation #778

Fix M-bridge report generation #778

Uh oh!

Conversation

srivatsankrishnan commented Jan 17, 2026

Summary

Test Plan

Additional Notes

Uh oh!

coderabbitai bot commented Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

greptile-apps bot commented Jan 17, 2026

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai bot commented Jan 17, 2026 •

edited

Loading