Skip to content

Conversation

@srivatsankrishnan
Copy link
Contributor

Summary

The report generation looks for the file name megatron_bridge_launcher.log instead of the correct file name cloudai_megatron_bridge_launcher.log as generated by the command generation strategy.

Test Plan

CI/CD
Internal cluster

Additional Notes

Include any other notes or comments about the pull request here. This can include challenges faced, future considerations, or context that reviewers might find helpful.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 17, 2026

📝 Walkthrough

Walkthrough

The changes update log file naming conventions from megatron_bridge_launcher.log to cloudai_megatron_bridge_launcher.log in report generation logic and corresponding tests. Error messages are also adjusted to reference output directories instead of specific file paths.

Changes

Cohort / File(s) Summary
Megatron Bridge Log File Updates
src/cloudai/workloads/megatron_bridge/report_generation_strategy.py, tests/report_generation_strategy/test_megatron_bridge_report_generation_strategy.py
Consistent rename of log filename references from megatron_bridge_launcher.log to cloudai_megatron_bridge_launcher.log across log discovery, metrics retrieval, and report generation paths. Error messages updated to include base output directory context.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 A log file renamed with cloudai's new flair,
From bridge to clouds, the launcher flies through the air,
Consistent and clean, the pattern flows true,
Simple adjustments in the files we renew! ✨

🚥 Pre-merge checks | ✅ 2
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Fix M-bridge report generation' directly describes the main change: correcting the report generation to use the correct log filename (cloudai_megatron_bridge_launcher.log instead of megatron_bridge_launcher.log).
Description check ✅ Passed The description clearly relates to the changeset by explaining that report generation was using the wrong log filename and that this PR corrects it to use the proper filename generated by the command generation strategy.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@srivatsankrishnan srivatsankrishnan marked this pull request as ready for review January 17, 2026 01:54
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Jan 17, 2026

Greptile Summary

Fixed a critical bug where report generation was looking for the wrong log filename. The command generation strategy creates cloudai_megatron_bridge_launcher.log, but the report generation strategy was searching for megatron_bridge_launcher.log, causing report generation to always fail.

Key Changes:

  • Updated log filename references in get_log_file() and results_file property from megatron_bridge_launcher.log to cloudai_megatron_bridge_launcher.log
  • Improved error messages to show the output directory path instead of hardcoded incorrect filename
  • Updated test fixture to use the correct filename, ensuring tests validate actual runtime behavior
  • Updated copyright year to 2025-2026

Impact:
This fix ensures Megatron-Bridge report generation works correctly by aligning the filename used for reading logs with the filename created during command execution.

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk - it's a straightforward bug fix with comprehensive test coverage
  • The change is a simple string replacement to fix a filename mismatch. All references have been updated consistently (4 occurrences in the main file), the test fixture has been updated to match, and the fix aligns with the actual filename generated by the command strategy (verified in slurm_command_gen_strategy.py:115). No other files reference the old filename.
  • No files require special attention

Important Files Changed

Filename Overview
src/cloudai/workloads/megatron_bridge/report_generation_strategy.py Fixed log filename from megatron_bridge_launcher.log to cloudai_megatron_bridge_launcher.log to match what command generation creates. Also improved error messages to show directory path instead of hardcoded wrong filename.
tests/report_generation_strategy/test_megatron_bridge_report_generation_strategy.py Updated test fixture to create log file with correct cloudai_megatron_bridge_launcher.log filename, ensuring tests validate actual runtime behavior.

Sequence Diagram

sequenceDiagram
    participant CmdGen as Command Generation Strategy
    participant FS as File System
    participant RepGen as Report Generation Strategy
    
    Note over CmdGen,RepGen: Before this PR
    CmdGen->>FS: Create cloudai_megatron_bridge_launcher.log
    RepGen->>FS: Look for megatron_bridge_launcher.log
    FS-->>RepGen: File not found ❌
    RepGen->>RepGen: Error: Cannot generate report
    
    Note over CmdGen,RepGen: After this PR
    CmdGen->>FS: Create cloudai_megatron_bridge_launcher.log
    RepGen->>FS: Look for cloudai_megatron_bridge_launcher.log
    FS-->>RepGen: File found ✓
    RepGen->>RepGen: Parse metrics and generate report
Loading

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@src/cloudai/workloads/megatron_bridge/report_generation_strategy.py`:
- Around line 31-39: The literal log filename is duplicated in get_log_file and
the results_file property; add a class-level constant (e.g., LOG_FILENAME =
"cloudai_megatron_bridge_launcher.log") to the MegatronBridge report generation
class and use that constant in get_log_file (construct log =
self.test_run.output_path / self.LOG_FILENAME) and in results_file (fallback
path using self.LOG_FILENAME) so the filename is defined in one place and can be
reused by tests.

In
`@tests/report_generation_strategy/test_megatron_bridge_report_generation_strategy.py`:
- Line 41: Replace the hard-coded filename
"cloudai_megatron_bridge_launcher.log" in the test with the strategy's filename
constant so the test follows the source of truth; import
MegatronBridgeReportGenerationStrategy (or the class that defines the filename
constant) and use its filename constant (e.g., FILENAME) when writing/reading
(refer to MegatronBridgeReportGenerationStrategy and its filename constant) to
keep the test synchronized with the implementation.

@srivatsankrishnan srivatsankrishnan merged commit 3f0d873 into NVIDIA:main Jan 17, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants