-
Notifications
You must be signed in to change notification settings - Fork 42
Model Name/ModeL size for verify configs #772
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Model Name/ModeL size for verify configs #772
Conversation
|
Warning Rate limit exceeded@srivatsankrishnan has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 7 minutes and 1 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughEnforces non-empty Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @src/cloudai/workloads/megatron_bridge/megatron_bridge.py:
- Around line 43-44: The PR made model_name and model_size required with
Field(min_length=1) but still allows whitespace-only values; add a field
validator like the existing hf_token validator to strip and reject
empty/whitespace-only strings for "model_name" and "model_size" (use
@field_validator("model_name","model_size", mode="after") in the same Pydantic
model, name it e.g. validate_model_fields, strip the value, raise
ValueError(f"cmd_args.{info.field_name} cannot be empty.") if the result is
empty, and return the stripped string) so parsing will fail on whitespace-only
input.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (2)
src/cloudai/workloads/megatron_bridge/megatron_bridge.pytests/slurm_command_gen_strategy/test_megatron_bridge_slurm_command_gen_strategy.py
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2026-01-05T22:23:56.998Z
Learnt from: srivatsankrishnan
Repo: NVIDIA/cloudai PR: 767
File: conf/experimental/megatron_bridge/test/b200/megatron_bridge_qwen_30b.toml:38-38
Timestamp: 2026-01-05T22:23:56.998Z
Learning: For the Megatron Bridge workload, the hf_token field in test TOML configurations is intentionally left empty (hf_token = "") to fail validation if not explicitly set by the user. CloudAI does not distribute secret keys, so users must provide their own Hugging Face token, and the validation failure is by design to enforce this.
Applied to files:
src/cloudai/workloads/megatron_bridge/megatron_bridge.pytests/slurm_command_gen_strategy/test_megatron_bridge_slurm_command_gen_strategy.py
📚 Learning: 2025-12-16T19:47:41.994Z
Learnt from: amaslenn
Repo: NVIDIA/cloudai PR: 754
File: src/cloudai/_core/registry.py:226-234
Timestamp: 2025-12-16T19:47:41.994Z
Learning: In this repository, prefer expressing behavioral documentation through tests rather than docstrings. Tests act as living, verified documentation. Reserve docstrings for interfaces or high-level descriptions, and avoid duplicating behavior that is already covered by tests.
Applied to files:
src/cloudai/workloads/megatron_bridge/megatron_bridge.pytests/slurm_command_gen_strategy/test_megatron_bridge_slurm_command_gen_strategy.py
🧬 Code graph analysis (1)
tests/slurm_command_gen_strategy/test_megatron_bridge_slurm_command_gen_strategy.py (1)
src/cloudai/workloads/megatron_bridge/megatron_bridge.py (1)
MegatronBridgeCmdArgs(26-89)
🔇 Additional comments (1)
tests/slurm_command_gen_strategy/test_megatron_bridge_slurm_command_gen_strategy.py (1)
86-88: LGTM! Test correctly updated for new required fields.The test now provides
model_nameandmodel_sizeto satisfy the new validation requirements while maintaining its focus on verifying that emptyhf_tokenis rejected. This change is necessary and consistent with other test methods in the file that already provide these fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
Added Pydantic validation for model_name and model_size fields to catch empty/whitespace-only values during config parsing instead of at runtime.
- Added
min_length=1constraint tomodel_nameandmodel_sizeField definitions - Implemented
validate_model_fieldscustom validator that strips whitespace and rejects empty strings - Updated test to provide required fields, but missing dedicated test cases for the new validation logic
- Validation follows the same pattern as existing
hf_tokenvalidator (strips and returns cleaned value) - Successfully moves error detection from runtime to parse-time as intended
Confidence Score: 4/5
- This PR is safe to merge with minimal risk
- The validation logic is straightforward and follows existing patterns in the codebase. The changes successfully move validation from runtime to parse-time, improving error detection. Minor deduction for missing test coverage of the new validation logic.
- Consider adding test cases in
tests/slurm_command_gen_strategy/test_megatron_bridge_slurm_command_gen_strategy.pyto verify the new validation behavior
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| src/cloudai/workloads/megatron_bridge/megatron_bridge.py | 4/5 | Added min_length=1 constraint and custom validator for model_name and model_size to ensure they cannot be empty or whitespace-only, catching configuration errors earlier during parsing |
| tests/slurm_command_gen_strategy/test_megatron_bridge_slurm_command_gen_strategy.py | 3/5 | Updated existing test to provide required model_name and model_size fields, but missing test cases for the new validation logic |
Sequence Diagram
sequenceDiagram
participant User
participant ConfigParser
participant Pydantic
participant MegatronBridgeCmdArgs
participant FieldValidator
User->>ConfigParser: Load TOML config
ConfigParser->>Pydantic: model_validate(config_data)
Pydantic->>MegatronBridgeCmdArgs: Create instance
Note over Pydantic,MegatronBridgeCmdArgs: Field validation phase
Pydantic->>MegatronBridgeCmdArgs: Check min_length=1 on model_name
alt Empty string
MegatronBridgeCmdArgs-->>Pydantic: ValidationError
Pydantic-->>User: Error caught during parsing
else Non-empty string
MegatronBridgeCmdArgs->>FieldValidator: validate_model_fields(model_name)
FieldValidator->>FieldValidator: Strip whitespace
alt Whitespace-only
FieldValidator-->>Pydantic: ValueError
Pydantic-->>User: Error caught during parsing
else Valid value
FieldValidator->>MegatronBridgeCmdArgs: Return stripped value
end
end
Pydantic->>MegatronBridgeCmdArgs: Check min_length=1 on model_size
MegatronBridgeCmdArgs->>FieldValidator: validate_model_fields(model_size)
FieldValidator->>MegatronBridgeCmdArgs: Return stripped value
MegatronBridgeCmdArgs-->>Pydantic: Validated instance
Pydantic-->>User: Success (or ValidationError)
tests/slurm_command_gen_strategy/test_megatron_bridge_slurm_command_gen_strategy.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
This PR adds Pydantic validation for model_name and model_size fields to catch empty/whitespace values at parse time rather than runtime.
Key Changes:
- Changed field definitions from
Field(default="")toField(min_length=1)making them required - Added custom validator that strips whitespace and rejects empty strings after stripping
- Updated test to include required fields when validating
hf_token - Added comprehensive tests for empty string and whitespace-only validation
Impact:
- Validation errors now occur during config parsing (
verify-configs,run,dry-run) instead of during command generation - Provides better user experience by catching errors earlier
- Values with leading/trailing whitespace are automatically stripped (consistent with
hf_tokenvalidator) - Runtime checks in
slurm_command_gen_strategy.py(lines 237-240) are now unreachable but harmless
Confidence Score: 4/5
- This PR is safe to merge with minimal risk - it moves validation earlier in the pipeline and improves error handling
- Score reflects thorough testing, clear implementation that achieves stated goals, and only minor stylistic suggestions around error message consistency. No functional bugs found.
- No files require special attention - both changed files have appropriate test coverage and follow existing patterns
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| src/cloudai/workloads/megatron_bridge/megatron_bridge.py | 4/5 | Added Pydantic validation for model_name and model_size fields to ensure they are non-empty strings, moving validation from runtime to parse time. Changes are sound and well-tested. |
| tests/slurm_command_gen_strategy/test_megatron_bridge_slurm_command_gen_strategy.py | 5/5 | Added comprehensive test coverage for the new validation logic, including tests for empty strings and whitespace-only values. Tests correctly verify both Pydantic's min_length validation and the custom validator. |
Sequence Diagram
sequenceDiagram
participant User
participant TOML as TOML Config File
participant Pydantic as Pydantic Validator
participant CmdArgs as MegatronBridgeCmdArgs
participant MinLength as min_length Validator
participant Custom as Custom Validator
participant CmdGen as Command Generator
User->>TOML: Define model_name & model_size
TOML->>Pydantic: Parse config (verify-configs/run/dry-run)
alt Field is missing
Pydantic-->>User: Error: Field required
else Field is empty string ""
Pydantic->>MinLength: Check min_length=1
MinLength-->>User: Error: String should have at least 1 character
else Field is whitespace only " "
Pydantic->>MinLength: Check min_length=1
MinLength->>Custom: Pass (length >= 1)
Custom->>Custom: Strip whitespace
Custom->>Custom: Check if empty
Custom-->>User: Error: cmd_args.model_name cannot be empty
else Field is valid " qwen3 "
Pydantic->>MinLength: Check min_length=1
MinLength->>Custom: Pass
Custom->>Custom: Strip to "qwen3"
Custom->>CmdArgs: Return stripped value
CmdArgs->>CmdGen: Generate command
Note over CmdGen: Runtime checks (lines 237-240)<br/>now unreachable
end
| model_name: str = Field(min_length=1) | ||
| model_size: str = Field(min_length=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The min_length=1 constraint combined with the custom validator that strips whitespace creates inconsistent error messages:
- Empty string
"": Pydantic's generic "String should have at least 1 character" - Whitespace
" ": Custom message "cmd_args.model_name cannot be empty."
Consider removing min_length=1 to provide consistent error messages for all invalid inputs. The custom validator alone is sufficient to catch both cases.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Summary
Currently we don't do strict
pydanticvalidation of all the fields by design to allow users to add/remove new flags as and when they are available. This is particularly needed feature for fast moving frameworks such as Nemo/AIDyanno and M-bridge now.However, Model familiy and Model name is required and user needs to provide this field. Currently the additional check for passing empty strings (or removing from toml file as done by this RM) was happening during run time causing a stack trace. This PR ensure this is caught during parsing using
verify-configsor usingrunordry-runmodel.Reference: RM 4829207
Test Plan
CI/CD
Additional Notes