-
Notifications
You must be signed in to change notification settings - Fork 0
Add package_todo.yml
format validation to validate command
#20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Adds validation to ensure package_todo.yml files maintain the correct serialization format, preventing manual edits from breaking the standard format and causing noise when running `pks update`. - Add PackageTodoFormatChecker validator that compares existing files with their expected serialized format - Integrate validator into existing validation system - Add comprehensive tests with fixtures for both valid and invalid formats - Show context-aware error messages suggesting correct update command based on packs_first_mode This prevents issues where mass search-replace operations (like renaming packs) result in incorrectly formatted files that create large changesets when people run `pks update`.
Add detailed doc comments explaining the package_todo.yml format validation functionality to help future maintainers understand: - How the validation works (deserialize -> re-serialize -> compare) - Why it's needed (prevent format inconsistencies from manual edits) - Common causes of validation failures - The role of serialize_package_todo in maintaining format consistency Documentation covers: - Module-level overview of the validation purpose - Function-level details for validation logic - Test documentation explaining test scenarios - Implementation details for the serialization format
Replace sequential iteration with par_iter() to validate package_todo.yml files in parallel, improving performance for large codebases with many packages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds validation to ensure package_todo.yml files maintain the correct serialization format, preventing manual edits from breaking the standard format and causing noise when running pks update
.
- Adds a new PackageTodoFormatChecker validator that compares existing files with their expected serialized format
- Integrates the validator into the existing validation system using parallel processing for performance
- Includes comprehensive tests with fixtures for both valid and invalid formats
Reviewed Changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.
Show a summary per file
File | Description |
---|---|
tests/package_todo_format_test.rs | Test suite validating correct and incorrect package_todo.yml formats |
tests/fixtures/incorrectly_formatted_package_todo/* | Test fixture with incorrectly ordered violations to test validation failure |
tests/fixtures/contains_package_todo/packs/foo/package_todo.yml | Updated existing fixture with YAML separator for consistency |
src/packs/package_todo.rs | Made serialize_package_todo function public and added comprehensive documentation |
src/packs/checker/package_todo.rs | New validator implementation for package_todo.yml format checking |
src/packs/checker.rs | Integration of new validator into validation system |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Replace &String parameters with &str to avoid unnecessary heap allocations and improve API design. Addresses feedback from code review. - Change serialize_package_todo parameter from &String to &str - Change header function parameter from &String to &str - Remove unnecessary to_string() call when passing pack_name
/// | ||
/// This function is used both for writing new package_todo.yml files and for | ||
/// format validation to ensure existing files match the expected format. | ||
pub(crate) fn serialize_package_todo( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not super familiar with Rust's modules. Is this a code smell? I didn't want to dive into the "how should we organize and expose these types/methods" rabbit hole just yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems reasonable, if you don't want the method to be part of the public API. If you really want to limit the usage you can declare pub(super)
so only this and the parent module can use it, but that's likely overkill for this project.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside from the unwrap
, this looks safe to ship as is. I've left additional comments on how to use anyhow
for consistency within pks
.
It's been a while since I've written any Rust though, so it would be good to have a second set of eyes on this.
src/packs/checker/package_todo.rs
Outdated
package_todo_path: &Path, | ||
pack_name: &str, | ||
packs_first_mode: bool, | ||
) -> Result<(), String> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/packs/checker/package_todo.rs
Outdated
|
||
// Compare the current content with the expected serialized format | ||
if current_content != expected_content { | ||
return Err(format!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If using anyhow
, this would need to be written as
return Err(format!( | |
return Err(anyhow!( |
with a use anyhow::anyhow
in the imports above.
src/packs/checker/package_todo.rs
Outdated
let current_content = | ||
fs::read_to_string(package_todo_path).map_err(|e| { | ||
format!("Failed to read {}: {}", package_todo_path.display(), e) | ||
})?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With anyhow
, you wouldn't need the map_err
here
let current_content = | |
fs::read_to_string(package_todo_path).map_err(|e| { | |
format!("Failed to read {}: {}", package_todo_path.display(), e) | |
})?; | |
let current_content = fs::read_to_string(package_todo_path)?; |
src/packs/checker/package_todo.rs
Outdated
let package_todo: PackageTodo = serde_yaml::from_str(¤t_content) | ||
.map_err(|e| { | ||
format!("Failed to parse {}: {}", package_todo_path.display(), e) | ||
})?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With anyhow
, you wouldn't need the map_err
here
let package_todo: PackageTodo = serde_yaml::from_str(¤t_content) | |
.map_err(|e| { | |
format!("Failed to parse {}: {}", package_todo_path.display(), e) | |
})?; | |
let package_todo: PackageTodo = serde_yaml::from_str(¤t_content)?; |
src/packs/checker/package_todo.rs
Outdated
.par_iter() | ||
.filter_map(|pack| { | ||
let package_todo_path = | ||
pack.yml.parent().unwrap().join("package_todo.yml"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.unwrap()
can lead to runtime panics. It's best to explicitly handle the None
case.
/// Returns [`None`] if the path terminates in a root or prefix, or if it's
/// the empty string.
If, in this context, the parent will never be one of the above cases, we can instead use expect("parent will always exist")
to provide the same behavior as unwrap with additional context as to why it's safe.
&pack.name, | ||
configuration.packs_first_mode, | ||
) | ||
.err() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If using anyhow
you'll need to convert these into String
.err() | |
.map_err(|e| e.to_string()).err |
/// | ||
/// This function is used both for writing new package_todo.yml files and for | ||
/// format validation to ensure existing files match the expected format. | ||
pub(crate) fn serialize_package_todo( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems reasonable, if you don't want the method to be part of the public API. If you really want to limit the usage you can declare pub(super)
so only this and the parent module can use it, but that's likely overkill for this project.
src/packs/checker.rs
Outdated
use crate::packs::pack::write_pack_to_disk; | ||
use crate::packs::pack::Pack; | ||
use crate::packs::package_todo; | ||
use crate::packs::package_todo as package_todo_module; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would try to differentiate the names of these two modules more, if not at a higher level, then at least for this alias. It's not clear what the difference is.
If one is meant to be public and the other internal, it may be worth restructuring the code to create an create::internal::
namespace.
@@ -0,0 +1,125 @@ | |||
//! Package todo format validation checker. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see now. We've introduced a new package_todo
module when there was already an existing one in the checker
namespace. That's why you're having namespace collisions.
None | ||
} else { | ||
Some(validation_errors) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is totally legit and there's nothing wrong with it, but I would prefer pattern matching in general.
It's also just good to know how to pattern match a slice.
match &validation_errors[..] {
[] => None,
_ => Some(validation_errors)
}
I whipped up a more complete example of matching over slices for you.
https://play.rust-lang.org/?version=stable&mode=debug&edition=2024&gist=7932bced185feb6f53dc7b23194c3466
To be 100% clear, I don't actually have a problem with the if/else
here.
You were just asking about idiomatic rust so it seemed to be a learning opportunity to bring this up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! This is exactly the kind of feedback I'm looking for. Thank you!
)); | ||
} | ||
|
||
Ok(()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a place I would definitely prefer pattern matching over an if return
.
The if return
structure leaves room to wedge code in between this and returning the Ok
, which can encourage complexity in the future as more early returns get added in here.
match current_content {
expected_content => Ok(()),
_ => {
Err(/*...*/)
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for taking this on! I'll leave the syntax and such to the rust experts, but this looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be nice to have a test that ensures packs without a package todo file are correctly skipped. I wouldn't consider this a blocker though!
src/packs/checker/package_todo.rs
Outdated
return Err(format!( | ||
"Package todo file {} is not in the expected format. Please run `{}` to fix it.", | ||
package_todo_path.display(), | ||
if packs_first_mode { "pks update" } else { "bin/packwerk update-todo" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can definitely be improved, but we have https://github.com/rubyatscale/pks/blob/main/src/packs/bin_locater.rs for identifying the executable.
- Remove unnecessary 'as package_todo_module' alias in checker.rs - Rename package_todo.rs to todo_format.rs for clarity - Replace unwrap() with explicit expect() for better error handling Co-authored-by: Sweta Sanghavi <sweta.sanghavi@gusto.com> Co-authored-by: Denis Kisselev <denis.kisselev@gusto.com> Co-authored-by: Martin Emde <martin.emde@gusto.com>
…tion - Migrate from custom Result<(), String> to anyhow::Result<()> for consistency with codebase - Remove manual map_err calls and leverage anyhow's automatic error conversion - Use bin_locater::packs_bin_name() for command suggestions in packs_first_mode - Maintain backward compatibility with bin/packwerk update-todo for legacy mode Addresses PR feedback to align error handling patterns and centralize command generation.
- Rename test file: package_todo_format_test.rs → todo_format_validation_test.rs - Add test for violations existing but no package_todo.yml files - Add test for clean case with no violations and no todo files - Correct test #2 to use privacy_violation_overrides fixture with actual violations - Ensure validator correctly skips packs without todo files in all scenarios
Adds validation to ensure package_todo.yml files maintain the correct serialization format, preventing manual edits from breaking the standard format and causing noise when running
pks update
.Changes
Test plan
This prevents issues where mass search-replace operations (like renaming packs) result in incorrectly formatted files that create large changesets when people run
pks update
.