feat(cli): add edit command for in-place localization file modifications #6
feat(cli): add edit command for in-place localization file modifications #6
Conversation
- Introduced a new `Edit` command with a `set` subcommand to add, update, or remove localization keys. - Implemented validation for input files and optional parameters. - Enhanced error handling for edit operations.
- Enhanced language resolution logic in the `set` subcommand to handle single-language files more gracefully. - Added error handling for missing language specification in multi-language contexts.
- Added a `dry_run` option to the `set` subcommand, allowing users to preview changes without writing to files. - Refactored the `run_edit_set_command` to utilize an `EditSetOptions` struct for better argument management. - Enhanced tests to verify that dry-run operations do not modify input files.
- Modified the `set` subcommand to accept multiple input files using glob patterns. - Enhanced validation to prevent using the `--output` option with multiple inputs. - Updated tests to cover new functionality for handling multiple input files and glob patterns.
…tures - Enhanced README.md to reflect the addition of the edit command in the CLI, detailing its functionalities for adding, updating, and removing localization keys. - Updated langcodec-cli/README.md to include comprehensive examples and options for the edit command, emphasizing support for multiple input files and glob patterns.
- Introduced a `--continue-on-error` flag to the `set` subcommand, allowing the processing of all input files even if some fail. - Enhanced error handling to report failures at the end while continuing with valid files. - Updated README.md to document the new option and its functionality. - Added tests to verify the behavior of the continue-on-error feature.
- Added a summary output for processed, successful, and failed files during the execution of the `set` subcommand. - Introduced a mechanism to skip missing input files while continuing processing. - Improved error handling to provide clearer feedback on the number of failures at the end of the command execution.
- Added a new function `unescape_strings_minimal` to handle specific escape sequences in string values, ensuring that double backslashes and apostrophes are correctly processed without introducing extra escapes. - Updated the `Parser` implementation to utilize the new unescaping function when processing string values. - Added unit tests to verify the functionality of the unescaping logic, ensuring correct handling of various escape scenarios.
- Updated the string parsing logic to directly convert substrings to `String` instead of using a separate unescaping function. - Enhanced the escape handling for backslashes to preserve their count before apostrophes. - Adjusted unit tests to verify correct escaping behavior for strings containing apostrophes and backslashes.
… values - Introduced a new test to verify that trailing spaces in string values are preserved during parsing. - Updated existing test for unescaping strings to improve readability by formatting the assertion across multiple lines.
- Removed the multiline_values_to_one_line function and integrated its logic directly into the parsing process for better clarity and efficiency. - Introduced a new parse_strings_content function to handle the parsing of string content, enhancing modularity and maintainability. - Updated the from_reader method to read the entire input at once, simplifying the handling of UTF-8 strings. - Enhanced comment handling and key-value pair extraction to ensure accurate parsing of .strings files.
- Updated the parsing logic to utilize a new `parse_quoted_utf8` function for improved handling of quoted strings, preserving backslashes and non-ASCII characters accurately. - Simplified the `escape_strings_token` function to better manage escape sequences and backslashes, ensuring correct preservation of string literals. - Added a new test case to verify the parsing of strings with trailing spaces and non-ASCII characters, enhancing overall test coverage.
- Enhanced the `Parser` implementation to extract the language from the header of .strings files, ensuring accurate language representation. - Introduced a new function `try_skip_langcodec_header` to detect and skip the auto-generated header at the start of the file, improving parsing efficiency. - Updated the `parse_strings_content` function to handle the detection of the header and manage the state of seen pairs during parsing.
- Reformatted the error handling logic in the `run_edit_set_command` function for better readability. - Enhanced clarity by using a multi-line format for the error message when file processing fails, maintaining consistent code style.
There was a problem hiding this comment.
Pull Request Overview
Introduces a comprehensive "edit" command to the CLI for unified in-place localization file modifications. The edit command consolidates add, update, and remove operations into a single interface that supports glob patterns, dry-run previews, and multi-file processing.
- Adds
edit setcommand for unified add/update/remove operations on localization files - Implements robust parsing and escaping improvements for .strings format to handle edge cases
- Provides comprehensive CLI integration with glob support, validation, and error handling
Reviewed Changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| langcodec/src/plural_rules.rs | Formatting improvements to multi-line assert statements |
| langcodec/src/formats/strings.rs | Major rewrite of .strings parser with improved Unicode handling and escaping |
| langcodec-cli/tests/edit_cli_tests.rs | Comprehensive test suite for edit command functionality |
| langcodec-cli/src/main.rs | CLI integration of edit command with argument parsing |
| langcodec-cli/src/edit.rs | Core implementation of edit command logic |
| langcodec-cli/README.md | Documentation updates for edit command |
| README.md | High-level documentation updates mentioning edit functionality |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| use std::collections::HashMap; | ||
| use std::fs::File; | ||
| use std::io::Read; | ||
| // keep imports minimal; actual Read trait is used via fully qualified call above |
There was a problem hiding this comment.
This comment is misleading since there's no 'fully qualified call above' - the Read trait import was removed. Either remove this comment or clarify what it refers to.
| // keep imports minimal; actual Read trait is used via fully qualified call above | |
| // keep imports minimal; actual Read trait is used via fully qualified call below |
| fn from_reader<R: std::io::BufRead>(reader: R) -> Result<Self, Error> { | ||
| let mut file_content = reader.lines().collect::<Result<Vec<_>, _>>()?.join("\n"); | ||
|
|
||
| Format::multiline_values_to_one_line(&mut file_content); | ||
|
|
||
| // For simplicity, we assume there are no multi-line comments and in-line comments in the file. | ||
| let lines = file_content.lines().collect::<Vec<_>>(); | ||
|
|
||
| let mut header = HashMap::<String, String>::new(); | ||
|
|
||
| let mut last_comment: Option<&str> = None; | ||
|
|
||
| // strings pair pattern: "key" = "value"; | ||
| let pairs = lines | ||
| .iter() | ||
| .filter_map(|line| { | ||
| let trimmed = line.trim(); | ||
| if trimmed.starts_with("//:") { | ||
| // This is a header line, we can extract metadata from it. | ||
| // | ||
| // Example: "//: Language: English" | ||
| let parts: Vec<&str> = trimmed.splitn(3, ':').collect(); | ||
| if parts.len() == 3 { | ||
| let key = parts[1].trim().to_string(); | ||
| let value = parts[2].trim().to_string(); | ||
| header.insert(key, value); | ||
| } | ||
| return None; // Skip header lines | ||
| } else if trimmed.is_empty() | ||
| || trimmed.starts_with("/*") | ||
| || trimmed.starts_with("//") | ||
| { | ||
| last_comment = Some(trimmed); | ||
| return None; // Skip empty lines and comments | ||
| } | ||
|
|
||
| let parts: Vec<&str> = trimmed.splitn(3, '=').collect(); | ||
| if parts.len() != 2 { | ||
| return None; // Invalid line format | ||
| } | ||
|
|
||
| let key = parts[0].trim().trim_matches('"').to_string(); | ||
| // Take the right-hand side up to the terminating semicolon, ignoring any inline comments afterward | ||
| let rhs = parts[1].trim(); | ||
| let rhs_before_semicolon = if let Some(idx) = rhs.find(';') { | ||
| &rhs[..idx] | ||
| } else { | ||
| rhs | ||
| }; | ||
| let rhs_trimmed = rhs_before_semicolon.trim(); | ||
|
|
||
| // Expect a quoted string value; handle empty string and ignore any trailing inline comments | ||
| let mut value = String::new(); | ||
| if !rhs_trimmed.is_empty() { | ||
| if rhs_trimmed.starts_with('"') { | ||
| // Find the last unescaped quote to close the value | ||
| let mut last_quote_pos: Option<usize> = None; | ||
| let bytes = rhs_trimmed.as_bytes(); | ||
| let mut i = 1; // start after the first quote | ||
| while i < bytes.len() { | ||
| if bytes[i] == b'"' { | ||
| // count preceding backslashes | ||
| let mut backslashes = 0; | ||
| let mut j = i; | ||
| while j > 0 && bytes[j - 1] == b'\\' { | ||
| backslashes += 1; | ||
| j -= 1; | ||
| } | ||
| if backslashes % 2 == 0 { | ||
| last_quote_pos = Some(i); | ||
| break; | ||
| } | ||
| } | ||
| i += 1; | ||
| } | ||
| if let Some(end_pos) = last_quote_pos { | ||
| // Safe slicing at UTF-8 boundaries because we only slice on quote bytes | ||
| value = rhs_trimmed[1..end_pos].to_string(); | ||
| } else { | ||
| // Malformed line without closing quote; treat as empty | ||
| value = String::new(); | ||
| } | ||
| } else { | ||
| // Not a quoted value; treat as empty to be permissive | ||
| value = String::new(); | ||
| } | ||
| } | ||
|
|
||
| let comment = match last_comment { | ||
| Some(comment) if comment.starts_with("/*") || comment.starts_with("//") => { | ||
| Some(comment.trim().to_string()) | ||
| } | ||
| _ => None, | ||
| }; | ||
|
|
||
| // Clear the last_comment after using it | ||
| if comment.is_some() { | ||
| last_comment = None; | ||
| } | ||
|
|
||
| Some(Pair { | ||
| key, | ||
| value, | ||
| comment, | ||
| }) | ||
| }) | ||
| .collect(); | ||
|
|
||
| // Extract language from header if available | ||
| let language = &header.get("Language").cloned().unwrap_or_default(); | ||
|
|
||
| // Read entire input into a string (UTF-8 expected here; UTF-16 handled in read_from) | ||
| let mut reader = reader; |
There was a problem hiding this comment.
The reader is being used after a mutable borrow. This will fail because reader was already borrowed mutably on line 36. Consider using reader directly instead of reborrowing.
| // Strings format only supports singular translations | ||
| // with plain text values. | ||
| match Translation::plain_translation(entry.value) { | ||
| // Strings format only supports singular translations. Preserve the value verbatim. |
There was a problem hiding this comment.
The comment on line 500 mentions 'Preserve the value verbatim' but the code still processes the Translation enum. Update the comment to accurately reflect what the code does - it extracts the singular value from the Translation.
This pull request introduces a new unified "edit" command to the CLI, enabling in-place add, update, and remove operations for localization files across multiple formats and inputs. The change includes documentation updates, new command-line options, and a robust implementation that supports glob patterns, dry-run previews, error handling, and per-file processing. The edit command is now fully integrated into the CLI alongside existing commands.
New CLI features and documentation:
editcommand to bothREADME.mdandlangcodec-cli/README.md, describing its unified add/update/remove functionality, support for glob patterns, multiple inputs, and preview mode. Detailed usage examples and options are now included. [1] [2] [3] [4] [5]Edit command implementation:
langcodec-cli/src/edit.rswith the core logic for handling theedit setoperation, including input validation, multi-file processing, error reporting, dry-run support, and format-specific write-back.CLI integration:
editcommand and subcommand structure inlangcodec-cli/src/main.rs, including argument parsing, option handling, and invocation from the main CLI entrypoint. [1] [2] [3] [4] [5]