Skip to content

Refactor: consolidate string utilities #2306

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

bobzhang
Copy link
Contributor

Summary

  • consolidate surrogate pair utilities into string/view.mbt
  • remove now-unused string/utils.mbt

Testing

  • moon fmt
  • moon info
  • moon check
  • moon test

https://chatgpt.com/codex/tasks/task_e_68550e1d79c08320a0139902120114fd

Copy link

Missing documentation for surrogate pair utilities

Category
Maintainability
Code Snippet
fn is_leading_surrogate(c : Int) -> Bool
fn is_trailing_surrogate(c : Int) -> Bool
fn code_point_of_surrogate_pair(leading : Int, trailing : Int) -> Char
Recommendation
Add documentation comments (///|) explaining each function's purpose, parameters, and return values
Reasoning
These functions handle complex Unicode concepts and should be well-documented to help other developers understand their purpose and proper usage

No bounds checking in code_point_of_surrogate_pair

Category
Correctness
Code Snippet
fn code_point_of_surrogate_pair(leading : Int, trailing : Int) -> Char {
((leading - 0xD800) * 0x400 + trailing - 0xDC00 + 0x10000).unsafe_to_char()
}
Recommendation
Add parameter validation to ensure leading and trailing are valid surrogate values:

fn code_point_of_surrogate_pair(leading : Int, trailing : Int) -> Char {
  if !is_leading_surrogate(leading) || !is_trailing_surrogate(trailing) {
    panic("Invalid surrogate pair")
  }
  ((leading - 0xD800) * 0x400 + trailing - 0xDC00 + 0x10000).unsafe_to_char()
}```
**Reasoning**
The function uses unsafe_to_char() but doesn't validate its inputs, which could lead to undefined behavior if called with invalid surrogate values

</details>
<details>

<summary> Magic numbers in surrogate pair calculation </summary>

**Category**
Maintainability
**Code Snippet**
((leading - 0xD800) * 0x400 + trailing - 0xDC00 + 0x10000).unsafe_to_char()
**Recommendation**
Extract magic numbers into named constants:
```moonbit
let surrogate_offset = 0x10000
let surrogate_shift = 0x400```
**Reasoning**
The surrogate pair calculation uses several magic numbers whose purpose isn't immediately clear. Named constants would make the code more maintainable and self-documenting

</details>

@coveralls
Copy link
Collaborator

Pull Request Test Coverage Report for Build 7402

Details

  • 3 of 3 (100.0%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 89.886%

Totals Coverage Status
Change from base Build 7401: 0.0%
Covered Lines: 8505
Relevant Lines: 9462

💛 - Coveralls

@bobzhang bobzhang closed this Jul 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants