Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes panic on DOMString::strip_leading_and_trailing_ascii_whitespace #21978

Merged
merged 1 commit into from Nov 16, 2018

Conversation

jimberlage
Copy link
Contributor

@jimberlage jimberlage commented Oct 18, 2018

This changes DOMString::strip_leading_and_trailing_ascii_whitespace to handle multi-byte unicode characters.


  • There are tests for these changes OR
  • These changes do not require tests because _____

This change is Reviewable

@highfive
Copy link

Thanks for the pull request, and welcome! The Servo team is excited to review your changes, and you should hear from @asajeffrey (or someone else) soon.

@highfive
Copy link

Heads up! This PR modifies the following files:

  • @asajeffrey: components/script/dom/bindings/str.rs
  • @KiChjang: components/script/dom/bindings/str.rs

@highfive highfive added the S-awaiting-review There is new code that needs to be reviewed. label Oct 18, 2018
@highfive
Copy link

warning Warning warning

  • These commits include an empty title element (<title></title>). Consider adding appropriate metadata.

Copy link
Member

@SimonSapin SimonSapin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks! Comments inline.

Some(idx) => idx + 1,
match self.0.rfind(|ref c| !char::is_ascii_whitespace(c)) {
Some(idx) => {
let last_non_whitespace = self.0.char_indices().find(|pair| pair.0 > idx);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This unnecessarily goes though the whole string from the start to idx. I think something like this would also work and be more efficient. The variable names are also slightly wrong (but already were before this PR).

            Some(last_non_whitespace) => {
                let after_last_non_whitespace = last_non_whitespace +
                    // `unwrap` never panics here because `rfind` found at least one char.
                    self.0[last_non_whitespace..].chars().next().unwrap().len_utf8();
                self.0.truncate(after_last_non_whitespace)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I’ve typed all this, I realize this whole match block can be removed and better written as:

self.0.truncate(self.0.trim_end_matches(|c| c.is_ascii_whitespace()).len());

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @SimonSapin for spamming your comment, I'm trying to understand servo I have on (maybe stupid :-)) question. With trim_end_matches a &str is created and the length of it is used. Would it also be possible to use trim_matches and replace the String member of the DOMString with the resulting &str or is the member used also somewhere else?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.0 = trim_matches(…).to_string() would work, but it makes a new memory allocation and copies the preserved part of the string. This algorithm reuses the existing allocation and, if first_non_whitespace happens to be zero, does not need to do any copying.

Some(idx) => {
let last_non_whitespace = self.0.char_indices().find(|pair| pair.0 > idx);
if last_non_whitespace.is_some() {
self.0.truncate(last_non_whitespace.unwrap().0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’d recommend if let Some(x) = y instead of is_some + unwrap, but with the change suggested above this branch is gone.

@nox
Copy link
Contributor

nox commented Oct 18, 2018

@jdm Do we really need a unit test for that? I would rather we don't add more unit tests but will defer to you for the decision on whether to keep them in this PR.

@@ -7,6 +7,7 @@
#[cfg(test)] extern crate script;
#[cfg(test)] extern crate servo_url;

#[cfg(test)] mod dom;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just make a top-level module named str, no need to follow the module hierarchy of the script crate.

@jdm
Copy link
Member

jdm commented Oct 18, 2018

I would move the unit test cases into the integration test.

Copy link
Contributor Author

@jimberlage jimberlage left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 3 files reviewed, 3 unresolved discussions (waiting on @SimonSapin, @nox, and @asajeffrey)


components/script/dom/bindings/str.rs, line 187 at r1 (raw file):

Previously, SimonSapin (Simon Sapin) wrote…

Now that I’ve typed all this, I realize this whole match block can be removed and better written as:

self.0.truncate(self.0.trim_end_matches(|c| c.is_ascii_whitespace()).len());
</blockquote></details>

Done.
___
*[components/script/dom/bindings/str.rs, line 189 at r1](https://reviewable.io/reviews/servo/servo/21978#-LP3qt50UcnoV6yAPM36-r1-189:-LP8Z0xXBM97cDBU-oPY:b-896fix) ([raw file](https://github.com/servo/servo/blob/4ade61c35b85e9d122ed87e45d16ce355c9a9a35/components/script/dom/bindings/str.rs#L189)):*
<details><summary><i>Previously, SimonSapin (Simon Sapin) wrote…</i></summary><blockquote>

I’d recommend `if let Some(x) = y` instead of `is_some` + `unwrap`, but with the change suggested above this branch is gone.
</blockquote></details>

Done.
___
*[tests/unit/script/lib.rs, line 10 at r1](https://reviewable.io/reviews/servo/servo/21978#-LP3qt50UcnoV6yAPM37-r1-10:-LP8Z2VPA3VrBDZaFsa1:b-896fix) ([raw file](https://github.com/servo/servo/blob/4ade61c35b85e9d122ed87e45d16ce355c9a9a35/tests/unit/script/lib.rs#L10)):*
<details><summary><i>Previously, nox (Anthony Ramine) wrote…</i></summary><blockquote>

Just make a top-level module named `str`, no need to follow the module hierarchy of the script crate.
</blockquote></details>

Done.


<!-- Sent from Reviewable.io -->

@jdm jdm assigned jdm and unassigned asajeffrey Oct 29, 2018
@jdm jdm added S-needs-squash Some (or all) of the commits in the PR should be combined. and removed S-awaiting-review There is new code that needs to be reviewed. labels Oct 29, 2018
@jdm
Copy link
Member

jdm commented Oct 29, 2018

This looks great! Please squash the two commits into one, and then it will be ready to merge!

@highfive highfive added the S-awaiting-review There is new code that needs to be reviewed. label Nov 8, 2018
@jdm
Copy link
Member

jdm commented Nov 8, 2018

@bors-servo r+

@bors-servo
Copy link
Contributor

📌 Commit 68228c0 has been approved by jdm

@highfive highfive added S-awaiting-merge The PR is in the process of compiling and running tests on the automated CI. and removed S-awaiting-review There is new code that needs to be reviewed. S-needs-squash Some (or all) of the commits in the PR should be combined. labels Nov 8, 2018
@bors-servo
Copy link
Contributor

⌛ Testing commit 68228c0 with merge 0175b2b...

bors-servo pushed a commit that referenced this pull request Nov 8, 2018
Fixes panic on DOMString::strip_leading_and_trailing_ascii_whitespace

<!-- Please describe your changes on the following line: -->
This changes `DOMString::strip_leading_and_trailing_ascii_whitespace` to handle multi-byte unicode characters.

---
<!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `__` with appropriate data: -->
- [X] `./mach build -d` does not report any errors
- [X] `./mach test-tidy` does not report any errors
- [X] These changes fix #21963 (github issue number if applicable).

<!-- Either: -->
- [X] There are tests for these changes OR
- [ ] These changes do not require tests because _____

<!-- Also, please make sure that "Allow edits from maintainers" checkbox is checked, so that we can help you if you get stuck somewhere along the way.-->

<!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. -->

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/21978)
<!-- Reviewable:end -->
@bors-servo
Copy link
Contributor

💔 Test failed - linux-dev

@highfive highfive added S-tests-failed The changes caused existing tests to fail. and removed S-awaiting-merge The PR is in the process of compiling and running tests on the automated CI. labels Nov 8, 2018
@jdm
Copy link
Member

jdm commented Nov 8, 2018

Diff in /home/servo/buildbot/slave/linux-dev/build/components/script/dom/bindings/str.rs at line 182:
             return;
         }
 
-        let trailing_whitespace_len = self.0.trim_end_matches(|ref c| char::is_ascii_whitespace(c)).len();
+        let trailing_whitespace_len = self
+            .0
+            .trim_end_matches(|ref c| char::is_ascii_whitespace(c))
+            .len();
         self.0.truncate(trailing_whitespace_len);
         if self.0.is_empty() {
             return;
Run `./mach fmt` to fix the formatting

@highfive highfive added S-awaiting-review There is new code that needs to be reviewed. and removed S-tests-failed The changes caused existing tests to fail. labels Nov 8, 2018
@jdm
Copy link
Member

jdm commented Nov 16, 2018

@bors-servo r+

@bors-servo
Copy link
Contributor

📌 Commit 1f99bf4 has been approved by jdm

@highfive highfive added S-awaiting-merge The PR is in the process of compiling and running tests on the automated CI. and removed S-awaiting-review There is new code that needs to be reviewed. labels Nov 16, 2018
bors-servo pushed a commit that referenced this pull request Nov 16, 2018
Fixes panic on DOMString::strip_leading_and_trailing_ascii_whitespace

<!-- Please describe your changes on the following line: -->
This changes `DOMString::strip_leading_and_trailing_ascii_whitespace` to handle multi-byte unicode characters.

---
<!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `__` with appropriate data: -->
- [X] `./mach build -d` does not report any errors
- [X] `./mach test-tidy` does not report any errors
- [X] These changes fix #21963 (github issue number if applicable).

<!-- Either: -->
- [X] There are tests for these changes OR
- [ ] These changes do not require tests because _____

<!-- Also, please make sure that "Allow edits from maintainers" checkbox is checked, so that we can help you if you get stuck somewhere along the way.-->

<!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. -->

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/21978)
<!-- Reviewable:end -->
@bors-servo
Copy link
Contributor

⌛ Testing commit 1f99bf4 with merge 831f966...

@bors-servo
Copy link
Contributor

@bors-servo bors-servo merged commit 1f99bf4 into servo:master Nov 16, 2018
@highfive highfive removed the S-awaiting-merge The PR is in the process of compiling and running tests on the automated CI. label Nov 16, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

assertion failed: self.is_char_boundary(new_len) in DOMString::strip_leading_and_trailing_ascii_whitespace
8 participants