Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 21810/improve validation methods #24626

Merged

Conversation

@glowe
Copy link

glowe commented Nov 3, 2019

This is a start at addressing #21810. I'm putting these changes out early to get some feedback on the following items:

  1. I added unit tests for the validation methods mentioned in #21810, because I couldn't tell whether any of the existing WPT tests covered them. Are these tests worthwhile? Are any of them unnecessary?
  2. I changed the implementation for is_valid_floating_point_number_string so that it passed the tests. The previous version of the function wasn't restrictive enough (it allowed certain whitespace characters before the number string).
  3. I changed the catch-all condition in htmlinputelement.rs to account for the remaining input types that don't have a value sanitization algorithm. This last change seems good to me since we won't be able to add a new input type without adding it to the case and checking the spec for an algorithm.

  • ./mach build -d does not report any errors
  • ./mach test-tidy does not report any errors
  • These changes fix #21810
  • There are tests for these changes
@highfive
Copy link

highfive commented Nov 3, 2019

Thanks for the pull request, and welcome! The Servo team is excited to review your changes, and you should hear from @asajeffrey (or someone else) soon.

@highfive
Copy link

highfive commented Nov 3, 2019

Heads up! This PR modifies the following files:

  • @asajeffrey: components/script/dom/htmlinputelement.rs, components/script/dom/bindings/str.rs
  • @KiChjang: components/script/dom/htmlinputelement.rs, components/script/dom/bindings/str.rs
@jdm
jdm approved these changes Nov 12, 2019
Copy link
Member

jdm left a comment

I'm in favour of unit tests for small, self-contained routines like this.

/// https://html.spec.whatwg.org/multipage/#valid-month-string
pub fn is_valid_month_string(&self) -> bool {
parse_month_string(&self.0).is_ok()

This comment has been minimized.

@jdm

jdm Nov 12, 2019

Member

Whoops, I missed reading this commit. While in general I'm in favour of using state machines for clarity of purpose and avoiding invalid states, I find this one to be difficult to follow. I'm not sure if it's the names of the states, or the repetition between them, or something else, but my eyes keep glazing over when I try to follow the logic. Given how straightforward the specification text is, I'm not exactly sure what to recommend.

This comment has been minimized.

@jdm

jdm Nov 12, 2019

Member

My reading of the specification is that the parsing algorithm for month strings is the same as the validation algorithm for month strings, so I don't see a good reason to reimplement this particular algorithm. Presumably there are other input types where they diverge?

This comment has been minimized.

@glowe

glowe Nov 20, 2019

Author

My reading of the specification is that the parsing algorithm for month strings is the same as the validation algorithm for month strings, so I don't see a good reason to reimplement this particular algorithm. Presumably there are other input types where they diverge?

Actually, the tests passed for the original implementations of the date validation functions, so I judge them to be equivalent to the specifications. The original issue #21810 recommended reimplementing all of these functions, but I think that recommendation was based on the premise that the functions might be incorrect. I've gone ahead and reverted the change to is_valid_month_string.

The lone validation function with an issue was is_valid_floating_point_number_string, so I've fixed it. Please take another look and let me know what you think.

@glowe glowe force-pushed the glowe:issue-21810/improve-validation-methods branch from 1986ddb to 8abbae1 Nov 20, 2019
@glowe glowe changed the title [WIP] Issue 21810/improve validation methods Issue 21810/improve validation methods Nov 20, 2019
@glowe glowe marked this pull request as ready for review Nov 20, 2019
@jdm
jdm approved these changes Nov 20, 2019
@jdm
Copy link
Member

jdm commented Nov 20, 2019

@bors-servo r+
Thanks!

@bors-servo
Copy link
Contributor

bors-servo commented Nov 20, 2019

📌 Commit 8abbae1 has been approved by jdm

@highfive highfive assigned jdm and unassigned asajeffrey Nov 20, 2019
@bors-servo
Copy link
Contributor

bors-servo commented Nov 20, 2019

Testing commit 8abbae1 with merge da3f021...

bors-servo added a commit that referenced this pull request Nov 20, 2019
…r=jdm

Issue 21810/improve validation methods

<!-- Please describe your changes on the following line: -->
This is a start at addressing #21810. I'm putting these changes out early to get some feedback on the following items:

1. I added unit tests for the validation methods mentioned in #21810, because I couldn't tell whether any of the existing WPT tests covered them. Are these tests worthwhile? Are any of them unnecessary?
2. I changed the implementation for `is_valid_floating_point_number_string` so that it passed the tests. The previous version of the function wasn't restrictive enough (it allowed certain whitespace characters before the number string).
3. I changed the catch-all condition in `htmlinputelement.rs` to account for the remaining input types that don't have a value sanitization algorithm. This last change seems good to me since we won't be able to add a new input type without adding it to the case and checking the spec for an algorithm.
---
<!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `___` with appropriate data: -->
- [x] `./mach build -d` does not report any errors
- [x] `./mach test-tidy` does not report any errors
- [x] These changes fix #21810

<!-- Either: -->
- [x] There are tests for these changes

<!-- Also, please make sure that "Allow edits from maintainers" checkbox is checked, so that we can help you if you get stuck somewhere along the way.-->

<!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. -->
@bors-servo
Copy link
Contributor

bors-servo commented Nov 20, 2019

💔 Test failed - status-taskcluster

@jdm
Copy link
Member

jdm commented Nov 20, 2019

Hmm, looks like we now fail a test that we used to pass:

{
    "status": "FAIL", 
    "group": "default", 
    "message": "assert_equals: expected \"\" but got \"2e308\"", 
    "stack": "@http://web-platform.test:8000/html/semantics/forms/the-input-element/number.html:50:7\nTest.prototype.step@http://web-platform.test:8000/resources/testharness.js:1931:25\ntest@http://web-platform.test:8000/resources/testharness.js:544:30\n@http://web-platform.test:8000/html/semantics/forms/the-input-element/number.html:43:5\n", 
    "subtest": "value >= Number.MAX_VALUE", 
    "test": "/html/semantics/forms/the-input-element/number.html", 
    "line": 60091, 
    "action": "test_result", 
    "expected": "PASS"
}

This can be reproduced with ./mach test-wpt tests/wpt/web-platform-tests/html/semantics/forms/the-input-element/number.html.

@glowe
Copy link
Author

glowe commented Nov 27, 2019

This can be reproduced with ./mach test-wpt tests/wpt/web-platform-tests/html/semantics/forms/the-input-element/number.html.

Thanks for identifying the failing test for me.

We are now failing on a bounds check: 2e308 exceeds Number.MAX_VALUE (1.7976931348623157e+308). I could be reading this wrong, but the specification for a valid floating point number (https://html.spec.whatwg.org/multipage/common-microsyntaxes.html#valid-floating-point-number) doesn't mention a bounds. However the specification for parsing (https://html.spec.whatwg.org/multipage/common-microsyntaxes.html#rules-for-parsing-floating-point-number-values) does mention it in that the converted value must be a IEEE 754 double-precision floating-point value.

Interestingly, I tried this specific test on Firefox 70.0.1 and it failed (http://wpt.live/html/semantics/forms/the-input-element/number.html). However, this test passes in the latest Chrome and Safari browsers.

Here's what I propose we do next (let me know what you think):

1.) I'll update the code to handle the bounds check, probably by delegating to the parsing code after verifying that there's no leading whitespace.
2.) I'm now wondering about the value of the unit tests I added. I didn't notice these web platform tests before and it seems like the unit tests are duplicative and potentially misleading. I'm leaning towards dropping the unit tests for this PR.
3.) The web platform tests for number inputs are incomplete for the leading whitespace check:

{value: " 1", expected: "", testname: "value with a leading whitespace"},

The specification mentions that the whitespace characters can be TAB, LF, FF, CR, or SPACE (https://infra.spec.whatwg.org/#ascii-whitespace). Our previous implementation only handled SPACEs. Would it be worth me filing an issue (and potentially a PR) to add tests for the other characters?

EDIT:
I've dealt with item 1. For item 2, I removed all unit tests except the one that demonstrates the existing issue with is_valid_floating_point_number_string (i.e., allowing other leading whitespace characters).

glowe added 2 commits Dec 1, 2019
Replaced catch-all with explicit case for inputs that do not have
a value sanitization algorithm. This should prevent us from
forgetting to implement a sanitization for an input, since they
must all be accounted for in the match expression.
Fixes an issue where DOMString::is_valid_floating_point_number_string
was returning true for strings that began with whitespace characters-
TAB, LF, FF, or CR. Also added a unit test to cover this since the
corresponding web-platform-tests are incomplete.
@glowe glowe force-pushed the glowe:issue-21810/improve-validation-methods branch from 8abbae1 to 576f51f Dec 1, 2019
@jdm
Copy link
Member

jdm commented Dec 3, 2019

Those choices seem reasonable! You can also include the WPT changes in this PR, because we automatically sync them upstream when they merge in our vendored copy.

@jdm
jdm approved these changes Dec 3, 2019
@jdm
Copy link
Member

jdm commented Dec 3, 2019

glowe added 2 commits Dec 3, 2019
Add cases for all leading ASCII whitespace characters.
glowe
This test is no longer necessary since the cases tested are also tested
by the number input web platform test.
@glowe
Copy link
Author

glowe commented Dec 3, 2019

Those choices seem reasonable! You can also include the WPT changes in this PR, because we automatically sync them upstream when they merge in our vendored copy.

Thanks! I did not know that. I've enhanced the number input WPT to include those additional cases and deleted the unit test since it's redundant coverage.

@servo-wpt-sync
Copy link
Collaborator

servo-wpt-sync commented Dec 3, 2019

Opened new PR for upstreamable changes.

Completed upstream sync of web-platform-test changes at web-platform-tests/wpt#20575.

@jdm
Copy link
Member

jdm commented Dec 3, 2019

@bors-servo r+
Thanks!

@bors-servo
Copy link
Contributor

bors-servo commented Dec 3, 2019

📌 Commit b9ec6f9 has been approved by jdm

@bors-servo
Copy link
Contributor

bors-servo commented Dec 3, 2019

Testing commit b9ec6f9 with merge 7ba88e8...

bors-servo added a commit that referenced this pull request Dec 3, 2019
…r=jdm

Issue 21810/improve validation methods

<!-- Please describe your changes on the following line: -->
This is a start at addressing #21810. I'm putting these changes out early to get some feedback on the following items:

1. I added unit tests for the validation methods mentioned in #21810, because I couldn't tell whether any of the existing WPT tests covered them. Are these tests worthwhile? Are any of them unnecessary?
2. I changed the implementation for `is_valid_floating_point_number_string` so that it passed the tests. The previous version of the function wasn't restrictive enough (it allowed certain whitespace characters before the number string).
3. I changed the catch-all condition in `htmlinputelement.rs` to account for the remaining input types that don't have a value sanitization algorithm. This last change seems good to me since we won't be able to add a new input type without adding it to the case and checking the spec for an algorithm.
---
<!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `___` with appropriate data: -->
- [x] `./mach build -d` does not report any errors
- [x] `./mach test-tidy` does not report any errors
- [x] These changes fix #21810

<!-- Either: -->
- [x] There are tests for these changes

<!-- Also, please make sure that "Allow edits from maintainers" checkbox is checked, so that we can help you if you get stuck somewhere along the way.-->

<!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. -->
@bors-servo
Copy link
Contributor

bors-servo commented Dec 3, 2019

☀️ Test successful - status-taskcluster
Approved by: jdm
Pushing 7ba88e8 to master...

@bors-servo bors-servo merged commit b9ec6f9 into servo:master Dec 3, 2019
1 of 2 checks passed
1 of 2 checks passed
Community-TC (pull_request) TaskGroup: failure
Details
homu Test successful
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

6 participants
You can’t perform that action at this time.