Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upParse Srcset Attribute #18313
Parse Srcset Attribute #18313
Conversation
highfive
commented
Aug 30, 2017
|
Thanks for the pull request, and welcome! The Servo team is excited to review your changes, and you should hear from @cbrewster (or someone else) soon. |
highfive
commented
Aug 30, 2017
|
Heads up! This PR modifies the following files:
|
|
Just had a quick look. This looks good so far. Nice work! But, this still requires a lot of tidying up. Please fix the build warnings and run |
| let(spaces, position) = collect_sequence_characters(position, |c| *c ==',' || char::is_whitespace(*c)); | ||
| let x = spaces.find(','); | ||
| match x { | ||
| Some(val) => println!("Parse Error"), |
This comment has been minimized.
This comment has been minimized.
wafflespeanut
Aug 31, 2017
Member
There shouldn't be any println! unless we really need it somewhere. Ideally, this should be ignoring parse errors completely and return only the values that have been parsed successfully (if any). Otherwise, it should simply be an empty vector.
This comment has been minimized.
This comment has been minimized.
iamneha
Sep 4, 2017
Author
Contributor
Currently being used solely for debugging purpose and will be removed when PR is finalized
| } | ||
| } | ||
| } | ||
| let mut error = Error::No; |
This comment has been minimized.
This comment has been minimized.
wafflespeanut
Aug 31, 2017
Member
This could simply be a bool, since there are only two possible values, and we can use bools easily in if conditionals.
| let mut char_iter = descriptor.chars(); | ||
| let (digits, remaining) = collect_sequence_characters(&descriptor, is_ascii_digit); | ||
| let valid_non_negative_integer = parse_unsigned_integer(digits.chars()); | ||
| let has_w = if let "w" = remaining { |
This comment has been minimized.
This comment has been minimized.
wafflespeanut
Aug 31, 2017
Member
Instead of having a complicated pattern match and a variable binding, we should use remaining == "w" in the if conditionals below.
| }; | ||
| if valid_non_negative_integer.is_ok() && has_w { | ||
| //not support sizes attribute | ||
| if width != None && density != None { |
This comment has been minimized.
This comment has been minimized.
wafflespeanut
Aug 31, 2017
Member
Nit: Let's use is_some() and is_none() here and everywhere (appropriately). Note that there are plenty of useful methods for Option<T> and Result<T, E>
| error = Error::Yes; | ||
| } | ||
| let result = parse_unsigned_integer(char_iter.clone()); | ||
| if result.is_err() { |
This comment has been minimized.
This comment has been minimized.
wafflespeanut
Aug 31, 2017
Member
If error becomes a bool, then we could write,
error = result.is_err();
if let Ok(w) = result {
width = Some(w);
}| } | ||
|
|
||
| // add the counts of spaces that we collect to advance the start index | ||
| let space_len = spaces.char_indices().count(); |
This comment has been minimized.
This comment has been minimized.
wafflespeanut
Aug 31, 2017
Member
We should go for chars only if we're concerned with unicode codepoints. If it's just spaces, then we could simply use spaces.len()
| let(url, spaces) = collect_sequence_characters(position, |c| !char::is_whitespace(*c)); | ||
|
|
||
| // add the counts of urls that we parse to advance the start index | ||
| start += url.char_indices().count(); |
This comment has been minimized.
This comment has been minimized.
wafflespeanut
Aug 31, 2017
Member
Why should we use char_indices instead of chars here? (and below?). We should prefer the latter, unless we're using the indices during the iteration.
| continue; | ||
| } | ||
| Some((idx, c @ ',')) => { | ||
| position.chars().enumerate(); |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
Also, please leave steps from spec in comments, so that it's easier to address/change parts of code later. |
| } else { | ||
| false | ||
| }; | ||
| if valid_non_negative_integer.is_ok() && has_w { |
This comment has been minimized.
This comment has been minimized.
wafflespeanut
Aug 31, 2017
•
Member
This should be !has_w, or more simply, we should check has_w && width.is_some() first (and error if it's true) before checking the non-negative integer thingy.
During descriptor parsing, we ensure that there are no repetitions in descriptors, which means there shouldn't be any usage of w, x or h more than once - i.e., if the value ends with h and we already have a height value in the descriptor, then it's a parse error.
|
Step 13 in the spec looks funny.
If it's a parse error, then there's no reason to parse it. The value for future-compat-h is not being used after parsing (probably because it's planned for the future?), but that shouldn't mean that the entire descriptor should be rejected with a parse error. We should probably skip the value and still use the descriptor? /cc @jdm |
|
Sorry, but which issue does this PR solve? |
| false | ||
| }; | ||
| let valid_floating_point = parse_double(digits); | ||
| let has_x = if remaining == "x" { |
This comment has been minimized.
This comment has been minimized.
| } else { | ||
| false | ||
| }; | ||
| let has_h = if remaining == "h" { |
This comment has been minimized.
This comment has been minimized.
| }; | ||
| if valid_non_negative_integer.is_ok() && has_w { | ||
| //not support sizes attribute | ||
| if width.is_some() && density.is_some() { |
This comment has been minimized.
This comment has been minimized.
| let result = parse_unsigned_integer(char_iter.clone()); | ||
| error = result.is_err(); | ||
| if let Ok(w) = result { | ||
| width = Some(w); |
This comment has been minimized.
This comment has been minimized.
creativcoder
Sep 5, 2017
•
Contributor
As we are just pulling the value out and converting it to option type (i.e, Some(w)), we can use the ok method like: width = result.ok(). Calling ok() on a Result<T> type converts it to an Option<T>. ok() method does what you are currently doing here.
| if future_compat_h.is_some() && width.is_none() { | ||
| error = true; | ||
| } | ||
| if error == false { |
This comment has been minimized.
This comment has been minimized.
creativcoder
Sep 5, 2017
Contributor
As error now is just a boolean we can reduce the conditional check to just this:
if !error {
// stuff
}| let char_iter = descriptor.chars(); | ||
| let (digits, remaining) = collect_sequence_characters(&descriptor, is_ascii_digit); | ||
| let valid_non_negative_integer = parse_unsigned_integer(digits.chars()); | ||
| let has_w = if remaining == "w" { |
This comment has been minimized.
This comment has been minimized.
| @@ -10,7 +10,7 @@ interface HTMLImageElement : HTMLElement { | |||
| [CEReactions] | |||
| attribute DOMString src; | |||
| // [CEReactions] | |||
| // attribute DOMString srcset; | |||
This comment has been minimized.
This comment has been minimized.
| let mut current_descriptor = String::new(); | ||
| let mut state = ParseState::InDescriptor; | ||
| let mut char_stream = position.chars().enumerate(); | ||
| //let mut last_index = 0; |
This comment has been minimized.
This comment has been minimized.
| error = true; | ||
| } | ||
| if error == false { | ||
| if width.is_some() || density.is_some() { |
This comment has been minimized.
This comment has been minimized.
creativcoder
Sep 5, 2017
•
Contributor
Step 15 of spec is written in a misleading way. Due to this check (if width.is_some() || density.is_some() {), the case of a single srcset string without any descriptor/density : ( i.e., small_image.jpg), does not parses correctly and returns an empty vec. This is a valid srcset string as confirmed by neha.
The spec does not mean that we should append ImageSource only if width and density is present.
Rather it should just check if error is false and just append whatever is there in the width and density variable along with url.
So basically this :
if !error {
// append stuff to candidates
} else {
println!("parsed error");
}|
Merge commits are not accepted. Please remove them and rebase against master instead. |
|
Made a preliminary review, I have not gotten too far into the spec before I started finding problems. It would also be super helpful if you can add step annotations inside your function body, this will leave a good note of which part of the spec that the code is attempting to follow. |
|
|
||
| pub fn collect_sequence_characters<F>(s: &str, predicate: F) -> (&str, &str) | ||
| where F: Fn(&char) -> bool | ||
| { |
This comment has been minimized.
This comment has been minimized.
| let mut candidates: Vec<ImageSource> = Vec::new(); | ||
| while start < input.len() { | ||
| let position = &input[start..]; | ||
| let(spaces, position) = collect_sequence_characters(position, |c| *c ==',' || char::is_whitespace(*c)); |
This comment has been minimized.
This comment has been minimized.
KiChjang
Sep 5, 2017
Member
nit: space after let.
Also, ignoring the point where you're supposed to throw a parse error when you encounter a comma, is this essentially position.filter(char::is_whitespace).collect()?
| return (s, ""); | ||
| } | ||
|
|
||
| pub fn parse_a_srcset_attribute(input: String) -> Vec<ImageSource> { |
This comment has been minimized.
This comment has been minimized.
KiChjang
Sep 5, 2017
Member
Is there a reason why you need a String here instead of a &str?
Also, since this function can return an error, the return type should be Fallible<Vec<ImageSource>>.
This comment has been minimized.
This comment has been minimized.
KiChjang
Sep 5, 2017
Member
Should also link to https://html.spec.whatwg.org/multipage/#parsing-a-srcset-attribute.
|
|
||
| pub fn parse_a_srcset_attribute(input: String) -> Vec<ImageSource> { | ||
| let mut start = 0; | ||
| let mut candidates: Vec<ImageSource> = Vec::new(); |
This comment has been minimized.
This comment has been minimized.
| pub den: Option<f64>, | ||
| } | ||
|
|
||
| pub fn collect_sequence_characters<F>(s: &str, predicate: F) -> (&str, &str) |
This comment has been minimized.
This comment has been minimized.
KiChjang
Sep 5, 2017
Member
Link to https://infra.spec.whatwg.org/#collect-a-sequence-of-code-points.
I'm also doubtful whether this function belongs here. It sounds like script/dom/bindings/str.rs is a better place.
| { | ||
| for (i, ch) in s.chars().enumerate() { | ||
| if !predicate(&ch) { | ||
| return (&s[0..i], &s[i..]) |
This comment has been minimized.
This comment has been minimized.
KiChjang
Sep 5, 2017
•
Member
This does not look correct; the spec says:
While position doesn’t point past the end of input and the code point at position within input meets the condition condition:
Append that code point to the end of result.
Advance position by 1.
In other words, whenever the condition (in the code it is named predicate instead) is fulfilled, the character that is fulfilling the condition must be appended to result.
Currently, your code would return immediately when it discovers the first character that returns false from invoking predicate, causing the remaining characters to not be parsed according to the spec.
| let(spaces, position) = collect_sequence_characters(position, |c| *c ==',' || char::is_whitespace(*c)); | ||
| let x = spaces.find(','); | ||
| match x { | ||
| Some(val) => println!("Parse Error"), |
This comment has been minimized.
This comment has been minimized.
KiChjang
Sep 5, 2017
Member
As previously stated, you should return an Error here instead of println!.
This comment has been minimized.
This comment has been minimized.
jdm
Sep 5, 2017
Member
As previously discussed with Neha, the spec does not actually want us to abort parsing when a parse error is encountered.
| @@ -9,8 +9,8 @@ interface HTMLImageElement : HTMLElement { | |||
| attribute DOMString alt; | |||
| [CEReactions] | |||
| attribute DOMString src; | |||
| // [CEReactions] | |||
| // attribute DOMString srcset; | |||
| [CEReactions] | |||
This comment has been minimized.
This comment has been minimized.
KiChjang
Sep 5, 2017
Member
In order to generate appropriate bindings for the return type, you'll need to add the [Throws] attribute here.
| @@ -10,6 +10,7 @@ pub use dom::bindings::cell::DOMRefCell; | |||
| pub use dom::bindings::js::JS; | |||
| pub use dom::node::Node; | |||
| pub use dom::bindings::refcounted::TrustedPromise; | |||
| pub use std::string::String; | |||
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
| @@ -0,0 +1,72 @@ | |||
| use script::test::String; | |||
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
30c6228
to
956ac15
|
Going good so far. Nice work! A few more steps towards merging. I think some of my comments should be thrown as warnings by rustc. We should get rid of the warnings just as we do the errors. |
| InParens, | ||
| AfterDescriptor, | ||
| } | ||
| #[derive(Debug)] |
This comment has been minimized.
This comment has been minimized.
| pub url: String, | ||
| pub descriptor: Descriptor, | ||
| } | ||
| #[derive(PartialEq)] |
This comment has been minimized.
This comment has been minimized.
| pub wid: Option<u32>, | ||
| pub den: Option<f64>, | ||
| } | ||
| #[derive(Clone, Copy, HeapSizeOf, JSTraceable)] |
This comment has been minimized.
This comment has been minimized.
| @@ -978,3 +1004,177 @@ fn image_dimension_setter(element: &Element, attr: LocalName, value: u32) { | |||
| let value = AttrValue::Dimension(value.to_string(), dim); | |||
| element.set_attribute(&attr, value); | |||
| } | |||
|
|
|||
| pub fn collect_sequence_characters<F>(s: &str, predicate: F) -> (&str, &str) | |||
| where F: Fn(&char) -> bool | |||
This comment has been minimized.
This comment has been minimized.
wafflespeanut
Sep 24, 2017
Member
Nit: Improper indentation (strip out spaces for this and the following lines)
| let mut candidates: Vec<ImageSource> = vec![]; | ||
| while start < input.len() { | ||
| let position = &input[start..]; | ||
| let (spaces, position) = collect_sequence_characters(position, |c| *c ==',' || char::is_whitespace(*c)); |
This comment has been minimized.
This comment has been minimized.
| } | ||
| if !error { | ||
| let descriptor = Descriptor { wid: width, den: density }; | ||
| let imageSource = ImageSource { url: url, descriptor: descriptor }; |
This comment has been minimized.
This comment has been minimized.
| let empty_vec = Vec::new(); | ||
| assert_eq!(parse_a_srcset_attribute("small-image.jpg 320w 1.1x"), empty_vec); | ||
| } | ||
| #[test] |
This comment has been minimized.
This comment has been minimized.
| let has_x = remaining == "x"; | ||
| let has_h = remaining == "h"; | ||
| if valid_non_negative_integer.is_ok() && has_w { | ||
| let result = parse_unsigned_integer(char_iter.clone()); |
This comment has been minimized.
This comment has been minimized.
wafflespeanut
Sep 24, 2017
Member
Why should we parse this again if we already have the result in valid_non_negative_integer? We could just unwrap it, right?
| width = Some(w); | ||
| } | ||
| } else if valid_floating_point.is_ok() && has_x { | ||
| let result = parse_double(digits); |
This comment has been minimized.
This comment has been minimized.
wafflespeanut
Sep 24, 2017
Member
Ditto on unnecessary re-parsing. We should unwrap from valid_floating_point
| density = Some(x); | ||
| } | ||
| } else if valid_non_negative_integer.is_ok() && has_h { | ||
| let result = parse_unsigned_integer(char_iter.clone()); |
This comment has been minimized.
This comment has been minimized.
| } | ||
| } | ||
| return (s, ""); | ||
| } |
This comment has been minimized.
This comment has been minimized.
wafflespeanut
Sep 28, 2017
Member
Here's the correct indentation:
pub fn collect_sequence_characters<F>(s: &str, predicate: F) -> (&str, &str)
where F: Fn(&char) -> bool
{
for (i, ch) in s.chars().enumerate() {
if !predicate(&ch) {
return (&s[0..i], &s[i..])
}
}
return (s, "");
}|
Sorry, I just realized that you haven't finished updating the code. |
|
I've squashed these commits in #18716. |
Parse srcset attribute values Squashed version of #18313. --- - [x] `./mach build -d` does not report any errors - [x] `./mach test-tidy` does not report any errors - [x] These changes fix (partially) #11416 - [x] There are tests for these changes <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/18716) <!-- Reviewable:end -->
Parse srcset attribute values Squashed version of #18313. --- - [x] `./mach build -d` does not report any errors - [x] `./mach test-tidy` does not report any errors - [x] These changes fix (partially) #11416 - [x] There are tests for these changes <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/18716) <!-- Reviewable:end -->
iamneha commentedAug 30, 2017
•
edited
./mach build -ddoes not report any errors./mach test-tidydoes not report any errorsThis change is