Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse Srcset Attribute #18313

Closed
wants to merge 2 commits into from
Closed

Parse Srcset Attribute #18313

wants to merge 2 commits into from

Conversation

@iamneha
Copy link
Contributor

iamneha commented Aug 30, 2017


  • ./mach build -d does not report any errors
  • ./mach test-tidy does not report any errors
  • These changes fix #__ (github issue number if applicable).
  • There are tests for these changes OR
  • These changes do not require tests because _____

This change is Reviewable

@highfive
Copy link

highfive commented Aug 30, 2017

Thanks for the pull request, and welcome! The Servo team is excited to review your changes, and you should hear from @cbrewster (or someone else) soon.

@highfive
Copy link

highfive commented Aug 30, 2017

Heads up! This PR modifies the following files:

  • @bholley: components/style/str.rs
  • @canaltinova: components/style/str.rs
  • @KiChjang: components/script/test.rs, components/script/dom/webidls/HTMLImageElement.webidl, components/script/dom/htmlimageelement.rs
  • @fitzgen: components/script/test.rs, components/script/dom/webidls/HTMLImageElement.webidl, components/script/dom/htmlimageelement.rs
  • @emilio: components/style/str.rs
@jdm jdm assigned jdm and unassigned cbrewster Aug 30, 2017
Copy link
Member

wafflespeanut left a comment

Just had a quick look. This looks good so far. Nice work! But, this still requires a lot of tidying up. Please fix the build warnings and run ./mach test-tidy to help cleanup the code.

let(spaces, position) = collect_sequence_characters(position, |c| *c ==',' || char::is_whitespace(*c));
let x = spaces.find(',');
match x {
Some(val) => println!("Parse Error"),

This comment has been minimized.

@wafflespeanut

wafflespeanut Aug 31, 2017

Member

There shouldn't be any println! unless we really need it somewhere. Ideally, this should be ignoring parse errors completely and return only the values that have been parsed successfully (if any). Otherwise, it should simply be an empty vector.

This comment has been minimized.

@iamneha

iamneha Sep 4, 2017

Author Contributor

Currently being used solely for debugging purpose and will be removed when PR is finalized

}
}
}
let mut error = Error::No;

This comment has been minimized.

@wafflespeanut

wafflespeanut Aug 31, 2017

Member

This could simply be a bool, since there are only two possible values, and we can use bools easily in if conditionals.

let mut char_iter = descriptor.chars();
let (digits, remaining) = collect_sequence_characters(&descriptor, is_ascii_digit);
let valid_non_negative_integer = parse_unsigned_integer(digits.chars());
let has_w = if let "w" = remaining {

This comment has been minimized.

@wafflespeanut

wafflespeanut Aug 31, 2017

Member

Instead of having a complicated pattern match and a variable binding, we should use remaining == "w" in the if conditionals below.

};
if valid_non_negative_integer.is_ok() && has_w {
//not support sizes attribute
if width != None && density != None {

This comment has been minimized.

@wafflespeanut

wafflespeanut Aug 31, 2017

Member

Nit: Let's use is_some() and is_none() here and everywhere (appropriately). Note that there are plenty of useful methods for Option<T> and Result<T, E>

error = Error::Yes;
}
let result = parse_unsigned_integer(char_iter.clone());
if result.is_err() {

This comment has been minimized.

@wafflespeanut

wafflespeanut Aug 31, 2017

Member

If error becomes a bool, then we could write,

error = result.is_err();
if let Ok(w) = result {
    width = Some(w);
}
}

// add the counts of spaces that we collect to advance the start index
let space_len = spaces.char_indices().count();

This comment has been minimized.

@wafflespeanut

wafflespeanut Aug 31, 2017

Member

We should go for chars only if we're concerned with unicode codepoints. If it's just spaces, then we could simply use spaces.len()

let(url, spaces) = collect_sequence_characters(position, |c| !char::is_whitespace(*c));

// add the counts of urls that we parse to advance the start index
start += url.char_indices().count();

This comment has been minimized.

@wafflespeanut

wafflespeanut Aug 31, 2017

Member

Why should we use char_indices instead of chars here? (and below?). We should prefer the latter, unless we're using the indices during the iteration.

continue;
}
Some((idx, c @ ',')) => {
position.chars().enumerate();

This comment has been minimized.

@wafflespeanut

wafflespeanut Aug 31, 2017

Member

What's this?

This comment has been minimized.

@jdm

jdm Sep 12, 2017

Member

This should be removed, since it does not do anything.

@wafflespeanut
Copy link
Member

wafflespeanut commented Aug 31, 2017

Also, please leave steps from spec in comments, so that it's easier to address/change parts of code later.

} else {
false
};
if valid_non_negative_integer.is_ok() && has_w {

This comment has been minimized.

@wafflespeanut

wafflespeanut Aug 31, 2017

Member

This should be !has_w, or more simply, we should check has_w && width.is_some() first (and error if it's true) before checking the non-negative integer thingy.

During descriptor parsing, we ensure that there are no repetitions in descriptors, which means there shouldn't be any usage of w, x or h more than once - i.e., if the value ends with h and we already have a height value in the descriptor, then it's a parse error.

@iamneha iamneha force-pushed the iamneha:parseSrcset branch from ecbae6f to 26c7c30 Sep 4, 2017
@wafflespeanut
Copy link
Member

wafflespeanut commented Sep 4, 2017

Step 13 in the spec looks funny.

If the descriptor consists of a valid non-negative integer followed by a U+0068 LATIN SMALL LETTER H character

This is a parse error.

If it's a parse error, then there's no reason to parse it. The value for future-compat-h is not being used after parsing (probably because it's planned for the future?), but that shouldn't mean that the entire descriptor should be rejected with a parse error. We should probably skip the value and still use the descriptor?

/cc @jdm

@KiChjang
Copy link
Member

KiChjang commented Sep 5, 2017

Sorry, but which issue does this PR solve?

@iamneha iamneha mentioned this pull request Sep 5, 2017
7 of 7 tasks complete
@iamneha
Copy link
Contributor Author

iamneha commented Sep 5, 2017

@KiChjang we are solving srcset attribute from issue Support responsive images #11416

false
};
let valid_floating_point = parse_double(digits);
let has_x = if remaining == "x" {

This comment has been minimized.

@creativcoder

creativcoder Sep 5, 2017

Contributor

we can just do this: let has_x = remaining == "x";

} else {
false
};
let has_h = if remaining == "h" {

This comment has been minimized.

@creativcoder

creativcoder Sep 5, 2017

Contributor

Same as above

};
if valid_non_negative_integer.is_ok() && has_w {
//not support sizes attribute
if width.is_some() && density.is_some() {

This comment has been minimized.

@creativcoder

creativcoder Sep 5, 2017

Contributor

you can do this error = width.is_some() && density.is_some();

let result = parse_unsigned_integer(char_iter.clone());
error = result.is_err();
if let Ok(w) = result {
width = Some(w);

This comment has been minimized.

@creativcoder

creativcoder Sep 5, 2017

Contributor

As we are just pulling the value out and converting it to option type (i.e, Some(w)), we can use the ok method like: width = result.ok(). Calling ok() on a Result<T> type converts it to an Option<T>. ok() method does what you are currently doing here.

if future_compat_h.is_some() && width.is_none() {
error = true;
}
if error == false {

This comment has been minimized.

@creativcoder

creativcoder Sep 5, 2017

Contributor

As error now is just a boolean we can reduce the conditional check to just this:

if !error {
// stuff
}
let char_iter = descriptor.chars();
let (digits, remaining) = collect_sequence_characters(&descriptor, is_ascii_digit);
let valid_non_negative_integer = parse_unsigned_integer(digits.chars());
let has_w = if remaining == "w" {

This comment has been minimized.

@creativcoder

creativcoder Sep 5, 2017

Contributor

we can just do this: let has_w = remaining == "w";

@@ -10,7 +10,7 @@ interface HTMLImageElement : HTMLElement {
[CEReactions]
attribute DOMString src;
// [CEReactions]
// attribute DOMString srcset;

This comment has been minimized.

@creativcoder

creativcoder Sep 5, 2017

Contributor

Nit: Please remove this commented line

let mut current_descriptor = String::new();
let mut state = ParseState::InDescriptor;
let mut char_stream = position.chars().enumerate();
//let mut last_index = 0;

This comment has been minimized.

@creativcoder

creativcoder Sep 5, 2017

Contributor

Nit: Please remove the commented line

error = true;
}
if error == false {
if width.is_some() || density.is_some() {

This comment has been minimized.

@creativcoder

creativcoder Sep 5, 2017

Contributor

Step 15 of spec is written in a misleading way. Due to this check (if width.is_some() || density.is_some() {), the case of a single srcset string without any descriptor/density : ( i.e., small_image.jpg), does not parses correctly and returns an empty vec. This is a valid srcset string as confirmed by neha.

The spec does not mean that we should append ImageSource only if width and density is present.
Rather it should just check if error is false and just append whatever is there in the width and density variable along with url.

So basically this :

if !error {
// append stuff to candidates
} else {
println!("parsed error");
}
@iamneha iamneha force-pushed the iamneha:parseSrcset branch from 26c7c30 to 36e8270 Sep 5, 2017
@KiChjang
Copy link
Member

KiChjang commented Sep 5, 2017

Merge commits are not accepted. Please remove them and rebase against master instead.

Copy link
Member

KiChjang left a comment

Made a preliminary review, I have not gotten too far into the spec before I started finding problems.

It would also be super helpful if you can add step annotations inside your function body, this will leave a good note of which part of the spec that the code is attempting to follow.


pub fn collect_sequence_characters<F>(s: &str, predicate: F) -> (&str, &str)
where F: Fn(&char) -> bool
{

This comment has been minimized.

@KiChjang

KiChjang Sep 5, 2017

Member

nit: weird indentation

let mut candidates: Vec<ImageSource> = Vec::new();
while start < input.len() {
let position = &input[start..];
let(spaces, position) = collect_sequence_characters(position, |c| *c ==',' || char::is_whitespace(*c));

This comment has been minimized.

@KiChjang

KiChjang Sep 5, 2017

Member

nit: space after let.

Also, ignoring the point where you're supposed to throw a parse error when you encounter a comma, is this essentially position.filter(char::is_whitespace).collect()?

return (s, "");
}

pub fn parse_a_srcset_attribute(input: String) -> Vec<ImageSource> {

This comment has been minimized.

@KiChjang

KiChjang Sep 5, 2017

Member

Is there a reason why you need a String here instead of a &str?

Also, since this function can return an error, the return type should be Fallible<Vec<ImageSource>>.


pub fn parse_a_srcset_attribute(input: String) -> Vec<ImageSource> {
let mut start = 0;
let mut candidates: Vec<ImageSource> = Vec::new();

This comment has been minimized.

@KiChjang

KiChjang Sep 5, 2017

Member

Not important, but vec![] is equivalent to Vec::new().

pub den: Option<f64>,
}

pub fn collect_sequence_characters<F>(s: &str, predicate: F) -> (&str, &str)

This comment has been minimized.

@KiChjang

KiChjang Sep 5, 2017

Member

Link to https://infra.spec.whatwg.org/#collect-a-sequence-of-code-points.

I'm also doubtful whether this function belongs here. It sounds like script/dom/bindings/str.rs is a better place.

{
for (i, ch) in s.chars().enumerate() {
if !predicate(&ch) {
return (&s[0..i], &s[i..])

This comment has been minimized.

@KiChjang

KiChjang Sep 5, 2017

Member

This does not look correct; the spec says:

While position doesn’t point past the end of input and the code point at position within input meets the condition condition:

Append that code point to the end of result.

Advance position by 1.

In other words, whenever the condition (in the code it is named predicate instead) is fulfilled, the character that is fulfilling the condition must be appended to result.

Currently, your code would return immediately when it discovers the first character that returns false from invoking predicate, causing the remaining characters to not be parsed according to the spec.

let(spaces, position) = collect_sequence_characters(position, |c| *c ==',' || char::is_whitespace(*c));
let x = spaces.find(',');
match x {
Some(val) => println!("Parse Error"),

This comment has been minimized.

@KiChjang

KiChjang Sep 5, 2017

Member

As previously stated, you should return an Error here instead of println!.

This comment has been minimized.

@jdm

jdm Sep 5, 2017

Member

As previously discussed with Neha, the spec does not actually want us to abort parsing when a parse error is encountered.

@@ -9,8 +9,8 @@ interface HTMLImageElement : HTMLElement {
attribute DOMString alt;
[CEReactions]
attribute DOMString src;
// [CEReactions]
// attribute DOMString srcset;
[CEReactions]

This comment has been minimized.

@KiChjang

KiChjang Sep 5, 2017

Member

In order to generate appropriate bindings for the return type, you'll need to add the [Throws] attribute here.

@@ -10,6 +10,7 @@ pub use dom::bindings::cell::DOMRefCell;
pub use dom::bindings::js::JS;
pub use dom::node::Node;
pub use dom::bindings::refcounted::TrustedPromise;
pub use std::string::String;

This comment has been minimized.

@KiChjang

KiChjang Sep 5, 2017

Member

This should not be necessary.

This comment has been minimized.

@jdm

jdm Sep 12, 2017

Member

Let's remove this.

@@ -0,0 +1,72 @@
use script::test::String;

This comment has been minimized.

@KiChjang

KiChjang Sep 5, 2017

Member

This should not be necessary.

This comment has been minimized.

@jdm

jdm Sep 12, 2017

Member

Let's remove this.

@KiChjang KiChjang assigned KiChjang and unassigned jdm Sep 5, 2017
@jdm jdm unassigned KiChjang Sep 5, 2017
@iamneha iamneha force-pushed the iamneha:parseSrcset branch 3 times, most recently from 30c6228 to 956ac15 Sep 16, 2017
@iamneha iamneha changed the title WIP(Parse srcset) - Need to add more test cases Parse Srcset Attribute Sep 20, 2017
@iamneha iamneha force-pushed the iamneha:parseSrcset branch from 4d1f7fe to 3292fd6 Sep 21, 2017
Copy link
Member

wafflespeanut left a comment

Going good so far. Nice work! A few more steps towards merging. I think some of my comments should be thrown as warnings by rustc. We should get rid of the warnings just as we do the errors.

InParens,
AfterDescriptor,
}
#[derive(Debug)]

This comment has been minimized.

@wafflespeanut

wafflespeanut Sep 24, 2017

Member

Nit: missing newline

pub url: String,
pub descriptor: Descriptor,
}
#[derive(PartialEq)]

This comment has been minimized.

@wafflespeanut

wafflespeanut Sep 24, 2017

Member

Nit: newline

pub wid: Option<u32>,
pub den: Option<f64>,
}
#[derive(Clone, Copy, HeapSizeOf, JSTraceable)]

This comment has been minimized.

@wafflespeanut

wafflespeanut Sep 24, 2017

Member

Nit: newline

@@ -978,3 +1004,177 @@ fn image_dimension_setter(element: &Element, attr: LocalName, value: u32) {
let value = AttrValue::Dimension(value.to_string(), dim);
element.set_attribute(&attr, value);
}

pub fn collect_sequence_characters<F>(s: &str, predicate: F) -> (&str, &str)
where F: Fn(&char) -> bool

This comment has been minimized.

@wafflespeanut

wafflespeanut Sep 24, 2017

Member

Nit: Improper indentation (strip out spaces for this and the following lines)

let mut candidates: Vec<ImageSource> = vec![];
while start < input.len() {
let position = &input[start..];
let (spaces, position) = collect_sequence_characters(position, |c| *c ==',' || char::is_whitespace(*c));

This comment has been minimized.

@wafflespeanut

wafflespeanut Sep 24, 2017

Member

Nit: missing space (should be *c == ',')

}
if !error {
let descriptor = Descriptor { wid: width, den: density };
let imageSource = ImageSource { url: url, descriptor: descriptor };

This comment has been minimized.

@wafflespeanut

wafflespeanut Sep 24, 2017

Member

Nit: camel case - image_source

let empty_vec = Vec::new();
assert_eq!(parse_a_srcset_attribute("small-image.jpg 320w 1.1x"), empty_vec);
}
#[test]

This comment has been minimized.

@wafflespeanut

wafflespeanut Sep 24, 2017

Member

Nit: newline between functions please

let has_x = remaining == "x";
let has_h = remaining == "h";
if valid_non_negative_integer.is_ok() && has_w {
let result = parse_unsigned_integer(char_iter.clone());

This comment has been minimized.

@wafflespeanut

wafflespeanut Sep 24, 2017

Member

Why should we parse this again if we already have the result in valid_non_negative_integer? We could just unwrap it, right?

width = Some(w);
}
} else if valid_floating_point.is_ok() && has_x {
let result = parse_double(digits);

This comment has been minimized.

@wafflespeanut

wafflespeanut Sep 24, 2017

Member

Ditto on unnecessary re-parsing. We should unwrap from valid_floating_point

density = Some(x);
}
} else if valid_non_negative_integer.is_ok() && has_h {
let result = parse_unsigned_integer(char_iter.clone());

This comment has been minimized.

@wafflespeanut

wafflespeanut Sep 24, 2017

Member

Ditto on re-parsing.

}
}
return (s, "");
}

This comment has been minimized.

@wafflespeanut

wafflespeanut Sep 28, 2017

Member

Here's the correct indentation:

pub fn collect_sequence_characters<F>(s: &str, predicate: F) -> (&str, &str)
    where F: Fn(&char) -> bool
{
    for (i, ch) in s.chars().enumerate() {
        if !predicate(&ch) {
            return (&s[0..i], &s[i..])
        }
    }
    
    return (s, "");
}
@wafflespeanut
Copy link
Member

wafflespeanut commented Sep 28, 2017

Sorry, I just realized that you haven't finished updating the code. 😅

@iamneha iamneha force-pushed the iamneha:parseSrcset branch from 43c332f to 84a88d3 Sep 29, 2017
@jdm jdm mentioned this pull request Oct 2, 2017
4 of 4 tasks complete
@jdm
Copy link
Member

jdm commented Oct 2, 2017

I've squashed these commits in #18716.

@jdm jdm closed this Oct 2, 2017
bors-servo added a commit that referenced this pull request Oct 3, 2017
Parse srcset attribute values

Squashed version of #18313.

---
- [x] `./mach build -d` does not report any errors
- [x] `./mach test-tidy` does not report any errors
- [x] These changes fix (partially) #11416
- [x] There are tests for these changes

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/18716)
<!-- Reviewable:end -->
bors-servo added a commit that referenced this pull request Oct 4, 2017
Parse srcset attribute values

Squashed version of #18313.

---
- [x] `./mach build -d` does not report any errors
- [x] `./mach test-tidy` does not report any errors
- [x] These changes fix (partially) #11416
- [x] There are tests for these changes

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/18716)
<!-- Reviewable:end -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

7 participants
You can’t perform that action at this time.