Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse Srcset Attribute #18313

Closed
wants to merge 2 commits into from
Closed
Changes from 1 commit
Commits
File filter...
Filter file types
Jump to…
Jump to file
Failed to load files.

Always

Just for now

Next

Parse srcset attribute

  • Loading branch information
iamneha committed Sep 29, 2017
commit 84a88d3e13812d604891282140928f35b056de51
@@ -55,9 +55,28 @@ use std::cell::{Cell, RefMut};
use std::default::Default;
use std::i32;
use std::sync::{Arc, Mutex};
use style::attr::{AttrValue, LengthOrPercentageOrAuto};
use style::attr::{AttrValue, LengthOrPercentageOrAuto, parse_double, parse_unsigned_integer};
use style::str::is_ascii_digit;
use task_source::TaskSource;

enum ParseState {

This comment has been minimized.

Copy link
@wafflespeanut

wafflespeanut Sep 24, 2017

Member

#[allow(dead_code)] should be used only when we have something that we don't use right now, but we have plans for it in the future. That's not the case of ParseState, so we should remove it. Also, since ParseState is not stored in any DOM struct, we don't have to derive HeapSizeOf and JSTraceable for it.

InDescriptor,
InParens,
AfterDescriptor,
}

#[derive(Debug, PartialEq)]
pub struct ImageSource {
pub url: String,
pub descriptor: Descriptor,
}

#[derive(Debug, PartialEq)]
pub struct Descriptor {
pub wid: Option<u32>,
pub den: Option<f64>,
}

#[derive(Clone, Copy, HeapSizeOf, JSTraceable)]

This comment has been minimized.

Copy link
@wafflespeanut

wafflespeanut Sep 24, 2017

Member

Nit: newline

#[allow(dead_code)]
enum State {
@@ -739,6 +758,11 @@ impl HTMLImageElementMethods for HTMLImageElement {
// https://html.spec.whatwg.org/multipage/#dom-img-src
make_setter!(SetSrc, "src");

// https://html.spec.whatwg.org/multipage/#parsing-a-srcset-attribute
make_url_getter!(Srcset, "srcset");
// https://html.spec.whatwg.org/multipage/#parsing-a-srcset-attribute
make_setter!(SetSrcset, "srcset");

// https://html.spec.whatwg.org/multipage/#dom-img-crossOrigin
fn GetCrossOrigin(&self) -> Option<DOMString> {
reflect_cross_origin_attribute(self.upcast::<Element>())
@@ -978,3 +1002,177 @@ fn image_dimension_setter(element: &Element, attr: LocalName, value: u32) {
let value = AttrValue::Dimension(value.to_string(), dim);
element.set_attribute(&attr, value);
}

/// Collect sequence of code points
pub fn collect_sequence_characters<F>(s: &str, predicate: F) -> (&str, &str)

This comment has been minimized.

Copy link
@KiChjang

KiChjang Sep 5, 2017

Member

Link to https://infra.spec.whatwg.org/#collect-a-sequence-of-code-points.

I'm also doubtful whether this function belongs here. It sounds like script/dom/bindings/str.rs is a better place.

This comment has been minimized.

Copy link
@wafflespeanut

wafflespeanut Sep 24, 2017

Member

Nit: Doc comment on what this function does?

where F: Fn(&char) -> bool
{
for (i, ch) in s.chars().enumerate() {
if !predicate(&ch) {
return (&s[0..i], &s[i..])
}
}

This comment has been minimized.

Copy link
@wafflespeanut

wafflespeanut Sep 28, 2017

Member

Here's the correct indentation:

pub fn collect_sequence_characters<F>(s: &str, predicate: F) -> (&str, &str)
    where F: Fn(&char) -> bool
{
    for (i, ch) in s.chars().enumerate() {
        if !predicate(&ch) {
            return (&s[0..i], &s[i..])
        }
    }
    
    return (s, "");
}

return (s, "");
}

/// Parse an `srcset` attribute - https://html.spec.whatwg.org/multipage/#parsing-a-srcset-attribute.
pub fn parse_a_srcset_attribute(input: &str) -> Vec<ImageSource> {
let mut url_len = 0;
let mut candidates: Vec<ImageSource> = vec![];
while url_len < input.len() {
let position = &input[url_len..];
let (spaces, position) = collect_sequence_characters(position, |c| *c == ',' || char::is_whitespace(*c));
// add the length of the url that we parse to advance the start index
let space_len = spaces.char_indices().count();

This comment has been minimized.

Copy link
@wafflespeanut

wafflespeanut Aug 31, 2017

Member

We should go for chars only if we're concerned with unicode codepoints. If it's just spaces, then we could simply use spaces.len()

url_len += space_len;
if position.is_empty() {
return candidates;
}
let (url, spaces) = collect_sequence_characters(position, |c| !char::is_whitespace(*c));
// add the counts of urls that we parse to advance the start index

This comment has been minimized.

Copy link
@jdm

jdm Sep 12, 2017

Member

add the length of the url that we parse to advance the start index

url_len += url.chars().count();
let comma_count = url.chars().rev().take_while(|c| *c == ',').count();
let url: String = url.chars().take(url.chars().count() - comma_count).collect();
// add 1 to start index, for the comma
url_len += comma_count + 1;
let (space, position) = collect_sequence_characters(spaces, |c| char::is_whitespace(*c));
let space_len = space.len();

This comment has been minimized.

Copy link
@wafflespeanut

wafflespeanut Sep 24, 2017

Member

We don't seem to be using spaces_len anywhere else, so let's remove this.

url_len += space_len;
let mut descriptors = Vec::new();

This comment has been minimized.

Copy link
@wafflespeanut

wafflespeanut Sep 24, 2017

Member

Nit: Our function couldn't get bigger. Let's drop a newline here for isolating a different span of code.

let mut current_descriptor = String::new();
let mut state = ParseState::InDescriptor;
let mut char_stream = position.chars().enumerate();
let mut buffered: Option<(usize, char)> = None;
loop {
let next_char = buffered.take().or_else(|| char_stream.next());
if next_char.is_some() {
url_len += 1;
}
match state {
ParseState::InDescriptor => {
match next_char {
Some((_, ' ')) => {
if !current_descriptor.is_empty() {
descriptors.push(current_descriptor.clone());
current_descriptor = String::new();
state = ParseState::AfterDescriptor;
}
continue;
}
Some((_, ',')) => {
if !current_descriptor.is_empty() {
descriptors.push(current_descriptor.clone());
}
break;
}
Some((_, c @ '(')) => {
current_descriptor.push(c);
state = ParseState::InParens;
continue;
}
Some((_, c)) => {
current_descriptor.push(c);
}
None => {
if !current_descriptor.is_empty() {
descriptors.push(current_descriptor.clone());
}
break;
}
}
}
ParseState::InParens => {
match next_char {
Some((_, c @ ')')) => {
current_descriptor.push(c);
state = ParseState::InDescriptor;
continue;
}
Some((_, c)) => {
current_descriptor.push(c);
continue;
}
None => {
if !current_descriptor.is_empty() {
descriptors.push(current_descriptor.clone());
}
break;
}
}
}
ParseState::AfterDescriptor => {
match next_char {
Some((_, ' ')) => {
state = ParseState::AfterDescriptor;
continue;
}
Some((idx, c)) => {
state = ParseState::InDescriptor;
buffered = Some((idx, c));

This comment has been minimized.

Copy link
@wafflespeanut

wafflespeanut Sep 24, 2017

Member

This is probably the only place where we're making use of idx - so, let's replace all instances of idx with _ in the patterns above.

continue;
}
None => {
if !current_descriptor.is_empty() {
descriptors.push(current_descriptor.clone());
}
break;
}
}
}
}
}

let mut error = false;

This comment has been minimized.

Copy link
@wafflespeanut

wafflespeanut Sep 24, 2017

Member

Nit: newline please

let mut width: Option<u32> = None;
let mut density: Option<f64> = None;
let mut future_compat_h: Option<u32> = None;

This comment has been minimized.

Copy link
@jdm

jdm Sep 12, 2017

Member

We should declare these inside the loop, or else they will never be reset. I suspect that parsing foo.jpg 2w, bar.jpg 1.0x would fail right now.

This comment has been minimized.

Copy link
@iamneha

iamneha Sep 12, 2017

Author Contributor

It is not failing I tried this also and added test for this also

for descriptor in descriptors {
let (digits, remaining) = collect_sequence_characters(&descriptor, |c| is_ascii_digit(c) || *c == '.');
let valid_non_negative_integer = parse_unsigned_integer(digits.chars());

This comment has been minimized.

Copy link
@jdm

jdm Sep 12, 2017

Member

We probably don't need .chars() here.

let has_w = remaining == "w";
let valid_floating_point = parse_double(digits);
let has_x = remaining == "x";
let has_h = remaining == "h";
if valid_non_negative_integer.is_ok() && has_w {

This comment has been minimized.

Copy link
@wafflespeanut

wafflespeanut Aug 31, 2017

Member

This should be !has_w, or more simply, we should check has_w && width.is_some() first (and error if it's true) before checking the non-negative integer thingy.

During descriptor parsing, we ensure that there are no repetitions in descriptors, which means there shouldn't be any usage of w, x or h more than once - i.e., if the value ends with h and we already have a height value in the descriptor, then it's a parse error.

let result = valid_non_negative_integer;
error = result.is_err();
if width.is_some() || density.is_some() {
error = true;
}
if let Ok(w) = result {
width = Some(w);

This comment has been minimized.

Copy link
@creativcoder

creativcoder Sep 5, 2017

Contributor

As we are just pulling the value out and converting it to option type (i.e, Some(w)), we can use the ok method like: width = result.ok(). Calling ok() on a Result<T> type converts it to an Option<T>. ok() method does what you are currently doing here.

}
} else if valid_floating_point.is_ok() && has_x {
let result = valid_floating_point;
error = result.is_err();
if width.is_some() || density.is_some() || future_compat_h.is_some() {
error = true;
}
if let Ok(x) = result {
density = Some(x);
}
} else if valid_non_negative_integer.is_ok() && has_h {
let result = valid_non_negative_integer;
error = result.is_err();
if density.is_some() || future_compat_h.is_some() {
error = true;
}
if let Ok(h) = result {
future_compat_h = Some(h);
}
} else {
error = true;
}
}
if future_compat_h.is_some() && width.is_none() {
error = true;
}
if !error {
let descriptor = Descriptor { wid: width, den: density };
let image_source = ImageSource { url: url, descriptor: descriptor };
candidates.push(image_source);
}
}
candidates
}
@@ -9,8 +9,8 @@ interface HTMLImageElement : HTMLElement {
attribute DOMString alt;
[CEReactions]
attribute DOMString src;
// [CEReactions]
// attribute DOMString srcset;

This comment has been minimized.

Copy link
@creativcoder

creativcoder Sep 5, 2017

Contributor

Nit: Please remove this commented line

[CEReactions]

This comment has been minimized.

Copy link
@KiChjang

KiChjang Sep 5, 2017

Member

In order to generate appropriate bindings for the return type, you'll need to add the [Throws] attribute here.

attribute DOMString srcset;
[CEReactions]
attribute DOMString? crossOrigin;
[CEReactions]
@@ -58,7 +58,8 @@ pub fn split_commas<'a>(s: &'a str) -> Filter<Split<'a, char>, fn(&&str) -> bool
s.split(',').filter(not_empty as fn(&&str) -> bool)
}

fn is_ascii_digit(c: &char) -> bool {
/// Character is ascii digit
pub fn is_ascii_digit(c: &char) -> bool {
match *c {
'0'...'9' => true,
_ => false,
ProTip! Use n and p to navigate between commits in a pull request.
You can’t perform that action at this time.