Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return selector after consuming combinator #115

Merged
merged 2 commits into from
May 25, 2017

Conversation

jjcarstens
Copy link
Contributor

Since the change to parse all tokens, a selector is returned when encountering a :comma token.

When we consume_combinators, that calls do_parse which gets a selector and continues on parsing tokens. However, the selector is returned as a result of matching :comma token and removes it from the remaining tokens. So when parsing continues after consume_combinator, it continues parsing the previous selector instead of starting a new one.

This changes so consume_combinator functions return a selector tuple since they will get a fully completed selector after parsing.

# before the change
iex> "a b.foo c, ul a[class=bar]" |> Floki.SelectorTokenizer.tokenize |> Floki.SelectorParser.parse |> length
1
iex> "a b.foo c, ul a[class=bar]" |> Floki.SelectorTokenizer.tokenize |> Floki.SelectorParser.parse
[%Floki.Selector{attributes: [], classes: [],
  combinator: %Floki.Combinator{match_type: :descendant,
   selector: %Floki.Selector{attributes: [], classes: ["foo"],
    combinator: %Floki.Combinator{match_type: :descendant,
     selector: %Floki.Selector{attributes: [%Floki.AttributeSelector{attribute: "class",
        match_type: :equal, value: "bar"}], classes: [], combinator: nil,
      id: nil, namespace: nil, pseudo_classes: [], type: "a"}}, id: nil,
    namespace: nil, pseudo_classes: [], type: "ul"}}, id: nil, namespace: nil,
  pseudo_classes: [], type: "a"}]

# With proposed changes
iex> "a b.foo c, ul a[class=bar]" |> Floki.SelectorTokenizer.tokenize |> Floki.SelectorParser.parse |> length
2
iex> "a b.foo c, ul a[class=bar]" |> Floki.SelectorTokenizer.tokenize |> Floki.SelectorParser.parse
[%Floki.Selector{attributes: [], classes: [],
  combinator: %Floki.Combinator{match_type: :descendant,
   selector: %Floki.Selector{attributes: [], classes: ["foo"],
    combinator: %Floki.Combinator{match_type: :descendant,
     selector: %Floki.Selector{attributes: [], classes: [], combinator: nil,
      id: nil, namespace: nil, pseudo_classes: [], type: "c"}}, id: nil,
    namespace: nil, pseudo_classes: [], type: "b"}}, id: nil, namespace: nil,
  pseudo_classes: [], type: "a"},
 %Floki.Selector{attributes: [], classes: [],
  combinator: %Floki.Combinator{match_type: :descendant,
   selector: %Floki.Selector{attributes: [%Floki.AttributeSelector{attribute: "class",
      match_type: :equal, value: "bar"}], classes: [], combinator: nil, id: nil,
    namespace: nil, pseudo_classes: [], type: "a"}}, id: nil, namespace: nil,
  pseudo_classes: [], type: "ul"}]

I think this covers all the missing cases, but would definitely love close 👀 and any suggestions you believe may still need to be tested.

Since [the change to parse all tokens a selector is returned when encountering a `:comma` token](https://github.com/philss/floki/pull/106/files#diff-7f26dd84090995cad0112f38a801b18bR32). However, when we `consume_combinators`, that calls `do_parse` which gets a selector and continues on parsing tokens. However, the selector is returned as a result of matching `:comma` token and removes it from the remaining tokens. So when parsing continues after `consume_combinator`, it continues parsing the previous selector instead of starting a new one.

This changes so `consume_combinator` functions return a selector tuple since they will get a fully completed selector after parsing.
@mischov
Copy link
Contributor

mischov commented May 25, 2017

Good catch. Your changes should be correct.

In the code that #106 is based on I append a comma onto the tokens returned from consuming a combinator, forcing a return of the selector when I try to continue to parse the selector (because reasons).

The behavior of appending a comma to the tokens was overkill in Floki so I removed it, but I (clearly) forgot to change the code to return the {selector, remaining_tokens} tuple like you have instead of continuing to parse the selector.

@philss philss merged commit c2187af into philss:master May 25, 2017
@philss
Copy link
Owner

philss commented May 25, 2017

I think this is correct! Thank you again, @jjcarstens!

@philss
Copy link
Owner

philss commented May 25, 2017

This fix was released in version 0.17.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants