Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a more robust representation of selectors #20

Closed
5 tasks done
philss opened this issue Sep 4, 2015 · 4 comments
Closed
5 tasks done

Add a more robust representation of selectors #20

philss opened this issue Sep 4, 2015 · 4 comments

Comments

@philss
Copy link
Owner

philss commented Sep 4, 2015

The current implementation does not allow searching for elements using a multi selector, like a.some-class or .some-class.another-class. This is because it is coupled to the idea of one search per selector at time. This means that I can't search a "tag" and a "class" in the same element at the same time.

We need to provide a basic infrastructure to be able to search using groups of selectors.
This could bring a huge flexibility to Floki, since it's would be possible to truly simulates the "jQuery" selector (actually simulates Sizzle, the jQuery query selector engine).

Examples of queries to support:

  • Floki.find("a.foo")
  • Floki.find(".foo.bar")
  • Floki.find(".foo[data-js=bar]")
  • Floki.find("a.foo[data='baz.html']")
  • Floki.find("a b.foo")

The following examples can be implemented as well, but since depends on a more robust representation of the HTML tree, it should be implemented in the future:

  • Floki.find("a + b") (matches b followed by a)
  • Floki.find("a > b") (matches b that is right in the first level of children of a)
  • Floki.find("a:first-child") (matchesawhena` is the first child of its parent)

This issue is related to #18

@philss philss added this to the 1.0 milestone Sep 4, 2015
@edmz
Copy link

edmz commented Sep 4, 2015

Just curious, in you experience, is Floki.find("a + b") harder to implement than Floki.find("a.foo")?

@philss
Copy link
Owner Author

philss commented Sep 4, 2015

@edmz I think it's harder because for the Floki.find("a + b") we need to have the concept of "parent" selector, and a "parent lookup". Today it's possible to perform a descending search using Floki.find("a b"), but this is easier because we only lookup elements inside the previous selector.

Floki.find("a.foo") would require a multiple match in the same element, making it easier than the previous example since we don't need to check the "position", or parent/child element, of the target element inside the HTML tree.

@philss
Copy link
Owner Author

philss commented Sep 12, 2015

Just to let you know: I'm working on this. It's getting shape and I should have a working version soon.

philss added a commit that referenced this issue Sep 17, 2015
This is a huge refactor in the `Floki.find/2` function that enables more
complex searches using a mix of selectors. You can mix selectors
like you would normally do in other tools like jQuery or to apply rules
using CSS selectors.

Examples of queries now supported:
- "a.foo"
- ".foo.bar"
- ".baz[data='something']"
- "[title][href$='.html']"
- "a b.foo c"

To archive this, it was necessary to write a tokenizer and a parser for
the inputted selector. It's quite easy to understand after read this
article by Andrea Leopardi (@whatyouhide) about tokenizing and parsing
in Elixir:
http://andrealeopardi.com/posts/tokenizing-and-parsing-in-elixir-using-leex-and-yecc/

The tokenizer partially covers the specs of CSS3 selectors, that you can
find at http://www.w3.org/TR/css3-selectors/

Knowning issues:
- There is no support for pseudo-selectors;
- The only combinator supported is descendant combinator;
- If there is a group of selectors in the same query, and two selectors
  matches the same node, this node will appear twice in the resultant
  list.

Closes #18 and #20.
@philss
Copy link
Owner Author

philss commented Sep 17, 2015

The merge of #22 closes this issue. I will open new issues related to the selectors that are missing.

@philss philss closed this as completed Sep 17, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants