Skip to content
This repository has been archived by the owner on Oct 19, 2021. It is now read-only.

Experiment with chainable selectors and rewriters #3

Closed
wants to merge 2 commits into from

Conversation

plexus
Copy link
Owner

@plexus plexus commented Jun 26, 2013

This commit introduces a first attempt at coming up with a chainable interface
for describing transformations of Hexp trees.

Hexp::Node::Selector

The idea is to have certain methods that select certain nodes in a tree. At
this point there is one such method : select, which takes a block that receives
a node and returns a boolean.

  node.select {|child_node| child_node.class? 'strong' }

The result is a Hexp::Node::Selector, it is Enumerable, so one can do #to_a, #map,
etc, to loop over the nodes in the tree that have class="strong".

Hexp::Node::Rewriter

This can be combined with the concept of "rewriting" a tree, here the block receives
a node, and the return value of the block takes the place of the original node in the
new, rewritten tree.

This return value can be a single node or an array of nodes (zero or more). Nil is
treated special in that it is consider a no-op. When the block returns nil, the
original node is simply kept in the tree unaltered.

This example will give every <p> the class "para".

  node.rewrite do |child_node|
    if child_node.tag == :p
      child_node.attr('class', 'para')}
    end
  end

Combined

Nodes and Selectors both respond to #rewrite, in the latter case only the nodes
that match the selection criterium are passed to the block, so the former example
can be rewritten as

  node.select  {|child| child.tag == :p}
      .rewrite {|child| child_node.attr('class', 'para') }

The result is an instance of Hexp::Node::Rewriter, which lazily evaluates the
transformation when converted to a Hexp. It includes Hexp::DSL, so it can be
used as any Hexp::Node (tag, attributes, to_html, etc).

These two can form the basic building blocks for a richer API. As a tryout they
already implement #wrap and #attr, so to take every paragraph, give it the class
'para', and wrap it in a div with class 'wrapper' you can write

  node.select {|child| child.tag == :p}
      .attr('class', 'para')
      .wrap(:div, 'class' => 'wrapper')

The future

The next step will be to leverage SASS's CSS parser to implement a CSS selector,
then the above would become

  node.css('p')
      .attr('class', 'para')
      .wrap(:div, 'class' => 'wrapper')

tl;dr drinks for feedback, I mean it

API design is hard, so I'm desperately looking for feedback on this. I will
personally buy you a beer (or other beverage of your choosing), at Eurucamp,
Arrrrcamp, any time in Berlin or at the first occasion that presents itself,
for every constructive comment on this PR.

This commit introduces a first attempt at coming up with a chainable interface
for describing transformations of Hexp trees.

The idea is to have certain methods that select certain nodes in a tree. At
this point there is one such method : select, which takes a block that receives
a node and returns a boolean.

  node.select {|child_node| child_node.class? 'strong' }

The result is a Hexp::Node::Selector, it is Enumerable, so one can do #to_a, #map,
etc and to loop over the nodes in the tree that have class="strong".

This can be combined with the concept of "rewriting" a tree, here the block receives
a node, and the return value of the block takes the place of the original node in the
new, rewritten tree.

This return value can be a single node or an array of nodes (zero or more). Nil is
treated special in that it is consider a no-op. When the block returns nil, the
original node is simply kept in the tree unaltered.

This example will give every <p> the class "para".

  node.rewrite do |child_node|
    if child_node.tag == :p
      child_node.attr('class', 'para')}
    end
  end

Nodes and Selectors both respond to #rewrite, in the latter case only the nodes
that match the selection criterium are passed to the block, so the former example
can be rewritten as

  node.select  {|child| child.tag == :p}
      .rewrite {|child| child_node.attr('class', 'para') }

The result is an instance of Hexp::Node::Rewriter, which lazily evaluates the
transformation when converted to a Hexp. It includes Hexp::DSL, so it can be
used as any Hexp::Node (tag, attributes, to_html, etc).

These two can form the basic building blocks for a richer API. As a tryout they
already implement #wrap and #attr, so to take every paragraph, give it the class
'para', and wrap it in a div with class 'wrapper' you can write

  node.select {|child| child.tag == :p}
      .attr('class', 'para')
      .wrap(:div, 'class' => 'wrapper')

The next step would be to leverage SASS's CSS parser to implement a CSS selector,
then the above would become

  node.css('p')
      .attr('class', 'para')
      .wrap(:div, 'class' => 'wrapper')

API design is hard, so I'm desperately looking for feedback on this. I will
personally buy you a beer (or other beverage of your choosing), at Eurucamp,
Arrrrcamp, any time in Berlin or at the first occasion that presents itself,
for every constructive comment on this PR.
@plexus
Copy link
Owner Author

plexus commented Jun 26, 2013

note that the build will certainly fail due to metrics dropping, that's fine for now, I'll clean it up

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling 7d1ad9f on chainable_api_rfc into 628cbd2 on master.

@coveralls
Copy link

Coverage Status

Coverage remained the same when pulling 36974b6 on chainable_api_rfc into 628cbd2 on master.

@PragTob
Copy link

PragTob commented Jun 27, 2013

So at first I was like "Ok what would I need that for?" but the later comments made it clearer. I like the future version, but I tend to like things that look like the jQuery API since I really love the jQuery API.

However jQuery is mainly used for interactive alteration of Stuff. Hexp is build to generate HTML in the first place. I mean why modify nodes afterwards and not just create them like this in the first place? I can see instances where that would be DRYer, which is cool but I dunno if it would be enough for people to change the way they write HTML - missing the killer feature part. Maybe I'm just tired.

Cheers,
Tobi

@plexus
Copy link
Owner Author

plexus commented Jun 27, 2013

Hey Tobi! Nice to see you here :)

You are right about jQuery, I have a mental node that I should take a better look at them and mimic some of that API. One big difference is that one of the constraints I'm putting on Hexp is that all values are immutable, whereas jQuery has a mutable DOM to play with. But that's mostly an implementation issue, the APIs could still resemble one another.

You are also right that typically you would be mostly just generating/composing things the way you want them. But I can think of several use cases where having this orthogonal way of altering existing trees is great. In fact after preventing XSS I think this is the killer feature for Hexp, I just need to come out with some better examples.

For example :

  • You have a shopping cart system, and then a plugin that adds discount code functionality. The plugin is completely separated from the core implementation, it can just add that extra field in the tree
  • Adding edit buttons or in-place-editing when logged in as admin

@til might have some more to say about this, since he's actually working on a very component based system.

@PragTob
Copy link

PragTob commented Jun 27, 2013

Related to adding functionality in to existing HTML docs: spree/deface

Those use cases are indeed better, thanks!

edit: scumbag github doesn't make a link :-( so here it is: https://github.com/spree/deface

@plexus
Copy link
Owner Author

plexus commented Jun 27, 2013

Ah yes, great example! And it depends on Nokogiri, so I'm guessing they parse the output, alter the Nokogiri DOM, then generate HTML again. A similar approach is used by https://github.com/Wardrop/Formless to populate forms with values from the request.

This going back and forth between parsing and generating is expensive. The first Ruby web framework I contributed to, Nitro, did something similar. Views were processed in several passes, each basically applying XML transformations, so you could do things like this

<ul>
  <li foreach="@users as user">{{user.name}}</li>
</ul>

I'm just making up syntax now, don't remember exactly. The thing is that each pass would parse, alter, generate, which was horribly slow. (I'm not saying Hexp is super fast, I haven't benchmarked at all, but altering arrays and hashes should be lighter than parsing HTML. And the immutable data structures should provide a good basis for optimization.)

@plexus
Copy link
Owner Author

plexus commented Jun 27, 2013

Relatedly : Drupal's Form API is also interesting. You build a data structure that represents your form, and Drupal will take care of rendering it. This is at a higher level than HTML. Modules can reference forms from other modules by an id, and then alter them.

I'm no longer a big fan of Drupal but this is something they really got right and that I miss in Rails. This form description is again used as a pre-processing step when the request comes in, so only fields in your forms are passed through, and lengths/types are enforced. So you have one source of truth for what you have in three places in rails : form helper in views, parameter checking in the controller (e.g. strong_params, or just to_i/to_s), and validations on models. And you get a consistent UI because people don't manually write all their form HTML :)

Basically this is a good example of "data structures over strings", where "data structure" is not synonymous to "syntax tree". It's simply a higher level representation of your app's semantics, and something that you can pass around and work with.

@til
Copy link

til commented Jun 29, 2013

The drupal form builder example sounds like an interesting use case for Hexp. It propably is often the case that html generation is only one feature of dealing with a tree of higher level objects that are central to the application, in that example form elements, or rather model attributes. In the classic CMS use case, one might deal with a tree of page elements. Rendering them to plain html is one feature, rendering them to wysiwyg editable html is another feature, serializing their content another one. The latter having potentially nothing to do with Hexp anymore, but somehow a tree of page elements and a tree of html nodes seem very similar things to me, just with different levels of detail.

I guess a sufficiently real-world example is necessary to drive the further development of the Hexp API, maybe you want to experiment with a drupal-like form builder? Maybe something that automatically renders forms for models that have ROM attribute definitions?

And some random API remarks: I find using jquery selectors so common that I wouldn't use it as method name, but rather have select or even rewrite accept it as single optional argument:

doc.rewrite('article > h1') { |h1| h1.text = h1.text.upcase }

Also would it be possible to not look at the return value of the rewrite block, but allow direct node modifications inside the block? I find the special case nil return value cumbersome, and would prefer something like this:

doc.rewrite('article > h2') { |h2| h2.remove! }

Also, why not use [] for attributes, e.g. node[:class] = 'foo' instead of node.attr('class', 'foo')?

@plexus
Copy link
Owner Author

plexus commented Jun 29, 2013

Hey Til, thanks for the joining the discussion.

I also think a more concrete example of one of these use cases will help drive things forward, so I played a bit with implement a form builder API, see https://gist.github.com/plexus/5892947

The interesting stuff is

IssueForm = Form.build do
  textfield  :title,    title: "Title"
  select     :priority, title: "Priority",
                        options: Issue::PRIORITIES
end

In your controller you can now do things like

IssueForm.new(Issue.new).to_html

# or

form = IssueForm.new(params)
Issue.create(form.values) # filtered and type-cast based on form definition

Will reply to your other suggestions later. In short : I definitely want to add #[] for accessing attributes, but not necessarily for setting them. All hexp objects are immutable at the moment, and that's a constraint I am very reluctant to drop.

@plexus
Copy link
Owner Author

plexus commented Jul 9, 2013

Adding links here for reference, this would be great to implement with hexp

http://begriffs.github.io/showpiece
https://github.com/begriffs/showpiece

@plexus plexus mentioned this pull request Jul 22, 2013
@plexus plexus closed this Jul 27, 2013
@plexus
Copy link
Owner Author

plexus commented Jul 29, 2013

Basic CSS selectors are now implemented, and the rewriter/selector stuff has been refactored, covered by tests and is now on master.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants