-
Notifications
You must be signed in to change notification settings - Fork 3
Experiment with chainable selectors and rewriters #3
Conversation
This commit introduces a first attempt at coming up with a chainable interface for describing transformations of Hexp trees. The idea is to have certain methods that select certain nodes in a tree. At this point there is one such method : select, which takes a block that receives a node and returns a boolean. node.select {|child_node| child_node.class? 'strong' } The result is a Hexp::Node::Selector, it is Enumerable, so one can do #to_a, #map, etc and to loop over the nodes in the tree that have class="strong". This can be combined with the concept of "rewriting" a tree, here the block receives a node, and the return value of the block takes the place of the original node in the new, rewritten tree. This return value can be a single node or an array of nodes (zero or more). Nil is treated special in that it is consider a no-op. When the block returns nil, the original node is simply kept in the tree unaltered. This example will give every <p> the class "para". node.rewrite do |child_node| if child_node.tag == :p child_node.attr('class', 'para')} end end Nodes and Selectors both respond to #rewrite, in the latter case only the nodes that match the selection criterium are passed to the block, so the former example can be rewritten as node.select {|child| child.tag == :p} .rewrite {|child| child_node.attr('class', 'para') } The result is an instance of Hexp::Node::Rewriter, which lazily evaluates the transformation when converted to a Hexp. It includes Hexp::DSL, so it can be used as any Hexp::Node (tag, attributes, to_html, etc). These two can form the basic building blocks for a richer API. As a tryout they already implement #wrap and #attr, so to take every paragraph, give it the class 'para', and wrap it in a div with class 'wrapper' you can write node.select {|child| child.tag == :p} .attr('class', 'para') .wrap(:div, 'class' => 'wrapper') The next step would be to leverage SASS's CSS parser to implement a CSS selector, then the above would become node.css('p') .attr('class', 'para') .wrap(:div, 'class' => 'wrapper') API design is hard, so I'm desperately looking for feedback on this. I will personally buy you a beer (or other beverage of your choosing), at Eurucamp, Arrrrcamp, any time in Berlin or at the first occasion that presents itself, for every constructive comment on this PR.
note that the build will certainly fail due to metrics dropping, that's fine for now, I'll clean it up |
So at first I was like "Ok what would I need that for?" but the later comments made it clearer. I like the future version, but I tend to like things that look like the jQuery API since I really love the jQuery API. However jQuery is mainly used for interactive alteration of Stuff. Hexp is build to generate HTML in the first place. I mean why modify nodes afterwards and not just create them like this in the first place? I can see instances where that would be DRYer, which is cool but I dunno if it would be enough for people to change the way they write HTML - missing the killer feature part. Maybe I'm just tired. Cheers, |
Hey Tobi! Nice to see you here :) You are right about jQuery, I have a mental node that I should take a better look at them and mimic some of that API. One big difference is that one of the constraints I'm putting on Hexp is that all values are immutable, whereas jQuery has a mutable DOM to play with. But that's mostly an implementation issue, the APIs could still resemble one another. You are also right that typically you would be mostly just generating/composing things the way you want them. But I can think of several use cases where having this orthogonal way of altering existing trees is great. In fact after preventing XSS I think this is the killer feature for Hexp, I just need to come out with some better examples. For example :
@til might have some more to say about this, since he's actually working on a very component based system. |
Related to adding functionality in to existing HTML docs: spree/deface Those use cases are indeed better, thanks! edit: scumbag github doesn't make a link :-( so here it is: https://github.com/spree/deface |
Ah yes, great example! And it depends on Nokogiri, so I'm guessing they parse the output, alter the Nokogiri DOM, then generate HTML again. A similar approach is used by https://github.com/Wardrop/Formless to populate forms with values from the request. This going back and forth between parsing and generating is expensive. The first Ruby web framework I contributed to, Nitro, did something similar. Views were processed in several passes, each basically applying XML transformations, so you could do things like this <ul>
<li foreach="@users as user">{{user.name}}</li>
</ul> I'm just making up syntax now, don't remember exactly. The thing is that each pass would parse, alter, generate, which was horribly slow. (I'm not saying Hexp is super fast, I haven't benchmarked at all, but altering arrays and hashes should be lighter than parsing HTML. And the immutable data structures should provide a good basis for optimization.) |
Relatedly : Drupal's Form API is also interesting. You build a data structure that represents your form, and Drupal will take care of rendering it. This is at a higher level than HTML. Modules can reference forms from other modules by an id, and then alter them. I'm no longer a big fan of Drupal but this is something they really got right and that I miss in Rails. This form description is again used as a pre-processing step when the request comes in, so only fields in your forms are passed through, and lengths/types are enforced. So you have one source of truth for what you have in three places in rails : form helper in views, parameter checking in the controller (e.g. strong_params, or just to_i/to_s), and validations on models. And you get a consistent UI because people don't manually write all their form HTML :) Basically this is a good example of "data structures over strings", where "data structure" is not synonymous to "syntax tree". It's simply a higher level representation of your app's semantics, and something that you can pass around and work with. |
The drupal form builder example sounds like an interesting use case for Hexp. It propably is often the case that html generation is only one feature of dealing with a tree of higher level objects that are central to the application, in that example form elements, or rather model attributes. In the classic CMS use case, one might deal with a tree of page elements. Rendering them to plain html is one feature, rendering them to wysiwyg editable html is another feature, serializing their content another one. The latter having potentially nothing to do with Hexp anymore, but somehow a tree of page elements and a tree of html nodes seem very similar things to me, just with different levels of detail. I guess a sufficiently real-world example is necessary to drive the further development of the Hexp API, maybe you want to experiment with a drupal-like form builder? Maybe something that automatically renders forms for models that have ROM attribute definitions? And some random API remarks: I find using jquery selectors so common that I wouldn't use it as method name, but rather have doc.rewrite('article > h1') { |h1| h1.text = h1.text.upcase } Also would it be possible to not look at the return value of the rewrite block, but allow direct node modifications inside the block? I find the special case nil return value cumbersome, and would prefer something like this: doc.rewrite('article > h2') { |h2| h2.remove! } Also, why not use |
Hey Til, thanks for the joining the discussion. I also think a more concrete example of one of these use cases will help drive things forward, so I played a bit with implement a form builder API, see https://gist.github.com/plexus/5892947 The interesting stuff is IssueForm = Form.build do
textfield :title, title: "Title"
select :priority, title: "Priority",
options: Issue::PRIORITIES
end In your controller you can now do things like IssueForm.new(Issue.new).to_html
# or
form = IssueForm.new(params)
Issue.create(form.values) # filtered and type-cast based on form definition Will reply to your other suggestions later. In short : I definitely want to add #[] for accessing attributes, but not necessarily for setting them. All hexp objects are immutable at the moment, and that's a constraint I am very reluctant to drop. |
Adding links here for reference, this would be great to implement with hexp http://begriffs.github.io/showpiece |
Basic CSS selectors are now implemented, and the rewriter/selector stuff has been refactored, covered by tests and is now on master. |
This commit introduces a first attempt at coming up with a chainable interface
for describing transformations of Hexp trees.
Hexp::Node::Selector
The idea is to have certain methods that select certain nodes in a tree. At
this point there is one such method : select, which takes a block that receives
a node and returns a boolean.
The result is a Hexp::Node::Selector, it is Enumerable, so one can do #to_a, #map,
etc, to loop over the nodes in the tree that have class="strong".
Hexp::Node::Rewriter
This can be combined with the concept of "rewriting" a tree, here the block receives
a node, and the return value of the block takes the place of the original node in the
new, rewritten tree.
This return value can be a single node or an array of nodes (zero or more). Nil is
treated special in that it is consider a no-op. When the block returns nil, the
original node is simply kept in the tree unaltered.
This example will give every <p> the class "para".
Combined
Nodes and Selectors both respond to #rewrite, in the latter case only the nodes
that match the selection criterium are passed to the block, so the former example
can be rewritten as
The result is an instance of Hexp::Node::Rewriter, which lazily evaluates the
transformation when converted to a Hexp. It includes Hexp::DSL, so it can be
used as any Hexp::Node (tag, attributes, to_html, etc).
These two can form the basic building blocks for a richer API. As a tryout they
already implement #wrap and #attr, so to take every paragraph, give it the class
'para', and wrap it in a div with class 'wrapper' you can write
The future
The next step will be to leverage SASS's CSS parser to implement a CSS selector,
then the above would become
tl;dr drinks for feedback, I mean it
API design is hard, so I'm desperately looking for feedback on this. I will
personally buy you a beer (or other beverage of your choosing), at Eurucamp,
Arrrrcamp, any time in Berlin or at the first occasion that presents itself,
for every constructive comment on this PR.