Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Replace HTML parser with a new parser that recognizes attribute types #90

Closed
patrick-steele-idem opened this Issue Jun 17, 2015 · 55 comments

Comments

Projects
None yet
@patrick-steele-idem
Copy link
Contributor

patrick-steele-idem commented Jun 17, 2015

Currently, Marko is using an HTML parser. An HTML parser treats every attribute value as a String type regardless of how it is written.

For example, with an HTML parser the following lines are all equivalent:

<my-component foo=123/>
<my-component foo="123"/>
<my-component foo='123'/>

As a result of this limitation, Marko relies on custom tag schemas to associate a type with an attribute value or the developer must use the ${<javascript-expression} syntax. Tag schemas are great for documentation but they are a pain to maintain and they make usage of a custom tag ambiguous. Not to mention, the ${<javascript-expression} syntax can only be used with attributes that have a string type (not expression, boolean, etc.).

The author of the <my-component> tag can separately declare the type for an attribute and developer viewing a template wouldn't know the attribute type unless you consulted the tag schema:
It would be great if the parser used by Marko treated all attribute values as JavaScript expressions. Spaces will end an attribute value unless the space is in a quoted string or within a parenthesis/square brackets/curly braces block. Finally, we would support ${<javascript-expression>} usage within ES6 Template strings (i.e. strings "quoted" using backticks). The following illustrates what is being proposed.

<my-component number=1/>
<my-component variable=name/>
<my-component complex-expression=1+2/>
<my-component complex-expression-with-spaces=(a + b)/>
<my-component simple-string="hello"/>
<my-component simple-string='hello'/>
<my-component dynamic-string=`hello ${name}`/>
<my-component string-concatenated=("hello "+name)/>
<my-component boolean=true/>
<my-component array=[1, 2, 3]/>
<my-component object={hello: 'world'}/>
<my-component function-call=data.foo()/>
<my-component complex-function-call=(data.foo() + data.bar())/>

Now that we are using a custom parser, we can also improve the syntax for conditionals and looping as @onemrkarthik suggested:

<div if(x > 5)>
   Do something
</div>

<ul>
   <li for(color in colors)>
       ${color}
   </li>
</ul>

<ul>
   <li for(color in colors; status-var=loop)>
       ${loop.getIndex+1}) ${color}
   </li>
</ul>

Here's what for-loops and conditionals would look like as tags:

<if(x > 5)>
    <div>
        Do something
    </div>
</if>

<ul>
    <for(color in colors)>
        <li>
            ${color}
        </li>
    </for>
</ul>

<ul>
    <for(color in colors; status-var=loop)>
        <li>
            ${loop.getIndex+1}) ${color}
        </li>
    </for>
</ul>

Thoughts? Concerns?

@SunnyGurnani

This comment has been minimized.

Copy link

SunnyGurnani commented Jun 17, 2015

How about any JavaScript Object and JavaScript Arrays ?

my-component dynamic-int="${1==1? 25 : 26}"

@viviangledhill

This comment has been minimized.

Copy link

viviangledhill commented Jun 17, 2015

I think it's a good idea to remove the noise of extra files. Making this part of the (marko-)html makes sense.
👍

@SunnyGurnani I assume that should also work as Patrick mentioned support for the ${...} syntax.

@patrick-steele-idem

This comment has been minimized.

Copy link
Contributor Author

patrick-steele-idem commented Jun 17, 2015

Hey @SunnyGurnani, as for JavaScript arrays and objects, we should make sure the following works as well:

<my-component array=[1, 2, 3]/>
<my-component object={hello: 'world'}/>

I'll updated the proposal. Thanks for the feedback

@pswar

This comment has been minimized.

Copy link

pswar commented Jun 17, 2015

If you want to consider attribute value as JS expression, would you also support calling functions?

<my-component count=results.count() />
@patrick-steele-idem

This comment has been minimized.

Copy link
Contributor Author

patrick-steele-idem commented Jun 17, 2015

Hi @pswar, good question. Yes, any valid JavaScript expression is allowed on the right-hand side and will be copied to the compiled JavaScript template verbatim. I'll update the proposal to make this clear. Thanks for the feedback

@philidem

This comment has been minimized.

Copy link
Member

philidem commented Jun 18, 2015

Do you think that for parsing simplicity we should require expressions to be wrapped with parentheses? If you can get it work without parentheses that would be great but I could see how that might be tricky.

For example, consider:

<my-component object=someFunc (1) />

That space after someFunc might look like a new attribute to the parser.

This would be safer in all cases:

<my-component object=(someFunc (1)) />
@patrick-steele-idem

This comment has been minimized.

Copy link
Contributor Author

patrick-steele-idem commented Jun 18, 2015

Hey @philidem, I see where you are coming from, but requiring parenthesis would actually make things more confusing I think. It would be a bad idea to require that variable names be wrapped in parenthesis like the following:

<my-component variable=(foo) />

The following is better:

<my-component variable=foo />

In this case, foo just happens to be a simple variable expression (not a function call).

I think using a space (outside of a code block) is the best way to delimit attributes.

@vedam

This comment has been minimized.

Copy link

vedam commented Jun 19, 2015

Hi,
maybe it's worth considering the upcoming ES6 Template-Strings.
https://babeljs.io/docs/learn-es2015/#template-strings

If the proposal is a thought about getting the parser more towards js, the following:

<my-component dynamic-string='hello ${name}'/>

would lead towards:

<my-component template-string=`hello ${name}`/>

(yes, the backticks) if I got it right.
Not to talk about multiline strings.
And we'll be able to omit string-concatenation

<my-component string-concatenated=("hello "+name)/>

or am I completely wrong
greets Achim

@patrick-steele-idem

This comment has been minimized.

Copy link
Contributor Author

patrick-steele-idem commented Jun 19, 2015

Hi @vedam, you bring up a good question regarding the use of backticks for template strings. Adopting backticks would make things more like JavaScript, but my initial reaction is that why not allow number quoted strings to be treated as template strings? We are processing all of these strings at compile-time to find the dynamic parts so there is no performance hit at runtime. Also worth mentioning is that we already support dynamic parts in normal quoted strings so switching to backticks would make things potentially problematic for developers migrating to the new version (migration is also a very important topic that needs to be discussed since this proposal would definitely be a breaking change). The only negative about not using backticks is that developers would need to escape the $ character inside quoted strings if it is not intended for a dynamic part. I'm leaning towards not supporting backticks, but I would like to get more thoughts from others since I think it is definitely worth discussing some more.

@vedam

This comment has been minimized.

Copy link

vedam commented Jun 19, 2015

@patrick-steele-idem, I understand the migration-argument completely. For me as a total marko-noob, my thoughts were totally js-driven. It's the language I'm most familiar with, loving the new possibilities
and features of the upcoming es6. And with something like these...

<my-component string-concatenated=("hello "+name)/>
<my-component complex-function-call=(data.foo() + data.bar())/>

...I have to leave my way of js-thinking towards some marko-specials. If everything after the "=" is familiar js I don't have to think about marko. It just works.
Maybe this kind of thinking is too naive.

@patrick-steele-idem

This comment has been minimized.

Copy link
Contributor Author

patrick-steele-idem commented Jun 19, 2015

Hey @vedam, I'm actually okay with supporting backticks, but I would want to transpile the code to its ES5 equivalent so that it works in older browsers that do not support that ES6 feature.

For example,

`foo ${name}`

Would be transpiled to the following:

'foo ' + name

I think the end result would be the same. I agree with your sentiment that the right-hand should just be JavaScript...this makes Marko more intuitive and more powerful.

If we agree to support backticks, then the question becomes: should we also support dynamic parts in normal quoted strings or should dynamic parts only be allowed in ES6 template strings?

@philidem

This comment has been minimized.

Copy link
Member

philidem commented Jun 19, 2015

👍 for supporting string templates with backticks

I think usage of backticks is a good idea and I think the Marko compiler could easily convert the string template to ES5 JavaScript.

Let's make this happen!

@philidem

This comment has been minimized.

Copy link
Member

philidem commented Jun 19, 2015

To @patrick-steele-idem's question, I would suggest only support variable replacement in string templates which are delimited with backticks. That is, don't do variable replacement within strings delimited with " and '.

Since migration tool seems necessary, I think we should convert any existing string attribute values to ES6 templates (using backticks).

@SunnyGurnani

This comment has been minimized.

Copy link

SunnyGurnani commented Jun 21, 2015

My Suggestion would be to allow all javascript expression in ${} only to avoid any confusion.

With proposed method it would be very confusing for anyone who is new to Marko to figure out which is plain text and which is javascript expression.

@patrick-steele-idem

This comment has been minimized.

Copy link
Contributor Author

patrick-steele-idem commented Jun 22, 2015

Hi @SunnyGurnani, what is being proposed is that the right-hand side is always a JavaScript expression. If it is valid JavaScript expression then it can go on the right hand side and it will actually be copied into the compiled JS verbatim. Even if right-hand value is a simple string it would still be a valid JavaScript expression. I think this will make things the least confusing.

The only special case is for attributes that are interpreted at compile-time (e.g. for="color in ['red', 'green', 'blue'])

Seem reasonable?

@pswar

This comment has been minimized.

Copy link

pswar commented Jun 22, 2015

As i think more about it, i think what @SunnyGurnani proposed makes more sense and less confusing.

Few things to consider:

  1. 'hello ${name}' is not a valid JS expression
  2. If you want to actually have a string "[1, 2, 3]" as attribute value in mystring=[1, 2, 3] then it would have to be encoded which would make it look odd, otherwise it becomes array value.
@patrick-steele-idem

This comment has been minimized.

Copy link
Contributor Author

patrick-steele-idem commented Jun 22, 2015

Thanks for the comments @pswar. Some related thoughts:

  1. I think we are saying the same thing. I'm in favor of only allowing ${...} in strings that use back ticks as "quotes" (not in strings quoted using double or single quotes). For example:
<my-component my-dynamic-string=`Hello ${data.name}!`/>

On top of that, we would transpile ES6 template strings to an ES5 equivalent.

  1. If you want to have a string of "[1, 2, 3]" then you would just do: mystring="[1, 2, 3]". If you want an Array of [1, 2, 3] then you would just omit the quotes: myarray=[1, 2, 3]
@pswar

This comment has been minimized.

Copy link

pswar commented Jun 22, 2015

👍

@patrick-steele-idem

This comment has been minimized.

Copy link
Contributor Author

patrick-steele-idem commented Jun 22, 2015

NOTE: I have updated the proposal to reflect that ${...} should only be allowed in ES6 template strings.

@SunnyGurnani

This comment has been minimized.

Copy link

SunnyGurnani commented Jun 22, 2015

Thanks @pswar and @patrick-steele-idem

JavaScript has given us the freedom from Classes and Types with current proposal we are taking away all that beauty and simplicity. I personally would prefer simplicity ${} for all expressions and let library handle the rest.

Current proposal suggest different implementation based on individual Types as following. However in JavaScript we never define type in the parameter or declare variable with type.

For dynamic string use ES6 strings
component attribute=`hello ${name}'

Single function don't use anything
component attribute=data.foo()

For Multiple Functions use ()
component attribute=(data.foo() + data.bar())

For Object use {}
component attribute={hello: 'world'}

For Array use []
component attribute=[1,2,3]

@pswar I don't see any problem in following expression. Hello is static and ${name} is JavaScript expression. Please correct me if I am wrong?
component attribute="hello ${name}"

@DanCech

This comment has been minimized.

Copy link

DanCech commented Jun 22, 2015

I was following up until the proposal that ${...} only be allowed within backtick-quoted attribute values, which would be a significant BC break. I understand the desire to simplify, but that seems a step too far to me vs retaining the existing behavior where quoted attribute values are scanned for ${...} replacements.

@patrick-steele-idem

This comment has been minimized.

Copy link
Contributor Author

patrick-steele-idem commented Jun 22, 2015

Hi @DanCech, this is still a developing proposal so nothing is set in stone. My thinking is that this is definitely going to be a breaking/major change and old templates will need to be migrated to work correctly with the new Marko version. We would want to make the migration as easy as possible so at a minimum we would need to provide a tool that would automatically update existing Marko templates to conform to the new parser by doing things such as the following:

Old:

<my-component foo="Hello ${data.name}"/>

New:

<my-component foo=`Hello ${data.name}`/>

As another example, if a custom "foo" attribute is declared as an "expression" type then the following conversion would happen:

Old:

<my-component foo="data.name"/>

New:

<my-component foo=data.name/>

Beyond that, if you are concerned that it would be hard to unlearn the old handling of ${...} in attribute values or that you still prefer all ${...} to be processed in all strings then that would be a good argument for going back to the old behavior and allowing ${...} in all quoted strings. Honestly, I could go either way since I see the value in both choices.

@patrick-steele-idem

This comment has been minimized.

Copy link
Contributor Author

patrick-steele-idem commented Jun 22, 2015

Hi @SunnyGurnani, I think there might be a misunderstanding because with the updated proposal we want to go closer to the simplicity of pure JavaScript. Instead of tag schemas, we are saying that an attribute value is simply a pure JavaScript expression and nothing more and nothing less. Marko has always been close to JavaScript by design but with the latest proposal we want to take it even closer. That is, Marko aims to be a perfect blend of JavaScript and HTML :)

@patrick-steele-idem

This comment has been minimized.

Copy link
Contributor Author

patrick-steele-idem commented Jun 22, 2015

Also, @SunnyGurnani, the parenthesis are only needed in situations where a "complex" expression has spaces that would have otherwise ended the attribute value.

@DanCech

This comment has been minimized.

Copy link

DanCech commented Jun 22, 2015

@patrick-steele-idem I understand where you're coming from, and in the end either way will work. I guess the way I was looking at the interpolation within quoted attributes question was that it's simple to think of replacement working the same way there as it does within CDATA sections, since they're both "text" parts of HTML.

One question this does bring up is how the different quoting styles in the template would affect the quoting in the generated HTML. For example, does this:

<my-component foo=`Hello ${data.name}`/>

compile to:

<my-component foo="Hello Dan" />

or:

<my-component foo='Hello Dan' />

or something else?

@patrick-steele-idem

This comment has been minimized.

Copy link
Contributor Author

patrick-steele-idem commented Jun 22, 2015

The idea is that the foo attribute in the following Marko template:

<my-component foo=`Hello ${data.name}`/>

Would compile to the following property definition in JavaScript code:

foo: "Hello " + data.name

NOTE: there would no signs of the backticks in the final compiled output because we want the code to work in ES5 and not just ES6

We could absolutely interpolate ${...} in normal quoted JavaScript strings, but then we are slightly changing the rules of JavaScript which may be surprising to some users if we say "the right-hand side of an attribute is just JavaScript".

@SunnyGurnani

This comment has been minimized.

Copy link

SunnyGurnani commented Jun 22, 2015

I got your point @patrick-steele-idem

I think then it would help to remove/depricate ${} from the language.
is there a need to use ES6 backticks ? Cant we just use following ?
my-component string-concatenated=("hello "+name)

This would avoid confusion and make Marko much easier to learn for people new to language.

@patrick-steele-idem

This comment has been minimized.

Copy link
Contributor Author

patrick-steele-idem commented Jun 22, 2015

Hi @SunnyGurnani, technically we don't need to support ${...} but just just makes the code easier to write and read compared to using string concatenation.

It did occur to me that we also currently support $<var-name> (without the curly braces) which is technically not part of the ES6 template strings spec. We could require ${<var-name>} but that seems like it would require extra code for end users unnecessarily. That is something we should also discuss. Thanks for the feedback so far!

@onemrkarthik

This comment has been minimized.

Copy link

onemrkarthik commented Jul 30, 2015

For conditional operators, how about something like this:

<div if(x > 5)>
   Do something
</div>

instead of

<div if=(x > 5)>
   Do something
</div>

For iterators,

<ul>
   <li for(color in colors)>
       ${color}
   </li>
</ul>

instead of

<ul>
    <li for="color in colors">
        ${color}
    </li>
</ul>
@patrick-steele-idem

This comment has been minimized.

Copy link
Contributor Author

patrick-steele-idem commented Jul 31, 2015

Great suggestion, @onemrkarthik. I like that the for(color in colors) and if(x > 5) directives match up very nicely to pure JavaScript. I'm going to update the original proposal to include your suggestions. Thanks!

@onemrkarthik

This comment has been minimized.

Copy link

onemrkarthik commented Jul 31, 2015

@patrick-steele-idem
I think the suggestion needs to be fixed to

<div if(x > 5)>
   Do something
</div>

I see an '=' after the if, in the suggestion.

@patrick-steele-idem

This comment has been minimized.

Copy link
Contributor Author

patrick-steele-idem commented Jul 31, 2015

Fixes. Thanks, @onemrkarthik

@kristianmandrup

This comment has been minimized.

Copy link
Contributor

kristianmandrup commented Aug 13, 2015

My gut reaction to this proposal was: "OH NO, NOT ANGULAR 2 SYNTAX!!!"

But reading a bit further on, I fully understand the reasoning. More power, less limits!!!
However I don't think it should be "all or nothin", breaking existing templates etc.
My proposal:
Allow the HTML "flavor" described above, but also allow/enable it while keeping in full HTML attribute compatibility mode, ie x="y".

<div x=(5-2*sqr(2))/>

AND

<div x.e="(5-2*sqr(2))"/>

Where the .e means that the internals should be evaluated as pure javascript and returned as:

<div x=(5-2*sqrt(2))/>

Then evaluated as javascript to finally render <div x="12"/>

It think this combines the best of both worlds (and would allow me to keep using jade to marko compilation!). The same principle would apply for string interpolation etc., ie. simply having .e attribute name postfix means a first "compilation" phase, basically just stripping the "s and understanding the intention as javascript evaluation :)

@patrick-steele-idem

This comment has been minimized.

Copy link
Contributor Author

patrick-steele-idem commented Aug 13, 2015

Great feedback @kristianmandrup. What are your thoughts on the fragmentation that would result from supporting two different syntaxes?

@philidem

This comment has been minimized.

Copy link
Member

philidem commented Sep 11, 2015

FYI, I started working on a new parser that will better understand JavaScript expressions in HTML attributes (e.g. <custom message="Hello ${name}!" count=100 large=true complex=(a + b)>). The key difference between the existing HTML parsers is that attributes will not need to be surrounded by quotes (quotes should only be used if the attribute value will be of type string).

See https://github.com/philidem/htmljs-parser

The base HTML language is handled just fine but I am still working on:

  • Placeholders within content: <div>${xyz}</div>
  • Placeholders within attribute string expressions: <div class="${xyz}">
  • Placeholders within placeholders: <div class="${"abc ${def}" + 123}"> (hopefully wouldn't be used very often)
  • Better flow control syntax: <for(a in b)>, <div for(a in b)>, and <div if(a === b)>

There's also going to be additional work inside Marko to switch to new parser and there will be some breaking changes that need to be addressed with a template migration tool perhaps.

If anyone has some additional thoughts about the Marko syntax and improvements that could be made, then please let me know! Thoughts and feedback on the parser are also welcome.

@philidem philidem self-assigned this Sep 11, 2015

@patrick-steele-idem

This comment has been minimized.

Copy link
Contributor Author

patrick-steele-idem commented Sep 11, 2015

Nice work, @philidem! Thank you for the hard work to make this happen. This will be a huge improvement to Marko and it will be nice to be able to do away the marko-tag.json files in most cases.

We'll definitely have to think about transforming old templates. The big thing is that if an attribute is declared as an expression then we will need to transform the template to drop the attribute quotes. For example:

Old:

<my-custom-tag data="data.someFunction()">

New:

<my-custom-tag data=(data.someFunction())>

We'll also need to transform if, for, else-if and else(?) attributes to use the new directive syntax.

I'm sure there will be some other transformations as well, but it is definitely doable.

Thanks again.

@kristianmandrup

This comment has been minimized.

Copy link
Contributor

kristianmandrup commented Sep 11, 2015

Looks very cool :) Good job!!!

@kristianmandrup

This comment has been minimized.

Copy link
Contributor

kristianmandrup commented Sep 17, 2015

Hi @philidem,

When can we expect to begin experimenting with this or contribute?
How do we hook the parser into markojs to replace the default Parser?
Please add usage/install instructions. Cheers :)

@philidem

This comment has been minimized.

Copy link
Member

philidem commented Sep 17, 2015

Still actively working on the parser and I'll push some more changes later today. There's still a lot of heavy work to do in Marko to leverage the parser and I think we're leaning toward making breaking changes while also providing a migration tool.

So nothing in the immediate future will be available but still working hard to get this done! I'll probably have to get feedback from @patrick-steele-idem when it is time make changes to Marko. Perhaps we can document the plan to see if we can help from the community as well.

@patrick-steele-idem

This comment has been minimized.

Copy link
Contributor Author

patrick-steele-idem commented Sep 17, 2015

We could definitely start on the migration tool right now so that it would be ready and well-tested in time for the switch to the new parser. That's an area where we could definitely use some more help from the community. I'll open another Github issue for that task.

@patrick-steele-idem

This comment has been minimized.

Copy link
Contributor Author

patrick-steele-idem commented Oct 15, 2015

Thoughts on the following?:

<var foo=1 hello='World'/>

Output JavaScript:

var foo=1;
var hello='World';

Old style:

<var name="foo" value="1"/>
<var name="hello" value="'World'"/>
@philidem

This comment has been minimized.

Copy link
Member

philidem commented Oct 15, 2015

Yes, like <var foo=1 hello='World'/>!

Also, I would like to propose this:

Declare function:

<function doSomething(a, b, c)>
</function>

Call function:

<doSomething(a, b, c) />
@patrick-steele-idem

This comment has been minimized.

Copy link
Contributor Author

patrick-steele-idem commented Oct 15, 2015

Also, we currently support simple conditionals in Marko:

<div class="{?data.isActive;active}">

How should we support that with the new parser?

This should of course work in the new parser, but it is a little uglier:

<div class=data.isActive?'active':''>
@yomed

This comment has been minimized.

Copy link
Contributor

yomed commented Oct 15, 2015

re: multiple variables, I don't know. I think most javascript styles these days prefer writing out one variable per line anyway.

@patrick-steele-idem

This comment has been minimized.

Copy link
Contributor Author

patrick-steele-idem commented Oct 15, 2015

@yomed one variable per line is a coding style. Doesn't really matter to the compiler and it could go either way.

@yomed

This comment has been minimized.

Copy link
Contributor

yomed commented Oct 15, 2015

@patrick-steele-idem Sure, and it isn't necessarily bad to support multiple styles, it's just additional things to maintain, document, etc.

re: conditionals - I actually think that looks better. It is uglier, but it's standard js.

@patrick-steele-idem

This comment has been minimized.

Copy link
Contributor Author

patrick-steele-idem commented Oct 15, 2015

Thanks for the feedback @yomed. I'm leaning towards standard JS as well.

@mlrawlings

This comment has been minimized.

Copy link
Member

mlrawlings commented Feb 4, 2016

Regarding conditionals, depending on how a falsy value is handled for the class attribute, the following would work... and it looks quite nice.

<div class=(data.isActive && 'active')>
@patrick-steele-idem

This comment has been minimized.

Copy link
Contributor Author

patrick-steele-idem commented Feb 4, 2016

I agree @mlrawlings. That's definitely allowed in Marko v3 and would work as expected. Thanks for pointing that out. I'm definitely looking forward to the new Marko v3 syntax :)

@patrick-steele-idem patrick-steele-idem added this to the 3.0-alpha milestone Feb 11, 2016

@patrick-steele-idem

This comment has been minimized.

Copy link
Contributor Author

patrick-steele-idem commented Feb 11, 2016

This feature has been implemented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.