New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inline Markup Literals #26
Conversation
I like this in general, but I'm not sure if it's possible to implement such parsing in a straightforward way, given Haxe's current parser. The inline markup block should ideally come as a single token from a lexer, but it has no idea about what parser expects and whether it's at the "start of the expression", so when it finds |
An issue is that a tag can be valid haxe code. https://try.haxe.org/#Ca85f line 6: |
Maybe some other chars could be used instead of |
The proposal states:
Which means |
What about But yeah we could just use some other characters, the most likely easiest (I guess) would be to introduce a new one like |
The |
@ibilon I think it's the Anyway, one of the easy ways would be to require a special character to make lexer enter tag parsing mode, not unlike how it's done for strings. E.g. |
I'd prefer something like |
@ibilon to build on what the others said: yes,
Matters are relatively simple (conceptually - as for implementation, that seems to be a different matter): only when the parser begins parsing a new expression does it allow for a tag to occur and there is no room for ambiguity there, because @nadako I've tried to get some understanding of how the parser works and found very little that would make me hopeful. The one solution I see is that the parser may be able to set some shared mutable state to tell the lexer to expect a markup literal, but peeking ahead would defeat this hack entirely: So I see three options:
I think the first approach is a disaster waiting to happen, but you tell me. The third solution is the most pragmatic and I could live with it. No point in obsessing over one character. I'm not really looking forward to having to explain for the tenth time why the backtick is there, but it's a lot better than nothing and we can still look into getting rid of it in the future. That said, I'm actually intrigued by the second one (I have never seriously considered it until this point). There's the issue of having |
But yes, why not |
Not directly related, but i actually have a (working) prototype which allows an alternative calling syntax for macros via metadata like
which gets translated to
or
which gets translated to something like |
@nadako remarks are quite valid, you need to detail how this integrate with existing haxe lexer. Appending @ does not seem very elegant. Actually we can detect an XML start in parse_expr by matching For the record I found both Scala and VB have XML literals. Also, it might be misleading how allowing what looks like "XML literals" without actually enforcing XML (or HTML) syntax. I can see how one would complain about the compiler not giving error when writing the following:
So maybe using xml tags with this is a bad idea after all. |
I don't have the slightest idea how the lexer and parser work. I can whip up an implementation for hscript if that's of any help ^^
Agreed.
If there is any such thing as a "raw mode", then that's the simplest solution. Still I have to stress that an opening tag is nothing but
That depends on what "problem for IDEs" means. Getting completion to work is quite possible, as the little gif shows. Syntax highlighting is more of a challenge, as already mentioned at the end. It depends very strongly on how the IDE approaches the matter. I think an approach to highlighting that is as broad as this proposal may have a fair chance of providing a decent result. Then again I'm no expert on the matter. Perhaps @nadako can comment.
And how would that ever happen? According to the proposal, the above will give a compiler error unless a macro processes it. If the macro expects XML syntax, it will produce an error (the most naive one will just produce the exception thrown by
There's a lot of super hairy stuff in actual HTML and XML. So if then you go to having half an XML or HTML parser (which one would you choose?), you're going to ingrain incompatibilities with the actual standard into the syntax. The most pragmatic approach is thus to make it the user's problem. If they want all bells and whistles, they use something like dom4. Or whatever. And when parsing speed becomes a problem (and eval offsets that threshold quite a bit), my understanding is that they can write the parser in ocaml and just plug it in and call it from a macro (right?). |
Actually I'm taking back my previous comment. We could ensure that when getting compiled (without going through macros), XML literals are checked against XML by the compiler and parsed in a similar way to Haxe smart strings. This way we get both something that doesn't require macros and can be used by macros in a smart manner. I still have one particular issue: your proposal leaves attributes syntax undefined, but it means the following would also be invalid syntax:
Maybe we should ensure some XML attribute syntax in order to distinguish an actual node from something else. |
I personally prefer not to include the xml parsing part into the compiler, mainly because of the looooooong release cycle. I found myself always in a "I want to use haxe nighty but I can't use haxe nightly" dilemma. For example, I want to use haxe nighty for these fixes:
But I can't because HaxeFoundation/haxe#6321 breaks all my codes. Sorry for being a bit off-track here, but my point is: please don't embed functionalities into the compiler if there are other alternatives. I think the same reason drove the team to move out SPOD as a macro library. |
@kevinresol I would agree in general not to put too much things in compiler if it can be avoided (SPOD required some compiler specific support a long time ago - before macro era, it can be safely removed now) However I don't think that having a syntax that requires a macro and lead to a compiler error otherwise is a good idea, especially if there's some behavior that can be expected from the end user. It seems obvious to me that the following code should compile if it's considered valid syntax: |
@ncannasse that's one of my ideas (I've had to share with anybody yet) to use interpolation syntax as a straight-forward solution for embedded DSLs (yes, I did notice this proposal is about "markup literals" ;)). It would require generation of "interpolation AST", so too speak, that would be available during macro processing. If there were no macros transforming it, every target would have to simply generate runtime interpolation, exactly as it's done now. This way every "inline DSL" would just be a string by default. Added bonus would be an easy way to find out which actual Haxe variables are referenced inside the interpolation/DSL string. I'm omitting now the need of finding out what macro to run on any given interpolation usage. Do you think it's a viable solution to this problem? |
I still don't really know how people expect this to be actually implemented without implementing a full XML parser. Counting opening and closing XML tags is all nice and fun until you have I'm not saying that I'm strictly against something like this, but I'm still not sure if XML is the solution. |
Something I think that should be given some thoughts is how to deal with code comments. If the rule is var x = <div>
// </div>; // Am I closing tag, content or comment
</div>; |
As I understand it, the idea is to leave the Haxe domain upon But yes, literals in general are also a good point: In order to parse this correctly, you'd still have to understand e.g. string literals in order to not close on |
IMHO. I don't like to mix markup lang in the any lang like Haxe. But i think it can be cool feature if it implement in isolated context like a block and builtin macro, look: function foo()
{
var myXml = @:markup(XML) {
<node foo="bar"/>
}
var myYaml = @:markup(YAML) {
foo:
- "bar"
- "far"
}
} This method provides modularity and extensibility. I'll try to explain my strange idea :) For this method we need one none-breaking compiler modification - new type of
What is that And I can say with confidence what any markup in future will can be implemented on And abstract example: class Main
{
static function main()
{
var myValue = 42;
var myXml = @:markup(XML) {
<?xml version="1.0" encoding="utf-8"?>
<x>
<ml value=$myValue/>
</x>
}
$type(myXml); // Xml
var myYaml = @:markup(YAML) {
foo:
- "bar"
- ${myValue}
}
}
}
// very stupid simple example of custom builder
class CustomMarkupBuilder
{
public static macro function init()
{
// similarly to Compiler.addGlobalMetadata and something like Context.onAfterTyping but "before"
Compiler.setCustomMarkupGenerator(":markup", "XML", buildXml)
Compiler.setCustomMarkupGenerator(":markup", "YAML", buildYaml)
}
public static macro function buildXml(srcExpr:UnsafeBlockExpr):Expr
{
var src = srcExpr.toString();
var xml = Xml.parse(src);
return macro Xml.parse(${srcExpr});
}
public static macro function buildYaml(srcExpr:UnsafeBlockExpr):Expr
return macro null;
} and call --macro "CustomMarkupBuilder.init()" This is only my hopes and dreams. But I'm sure it would be very cool! |
Any news here? |
This type of markup from the Nemerle programming language (http://nemerle.org/About) looks very clean: def title = "Programming language authors";
def authors = ["Anders Hejlsberg", "Simon Peyton-Jones"];
// 'xml' - macro from Nemerle.Xml.Macro library which alows to inline XML literals into the nemerle-code
def html = xml <#
<html>
<head>
<title>$title</title>
</head>
<body>
<ul $when(authors.Any())>
<li $foreach(author in authors)>$author</li>
</ul>
</body>
</html>
#>
Trace.Assert(html.GetType().Equals(typeof(XElement)));
WriteLine(html.GetType()); I rather like a few aspects of this: First, the operators '<#' and '#>' as markup(?) delimiters is very readable; Second, the formatting of the xml as a "here" doc (a la bash) makes for very clean code; Third, the keyword (rather than a @macro(something) { stuff here }) is shorter, easier to type, and makes very clear the context. So, more in a Haxe parlance, I would see it like this: import Sys;
class Test {
static function main() {
var title = "Programming language authors";
var authors:Array<String> = ["Nicolas", "Simon", "<it>et al</it>"];
var xml:String = @lang(xml <#
<html>
<head>
<title>$title</title>
</head>
<body>
${if (authors.size()) @lang(xml <#
<ul>
${for (a in authors) @lang(xml <# <li>$a</li> #>) }
</ul>
#> }
</body>
</html>
#>);
trace(xml);
}
} It's not quite as clean because of the nice $when and $foreach that Nemerle has, but it's still fairly nice. What @lang() does is:
The big benefit with this approach is that the Haxe compiler doesn't have to understand the embedded language. Can this be done with a macro already? I don't know; it seems like it, except possibly for the token parsing. (It would be a pretty hefty macro to do parsing and validation!) |
The way Reason handles jsx is pretty clean and well-defined. Might be a good start and would still allow to be somewhat extendable through macros, as long as attribute values and children are valid Haxe expressions. |
I agree, for the most part. (I like the '<#', '#>' operators -- can't say why.) I was just trying to figure out how to do it without adding a new language keyword for every type of markup that we should ignore (or forcing us to pick just one...). Also, we really don't want to have to add markup parsing and evaluation in the compiler. |
@nadako Are you thinking of compile-time or run-time reentrancy issues? |
The |
@EricBishton Rentrancy is a syntax problem, so it is compile time. |
I like this idea, it's bold and focuses on keeping things straightforward and simple. I suggest we consider this as the implementation approach for this proposal and let the core team vote on this, unless we want to wait a couple more months... anyone? :) |
I think that's very short sighted. While today JSX is the hot thing, it might be something else entirely in two years. Language design is about creating solid bridges for the future, instead of single-usage wooden ladders. |
Yeah, I agree to that in general. It's just that I don't really see how we can implement arbitrary syntaxes without pluggable parser/lexer or ugly and useless (for jsx at least) heredoc syntax. |
@nadako - I don't think you can. To get arbitrary syntaxes, you need to be able to arbitrarily extend the parser; something has to parse it. And there must be a way to delineate it, whether using a heredoc delimiter or a meta. In truth, @fzzr- and I have shown very similar solutions (his UnsafeBlock example also uses string interpolation). From an IDE implementation point of view, having a dedicated operator and being able to treat the inserted language as a string is a simpler implementation. To treat it as an UnsafeBlock is harder, but really only because we can't detect it directly in the lexer (like we can a new operator). At that point, we either have to have meta support in the lexer (which is not good) or all supportable syntaxes have to be Haxe syntax compatible (which is untenable). |
As another thought... In truth, we don't require a meta at all. We can create the new operator similar to a markdown tag: import Sys;
class Test {
static function main() {
var title = "Programming language authors";
var authors:Array<String> = ["Nicolas", "Simon", "<it>et al</it>"];
var xml:String = <#xml
<html>
<head>
<title>$title</title>
</head>
<body>
${if (authors.size()) <#xml
<ul>
${for (a in authors) <#xml <li>$a</li> #>}
</ul>
#>}
</body>
</html>
#>;
trace(xml);
}
} Then, all we've really done is create a new string type that allows/requires an (ocaml) compiler extension to parse and/or verify it. If the plugin isn't available (or there is no name immediately following the opening operator), it's still just a string to Haxe. And, frankly, this is all people are really asking for: an easier way to embed their markup, which often requires its own string delimiters. (I still like '<#' and '#>' better than triple-backticks (```), tough.) |
@EricBishton The Haxe parser has to parse |
I was under the impression that the new plugin system allows this... no? |
It is true, the Haxe lexer does have to be extended to use Second, the content is not unparsed. It is a Haxe string, subject to normal string interpolation, therefore must(?) be nested. (And, if we want an option for a non-interpolated block, we can add |
BTW, we can come up with all sorts of delimiter combinations. Perhaps var xml:String = {'xml
<html>
<head>
<title>$title</title>
</head>
<body>
${if (authors.size()) {'xml
<ul>
${for (a in authors) {'xml <li>$a</li>'} }
</ul>
'} }
</body>
</html>
'}; I kind of get lost looking for the closing In the end, there are two goals:
To do that, we need to:
|
I am against a plugin system where one has to use ocaml . The whole point of Haxe is to have one language that rules them all, so that would be a weird contradiction. Also, I wonder how many are willing to learn ocaml to make such plugin. I think if everything is structured and available in the well known macro context, that would be more convenient, no? |
I agree wholeheartedly. Unfortunately, that's not what is currently implemented and available; we have the ocaml plugin architecture already in the compiler. I think (though I don't know) that it would be harder to implement a parser/lexer/validator in a Haxe macro than it is in Ocaml. As I think @ousado said when the plugin architecture was implemented, "All we need now is an Ocaml back-end for Haxe." My corollary is, "All we need is a Haxe (or Neko, or HL) binding for Ocaml." |
@ncannasse IMHO You aren't right in this question. Nowdays using JSX and react-like libraries is the fastest, convenient and advanced way to develop web applications( and some type of mobile ones). I think it will continue to evolve and no reasons it will be forgotten. So we need more advanced supporting jsx and other alien syntax in haxe(or a mechanism that can provide it) than just macro function with a string param. This thing can help popularize haxe among potential js-target users. It's realy big audience. |
One cool use case that is not React, would be to see haxeui and other ui libs use it :)
Apart from adoption and time I don't know how we could have a good measurement on whether something is just a As for |
What about using a syntax similar to ES6 template literals Template literals are ES6's answer to DSLs macro function jsx(string, expressions, ...) {
// parse jsx, return react object expressions
}
function render() {
var message = 'hello world';
return jsx`
<div>$message</div>
`;
}
function renderSomethingMoreComplex() {
return jsx`
<section>${ render() }</section>
<ul>
${ [1,2,3].map(n -> jsx`<li>$n</li>`); }
</ul>
`;
}
|
Which part about this proposal locks us into a single DSL?
I'm inclined to doubt that ES6 developers who're looking for familiarity will go for Haxe when they can use FlowType or TypeScript, especially since both of them have JSX support. I don't mind template literals. It's a neat feature although I'm not sure of the improvement here, beyond having a third type of quote that means you can use single and double quotes in the string. You can already do: function render() {
var message = 'hello world';
return @jsx'
<div>$message</div>
';
} |
@kevinresol Thanks. I've just noticed and removed the question) |
I wonder if the following will parse or not according to the proposal:
|
I actually wanted to brood about this a little while longer, but @mrcdk's initiative kinda forced my hand to present an alternative now.
Rendered version here.