Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Final changes to LESv3 #52
I'm changing a bunch of things, mainly in order to finalize LESv3. However, the changes to operator precedence/classification affect both LESv2 and LES3, partly because both languages share precedence-choosing code but also for the sake of consistency. These changes are not committed as of today.
!! to suffix operator
Wow, that sounds like a lot of work! This all seems super reasonable to me. I'm especially happy about you deciding to store
I also like your idea of supporting arbitrary suffices for numeric literals. How does that interoperate with custom literals? Is
So I'm thinking there should be a new
Which reminds me of another change I forgot to mention - I've changed the type marker
Hmmm. Adding a
And people probably aren't going to check
Um, three things. First, there is supposed to be no semantic difference between
Second, there are multiple type markers that produce, say, a 32-bit integer. You can write
I think it's interesting that suffices don't always make a difference semantically and that all custom literals with unknown suffices are parsed as strings. I also seem to recall that you said you would've liked to parse custom literals that are actually integers/floats as what they are instead of strings.
Maybe the root cause of most custom literal–related issues is that custom literals and value types are orthogonal, but they're represented using the same prefix/suffix scheme.
Here's a strawman proposal: drop the
I know it's a completely different approach, but I feel like it solves a lot of issues with custom literals. What do you think?
I'm not going to call this
I probably don't understand your proposal because you refer to a
So let me first demolish my faulty interpretation of your strawman.
[edit: struck out because arguing against a false interpretation of a strawman is such a silly exercise]
Hmm, reading over your "advantages list" it seems that what you really want to propose is a "units" thing, not a "custom literal" thing. Please don't confuse the two, they are totally different concepts that would exist for totally different reasons. The "Idea" regarding units that I wrote down earlier is, as I implied, imperfect. It was an idea about co-opting custom literals to represent units - consider it an interesting hack that one could use if LES does not support units in any other way.
I'm not particularly invested in the
I'm sorry that I didn't explain what I meant in depth. I'll try to sketch my motivations for proposing to use a
First off, right now a prefix/suffix can mean one of two things.
Furthermore, it's the job of some separate component outside of the printer/parser to interpret custom literals. So we probably shouldn't hide the custom literal suffix in a field of a literal node (like
I think a call to
Does that make more sense? Would you like me to elaborate on some specific aspect of this proposal?
They are not always represented by strings.
Type markers are what make custom literals custom. I'm not sure that you understand why I created custom literals and type markers, so let me explain.
The vision is this: there will be X LES parsers written by Y people in Z languages. I would hope that someday Z>20, you know? Now, each programming environment is different, so each one will have a set of literals that it supports... and another set that it does not support. Most will have built-in support for string, float64, float32, int32 and characters (though not necessarily proper 21-bit characters), but not all of them. Many will support int64, BigInt, Symbol and small integers (byte, short, etc.). Some will support regexes, some might even support unums.
The purpose of custom literals is to handle literals in a uniform way across the myriad environments where LES may be used, in a way that is (1) future-proof and (2) allows unknown literals to be round-tripped by lexers that don't understand them. I want to avoid a rigid framework that says "these are the standard literals which everyone should support, and everything else is non-standard". Instead, quite the opposite: the only thing that will be required from an LES parser is the ability to store a literal as a string. Everything else is optional.
And today it occurred to me that lexers should by convention support a plug-in literal parsing system so users can expand (or even clear) the set of supported literal types. Probably the parser collection should even be a separate object from the lexer itself.
If a literal type is unknown, then it should be stored as a string. It should not be stored as a number even if it is written as a number, partly because in general it's infeasible. For example, consider this number:
Some environments may be able to store this in a numeric form, but most can't. So rather attempt to parse it, I believe it's better to keep it as a string. And remember, if you want to write a negative literal, it must be written as a string:
So the use of string versus number notation is not a strong signal about whether the user wants it to be parsed as a string or as a number.
In fact, some users will explicitly want to avoid parsing literals because a parsed literal contains less information than the original literal, e.g.
If your goal is to load an LES file and, say, find and replace all instances of
Now, in the design of LES we must consider not only the conversion of LES to Loyc trees, but the reverse conversion also. So here are a couple of things to consider:
So, given what I had in mind, I don't understand why you are proposing a system that involves two suffixes like
Oh, I see. I was under the impression that by "custom literals" you meant something like C++'s user-defined literals. Thanks for clearing that up.
A plug-in literal parsing system for the lexer sounds promising. I'm not sure if
added a commit
Sep 27, 2017
I'm tweaking the precedence of
Btw I once proposed