-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NaN and infinity float literals #16
Conversation
I've been busy... sorry about the delay. You've made an excessive number of whitespace changes, which is very distracting. Um... wish you hadn't done that. There is already a function |
Yeah, detecting Negative NaNs sure do exist, and I may have overlooked them when adding NaN literals to the lexer. But even with negative NaN literals, there would still be a whole bunch of NaNs that we can't encode right now. But according to the Wikipedia article you mentioned, any 32-bit float whose bits look like I'm really sorry about the whitespace. It kind of happened by accident. I suppose I could just kill the current named float literal implementation by reverting all changes to Is that something you'd like me to do? Also, what about NaN? Adding nothing more than a negative NaN literal just feels like putting a Band-Aid on a gaping wound. |
Well, you could do it yourself or wait for me, but I probably won't do it soon. I am ill right now and my wedding is on Saturday. |
I'm okay with not being able to represent most nans, for now, since needing more than one is quite unusual. Supposing there were a special literal to represent all the nans bitwise, it still makes sense to have a default NaN with a straightforward name like I have no opinion about whether |
All right. I rolled back my previous named float literal implementation, and replaced it with some Furthermore, I have also fixed an edge case bug that arose from reserving I decided against adding Congrats on the wedding, by the way. Get well soon. |
NaN and infinity float literals
Hey, I finally got around to reviewing your code. Nice work! I'm happy with it just the way it is, which is really nice 'cause I can just click github's handy "Merge" button. Sorry I took so long! I got proper internet access 10 days ago btw... |
Thanks for taking the time to review my changes! Oh, and sorry about the whole whitespace thing. I'll try too keep that in mind in the future. I wonder if monodevelop has a feature similar to that faintly visible spaces thing (I've been using an Ubuntu desktop lately, so Visual Studio's not much of an option for me right now). By the way, do you think there's any chance of getting the BLT format upstreamed at some point? It's proven really useful to me as a reliable and compact representation of |
To be honest... I really didn't understand the BLT format as you originally described it. I just had a feeling that it was more complicated than necessary and... well, I didn't feel like expending the effort necessary to understand it. Currently, I can't find the link to the BLT info. I'm going to level with you... I've been battling depression over the past couple of months, but I recently had a burst of energy, which I put into work on the EC# parser again. This uncovered a bug in LLLPG, and unfortunately I realized that I had no idea how to fix the bug. The problem seems to be that LLLPG is too complicated, especially its prediction analysis. I'd like to rewrite it to use a simplified grammar representation that eliminates loops and other "gratuitous" elements. Maybe then, I thought, I could understand how my own algorithm worked well enough to not only fix that pesky bug, but maybe optimize it and subtly change the way zero-width assertions work... however, when I started working on that, I ran out of energy again. I keep hoping that the next feature I do is going to finally attract some external attention - what I thought was a small feature, a macro to implement pattern recognition in C#, was my next chance to convince people to sit up and notice EC#. But since a bug in LLLPG is blocking progress, that small feature has become a lot of work and I'm having trouble solving it. Really what I should be doing right now is changing LES so that you don't need a semicolon after a closing brace, and then trying to convince the WebAssembly guys to use LES instead of an ad-hoc notation. Ack, the power went out. Boo Philippines. Yay laptop! Will send this when the power comes back... huh, the water's out too. How about that. |
I'm really sorry to hear that you're feeling that way. All parser generators, including LLLPG, are indistinguishable from magic to me. There's no way I can provide any advice on that, so I'll present my thoughts on the other topics you mentioned. Look, I've been reading the WebAssembly design discussions myself, but I doubt adding additional complexity to LES in the form of optional semicolons will make it a more attractive format to the WebAssembly folks. Seems to me like they picked S-expressions because of their (percieved) simplicity. They're using a wide variety of programming languages over there, and mandating all WebAssembly tools to implement a full-fledged LES parser/printer would place a relatively heavy burden on non-.NET implementations. None of that does in any way diminish Loyc - or LES -, though. I like LES, but no format is going to be all things to all people. Fortunately, Loyc's architecture makes it both feasible and easy to create custom printers/parsers. Using Loyc as a library is fun, too. I had a very positive experience with it when I was creating an on-disk IR format for Flame. Same thing when I created If you really want the WebAssembly file format to be LES-like, then I'd recommend offering them a regular, easy-to-implement subset of LES that is exactly what WebAssembly needs, and nothing more. It's okay to aim for world domination with LES, but you can hardly expect that to happen instantaneously. World domination is a gradual process, after all. Just keep moving in the right direction, and you'll get there eventually. Here's a link to my sad attempt at describing the BLT format: Core/Loyc.Binary/README.md. I have no idea what's supposed to be in a readme for a file format, to be honest. The example in that readme is probably confusing as hell, too. But that's just me being bad at explaining how BLT works. I would gladly answer any and all questions about BLT. It's not an overly complex format, and its parser is just 412 lines of text, which I think is acceptable. I'm pretty happy about BLT file sizes. Statically linking the core Flame libraries results in a 791.6 kB DLL (CIL) vs an 810.3 kB BLT (IR) vs a 7.1 MB LES (IR). I though BLT would be a cool addition to Loyc, because binary formats are convenient for inter-program communication. I needed a binary format for Flame, so I can imagine that (future) Loyc users may want one, too. |
Uhh... What part(s) of LES would you say they don't need? Perhaps attributes aren't needed (that's the main difference between Loyc trees and s-exprs), but having them might turn out to be useful. While making semicolons optional will slightly complicate the parser, I figure that moving LES syntax closer to Javascript would please them, as they are experienced browser devs and therefore likely to be "traditionalist". It's good that you provide an example in the BLT document, because I generally cannot understand specs without seeing a detailed example first. The problem I have understanding the
|
I'd say ditching space-sensitivity and Python mode is a good idea, even if the latter has not been implemented yet. Both seem like hacks, which may be undesirable in a format that's supposed to last "forever." Attributes are nice (and easy to implement on a syntactic level), but they also complicate things when it comes to semantics: which attributes can a WebAssembly VM safely ignore, and which attributes does it recognize? WebAssembly wasn't designed with attributes in mind, so maybe they're more trouble than they're worth. IIRC, PNaCl was undone in part by its dependence on the ever-changing LLVM IR, which made it virtually impossible for the PNaCl project to upgrade to a newer version of LLVM without breaking compatibility. It makes sense for them to be looking for a more stable format now. You're obviously not done making changes to LES yet. The WebAssembly group wants a stable, well-defined file format that'll last for a long time. A mini-version of LES with exactly those properties should work nicely. As for the BLT file: I am, embarassingly enough, unable to compile Loyc on this machine (Windows-specific pre-build commands, I reckon), which includes the
As you can see, a node template defines a data structure of sorts. The "structure" of a BLT's data is defined by the template table, whereas the actual data is defined by the top-level node list. Looks like I'm still using 32-bit integers to encode id indices in |
Update: got (most) Loyc projects to compile on Linux. Pushed to my fork. |
Hmmm. So looking at template #0, "CallIdNode" (byte 1) refers to "A call node whose target is an id node". The second byte means there are 2 children; the next four bytes refer to symbol 0 which is "+", and its arguments have "type 5" which is one of the 17 types in the README file of BLT (of which 15 are literal types, one is for identifiers, and one is for references to templates.) Hence the bytes I should have mentioned this before, but when doing examples of this type, it's best to use disjoint numbers to avoid confusion, e.g. Hmm. It makes me uncomfortable that there is a special optimization for "A call node whose target is an id node", and in particular that the optimization is necessary. If the target were encoded the same way as the arguments, your file format would require the "top-level node list" to include three extra bytes for 3 references to "+" ($0). This is not unprecedented - Also, Wasm discussions make me wonder whether the binary format should be canonical and "multi-level", i.e. have a multi-stage encode/decode pipeline. BLT is currently flexible, allowing a single Loyc tree to be encoded in numerous ways. JSStats (who somehow isn't really getting along with the other devs, I'm not sure why) has proposed reducing degrees of freedom so that two different "level 0 encoded" Wasm modules would never convert to the same text form. This makes sense to me, since their level 0 is the "uncompressed" binary form, and canonicality could allow "directly" comparing binaries produced from different sources. "Level 1" would be a compression layer, and thus the same module could have infinite level-1 encodings. (There's one drawback to mandating canonicality: a small overhead in the decoder to detect non-canonical files.) I'm sorry I was slow to respond again. The lack of interest in Loyc tools from everyone but you, combined with that blocking bug in LLLPG, and personal issues, has sapped my energy. I wish I could hang out with you for a hackathon or something, maybe I'd get motivated again. Belgium! This world is too darn big. I'm curious, what do you do for a living? I'm currently unemployed. There aren't any special build steps in Loyc.sln btw. The only obvious potential problem is that there are more configurations than just "Debug" and "Release", ones for .NET 3.5, .NET 4 and .NET 4.5. I've been meaning to make a solution based on PCLs instead (at the sacrifice of finally dropping support for Visual Studio 2010 and .NET 3.5). How did you fix your problem (building on Linux)? |
Actually let's do any more discussion of BLT in #17. |
This PR adds the following named floating-point literals. There names and values are listed below. I went will all-lowercase names because of the precedent set by
@null
,@void
,@true
and@false
.@inf_d
-double.PositiveInfinity
-@inf_d
-double.NegativeInfinity
@nan_d
-double.NaN
@inf_f
-float.PositiveInfinity
-@inf_f
-float.NegativeInfinity
@nan_f
-float.NaN
I think I also fixed a bug in
BaseLexer.Match(int, int)
, which surfaced while I was rewriting the grammar. It seemed like theif
andelse
cases were reversed there, and the lexer LLLPG generated wouldn't accept the new float literals until I changed that.I also threw in some test cases for these new float literals, which seem to work fine.
Any comments or suggestions are most welcome.
PS: Visual Studio kept complaining that the end line sequences in
BaseLexer.cs
were inconsistent (which is probably my fault), so I told it to go ahead and reformat all line endings in that file as LF, which, at first glance, seemed to be the line ending scheme the other files were using. Sorry about that.