Introduce SemiScript. #11

alloy · 2012-04-15T21:36:43Z

After much discussion, it became clear that to support this library far into the future, we would need to abstract the requirements into its own language, which can be compiled down to different languages.

Currently the compiler generates valid code for JavaScript and CoffeeScript, but could easily be extended to support additional languages.

The compiler has been written in C for optimal ROFLScale performance.

Related #8.

* Allows us to support popular languages, now, and far into the future. * Translates to JavaScript and CoffeeScript, but is modular enough to easily be extended to support other languages.

alloy · 2012-04-15T21:40:13Z

Btw, the option parsing has been implemented very naively on purpose, I think it can provide us with quite some bike-shedding hours in the future.

madrobby · 2012-04-15T21:49:54Z

+1

showell · 2012-04-15T22:04:24Z

Great idea, and nice work.

It should be easy to extend this concept to other languages. Sorry, this is just a sketch (untested):

    if (
        (strcmp(argv[1], "-ada") == 0) ||
        (strcmp(argv[1], "-algol") == 0) ||
        (strcmp(argv[1], "-c") == 0) ||
        (strcmp(argv[1], "-cpp") == 0) ||
        (strcmp(argv[1], "-java") == 0) ||
        (strcmp(argv[1], "-js") == 0) ||
        (strcmp(argv[1], "-pascal") == 0) ||
        (strcmp(argv[1], "-perl") == 0) ||
        (strcmp(argv[1], "-php") == 0)
    ) {
         semicolon = 1;
    }

lrz · 2012-04-15T22:25:30Z

Well, multiple strcmp statements are going to impact the promised ROFLScale performance. It would be more efficient to use a perfect hash table here (one can be generated using gperf(1)). I would certainly throw a few belgian dollars in a kickstarter project that promised to contribute such a change.

lrz · 2012-04-15T22:28:44Z

compiler.c

+
+        // Comment, ignore the rest of the line.
+        case '#':
+          comment = 1;


That's probably not going to work if the sharp character is used in a string (or properly escaped), right? It might be a better idea to tokenize the file contents first. As performance is an issue here I recommend using the Amazon Mechanical Turk.

Indeed, it's not going to work due to it only considering one character at a time (with too little state info) and therefor not parsing the grammar for comments properly; I'd normally recommend using a more elaborate finite state automaton for the latter, but crowdsourcing is definitely more cloud 3.0. Beware of using jquery files though, the excessive use of dollar signs might influence the tokenizer in thinking you might be part of a billion dollar photo filter company and could drive up pricing per token parsed.

Hmm, you make a valid point, @prototype. It would probably be safer to perform a DSE optimization pass (dollar sign elimination) on the js code before submitting it to Amazon. I believe there was a paper recently published in The Journal of Machine Learning Research about this technique.

I would recommend writing the optimization pass using LLVM, since it provides all the necessary foundations.

Guys, I suggest you create a fork if you want to add support for strings, or anything else than a semicolon or sharp character. As it is, there is no support for them and as such your code would be at fault.

So this is how you thank people working on your code? This is open source, we have the right to open meaningful discussions about any project. I can see that you are trying to create a culture of exclusion here. I do hope that you will properly apologize for this atrocious behavior.

I will do no such thing. Open source means that I get to dictate what you will be using, not some democratic model from your hippy lala-world.

I vigorously protest against this dictatorial behavior. You should be open to our concerns day and night. I am going to write a blog post about this story and submit it to hacker news.

This said, you are contradicting yourself. In your previous comment you proudly state that the code only supports semicolon or sharp characters, or you are deliberately checking for \n. I suspect you have bigger plans about the code and that you already started working on our suggestions, with the goal of delivering a proprietary product and selling it to facebook.

alloy · 2012-04-15T23:01:13Z

@showell I like the suggestion, but @lrz raises a valid point. Any takers?

prototype · 2012-04-15T23:10:49Z

I do have some ethical concerns though with introducing semicolons back into our javascript. Do we really want to be responsible for a new generation of emo developers?

lrz · 2012-04-15T23:20:20Z

Is that valid javascript code? I believe that emo is a reserved keyword in the grammar.

alloy · 2012-04-15T23:23:19Z

It would definitely be cool to be able to have turning-complete code like that in semiscript.

showell · 2012-04-15T23:32:16Z

@lrz I would commit some of my own Belgian money to this project, but I'm afraid Belgium stopped printing its own currency in 1995, and that was AGES ago.

Also, "emo" is only supported as a keyword in certain browser implementations of what we colloquially call "javascript." It is not part of the ECMAScript standard. As far as I know, the committee hasn't even reviewed the draft proposal.

raggi · 2012-04-15T23:36:15Z

Don't you have to wrap all this in a (function(){;})(); ??? ECOMMON

@lrz

…peed as per @lrz's suggestion.

alloy · 2012-04-16T00:20:30Z

@showell & @lrz Done.

showell · 2012-04-16T00:53:57Z

@alloy By my calculation, your use of gperf introduces at least 256 bytes of memory overhead into the solution for a minimal speed enhancement. You can look at asso_values to see what I mean. In addition, each word in wordlist necessitates a 4-byte pointer due to the extra level of indirection, not to mention an extra 30 bytes for the five empty strings necessary to make the overly clever hash function work.

Having said that, I do think it's a promising patch. Next time, don't even bother submitting compiler.gperf. The gperf transcompiler generates ugly, non-idiomatic C code. I'd rather just work with the C directly and avoid the extra build steps. I see that you automated it to some degree in the Makefile, but what happens if I forget to run the makefile? Talk about a debugging nightmare.

Also, what if I mistype "pascal" in the "%%" section? At that point, I'm gonna have to drop down into a C debugger to figure out what's broken. So I might as well have just learned C.

Sorry to be so harsh. I totally understand if you want to use gperf for your own projects, and think it's cool, but I don't want to have to learn it myself. I'm perfectly happy with C. I don't even want to hear about gperf.

alloy · 2017-07-24T12:14:35Z

@showell I’ve slept on this for 1925 nights and have come to the conclusion that you’re probably right. Additionally, since then I’ve started using TypeScript, so I would probably re-use its compiler infrastructure nowadays and get compilation time reasoning about semi-colons for free.

madrobby · 2017-07-24T21:20:23Z

Dammit @alloy, I was just about going to merge this. Well, I guess not then. Your utter lack of any patience for waiting a little bit for pull requests to be applied is disappointing.

showell · 2017-07-24T23:51:28Z

@alloy @madrobby This whole PR has been an utter fiasco.

UNSUBSCRIBE!

Oh wait, I think the CI build is still running. Don't unsubscribe me till we see what it turns up. And I'll ask our PM to schedule a postmortem.

Introduce SemiScript.

f9a916a

* Allows us to support popular languages, now, and far into the future. * Translates to JavaScript and CoffeeScript, but is modular enough to easily be extended to support other languages.

lrz reviewed Apr 15, 2012
View reviewed changes

alloy added 2 commits April 16, 2012 02:07

Support more languages as per @showell's suggestion, with gperf for s…

966e5ea

…peed as per @lrz's suggestion.

Print an error when a syntax error occurs.

f3b36c3

alloy closed this Jul 24, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce SemiScript. #11

Introduce SemiScript. #11

alloy commented Apr 15, 2012

alloy commented Apr 15, 2012

madrobby commented Apr 15, 2012

showell commented Apr 15, 2012

lrz commented Apr 15, 2012

lrz Apr 15, 2012

prototype Apr 15, 2012

lrz Apr 15, 2012

alloy Apr 15, 2012

lrz Apr 15, 2012

alloy Apr 15, 2012

lrz Apr 15, 2012

alloy commented Apr 15, 2012

prototype commented Apr 15, 2012

lrz commented Apr 15, 2012

alloy commented Apr 15, 2012

showell commented Apr 15, 2012

raggi commented Apr 15, 2012

alloy commented Apr 16, 2012

showell commented Apr 16, 2012

alloy commented Jul 24, 2017

madrobby commented Jul 24, 2017

showell commented Jul 24, 2017

Introduce SemiScript. #11

Introduce SemiScript. #11

Conversation

alloy commented Apr 15, 2012

alloy commented Apr 15, 2012

madrobby commented Apr 15, 2012

showell commented Apr 15, 2012

lrz commented Apr 15, 2012

lrz Apr 15, 2012

Choose a reason for hiding this comment

prototype Apr 15, 2012

Choose a reason for hiding this comment

lrz Apr 15, 2012

Choose a reason for hiding this comment

alloy Apr 15, 2012

Choose a reason for hiding this comment

lrz Apr 15, 2012

Choose a reason for hiding this comment

alloy Apr 15, 2012

Choose a reason for hiding this comment

lrz Apr 15, 2012

Choose a reason for hiding this comment

alloy commented Apr 15, 2012

prototype commented Apr 15, 2012

lrz commented Apr 15, 2012

alloy commented Apr 15, 2012

showell commented Apr 15, 2012

raggi commented Apr 15, 2012

alloy commented Apr 16, 2012

showell commented Apr 16, 2012

alloy commented Jul 24, 2017

madrobby commented Jul 24, 2017

showell commented Jul 24, 2017