New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce SemiScript. #11
Conversation
* Allows us to support popular languages, now, and far into the future. * Translates to JavaScript and CoffeeScript, but is modular enough to easily be extended to support other languages.
Btw, the option parsing has been implemented very naively on purpose, I think it can provide us with quite some bike-shedding hours in the future. |
+1 |
Great idea, and nice work. It should be easy to extend this concept to other languages. Sorry, this is just a sketch (untested):
|
Well, multiple strcmp statements are going to impact the promised ROFLScale performance. It would be more efficient to use a perfect hash table here (one can be generated using gperf(1)). I would certainly throw a few belgian dollars in a kickstarter project that promised to contribute such a change. |
|
||
// Comment, ignore the rest of the line. | ||
case '#': | ||
comment = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's probably not going to work if the sharp character is used in a string (or properly escaped), right? It might be a better idea to tokenize the file contents first. As performance is an issue here I recommend using the Amazon Mechanical Turk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, it's not going to work due to it only considering one character at a time (with too little state info) and therefor not parsing the grammar for comments properly; I'd normally recommend using a more elaborate finite state automaton for the latter, but crowdsourcing is definitely more cloud 3.0. Beware of using jquery files though, the excessive use of dollar signs might influence the tokenizer in thinking you might be part of a billion dollar photo filter company and could drive up pricing per token parsed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, you make a valid point, @prototype. It would probably be safer to perform a DSE optimization pass (dollar sign elimination) on the js code before submitting it to Amazon. I believe there was a paper recently published in The Journal of Machine Learning Research about this technique.
I would recommend writing the optimization pass using LLVM, since it provides all the necessary foundations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guys, I suggest you create a fork if you want to add support for strings, or anything else than a semicolon or sharp character. As it is, there is no support for them and as such your code would be at fault.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is how you thank people working on your code? This is open source, we have the right to open meaningful discussions about any project. I can see that you are trying to create a culture of exclusion here. I do hope that you will properly apologize for this atrocious behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will do no such thing. Open source means that I get to dictate what you will be using, not some democratic model from your hippy lala-world.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I vigorously protest against this dictatorial behavior. You should be open to our concerns day and night. I am going to write a blog post about this story and submit it to hacker news.
This said, you are contradicting yourself. In your previous comment you proudly state that the code only supports semicolon or sharp characters, or you are deliberately checking for \n. I suspect you have bigger plans about the code and that you already started working on our suggestions, with the goal of delivering a proprietary product and selling it to facebook.
Is that valid javascript code? I believe that emo is a reserved keyword in the grammar. |
It would definitely be cool to be able to have turning-complete code like that in semiscript. |
@lrz I would commit some of my own Belgian money to this project, but I'm afraid Belgium stopped printing its own currency in 1995, and that was AGES ago. Also, "emo" is only supported as a keyword in certain browser implementations of what we colloquially call "javascript." It is not part of the ECMAScript standard. As far as I know, the committee hasn't even reviewed the draft proposal. |
Don't you have to wrap all this in a |
@alloy By my calculation, your use of gperf introduces at least 256 bytes of memory overhead into the solution for a minimal speed enhancement. You can look at asso_values to see what I mean. In addition, each word in wordlist necessitates a 4-byte pointer due to the extra level of indirection, not to mention an extra 30 bytes for the five empty strings necessary to make the overly clever hash function work. Having said that, I do think it's a promising patch. Next time, don't even bother submitting compiler.gperf. The gperf transcompiler generates ugly, non-idiomatic C code. I'd rather just work with the C directly and avoid the extra build steps. I see that you automated it to some degree in the Makefile, but what happens if I forget to run the makefile? Talk about a debugging nightmare. Also, what if I mistype "pascal" in the "%%" section? At that point, I'm gonna have to drop down into a C debugger to figure out what's broken. So I might as well have just learned C. Sorry to be so harsh. I totally understand if you want to use gperf for your own projects, and think it's cool, but I don't want to have to learn it myself. I'm perfectly happy with C. I don't even want to hear about gperf. |
@showell I’ve slept on this for 1925 nights and have come to the conclusion that you’re probably right. Additionally, since then I’ve started using TypeScript, so I would probably re-use its compiler infrastructure nowadays and get compilation time reasoning about semi-colons for free. |
Dammit @alloy, I was just about going to merge this. Well, I guess not then. Your utter lack of any patience for waiting a little bit for pull requests to be applied is disappointing. |
After much discussion, it became clear that to support this library far into the future, we would need to abstract the requirements into its own language, which can be compiled down to different languages.
Currently the compiler generates valid code for JavaScript and CoffeeScript, but could easily be extended to support additional languages.
The compiler has been written in C for optimal ROFLScale performance.
Related #8.