Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue with semicolon-less javascript #56

Closed
einars opened this issue Jul 26, 2011 · 9 comments
Closed

issue with semicolon-less javascript #56

einars opened this issue Jul 26, 2011 · 9 comments

Comments

@einars
Copy link
Contributor

einars commented Jul 26, 2011

a++
foo()

converted to

a++foo()

@atrigent
Copy link

Could you guys maybe update this issue? The given code now seems to be converted to:

a++
foo()

...which also doesn't make sense, although it does have the same meaning as the original which is good. In any case, could the weird indentation be fixed?

@bitwiseman
Copy link
Member

This is a problem of due to our processing tokens with almost no look-ahead... Hm, but if you're specifying preserve_newlines, and you enter this, I think it makes sense to preserve it.

We can be sure that the reverse is true (http://www.ecma-international.org/ecma-262/5.1/#sec-7.9.1) :

a
++foo()

Even in this case, where it is a syntax error, it looks like newline is equivalent to semicolon. I'll add something for this tomorrow.

@atrigent
Copy link

Ah I see, if you uncheck "Preserve empty lines?" on the website, you do get the behaviour described in the original report.

Ok, so there are a couple issues here. First of all, "preserve empty lines" seems to be misleading. There are no empty lines in this:

a++
b

...and yet, that option changes the behaviour. That doesn't make any sense. Second of all, it seems to me that the newline should be preserved in any case, because a++b is ambiguous and not valid Javascript.

In general, I don't see why weird corner case keep showing up that break the beautifier. The semicolon insertion rules that you linked to are very simple. Why can't you just implement them and then insert an extra line break every time you would add (or you encounter) a semicolon?

@bitwiseman
Copy link
Member

So, yeah... Have you looked at the code? 😄 I understand what I said wasn't the most reasuring answer. But frankly, it's not the most reassuring code.

Opened #199 and #200.

@bitwiseman
Copy link
Member

The problem is in the first part of the first of those three rule (http://www.ecma-international.org/ecma-262/5.1/#sec-7.9.1):

When, as the program is parsed from left to right, a token (called the offending token) is encountered
that is not allowed by any production of the grammar, then ...

With not fully parsing as noted in #200, "any production of the grammar" is not infomation that we have in the current implementation. It has gotten better over time, but until #200 is fixed, we're going to keep hitting edge cases.

@atrigent
Copy link

I see. In #200, you note that this is also a feature. Is that because not doing the full parsing is faster?

@einars
Copy link
Contributor Author

einars commented Mar 23, 2013

No. Even though the current parser is dumb because that was the fastest to hack together (even though it has been quite costly feature-wise in the long run; #200 is huge and kind of long overdue), "proper" parsers are generally quite intolerant by default and love perfect sources without errors. The beautifier currently formats everything thrown at it (even when that's not a proper js — and often with great results), the most notable exceptions being with semicolon-less javascript; it'd be a huge setback to fail with javascript parse errors if the source can't be parsed as a proper javascript.

@atrigent
Copy link

But what is the use of Javascript-that-is-not-Javascript (i.e. Javascript that does not parse)? Is the idea that people would be able to use the beautifier while they are writing their code when it might not be complete and thus not parse correctly? That seems like a pretty uncommon case, and I'd argue not one that should be supported anyways - people should learn to make their own code beautiful. I mostly use this for understanding obfuscated/minified code, which obviously has to parse because it actually gets run.

@einars
Copy link
Contributor Author

einars commented Mar 23, 2013

The users find their uses. Maybe it just makes them feel good when the application doesn't scold them about missing quotes and braces, but just robustly reformats their code as good as it can.

Of course, it's crappy when some valid semicolon-less javascript gets misformatted. That would be great to fix, but it's currently impossible, without the switch to a full-fledged javascript parser.

While we don't have one, I can suggest you take a look at uglifyjs — it has an excellent javascript parser and a beautifier mode, it might work well for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants