Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't add newlines when concatenating JS #419

Closed
wants to merge 2 commits into from
Closed

Conversation

bouk
Copy link
Member

@bouk bouk commented Nov 9, 2016

This ensures sourcemaps don't get misaligned

For review @schneems

@SamSaffron @danshultz I know you're interested in this

Fixes #388
Fixes #416

This ensures sourcemaps don't get misaligned
@matthewd
Copy link
Member

matthewd commented Nov 9, 2016

This won't go so well if a file ends with //

elsif source[source.size - 1].ord != 0x0A
buf << "\n"
unless string_end_with_semicolon?(source)
buf << ";"
end

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm - that was a simple fix.

I was under the impression (which may have been wrongly made) that we may have some sources that do not end in a newline requiring us to need to append one.

I did not realize this line elsif source[source.size - 1].ord != 0x0A becomes no longer needed although based on the tests, that feels correct.

@danshultz
Copy link

Thanks for diving into this - I don't think I realized that elsif source[source.size - 1].ord != 0x0A was possibly no longer necessary.

@bouk
Copy link
Member Author

bouk commented Nov 9, 2016

Very good point @matthewd. I wonder if we maybe should make the semicolon/newline appending step a postprocessor, so we can also modify the source map

@TylerHorth
Copy link

TylerHorth commented Nov 9, 2016

I've run some tests and I think this has fixed the issue. I was originally concerned that if the files didn't end in a newline, that the last and first lines would be concatenated together. But, it looks like the directive processor already ensures files end in new lines.

Since newlines are automatically added, files ending in comments shouldn't be a problem either.

I do have one minor concern. As the semicolon is inserted after the newline, the first line of the next file will be offset by 1 column.


Test files:

sub
├── a.js
├── directory.js
└── modules
    └── something.js

1 directory, 3 files

directory.js:

//= require ./a
//= require ./modules/something
console.log('sub/directory.js');
a();

a.js: (note: does not end with new line)

function a() {
  console.log('sub/a.js')  
}

modules/something.js:

console.log("sub/directory/modules/something.js")

Compiled output:

function a() {
  console.log('sub/a.js')
}
;console.log("sub/directory/modules/something.js")
;

console.log('sub/directory.js');
a();

Mappings:

{:source=>"a.js", :generated=>[1, 0], :original=>[1, 0]},
{:source=>"a.js", :generated=>[2, 0], :original=>[2, 0]},
{:source=>"a.js", :generated=>[3, 0], :original=>[3, 0]},
{:source=>"something.js", :generated=>[4, 0], :original=>[1, 0]},
{:source=>"directory.js", :generated=>[5, 0], :original=>[1, 0]},
{:source=>"directory.js", :generated=>[6, 0], :original=>[2, 0]},
{:source=>"directory.js", :generated=>[7, 0], :original=>[3, 0]},
{:source=>"directory.js", :generated=>[8, 0], :original=>[4, 0]}

As you can see, the semicolon inserted before something.js has caused it's mapping to be slightly off ({:source=>"something.js", :generated=>[4, 0], :original=>[1, 0]} should have generated=>[4,1]). This is very minor and is probably not worth the effort to fix.

The ideal generated output would probably be:

function a() {
  console.log('sub/a.js')
};
console.log("sub/directory/modules/something.js");


console.log('sub/directory.js');
a();

But this becomes more complicated when you need to deal with comments and stuff.

If we can confirm that the directive processor will always be run before concatenation (and thus files will always end in a newline) then I think this is safe to ship.

Refactor concat_javascript_sources to avoid re-encoding UTF32 data
@bouk
Copy link
Member Author

bouk commented Nov 9, 2016

I've refactored concat_javascript_sources.

I realized that because Sprockets uses UTF8, we don't need to re-encode to UTF-32 if there's multibytes in the code, because any multibyte sequences don't use any of the first 127 ASCII bytes. So instead I scan the string byte-by-byte.

I have also made it so that it inserts the semicolon at the first location that's non-whitespace at the end of the string. This ensures we're not offsetting things. If the previous line was a comment it'll end up with an extra semicolon at the end, which is harmless.

I'm not a big fan of this piece of code:

unless idx.nil?
  source = source.dup
  source.insert(idx, ';')
end

But I haven't figured out a better way to do it, we need to append to the source because inserting into the buffer could mean O(N^2) performance if there's any multibytes, and I need to dup in case the string is a frozen literal (would it be kosher to add a .frozen? in there?).

Looking forward to feedback, @TylerHorth could you see if this fixes the one char offset?

@TylerHorth
Copy link

This looks a lot better, besides the quirk with comments.

function a() {
  console.log('sub/a.js')
}//;
console.log("sub/directory/modules/something.js");


console.log('sub/directory.js');
a();

I did some research to see if this would be a problem, but it looks like the javascript interpreter inserts semicolons automatically on every newline. So if we end files on a newline anyway, is it even necessary to insert them at all?

@bouk
Copy link
Member Author

bouk commented Nov 10, 2016

Files aren't guaranteed to end in a newline, because the directive processor happens before any transforms. The babel transformer never adds a newline at the end, for example (which is babel's fault)

@TylerHorth
Copy link

If that's the case, would something like this be adequate? Or are there cases where we would need to add a semicolon even with a newline?

def concat_javascript_sources(buf, source)
  buf << 0x0A if buf.bytesize > 0 && buf.getbyte(-1) != 0x0A
  buf << source
end

@schneems
Copy link
Member

I dropped the ball here. Is this good to go or does it need more work?

@sun617 sun617 mentioned this pull request Dec 30, 2016
@marcusmalmberg
Copy link

Any news on this one?

@schneems
Copy link
Member

Fixed in #515

@schneems schneems closed this Nov 20, 2017
@jeremy jeremy deleted the remove-newline-adding branch March 12, 2018 23:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants