New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trim output size & complexity #34
Conversation
I realised all of the internal function calls should be replaced by string concatenation, so I did that. Also, I uploaded the real-world localization data I'm working with to a gist, along with the resulting |
Rather than repeat I'm working on a few further changes on top of the ones I've included here, adding an In that branch, I've also allowed with 83f0931 the top-level entries in With all of that, I'm currently seeing JavaScript output that adds about 30% to the size of the source JSON, while v0.1.7's output adds about 300%. |
👍 Awesome @eemeli ! |
Looking through all these. Look great from the bird's eye view. One quick question:
|
MessageFormat.locale["in"] = function(n) { | ||
return 'other'; | ||
}; | ||
MessageFormat.locale.in=function(n){return "other"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, this will likely need to move back to ["in"]
for compat reasons.
I think I'm good with this. I'm gonna pull it an change that one spot myself, so I don't forget about it. |
Trim output size & complexity
I think this could potentially be the cause of issue #38 |
return d[k]; | ||
} | ||
// plural | ||
p=function(d,k,o,l,p){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @eemeli - So I hadn't realized before that these were global functions. Can we at least namespace them on MessageFormat
? I know it's longer, but after gzip it should only cost 1 byte per repeat. Feels very odd leaking single letter global variables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drat, you're right. I'd been mainly looking at the include code, and had missed that those functions' versions in lib/messageformat.dev.js
were missing a var
prefix. So they end up as globals, when they really shouldn't and needn't. The compiled code is fine, though, there the single-letter functions only exist within the wrapping anonymous function. Do you want me to fix this and do a pull request, or will you do it yourself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So when we actually compile a message and it references p
- wouldn't that
break if these weren't global?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, since they're valid within the scope of the wrapping function. When used directly, compile()
returns a function that's coming from a scope where p
& co. are valid; when precompiled into output JavaScript, the functions are included from lib/messageformat.include.js
.
The only way it breaks is if you use an external wrapper to the library that calls precompile()
, and tries to parse the results without the utility functions. But if you're doing that, it should break really easily and obviously, and hence be fixable.
Actually, in the precompiled output, the only global we end up with is i18n
, even MessageFormat
is scoped only within the wrapping function. As its only member is the locale
array, we could consider getting rid of that as well, and use something like var locale = { 'en': ... };
instead, but those are currently hard-coded into locale/*.js
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yea. That's how I use it, I just hadn't upgraded my old stuff to use the new smaller hotness. I'm writing some bindings for handlebars to open source and I probably want to make it to where there's only a single global. I definitely don't want to restrict things to only work with the command line tools, etc.
We do only end up with the global i18n
when we do the precompile, but it still tries to call several of the functions that no longer exist. So that's probably not ideal.
If you know of an easy fix, I'd love a pull request. Otherwise I can get into how some of this new stuff works tomorrow or the next day.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately I've no experience with Handlebars, so I don't really know how the extension needs to be structured to make sure that the utility functions are available in the proper scope.
If the issue is with the Handlebars-precompiled output, can't you use build()
from bin/messageformat.js
? That should get you working output, with a single definable global, and the utility functions as local variables within that. Or is the problem something else entirely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea. The handlebars part was an unneeded detail.
I think compiled functions should work by running them (even after saving
them as strings into files and losing scope, etc) with only a MessageFormat
runtime as a single global (or module). It shouldn't require a packaged
build in that manner. It trades a tiny bit of space for a lot of output
flexibility. The new version I think breaks all current integrations I've
done because of the way I precompile (I've never actually used the command
line tool outside of testing it). So I'd be excited to fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're going to make the utility functions public, they'll need to be defined as window.i18n.p = ...
in lib/messageformat.include.js
as MessageFormat
is not a global in the precompiled code. You'll also therefore need to know about i18n
in lib/messageformat.dev.js
, which is currently not a requirement. Or you'll need to switch from the current configurable namespace that defaults to window.i18n
to a fixed MessageFormat
global namespace.
I wouldn't do any of that, tbh. Unless you're planning on inlining everything back in to the output of precompile()
, you'll still need a custom function for storing functions as strings in order to accommodate for different locales, and if you have that (build()
in bin/messageformat.js
), why not keep it as is, which already enables the use of the single-letter local functions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I spose I think it should pretty much just work like it did before. Any code that wasn't explicitly in the precompiled function called a MessageFormat property function. If we need to change the name of a local variable inside the precompiled function that's fine.
I think it's pretty weird to have references to single-letter functions and stuff. I also think it's weird to create a separate namespace for them. They essentially make up the 'messageformat runtime' vs. the 'messageformat full parser and compiler'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I fixed the issues with #45. Just making the utility functions local to the wrapping function in lib/messageformat.dev.js
wasn't enough since new Function()
doesn't end up in its calling scope, as I'd assumed. Hence the solution's a bit different.
The output of compile()
is now independent of all external factors, as the utility functions are explicitly included in its scope. Calling toString()
on its results isn't quite enough to create portable code, though; you need to include the output of MessageFormat.LocalFunctions()
as well. Technically that could be fixable, but I'd recommend not doing so, as it would encourage duplicating code.
With these changes, any real-world sized localization is likely to shrink by a huge margin. With the sort-of realistic test case I've been using, the output is now 70% smaller than with version 0.1.7. The included examples stay at roughly their current size, as they are rather minimal cases.
The main reason this is possible is by including a small set of utility functions in the generated code, which then allows for removing duplicate code, making some of the inner function calls into strings, inlining all the generated local variables, and some other minor changes.
I've also minified the output slightly (including all the simpler locales), but the output should still be at least as readable as it was before. Programmatically minifying it can probably shave another 10-20% from it, but at the cost of legibility.
The utility functions are included in two places; a new file
lib/messageformat.include.js
and near the top oflib/messageformat.dev.js
. They differ only in amounts of comments and white space, to make the latter easier to read.I had to change a couple of the test cases slightly, as they were referring to code that now gets called from a different function. All tests should be valid for all commits.