Add cached format functions for 3x faster formatting. #317

Merged
merged 3 commits into from Jun 6, 2012

Projects

None yet

5 participants

@timrwood
Member

Not ready to pull in, opening discussion.

TODO: inline L LL LLL LLLL LT tokens.

Some speed tests:

http://jsperf.com/date-formatting/8
http://jsperf.com/momentjs-cached-format-functions

All make test-moment tests are passing now, need to add support for L LL LLL LLLL LT for all tests to pass.

File size difference
-108b minified
+75b gzipped

Though there are probably some byte squeezing techniques we could add.

@timrwood timrwood Cached format functions like xaprb
Still need to inline L LL LLL LLLL LT functions…
6529d56
@timrwood
Member

Also, we can probably do something similar with the parser.

@rockymeza
Contributor

This could definitely use some real commenting. I know what this is doing, but only because I read about the different approaches to date formatting that you investigated. This is a huge booby trap for somebody who may wish to extend moment's formatting.

Additionally, perhaps this could be an opportunity to make the formatter extensible on the run. Because formatters are now just strings in a hash, why not allow users to add their own formats?

Member

Yeah, if everyone thinks this should be merged in, I'll definitely add more inline comments.

Adding user created tokens sounds like a great idea. We'll need to recompile the regex whenever tokens are added and invalidate any previously generated functions as well, but its definitely possible now!

We'll also probably need to sort the tokens by length (longest to shortest) so if someone adds a token like Y = 'blah' it doesn't format "YYYY" to "blahblahblahblah".

What are your thoughts about replacing existing tokens? Should people be able to do moment.addToken("MM" : "t.year()") and have it replace the existing MM token? On one hand I can see how it would be useful if someone wanted to completely change the token system to strftime or php or true LDML tokens, but on the other hand, it could cause some confusion if a third party plugin changed the tokens and you weren't aware of it.

I personally feel the original moment format tokens should never change, but if people want a completely different token language they should use something like the moment-strftime wrapper and not the core formatting tokens.

Member

My take on that is to value flexibility over protection. If people want to break stuff in all sorts of horrible ways, I think that's up to them. They won't stumble on this by accident.

Member

Yeah, I guess so.

Contributor

I don't think we need to actively pursue this end though. However, if we keep it in mind when we are changing the formatter, we can gradually build toward an extensible one. I don't think there is any demand for this and I don't think there is any reason to build this feature yet.

Member

Cool, it does add a bit of work when we need to expose the tokens. Every time a token is changed or added, we need to recompile the regex and destroy any previously created functions.

Maybe we can push it out to the 2.0 release.

@adambrunner adambrunner commented on the diff May 28, 2012
moment.js
@@ -23,7 +23,9 @@
aspNetJsonRegex = /^\/?Date\((\-?\d+)/i,
// format tokens
- formattingTokens = /(\[[^\[]*\])|(\\)?(Mo|MM?M?M?|Do|DDDo|DD?D?D?|dddd?|do?|w[o|w]?|YYYY|YY|a|A|hh?|HH?|mm?|ss?|SS?S?|zz?|ZZ?|LT|LL?L?L?)/g,
+ formattingTokens = /(\[[^\[]*\])|(\\)?(Mo|MM?M?M?|Do|DDDo|DD?D?D?|dddd?|do?|w[o|w]?|YYYY|YY|a|A|hh?|HH?|mm?|ss?|SS?S?|zz?|ZZ?)/g,
+ localFormattingTokens = /(LT|LL?L?L?)/g,
@adambrunner
adambrunner May 28, 2012 Contributor

You can save 1 byte by using /(LT|L{1,4})/g

@dmitriy-kiriyenko
dmitriy-kiriyenko May 28, 2012

I also believe that /L{1,4}/ works faster then /LL?L?L?/.

And same pattern used in the line above. I'd say, abused =)

@timrwood
timrwood May 28, 2012 Member

Performance looks about the same between /(LT|L{1,4})/g and /(LT|LL?L?L?)/g. The saved byte is good though!

http://jsperf.com/regex-optional-vs-length

@adambrunner adambrunner commented on the diff May 28, 2012
moment.js
@@ -23,7 +23,9 @@
aspNetJsonRegex = /^\/?Date\((\-?\d+)/i,
// format tokens
- formattingTokens = /(\[[^\[]*\])|(\\)?(Mo|MM?M?M?|Do|DDDo|DD?D?D?|dddd?|do?|w[o|w]?|YYYY|YY|a|A|hh?|HH?|mm?|ss?|SS?S?|zz?|ZZ?|LT|LL?L?L?)/g,
+ formattingTokens = /(\[[^\[]*\])|(\\)?(Mo|MM?M?M?|Do|DDDo|DD?D?D?|dddd?|do?|w[o|w]?|YYYY|YY|a|A|hh?|HH?|mm?|ss?|SS?S?|zz?|ZZ?)/g,
@adambrunner
adambrunner May 28, 2012 Contributor

I've played a bit and this expression can be shortened to this:
formattingTokens = /(\[[^\[]*\])|(\\)?([MD]{1,4}|d|dddd?|ww?|[MDdw]o|DDDo|YYYY|YY|a|A|hh?|HH?|mm?|ss?|SS?S?|zz?|ZZ?)/g,

It can save 10 bytes.

Explanation of the changes:
MM?M?M?|DD?D?D? became [MD]{1,4}
Mo|Do|do|wo became [MDdw]o
dddd?|do? became d|dddd
w[o|w]? became ww?

@dmitriy-kiriyenko
dmitriy-kiriyenko May 28, 2012

Can't understand the two latest. Looks like you're missing o.

@adambrunner
adambrunner May 28, 2012 Contributor

It isn't. The "do" and "wo" is covered by the second expression "[MDdw]o".

@timrwood
timrwood May 28, 2012 Member

Nice work! [MDdw]o will need to go before [MD]{1,4} and d|dddd?|ww? otherwise it will match just the M of Mo.

Also, we are adding a dd token in another pull request so [MD]{1,4} can become [MDd]{1,4} and drop another 5 bytes.

@timrwood
timrwood May 28, 2012 Member

Oh wait, [MD]{1,4} would also match MMDD and not split it into two different formatting tokens. It would need to be M{1,4}|D{1,4}

@adambrunner
adambrunner May 29, 2012 Contributor

You're right, my fault! So the final pattern should be /(\[[^\[]*\])|(\\)?([MDdw]o|DDDo|d{1,4}|D{1,4}|M{1,4}|ww?|YYYY|YY|a|A|hh?|HH?|mm?|ss?|S{1,3}|zz?|ZZ?)/g, if you introduce the dd format too. And if I didn't miss anything again.

@timrwood timrwood referenced this pull request May 29, 2012
Closed

1.7.0 Changelog #288

@timrwood
Member

For exposing formatting tokens, I was thinking of using this syntax.

moment.formatToken('MM', 'p(t.month()+1,2)'); // set a format token
moment.formatToken('MM'); // 'p(t.month()+1,2)' (get a format token)

Alternatively, we could use token, but that might be insufficiently named as parsing tokens are not affected. I don't plan on exposing parsing tokens (and it would be difficult with the incremental parser) but we may need to in the future...

moment.token('MM', 'p(t.month()+1,2)'); // set a format token
moment.token('MM'); // 'p(t.month()+1,2)' (get a format token)
@rockymeza
Contributor

I think that the cached formatters should go out in 1.7, but if we ever do do extensible formatting that should not. The 3x boost for 1.7 seems to be ready already

@timrwood timrwood merged commit 4996b6b into develop Jun 6, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment