Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[refactor] improve Date parsing performance #5259

Merged
merged 11 commits into from Jul 25, 2018

Conversation

Projects
None yet
1 participant
@kares
Copy link
Member

commented Jul 24, 2018

as an attempt to speed-up #5255
... this gets slightly faster on HotSpot, still noticeably slower than MRI (which has these in C)

Graal might be in a better spot doing EA, since the frame writes aren't there ...

kares added some commits Jul 23, 2018

use internal str.sub! for date parsing (without frame info)
MRI has all of the date code in C and also avoids $~ and friends
... expected to noticeably improve Date.parse/_parse performance
review regexp match?-ing - MRI doesn't do dynamic dispatch
str.match? always does 'real' Regexp#match? no matter of patches
+ also do not use same (throw away) array for different things
[refactor] avoid rest of =~ with a match private helper
... now we can not blame $frame "globals" for being slow
@kares

This comment has been minimized.

Copy link
Member Author

commented Jul 24, 2018

this, actually, doesn't do that much (as @enebo's work) :
BASELINE from #5255

Date.parse('2018-07-17', false) [3000000x]            56.580000   0.060000  56.640000 ( 55.751624)
Date._parse('2018-07-17 21:20:55') [3000000x]         74.200000   0.190000  74.390000 ( 72.432050)
Date.iso8601('1999-12-31T00:00:00') [3000000x]        10.870000   0.020000  10.890000 ( 10.519134)
DateTime.iso8601('1999-12-31T19:20:06') [3000000x]    34.040000   0.060000  34.100000 ( 33.126543)

master

                                                         user     system      total        real
Date.parse('2018-07-17', false) [3000000x]          43.760000   0.060000  43.820000 ( 42.784604)
Date._parse('2018-07-17 21:20:55') [3000000x]       56.410000   0.070000  56.480000 ( 54.952706)
Date.iso8601('1999-12-31T00:00:00') [3000000x]       7.950000   0.000000   7.950000 (  7.710798)
DateTime.iso8601('1999-12-31T19:20:06') [3000000x]  15.660000   0.020000  15.680000 ( 15.259132)

we're now at :

                                                         user     system      total        real
Date.parse('2018-07-17', false) [3000000x]          42.400000   0.050000  42.450000 ( 41.576222)
Date._parse('2018-07-17 21:20:55') [3000000x]       55.870000   0.100000  55.970000 ( 54.521819)
Date.iso8601('1999-12-31T00:00:00') [3000000x]       7.550000   0.020000   7.570000 (  7.338553)
DateTime.iso8601('1999-12-31T19:20:06') [3000000x]  14.680000   0.020000  14.700000 ( 14.298500)
@kares

This comment has been minimized.

Copy link
Member Author

commented Jul 24, 2018

actually not much of a difference on GraalVM (-Xcompile.invokedynamic same results wout indy) :
master

---------------------------------------------------------------------------- total: 89.100000sec

                                                         user     system      total        real
Date.parse('2018-07-17', false) [3000000x]          26.960000   0.040000  27.000000 ( 25.664247)
Date._parse('2018-07-17 21:20:55') [3000000x]       35.230000   0.040000  35.270000 ( 33.491879)
Date.iso8601('1999-12-31T00:00:00') [3000000x]       6.350000   0.020000   6.370000 (  5.316840)
DateTime.iso8601('1999-12-31T19:20:06') [3000000x]  11.310000   0.040000  11.350000 ( 10.668665)

PR branch :

---------------------------------------------------------------------------- total: 86.280000sec

                                                         user     system      total        real
Date.parse('2018-07-17', false) [3000000x]          28.000000   0.040000  28.040000 ( 26.218359)
Date._parse('2018-07-17 21:20:55') [3000000x]       35.980000   0.030000  36.010000 ( 33.780362)
Date.iso8601('1999-12-31T00:00:00') [3000000x]       5.370000   0.000000   5.370000 (  4.348643)
DateTime.iso8601('1999-12-31T19:20:06') [3000000x]  10.100000   0.020000  10.120000 (  9.489929)
[refactor] move _parse_day internal to native
this seems to make a difference - indy doesn't get better ...
its ~ the same if the left-over .rb impl gets to use `subs()`
@kares

This comment has been minimized.

Copy link
Member Author

commented Jul 24, 2018

attempted to move a piece into native (^^ above commit), and it makes a difference :

                                                         user     system      total        real
Date.parse('2018-07-17', false) [3000000x]          38.370000   0.070000  38.440000 ( 37.646926)
Date._parse('2018-07-17 21:20:55') [3000000x]       49.630000   0.100000  49.730000 ( 48.354896)
Date.iso8601('1999-12-31T00:00:00') [3000000x]       7.140000   0.030000   7.170000 (  6.946499)
DateTime.iso8601('1999-12-31T19:20:06') [3000000x]  14.990000   0.010000  15.000000 ( 14.599548)

indy doesn't improve - also tried keeping _parse_day in .rb using subs to avoid frame vars - same result so its not doing any Ruby method inlining there anyways

                                                         user     system      total        real
Date.parse('2018-07-17', false) [3000000x]          44.070000   0.060000  44.130000 ( 42.912267)
Date._parse('2018-07-17 21:20:55') [3000000x]       55.780000   0.080000  55.860000 ( 54.245675)
Date.iso8601('1999-12-31T00:00:00') [3000000x]       7.060000   0.040000   7.100000 (  6.572138)
DateTime.iso8601('1999-12-31T19:20:06') [3000000x]  13.780000   0.020000  13.800000 ( 13.163179)

graal

                                                         user     system      total        real
Date.parse('2018-07-17', false) [3000000x]          26.260000   0.040000  26.300000 ( 25.555931)
Date._parse('2018-07-17 21:20:55') [3000000x]       34.770000   0.060000  34.830000 ( 33.686670)
Date.iso8601('1999-12-31T00:00:00') [3000000x]       5.130000   0.020000   5.150000 (  4.944249)
DateTime.iso8601('1999-12-31T19:20:06') [3000000x]  11.220000   0.000000  11.220000 ( 10.884185)

kares added some commits Jul 25, 2018

[refactor] move most Date._parse (delegates) into native
with indy on GraalVM we (almost) meet MRI 2.5 performance
... except `Date.parse('2018-07-17', false)` being 2x slower
do HAVE checks before various parses + cleanup Date _parse(s)
gets us close to MRI performance -> pretty much resolves GH-5255

... also added a b|B check (which MRI doesn;t have) for parse_bc
move Date's parse_time to native - matched up with MRI
... regexp was slightly different in JRuby's format.rb
[perf] improve Date._parse for dates - don't do time parsing
3x speed improvement for raw dates `Date._parse('2000-01-01')`
@kares

This comment has been minimized.

Copy link
Member Author

commented Jul 25, 2018

so this should be resolved, matching up with MRI's C got us to (indy ~ 5% slower) :

                                                         user     system      total        real
Date.parse('2018-07-17', false) [3000000x]          17.840000   0.020000  17.860000 ( 17.523784)
Date._parse('2018-07-17 21:20:55') [3000000x]       25.030000   0.020000  25.050000 ( 24.388277)
Date.iso8601('1999-12-31T00:00:00') [3000000x]       7.170000   0.020000   7.190000 (  6.963285)
DateTime.iso8601('1999-12-31T19:20:06') [3000000x]  13.670000   0.010000  13.680000 ( 13.336531)

for reference MRI 2.5 :

Date.parse('2018-07-17', false) [3000000x]            12.222756   0.000000  12.222756 ( 12.222796)
Date._parse('2018-07-17 21:20:55') [3000000x]         29.789987   0.000000  29.789987 ( 29.790399)
Date.iso8601('1999-12-31T00:00:00') [3000000x]         9.213094   0.000000   9.213094 (  9.213815)
DateTime.iso8601('1999-12-31T19:20:06') [3000000x]     9.899020   0.000000   9.899020 (  9.899302)
@kares

This comment has been minimized.

Copy link
Member Author

commented Jul 25, 2018

... and with commit 4109d58 (additional checks for avoiding a regexp) we're now also good with raw dates:

                                                         user     system      total        real
Date.parse('2018-07-17', false) [3000000x]           6.370000   0.000000   6.370000 (  6.208427)
Date._parse('2018-07-17 21:20:55') [3000000x]       26.010000   0.040000  26.050000 ( 25.394114)
Date.iso8601('1999-12-31T00:00:00') [3000000x]       7.430000   0.010000   7.440000 (  7.237803)
DateTime.iso8601('1999-12-31T19:20:06') [3000000x]  15.140000   0.020000  15.160000 ( 14.765716)

DateTime's iso8601 would be worth looking at next, also not all methods were moved to native ...

@kares kares changed the title [refactor] date parse internals to avoid $frame vars [refactor] improve Date parsing performance Jul 25, 2018

@kares

This comment has been minimized.

Copy link
Member Author

commented Jul 25, 2018

@enebo should be good to go, if you do not have any objections ... not planning more work atm
might look into DateTime iso parsing later, but this good leave on master by than.

@kares kares merged commit a5f85ca into jruby:master Jul 25, 2018

1 check failed

continuous-integration/travis-ci/pr The Travis CI build failed
Details

@kares kares added this to the JRuby 9.2.1.0 milestone Jul 25, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.