Performance improvements #134

Closed
wants to merge 7 commits into
from

Projects

None yet

3 participants

@rkh
Member
rkh commented Dec 5, 2010

General algorithms remain untouched. In a routing app returning strings directly, I get up to 70% performance improvement. In an app rendering a template I still get a 20% performance boost. This number is likely to decrease for real world apps (since they don't actually spend much time in Sinatra, but rather in templates and models). If I got time, I'll benchmark against real apps. All test pass.

rkh added some commits Dec 3, 2010
@rkh rkh Add string caching.
By caching strings that are encountered on every request we reduce the number
of objects being allocated on every request both improving our memory
footprint and performance, since the GC won't kick in that often.

For a simple sinatra app (one route returning a string), I get those numbers
without string caching:

request   concurrency   req/s    failures
==========================================
1000      1             794      0
1000      11            793      0
1000      21            863      0
1000      31            870      0
1000      41            703      0
1000      51            741      0
1000      61            752      0
1000      71            742      0
1000      81            305      0
1000      91            684      0

With string caching:

request   concurrency   req/s    failures
==========================================
1000      1             952      0
1000      11            1106     0
1000      21            1156     0
1000      31            953      0
1000      41            1030     0
1000      51            1140     0
1000      61            1156     0
1000      71            1123     0
1000      81            1119     0
1000      91            1097     0

For benchmarking I used Thin's abc (which in turn uses ab).
Original proposal by Matt Aimonetti, see GH #105.
61632c4
@rkh rkh Use escape_utils if available.
This improves Rack's performance when escaping/unescaping URLs.
For a simple sinatra app (one route returning a string), I get those numbers
without escape_utils:

request   concurrency   req/s    failures
==========================================
1000      1             952      0
1000      11            1106     0
1000      21            1156     0
1000      31            953      0
1000      41            1030     0
1000      51            1140     0
1000      61            1156     0
1000      71            1123     0
1000      81            1119     0
1000      91            1097     0

And these with escape_utils:

request   concurrency   req/s    failures
==========================================
1000      1             964      0
1000      11            1178     0
1000      21            1229     0
1000      31            1206     0
1000      41            1236     0
1000      51            1230     0
1000      61            1222     0
1000      71            1206     0
1000      81            1224     0
1000      91            1190     0

For benchmarking I used Thin's abc (which in turn uses ab).
bfce7aa
@rkh rkh Use hwia if available.
This improves performance when generating indifferent hashes (happens at least
once every request).

For a simple sinatra app (one route with simple params accessing), I get those
numbers without hwia:

request   concurrency   req/s    failures
==========================================
1000      1             936      0
1000      11            957      0
1000      21            876      0
1000      31            914      0
1000      41            942      0
1000      51            956      0
1000      61            945      0
1000      71            846      0
1000      81            954      0
1000      91            999      0

And with hwia:

request   concurrency   req/s    failures
==========================================
1000      1             932      0
1000      11            1047     0
1000      21            1023     0
1000      31            1178     0
1000      41            1021     0
1000      51            1158     0
1000      61            1438     0
1000      71            1122     0
1000      81            1029     0
1000      91            1018     0
1000      96            1098     0

For benchmarking I used Thin's abc (which in turn uses ab).
df62aac
@rkh rkh Revert "Use hwia if available."
This reverts commit df62aac.

Breaks one test. Probably related to string/symbol mapping hwia is doing
internally. Have to dig deeper.
a14502c
@rkh rkh If a simple string is given as pattern, don't create a regexp.
Without:

request   concurrency   req/s    failures
==========================================
1000      1             1017     0
1000      11            1010     0
1000      21            916      0
1000      31            925      0
1000      41            1199     0
1000      51            1206     0
1000      61            1237     0
1000      71            1316     0
1000      81            1275     0
1000      91            1343     0

With patch:

request   concurrency   req/s    failures
==========================================
1000      1             1622     0
1000      11            1544     0
1000      21            1459     0
1000      31            1632     0
1000      41            1488     0
1000      51            1646     0
1000      61            1595     0
1000      71            1544     0
1000      81            1426     0
1000      91            1556     0

I suspect the performance difference being even bigger on rbx.
453f6f6
@rkh rkh Improve performance of methods geneterated by #set.
The produced method are not only faster, but for some reason (probably by not
creating too many closures), the GC kicks in far less often.

Without:

request   concurrency   req/s    failures
==========================================
1000      1             1034     0
1000      6             1171     0
1000      11            1205     0
1000      16            1221     0
1000      21            1223     0
1000      26            1207     0
1000      31            1226     0
1000      36            169      0 <= GC
1000      41            1205     0
1000      46            1197     0
1000      51            1240     0
1000      56            1235     0
1000      61            600      0 <= GC
1000      66            1123     0
1000      71            1222     0
1000      76            1223     0
1000      81            270      0 <= GC
1000      86            1017     0
1000      91            1194     0
1000      96            581      0

With optimization:

request   concurrency   req/s    failures
==========================================
1000      1             1245     0
1000      6             1402     0
1000      11            1382     0
1000      16            1424     0
1000      21            1363     0
1000      26            1432     0
1000      31            1254     0
1000      36            1432     0
1000      41            1394     0
1000      46            1682     0
1000      51            1428     0
1000      56            1434     0
1000      61            1446     0
1000      66            1434     0
1000      71            1428     0
1000      76            1429     0
1000      81            184      0 <= GC
1000      86            1432     0
1000      91            1257     0
1000      96            1424     0
088fe7e
@rkh rkh On first incoming request in production mode, eliminates some closure…
…s we

don't have to keep but access frequently.

Without patch:

    request   concurrency   req/s    failures
    ==========================================
    1000      1             1058     0
    1000      11            1181     0
    1000      21            1265     0
    1000      31            1258     0
    1000      41            1079     0
    1000      51            1211     0
    1000      61            1239     0
    1000      71            1232     0
    1000      81            1234     0
    1000      91            1149     0

With patch:

    request   concurrency   req/s    failures
    ==========================================
    1000      1             1142     0
    1000      11            1501     0
    1000      21            1385     0
    1000      31            1779     0
    1000      41            1404     0
    1000      51            1685     0
    1000      61            1764     0
    1000      72            1371     0
    1000      81            1333     0
    1000      91            1374     0
f092e5a
@shtirlic
shtirlic commented Dec 7, 2010

@rkh did u do performance tests after the all these commits (Thin's abc)?
Please post results if u have it.

@rkh
Member
rkh commented Jan 8, 2011

Performance differs extremely. For instance, 453f6f6 actually is a performance issue on 1.9.2. If we could access the AST of a closure, we could do partial evaluation (which would improve performance by an order of magnitude, I suspect).

@cactus
cactus commented Jan 24, 2011

I am curious as to why this never made it in, especially the part about using constants instead of strings for hash access of header names.

@rkh
Member
rkh commented Jan 24, 2011

This will make it in. I just feel like the patches are no where ready and some should even be reverted. Ultimately, I would really love to play with something like partial evaluation to get even bigger speed boosts. However, if I won't have the time to look into this before 1.2.0 I was meaning to cherry pick at least 61632c4 and 088fe7e. I think that f092e5a is a bit dangerous, 453f6f6 can even cause a slow down on 1.9.2 (while it gives a boost on 1.8.7) and bfce7aa should be rather taken care of by the user.

@cactus
cactus commented Jan 25, 2011

Thanks for the info!
:)

@cactus
cactus commented on a14502c Feb 24, 2011

I looked into this a bit, and I had two breaking tests.

One was a trivial case where the result was an assert_equal and a hash was expected but a StrHash was returned (hwia hash class is StrHash).

The other breaking test is a bit stranger (test_transitions_to_the_next_matching_route_on_pass). It seems that when a pass is used, the params are not being restored properly before handing to the next route, if hwia is used. I tracked this down to the behavior of .merge in hwia. This code should illustrate the issue more clearly than I could probably explain.

>> require 'hwia'
>> h = StrHash.new
>> h['a'] = 'b'
>> i = Hash.new
>> i['c'] = 'd'
>> i.merge(h)
 => {"c"=>"d", "a"=>"b"} 
>> i 
 => {"c"=>"d"}
# note! Non-destructive merge is not merge!
>> h.merge(i)
 => {"a"=>"b", "c"=>"d"} 
>> h 
 => {"a"=>"b", "c"=>"d"}
#note! merge behaves like merge! in hwai

This results in the merge call in process_routes @params = @original_params.merge(params) actually setting @original_params instead of just using it, if hwia is used. gross! :/

All that said, I was doing some googling, and found this.
https://groups.google.com/group/sinatrarb/browse_thread/thread/906af6133cf72f9c/a30253d5e40971b5?#a30253d5e40971b5

So it seems that back in 2008, it was a choice to not use hwia. Presumably to remove the dependency (and possibly this pathological merge case).
I don't know what the performance characteristics look like with and without hwia. If it is decided to use it again, it might make sense to either submit this upstream as a bug, or work around it (and document that fact).

created upstream ticket with hwia project just in case this is in fact a bug (and not working as designed).
methodmissing/hwia#1

Member
rkh replied Feb 24, 2011

Thanks for looking into this.

I think the HWIA mentioned on the ML is actually Rails' pure ruby implementation, not the hwia gem, which hooks heavily into Ruby's internals.

@kytrinyx kytrinyx deleted the performance branch Mar 24, 2015
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment