rexml parsing 100k file 4x slower than MRI #102

Closed
stepheneb opened this Issue Nov 29, 2009 · 4 comments

Comments

Projects
None yet
4 participants
Contributor

stepheneb commented Nov 29, 2009

Comparing rexml performance on MRI 1.8.6 and Rubinius 1.0.0-rc1 (1).

Parse a 98k XML document and count one type of leaf element (466 entries) 100 times (2)

    $ ruby xml_benchmark_rexml.rb 
                user     system      total        real
    rexml   8.990000   0.540000   9.530000 (  9.609303)
    rexml   8.900000   0.540000   9.440000 (  9.555908)

    $ bin/rbx xml_benchmark_rexml.rb 
                user     system      total        real
    rexml  44.513807   0.000000  44.513807 ( 44.513791)
    rexml  41.221796   0.000000  41.221796 ( 41.221803)

1: Tested versions of Ruby
MRI 1.8.6 (2008-08-11 rev 287), platform: universal-darwin9.0
rubinius 1.0.0-rc1 (1.8.7 0f83f82 2009-11-25 JI) [i686-apple-darwin9.8.0]

2: xml_benchmark_rexml.rb and associated xml data file here
http://svn.concord.org/svn/projects/trunk/common/ruby/xml_benchmarks/

FYI, this is extracted from larger set described here: http://gist.github.com/244822

I ran rbx with -P and got the following (top 5 anyway)

  %   cumulative   self                self     total
 time   seconds   seconds      calls  ms/call  ms/call  name
------------------------------------------------------------
  28.97   46.76     23.44     297000     0.08     0.16  REXML::Parsers::BaseParser#pull
  19.29   77.78     15.61        200    78.03   388.89  REXML::Document#build
  12.59   10.19     10.19        749    13.61    13.61  GC.collect_mature
   6.56    5.35      5.31    2745455     0.00     0.00  Kernel#equal?
   3.72    3.05      3.01    1541683     0.00     0.00  Kernel#kind_of?
Owner

evanphx commented Mar 26, 2010

We need to address this eventually.

Member

kronos commented Mar 31, 2010

Rubinius 69b2a03

Rehearsal -----------------------------------------
rexml  21.906701   0.000000  21.906701 ( 21.906644)
------------------------------- total: 21.906701sec

            user     system      total        real
rexml  19.418585   0.000000  19.418585 ( 19.418560)

ruby 1.8.7 (2009-12-24 patchlevel 248) [i686-darwin10.0.0]

Rehearsal -----------------------------------------
rexml   9.630000   0.100000   9.730000 (  9.794298)
-------------------------------- total: 9.730000sec

            user     system      total        real
rexml   9.660000   0.120000   9.780000 (  9.898224)

ruby 1.9.1p376 (2009-12-07 revision 26041) [i386-darwin10]

Rehearsal -----------------------------------------
rexml  15.050000   0.300000  15.350000 ( 15.610518)
------------------------------- total: 15.350000sec

            user     system      total        real
rexml  14.900000   0.300000  15.200000 ( 15.424011)
Owner

evanphx commented Mar 31, 2010

Rubinius is currently about 2x slower now. We need to profile and work on the JIT to improve it, most likely. Closing this an issue because the new work will likely be post-1.0.

This issue was closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment