Skip to content

Commit

Permalink
Optimize Source#read_until method (#135)
Browse files Browse the repository at this point in the history
Optimize `Source#read_until` method.

## Benchmark
```
RUBYLIB= BUNDLER_ORIG_RUBYLIB= /Users/naitoh/.rbenv/versions/3.3.0/bin/ruby -v -S benchmark-driver /Users/naitoh/ghq/github.com/naitoh/rexml/benchmark/parse.yaml
ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [arm64-darwin22]
Calculating -------------------------------------
                         before       after  before(YJIT)  after(YJIT)
                 dom      9.877       9.992        15.605       17.559 i/s -     100.000 times in 10.124592s 10.008017s 6.408031s 5.695167s
                 sax     22.903      25.151        39.482       50.846 i/s -     100.000 times in 4.366300s 3.975922s 2.532822s 1.966706s
                pull     25.940      30.474        44.685       61.450 i/s -     100.000 times in 3.855070s 3.281511s 2.237879s 1.627346s
              stream     25.255      29.500        41.819       53.605 i/s -     100.000 times in 3.959539s 3.389825s 2.391256s 1.865505s

Comparison:
                              dom
         after(YJIT):        17.6 i/s
        before(YJIT):        15.6 i/s - 1.13x  slower
               after:        10.0 i/s - 1.76x  slower
              before:         9.9 i/s - 1.78x  slower

                              sax
         after(YJIT):        50.8 i/s
        before(YJIT):        39.5 i/s - 1.29x  slower
               after:        25.2 i/s - 2.02x  slower
              before:        22.9 i/s - 2.22x  slower

                             pull
         after(YJIT):        61.4 i/s
        before(YJIT):        44.7 i/s - 1.38x  slower
               after:        30.5 i/s - 2.02x  slower
              before:        25.9 i/s - 2.37x  slower

                           stream
         after(YJIT):        53.6 i/s
        before(YJIT):        41.8 i/s - 1.28x  slower
               after:        29.5 i/s - 1.82x  slower
              before:        25.3 i/s - 2.12x  slower

```

- YJIT=ON : 1.13x - 1.38x faster
- YJIT=OFF : 1.01x - 1.17x faster

Co-authored-by: Sutou Kouhei <kou@clear-code.com>
  • Loading branch information
naitoh and kou committed Jun 3, 2024
1 parent 3e3893d commit 037c16a
Showing 1 changed file with 13 additions and 2 deletions.
15 changes: 13 additions & 2 deletions lib/rexml/source.rb
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,16 @@ class Source
attr_reader :line
attr_reader :encoding

module Private
PRE_DEFINED_TERM_PATTERNS = {}
pre_defined_terms = ["'", '"']
pre_defined_terms.each do |term|
PRE_DEFINED_TERM_PATTERNS[term] = /#{Regexp.escape(term)}/
end
end
private_constant :Private
include Private

# Constructor
# @param arg must be a String, and should be a valid XML document
# @param encoding if non-null, sets the encoding of the source to this
Expand Down Expand Up @@ -69,7 +79,8 @@ def read(term = nil)
end

def read_until(term)
data = @scanner.scan_until(/#{Regexp.escape(term)}/)
pattern = Private::PRE_DEFINED_TERM_PATTERNS[term] || /#{Regexp.escape(term)}/
data = @scanner.scan_until(pattern)
unless data
data = @scanner.rest
@scanner.pos = @scanner.string.bytesize
Expand Down Expand Up @@ -179,7 +190,7 @@ def read(term = nil)
end

def read_until(term)
pattern = /#{Regexp.escape(term)}/
pattern = Private::PRE_DEFINED_TERM_PATTERNS[term] || /#{Regexp.escape(term)}/
term = encode(term)
begin
until str = @scanner.scan_until(pattern)
Expand Down

0 comments on commit 037c16a

Please sign in to comment.