Skip to content

Commit

Permalink
Changed processing in REXML::Parsers::BaseParser#pull_event from regu…
Browse files Browse the repository at this point in the history
…lar expression to processing using StringScanner.

## Why
Improve maintainability by optimizing the process so that the parsing process proceeds using StringScanner#scan.

# Changed
- Added Source#string= method for error message output.
- Added TestParseDocumentTypeDeclaration#test_no_name test case.
- Of the `intSubset` of DOCTYPE, "<!" added consideration for processing `Comments` that begin with "<!".

[intSubset Spec]
https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-doctypedecl
> [28] 	doctypedecl   ::= '<!DOCTYPE' S Name (S ExternalID)? S? ('[' intSubset ']' S?)? '>'

https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-intSubset
> [28b] intSubset   ::=  (markupdecl | DeclSep)*

https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-markupdecl
> [29]  markupdecl   ::= elementdecl | AttlistDecl | EntityDecl | NotationDecl | PI | Comment

https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-elementdecl
> [45]  elementdecl   ::=   '<!ELEMENT' S Name S contentspec S? '>'

https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-AttlistDecl
> [52] 	AttlistDecl   ::=   '<!ATTLIST' S Name AttDef* S? '>'

https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EntityDecl
> [70] 	EntityDecl   ::=   GEDecl | PEDecl
> [71] 	GEDecl	   ::=   '<!ENTITY' S Name S EntityDef S? '>'
> [72] 	PEDecl	   ::=   '<!ENTITY' S '%' S Name S PEDef S? '>'

https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-NotationDecl
> [82] 	NotationDecl   ::=   '<!NOTATION' S Name S (ExternalID | PublicID) S? '>'

https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-PI
> [16] 	PI	   ::=   '<?' PITarget (S (Char* - (Char* '?>' Char*)))? '?>'

https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-Comment
> [15] 	Comment	   ::=   '<!--' ((Char - '-') | ('-' (Char - '-')))* '-->'

https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-DeclSep
> [28a] DeclSep	   ::=   PEReference | S

https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-PEReference
> [69]  PEReference   ::=   '%' Name ';'

[Benchmark]

```
RUBYLIB= BUNDLER_ORIG_RUBYLIB= /Users/naitoh/.rbenv/versions/3.3.0/bin/ruby -v -S benchmark-driver /Users/naitoh/ghq/github.com/naitoh/rexml/benchmark/parse.yaml
ruby 3.3.0 (2023-12-25 revision 5124f9ac75) [arm64-darwin22]
Calculating -------------------------------------
                         before       after  before(YJIT)  after(YJIT)
                 dom     11.336      10.673        17.347       17.951 i/s -     100.000 times in 8.821568s 9.369098s 5.764700s 5.570586s
                 sax     31.543      31.015        47.585       51.672 i/s -     100.000 times in 3.170243s 3.224280s 2.101494s 1.935272s
                pull     37.364      35.844        58.475       62.661 i/s -     100.000 times in 2.676348s 2.789845s 1.710137s 1.595896s
              stream     34.795      35.401        49.736       57.144 i/s -     100.000 times in 2.874007s 2.824748s 2.010624s 1.749971s

Comparison:
                              dom
         after(YJIT):        18.0 i/s
        before(YJIT):        17.3 i/s - 1.03x  slower
              before:        11.3 i/s - 1.58x  slower
               after:        10.7 i/s - 1.68x  slower

                              sax
         after(YJIT):        51.7 i/s
        before(YJIT):        47.6 i/s - 1.09x  slower
              before:        31.5 i/s - 1.64x  slower
               after:        31.0 i/s - 1.67x  slower

                             pull
         after(YJIT):        62.7 i/s
        before(YJIT):        58.5 i/s - 1.07x  slower
              before:        37.4 i/s - 1.68x  slower
               after:        35.8 i/s - 1.75x  slower

                           stream
         after(YJIT):        57.1 i/s
        before(YJIT):        49.7 i/s - 1.15x  slower
               after:        35.4 i/s - 1.61x  slower
              before:        34.8 i/s - 1.64x  slower
```

- YJIT=ON : 1.03x - 1.15x faster
- YJIT=OFF : 0.94x - 1.01x faster

Co-authored-by: Sutou Kouhei <kou@clear-code.com>
  • Loading branch information
naitoh and kou committed Feb 25, 2024
1 parent 0656925 commit bc05d26
Show file tree
Hide file tree
Showing 3 changed files with 195 additions and 163 deletions.
Loading

0 comments on commit bc05d26

Please sign in to comment.