Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse POMs with SAX parser #1029

Merged
merged 9 commits into from Jan 23, 2019
Merged

Parse POMs with SAX parser #1029

merged 9 commits into from Jan 23, 2019

Conversation

@alexarchambault
Copy link
Member

@alexarchambault alexarchambault commented Jan 23, 2019

No description provided.

@alexarchambault alexarchambault force-pushed the topic/pom-sax-parser branch from 310200a to 24955d3 Jan 23, 2019
@alexarchambault
Copy link
Member Author

@alexarchambault alexarchambault commented Jan 23, 2019

Benchmark results for the new SAX parser (as of 24955d3)

$ sbt 'benchmark/jmh:run -i 20 -wi 20 -f1 -t1'
[info] Benchmark                              Mode  Cnt    Score    Error  Units
[info] ParseTests.parseApacheParent           avgt   20    0.783 ±  0.011  ms/op
[info] ParseTests.parseSparkParent            avgt   20    2.936 ±  0.037  ms/op
[info] ParseTests.parseSparkParentMavenModel  avgt   20    0.692 ±  0.001  ms/op
[info] ParseTests.parseSparkParentXmlDom      avgt   20    2.288 ±  0.002  ms/op
[info] ParseTests.parseSparkParentXmlSax      avgt   20    0.997 ±  0.001  ms/op
[info] ProcessingTests.sparkSql               avgt   20    0.647 ±  0.001  ms/op
[info] ResolutionDomTests.coursierCli         avgt   20    4.235 ±  0.035  ms/op
[info] ResolutionDomTests.sparkSql            avgt   20  187.636 ±  0.389  ms/op
[info] ResolutionTests.coursierCli            avgt   20    3.856 ±  0.021  ms/op
[info] ResolutionTests.sparkSql               avgt   20  168.218 ±  0.384  ms/op
[success] Total time: 4048 s, completed Jan 23, 2019 2:28:56 PM

parseSparkParentXmlDom and parseSparkParentXmlSax are basically String => Project functions, going from the content of org.apache.spark:spark-parent_2.12:2.4.0 to a coursier.core.Project, either via an AST of the XML, or via a SAX-based parser. The SAX parser divides that time by more than 50%.

ResolutionDomTests.sparkSql and ResolutionTests.sparkSql run whole resolutions of org.apache.spark:spark-sql_2.12:2.4.0, hitting metadata fully in memory (no network or file I/O). The gain is only around ~10 % here. This gain is kind of disappointing compared to the one in the former comparison (which looks at a more atomic operation).

@alexarchambault alexarchambault merged commit 6d0e242 into master Jan 23, 2019
2 checks passed
2 checks passed
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@alexarchambault alexarchambault deleted the topic/pom-sax-parser branch Jan 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

1 participant