Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse POMs with SAX parser #1029

Merged
merged 9 commits into from Jan 23, 2019
Merged

Parse POMs with SAX parser #1029

merged 9 commits into from Jan 23, 2019

Conversation

alexarchambault
Copy link
Member

No description provided.

@alexarchambault
Copy link
Member Author

Benchmark results for the new SAX parser (as of 24955d3)

$ sbt 'benchmark/jmh:run -i 20 -wi 20 -f1 -t1'
[info] Benchmark                              Mode  Cnt    Score    Error  Units
[info] ParseTests.parseApacheParent           avgt   20    0.783 ±  0.011  ms/op
[info] ParseTests.parseSparkParent            avgt   20    2.936 ±  0.037  ms/op
[info] ParseTests.parseSparkParentMavenModel  avgt   20    0.692 ±  0.001  ms/op
[info] ParseTests.parseSparkParentXmlDom      avgt   20    2.288 ±  0.002  ms/op
[info] ParseTests.parseSparkParentXmlSax      avgt   20    0.997 ±  0.001  ms/op
[info] ProcessingTests.sparkSql               avgt   20    0.647 ±  0.001  ms/op
[info] ResolutionDomTests.coursierCli         avgt   20    4.235 ±  0.035  ms/op
[info] ResolutionDomTests.sparkSql            avgt   20  187.636 ±  0.389  ms/op
[info] ResolutionTests.coursierCli            avgt   20    3.856 ±  0.021  ms/op
[info] ResolutionTests.sparkSql               avgt   20  168.218 ±  0.384  ms/op
[success] Total time: 4048 s, completed Jan 23, 2019 2:28:56 PM

parseSparkParentXmlDom and parseSparkParentXmlSax are basically String => Project functions, going from the content of org.apache.spark:spark-parent_2.12:2.4.0 to a coursier.core.Project, either via an AST of the XML, or via a SAX-based parser. The SAX parser divides that time by more than 50%.

ResolutionDomTests.sparkSql and ResolutionTests.sparkSql run whole resolutions of org.apache.spark:spark-sql_2.12:2.4.0, hitting metadata fully in memory (no network or file I/O). The gain is only around ~10 % here. This gain is kind of disappointing compared to the one in the former comparison (which looks at a more atomic operation).

@alexarchambault alexarchambault merged commit 6d0e242 into master Jan 23, 2019
@alexarchambault alexarchambault deleted the topic/pom-sax-parser branch January 23, 2019 16:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant