Skip to content

Ensure constant time random access in ElemNameSeqReader.#544

Merged
eed3si9n merged 1 commit into
eed3si9n:masterfrom
omatzcoveo:sequence-parsing-performance
Nov 30, 2020
Merged

Ensure constant time random access in ElemNameSeqReader.#544
eed3si9n merged 1 commit into
eed3si9n:masterfrom
omatzcoveo:sequence-parsing-performance

Conversation

@omatzcoveo

Copy link
Copy Markdown
Contributor

Hi there,

Parsing large sequences has performance issues : I traced it back to a quadratic behavior in a loop using ElemNameSeqReader.

ElemNameSeqReader's use of isDefinedAt presuppose constant time random access in its internal seq, but it ends up doing it on a List, I believe fed from AnyElemNameParser. This diff both builds a Vector instead of a List from AnyElemNameParser, and enforces a Vector in ElemNameSeqReader internals. This is successful in fixing the performance issue we've observed at my company.

Let me know if you need me to do something else in the pull request (do I have to do something special to update src_managed/scalaxb/scalaxb.scala ?).

Thanks

@eed3si9n

Copy link
Copy Markdown
Owner

@omatzcoveo Thanks for the contribution. Could you fix the failing integration tests please?

@omatzcoveo omatzcoveo force-pushed the sequence-parsing-performance branch from 159e645 to 1065537 Compare November 27, 2020 16:02
@omatzcoveo

Copy link
Copy Markdown
Contributor Author

Last friday I opted to reduce the scope of the change in order not to have any impact on tests and such, and only keep the important bit that removes the quadratic behavior. Thanks!

@eed3si9n eed3si9n merged commit 1d123ed into eed3si9n:master Nov 30, 2020
@eed3si9n

Copy link
Copy Markdown
Owner

Nicely done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants