Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of the parser's failure path #374

Merged
merged 2 commits into from
Mar 26, 2024
Merged

Conversation

qwwdfsad
Copy link
Contributor

@qwwdfsad qwwdfsad commented Mar 22, 2024

  • Previously, an exception was instantiated on each ParseError and the parsing failure was completely dominated by 'fillInStacktrace' method
  • Now all exception messages are aggregated into a single message and only one instance of ParseException is thrown
  • An improvement on the basic benchmark is ~300%

The changes are quite code-local and worth doing -- we are basically avoiding redundant work and making a vector for potential denial of service attacks much less.

Benchmark:

val zonedDtFormat = DateTimeComponents.Format {
    dateTime(LocalDateTime.Formats.ISO)
    offset(UtcOffset.Formats.ISO)
    char('[')
    timeZoneId()
    char(']')
}

val input = "2008-06-03T11:05:30.123456789+01:00[Mars/New_York]"

@Benchmark
fun fail(): Any? {
    try {
        return zonedDtFormat.parse(input)
    } catch (e: Throwable) {
        return e
    }
}

Profile before:
image

Profile after:
image

* Previously, an exception was instantiated on each ParseError and the parsing failure was completely dominated by 'fillInStacktrace' method
* Now all exception messages are aggregated into a single message and only one instance of ParseException is thrown
* An improvement on the basic benchmark is ~300%
@qwwdfsad
Copy link
Contributor Author

A few things:

  • It's easy to push it further by avoiding ParseException.fillInStackTrace at all -- I can do that if you are okay with the idea
  • I'm not sure neither why sorting of error messages is required nor what these messages can be used for. I'd guess it's up to you to decide on that

@dkhalanskyjb
Copy link
Contributor

I'm not sure neither why sorting of error messages is required nor what these messages can be used for.

As I've had to write a lot of various datetime formats and check why they don't work, I was consistently frustrated by Java's nonchalant "I can't parse the passed string starting from character 14." Why can't it? What did it expect, what did it get?

So, these messages are for when you write a format and aren't sure why it doesn't work. The messages are sorted so that the most successful parsing paths are at the top. You're more likely to be interested in the format that almost worked than the format that failed immediately, though there are some exceptions to this.

It's easy to push it further by avoiding ParseException.fillInStackTrace at all -- I can do that if you are okay with the idea

Sure. Also, if these exceptions are such a bother performance-wise, then we can rework the code to throw one exception fewer: let match be inline and accept an onError: (List<ParseError>) -> Nothing lambda; then, we'd be able to build a DateTimeFormatException with the required message directly, so ParseException wouldn't even be needed.

@qwwdfsad
Copy link
Contributor Author

Thanks, that's an important piece of trivia, I've added the comment.

if these exceptions are such a bother performance-wise, then we can rework the code to throw one exception fewer

I (or somebody else) will handle it in a separate PR.
The performance hit comes not from the exception, but from the fillInStackTrace that can easily be overridden. Typically, it's much easier to do (esp. if the exception is not caught to user code) than doing anything else (e.g. figuring lambdas and inlining)

@qwwdfsad qwwdfsad removed the request for review from dkhalanskyjb March 26, 2024 11:24
@qwwdfsad qwwdfsad merged commit 3bc18fb into master Mar 26, 2024
Copy link
Contributor

@dkhalanskyjb dkhalanskyjb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice race condition!

@qwwdfsad qwwdfsad deleted the unhappy-path-perf branch March 26, 2024 11:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants