Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#7676 Don't use cache for Event sprintf, use efficient implementation instead #7692

Closed
wants to merge 1 commit into from

Conversation

original-brownbear
Copy link
Member

@original-brownbear original-brownbear commented Jul 14, 2017

Fixes #7676 and generally improves the performance of org.logstash.StringInterpolation#evaluate by:

  • Don't use regex to find the contents between %{}, it's way too inefficient for a simple case like this with all its allocations

  • Flatten logic out of the many helper classes

  • Use a reusable, thread-local StringBuilder to serialize into

  • Don't use .substring for comparisons

  • static ObjectMapper

  • org.joda.time.format.DateTimeFormat#forPattern not being cached is not an issue, since it's cached in org.joda.time.format.DateTimeFormat#cPatternCache

  • Added benchmark

    • ~15% faster for a few cases I ran, even for repeated patterns with interpolation + couldn't find any case even with date formatting, where this implementation is slower than the cached one
    • no leaking on dynamic patterns

PS: All tests still pass and this thing has good coverage it seems :P

@jordansissel
Copy link
Contributor

Your PR description has lots of nice details, but your commit doesn't have this same information. Can you copy the PR description into your commit message?

@original-brownbear
Copy link
Member Author

@jordansissel sure, sec :)

@original-brownbear
Copy link
Member Author

@jordansissel done :)


if(pos <= template.length() - 1) {
compiledTemplate.add(new StaticNode(template.substring(pos)));
if (template.regionMatches(open + 2, "+%s", 0, 3)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoding 3 for length gives you interesting results that are differnt from Logstash 5.5. I think you need to check the whole length (close - open - 1 ?)

outcome input v5.5.0 this pr
same +%s 1500508897 1500509113
different +%ss %43 1500509162
different +%s d y %30 20 2017 1500509234

In all cases above, since the first 3 chars are "%+s" it assumes unix epoch time and ignores the rest.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jordansissel huh good find, let me a add a test + fix for this case :)

…stash.StringInterpolation#evaluate` by:

* Don't use regex to find the contents between `%{}`, it's way too inefficient for a simple case like this with all its allocations
* Flatten logic out of the many helper classes
* Use a reusable, thread-local `StringBuilder` to serialize into
* Don't use `.substring` for comparisons
* `static` `ObjectMapper`
* `org.joda.time.format.DateTimeFormat#forPattern` not being cached is not an issue, since it's cached in `org.joda.time.format.DateTimeFormat#cPatternCache`

* Added benchmark
   * ~15% faster for a few cases I ran, even for repeated patterns with interpolation + couldn't find any case even with date formatting, where this implementation is slower than the cached one
   *  no leaking on dynamic patterns
@original-brownbear
Copy link
Member Author

@jordansissel fixed and test for the problematic case added :)

Copy link
Contributor

@jordansissel jordansissel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@elasticsearch-bot
Copy link

Armin Braun merged this into the following branches!

Branch Commits
6.x 92c9b1f
master 96ff365

elasticsearch-bot pushed a commit that referenced this pull request Jul 25, 2017
…tringInterpolation#evaluate` by:

* Don't use regex to find the contents between `%{}`, it's way too inefficient for a simple case like this with all its allocations
* Flatten logic out of the many helper classes
* Use a reusable, thread-local `StringBuilder` to serialize into
* Don't use `.substring` for comparisons
* `static` `ObjectMapper`
* `org.joda.time.format.DateTimeFormat#forPattern` not being cached is not an issue, since it's cached in `org.joda.time.format.DateTimeFormat#cPatternCache`

* Added benchmark
   * ~15% faster for a few cases I ran, even for repeated patterns with interpolation + couldn't find any case even with date formatting, where this implementation is slower than the cached one
   *  no leaking on dynamic patterns

Fixes #7692
@original-brownbear
Copy link
Member Author

@jordansissel thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants