Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance bug in BulkHandlers #3104

Open
SheliakLyr opened this issue Jul 2, 2024 · 2 comments
Open

Performance bug in BulkHandlers #3104

SheliakLyr opened this issue Jul 2, 2024 · 2 comments

Comments

@SheliakLyr
Copy link

My application often uses bulk requests for indexing data in ES. According to the profiler, a surprising amount of CPU time is spent in BulkHandlers.buildBulkHttpBody (mkString). Initially I thought that memory allocation might be the cause, but the following code has a much simpler problem:

private[bulk] def buildBulkHttpBody(bulk: BulkRequest): String = {
    val builder = StringBuilder.newBuilder
    val rows: Iterator[String] = BulkBuilderFn(bulk)
    rows.addString(builder, "", "\n", "")
    builder.append("\n") // es seems to require a trailing new line as well
    builder.mkString
  }

Notice, how the StringBuilder is defined:

class StringBuilder(...) extends AbstractSeq[Char]

Calling mkString on builder creates a String by iterating on all individual characters. This is very slow. I think that the initial intention was to use toString or result.

I have a number of alternative solutions:

  1. Simply replace mkString by result().
  2. Replace the whole method by:
BulkBuilderFn(bulk).mkString("", "\n", "\n")
  1. Avoid the "big string" allocation completely by passing a collection of strings into HttpEntity (best, but more complicated).
@Philippus
Copy link
Owner

Philippus commented Jul 2, 2024

For me mkString is an alias to toString, so not sure if that will help. But, replacing the whole method with BulkBuilderFn(bulk).mkString("", "\n", "\n") seems to work fine. Do you want to create a PR, or shall I?

@SheliakLyr
Copy link
Author

SheliakLyr commented Jul 2, 2024

Maybe you checked it on a different scala version. I am using 2.13.14, where:

  def result() = underlying.toString

  override def toString: String = result()

  // mkString is inherited

Please go ahead with a PR, I am too busy today.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants