Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-39146][CORE][SQL] Introduce local singleton for ObjectMapper that may be reused #37999

Closed
wants to merge 14 commits into from

Conversation

LuciferYang
Copy link
Contributor

@LuciferYang LuciferYang commented Sep 26, 2022

What changes were proposed in this pull request?

This pr introduce local singletons for Jackson ObjectMapper that may be reused in Spark code to reduce the cost of repeatedly creating ObjectMapper.

Why are the changes needed?

Minor performance improvement.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Pass GitHub Actions

@LuciferYang
Copy link
Contributor Author

LuciferYang commented Sep 26, 2022

Write a mirco-benchmark to test Jackson ObjectWriter read and write:

https://github.com/LuciferYang/spark/blob/objectMapper/sql/catalyst/src/test/scala/org/apache/spark/sql/connector/catalog/JacksonBenchmark.scala

  def testReadJsonToMap(valuesPerIteration: Int): Unit = {
    val input =
      """
        |{"mergeDir":"/a/b/c/mergeDirName","attemptId":"appattempt_1648454518011_994053_000001"}
      """.stripMargin

    val benchmark = new Benchmark("Test read json to map",
      valuesPerIteration, output = output)

    benchmark.addCase("Test Multiple") { _: Int =>
      for (_ <- 0L until valuesPerIteration) {
        val mapper = new ObjectMapper()
        mapper.registerModule(DefaultScalaModule)
        mapper.readValue(input, classOf[mutable.HashMap[String, String]])
      }
    }

    val mapper = new ObjectMapper()
    mapper.registerModule(DefaultScalaModule)
    benchmark.addCase("Test Single") { _: Int =>
      for (_ <- 0L until valuesPerIteration) {
        mapper.readValue(input, classOf[mutable.HashMap[String, String]])
      }
    }

    benchmark.run()
  }

  def testWriteMapToJson(valuesPerIteration: Int): Unit = {

    val map: mutable.HashMap[String, String] = new mutable.HashMap[String, String]()
    map.put("mergeDir", "/a/b/c/mergeDirName")
    map.put("attemptId", "yarn_appattempt_1648454518011_994053_000001")


    val benchmark = new Benchmark("Test write map to json",
      valuesPerIteration, output = output)

    benchmark.addCase("Test Multiple") { _: Int =>
      for (_ <- 0L until valuesPerIteration) {
        val mapper = new ObjectMapper()
        mapper.registerModule(DefaultScalaModule)
        mapper.writeValueAsString(map)
      }
    }

    val mapper = new ObjectMapper()
    mapper.registerModule(DefaultScalaModule)
    benchmark.addCase("Test Single") { _: Int =>
      for (_ <- 0L until valuesPerIteration) {
        mapper.writeValueAsString(map)
      }
    }

    benchmark.run()
  }

  def testCreateObjectMapper(valuesPerIteration: Int): Unit = {

    val benchmark = new Benchmark("Test create ObjectMapper",
      valuesPerIteration, output = output)

    benchmark.addCase("Test create ObjectMapper") { _: Int =>
      for (_ <- 0L until valuesPerIteration) {
        val mapper = new ObjectMapper()
        mapper.registerModule(DefaultScalaModule)
      }
    }

    benchmark.run()
  }

  override def runBenchmarkSuite(mainArgs: Array[String]): Unit = {
    val valuesPerIteration = 10000

    testCreateObjectMapper(valuesPerIteration = valuesPerIteration)
    testWriteMapToJson(valuesPerIteration = valuesPerIteration)
    testReadJsonToMap(valuesPerIteration = valuesPerIteration)
  }

and run this use GA:

OpenJDK 64-Bit Server VM 1.8.0_345-b01 on Linux 5.15.0-1020-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
Test create ObjectMapper:                 Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Test create ObjectMapper                            729            738           8          0.0       72886.8       1.0X

OpenJDK 64-Bit Server VM 1.8.0_345-b01 on Linux 5.15.0-1020-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
Test write map to json:                   Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Test Multiple                                      2673           2748         107          0.0      267259.3       1.0X
Test Single                                           5              5           1          2.2         451.0     592.7X

OpenJDK 64-Bit Server VM 1.8.0_345-b01 on Linux 5.15.0-1020-azure
Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz
Test read json to map:                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Test Multiple                                     11701          11745          61          0.0     1170133.9       1.0X
Test Single                                           6              8           1          1.5         648.2    1805.3X

From the test results, we should use singleton Jackson ObjectMapper, because it seems expensive to new a ObjectMapper instance.

@LuciferYang LuciferYang marked this pull request as draft September 26, 2022 09:02
@LuciferYang LuciferYang changed the title [SPARK-39146][CORE] Introduce JacksonUtils to use singleton Jackson ObjectMapper [WIP][SPARK-39146][CORE] Introduce JacksonUtils to use singleton Jackson ObjectMapper Sep 26, 2022
@LuciferYang LuciferYang changed the title [WIP][SPARK-39146][CORE] Introduce JacksonUtils to use singleton Jackson ObjectMapper [SPARK-39146][CORE][SQL][K8S] Introduce JacksonUtils to use singleton Jackson ObjectMapper Sep 26, 2022
@srowen
Copy link
Member

srowen commented Sep 26, 2022

My concern is that ObjectMapper, while thread-safe, is synchronized in some methods, IIRC. This could introduce contention for locks. Is the perf win really compelling? I wonder if we can reuse ObjectMapper inside classes where it matters for perf and not try to share one instance so widely.

@LuciferYang
Copy link
Contributor Author

OK, let me add a multi thread comparison to check this

@LuciferYang
Copy link
Contributor Author

In the serial r/w scenario, the benefits are obvious,

  • Reading scenario: using singleton is 1800+% faster than creating ObjectMapper every time
  • Write scenario: using a single instance is 500+% faster than creating ObjectMapper every time

@LuciferYang
Copy link
Contributor Author

I write a multi thread test as follows:

https://github.com/LuciferYang/spark/blob/fe26455668a904400e4b81ec696b7cd3abb3c923/sql/catalyst/src/test/scala/org/apache/spark/sql/connector/catalog/MJacksonBenchmark.scala

  def testWriteMapToJson(valuesPerIteration: Int, threads: Int): Unit = {

    val map = Map("intValue" -> 1,
      "longValue" -> 2L,
      "doubleValue" -> 3.0D,
      "stringValue" -> "4",
      "floatValue" -> 5.0F,
      "booleanValue" -> true)


    val benchmark = new Benchmark(s"Test $threads threads write map to json",
      valuesPerIteration, output = output)

    val multi = Array.fill(threads)({
      val ret = new ObjectMapper()
      ret.registerModule(DefaultScalaModule)
      ret
    })

    benchmark.addCase("Test use multi mapper") { _: Int =>
      val latch = new CountDownLatch(valuesPerIteration)
      val executor = ThreadUtils.newDaemonFixedThreadPool(threads, "multi")
      for (i <- 0 until valuesPerIteration) {
        executor.submit(new Runnable {
          override def run(): Unit = {
            val idx = i % threads
            multi(idx).writeValueAsString(map)
            latch.countDown()
          }
        })
      }
      latch.await()
      executor.shutdown()
    }

    val mapper = new ObjectMapper()
    mapper.registerModule(DefaultScalaModule)
    val singleton = Array.fill(threads)(mapper)
    benchmark.addCase("Test use singleton mapper") { _: Int =>
      val latch = new CountDownLatch(valuesPerIteration)
      val executor = ThreadUtils.newDaemonFixedThreadPool(threads, "singleton")
      for (i <- 0 until valuesPerIteration) {
        executor.submit(new Runnable {
          override def run(): Unit = {
            val idx = i % threads
            singleton(idx).writeValueAsString(map)
            latch.countDown()
          }
        })
      }
      latch.await()
      executor.shutdown()
    }

    benchmark.run()
  }

  def testReadJsonToMap(valuesPerIteration: Int, threads: Int): Unit = {

    val input = {
      val map = Map("intValue" -> 1,
        "longValue" -> 2L,
        "doubleValue" -> 3.0D,
        "stringValue" -> "4",
        "floatValue" -> 5.0F,
        "booleanValue" -> true)
      val mapper = new ObjectMapper()
      mapper.registerModule(DefaultScalaModule)
      mapper.writeValueAsString(map)
    }

    val benchmark = new Benchmark(s"Test $threads threads read json to map",
      valuesPerIteration, output = output)

    val multi = Array.fill(threads)({
      val ret = new ObjectMapper()
      ret.registerModule(DefaultScalaModule)
      ret
    })

    benchmark.addCase("Test use multi mapper") { _: Int =>
      val latch = new CountDownLatch(valuesPerIteration)
      val executor = ThreadUtils.newDaemonFixedThreadPool(threads, "multi")
      for (i <- 0 until valuesPerIteration) {
        executor.submit(new Runnable {
          override def run(): Unit = {
            val idx = i % threads
            multi(idx).readValue(input, classOf[mutable.HashMap[String, String]])
            latch.countDown()
          }
        })
      }
      latch.await()
      executor.shutdown()
    }

    val mapper = new ObjectMapper()
    mapper.registerModule(DefaultScalaModule)
    val singleton = Array.fill(threads)(mapper)

    benchmark.addCase("Test use singleton mapper") { _: Int =>
      val latch = new CountDownLatch(valuesPerIteration)
      val executor = ThreadUtils.newDaemonFixedThreadPool(threads, "singleton")
      for (i <- 0 until valuesPerIteration) {
        executor.submit(new Runnable {
          override def run(): Unit = {
            val idx = i % threads
            singleton(idx).readValue(input, classOf[mutable.HashMap[String, String]])
            latch.countDown()
          }
        })
      }
      latch.await()
      executor.shutdown()
    }

    benchmark.run()
  }

The result from GA as follows:

OpenJDK 64-Bit Server VM 1.8.0_345-b01 on Linux 5.15.0-1020-azure
Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
Test 5 threads read json to map:          Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Test use multi mapper                              1927           1959          46          0.5        1927.2       1.0X
Test use singleton mapper                          1727           1884         222          0.6        1727.0       1.1X

OpenJDK 64-Bit Server VM 1.8.0_345-b01 on Linux 5.15.0-1020-azure
Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
Test 10 threads read json to map:         Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Test use multi mapper                              1411           1465          76          0.7        1411.2       1.0X
Test use singleton mapper                          1297           1322          36          0.8        1296.7       1.1X

OpenJDK 64-Bit Server VM 1.8.0_345-b01 on Linux 5.15.0-1020-azure
Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
Test 20 threads read json to map:         Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Test use multi mapper                              1570           1598          41          0.6        1569.7       1.0X
Test use singleton mapper                          1433           1455          31          0.7        1432.7       1.1X

OpenJDK 64-Bit Server VM 1.8.0_345-b01 on Linux 5.15.0-1020-azure
Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
Test 5 threads write map to json:         Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Test use multi mapper                               804            948         183          1.2         803.7       1.0X
Test use singleton mapper                           778            954         167          1.3         777.7       1.0X

OpenJDK 64-Bit Server VM 1.8.0_345-b01 on Linux 5.15.0-1020-azure
Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
Test 10 threads write map to json:        Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Test use multi mapper                               784            865          75          1.3         784.0       1.0X
Test use singleton mapper                           777            954         179          1.3         776.6       1.0X

OpenJDK 64-Bit Server VM 1.8.0_345-b01 on Linux 5.15.0-1020-azure
Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
Test 20 threads write map to json:        Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Test use multi mapper                               777            841          96          1.3         777.4       1.0X
Test use singleton mapper                           782            932         134          1.3         782.1       1.0X

and I use a bare metal server to test more threads, the test result as follows:

OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 4.14.0_1-0-0-45
Intel(R) Xeon(R) Gold 6271C CPU @ 2.60GHz
Test 5 threads read json to map:          Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Test use multi mapper                              1197           1338         200          0.8        1197.1       1.0X
Test use singleton mapper                          1023           1277         360          1.0        1022.8       1.2X

OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 4.14.0_1-0-0-45
Intel(R) Xeon(R) Gold 6271C CPU @ 2.60GHz
Test 10 threads read json to map:         Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Test use multi mapper                              1193           2305        1573          0.8        1193.0       1.0X
Test use singleton mapper                          1048           1487         621          1.0        1047.6       1.1X

OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 4.14.0_1-0-0-45
Intel(R) Xeon(R) Gold 6271C CPU @ 2.60GHz
Test 20 threads read json to map:         Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Test use multi mapper                              1113           1491         535          0.9        1113.3       1.0X
Test use singleton mapper                          1351           1455         147          0.7        1351.0       0.8X


OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 4.14.0_1-0-0-45
Intel(R) Xeon(R) Gold 6271C CPU @ 2.60GHz
Test 50 threads read json to map:         Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Test use multi mapper                              1003           1471         661          1.0        1003.3       1.0X
Test use singleton mapper                          1000           1330         468          1.0         999.8       1.0X


OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 4.14.0_1-0-0-45
Intel(R) Xeon(R) Gold 6271C CPU @ 2.60GHz
Test 75 threads read json to map:         Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Test use multi mapper                              1019           1345         461          1.0        1018.7       1.0X
Test use singleton mapper                          1024           1241         306          1.0        1024.3       1.0X

OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 4.14.0_1-0-0-45
Intel(R) Xeon(R) Gold 6271C CPU @ 2.60GHz
Test 5 threads write map to json:         Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Test use multi mapper                               842           1474         894          1.2         842.0       1.0X
Test use singleton mapper                           959           1585         884          1.0         959.5       0.9X

OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 4.14.0_1-0-0-45
Intel(R) Xeon(R) Gold 6271C CPU @ 2.60GHz
Test 10 threads write map to json:        Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Test use multi mapper                              1280           1844         797          0.8        1280.2       1.0X
Test use singleton mapper                           889           1634        1054          1.1         888.5       1.4X

OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 4.14.0_1-0-0-45
Intel(R) Xeon(R) Gold 6271C CPU @ 2.60GHz
Test 20 threads write map to json:        Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Test use multi mapper                               858            957         132          1.2         857.6       1.0X
Test use singleton mapper                           896           1625        1031          1.1         895.6       1.0X


OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 4.14.0_1-0-0-45
Intel(R) Xeon(R) Gold 6271C CPU @ 2.60GHz
Test 50 threads write map to json:        Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Test use multi mapper                               884            960         115          1.1         884.2       1.0X
Test use singleton mapper                           843           1227         626          1.2         843.3       1.0X


OpenJDK 64-Bit Server VM 1.8.0_312-b07 on Linux 4.14.0_1-0-0-45
Intel(R) Xeon(R) Gold 6271C CPU @ 2.60GHz
Test 75 threads write map to json:        Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Test use multi mapper                              1094           1109          22          0.9        1093.7       1.0X
Test use singleton mapper                           872           1257         658          1.1         871.5       1.3X

@LuciferYang
Copy link
Contributor Author

@srowen From the above test results, there is no significant performance difference between using global and local singletons.

From a code perspective, thread safety should not be guaranteed by locks, it seems guaranteed by using a new JsonParser/JsonGenerator every serialization and deserialization.

https://github.com/FasterXML/jackson-databind/blob/492334da383b0677c19b08964fddad18230546af/src/main/java/com/fasterxml/jackson/databind/ObjectMapper.java#L3838-L3851

image

https://github.com/FasterXML/jackson-databind/blob/492334da383b0677c19b08964fddad18230546af/src/main/java/com/fasterxml/jackson/databind/ObjectMapper.java#L3647-L3658

image

So there seems to be no problem with the global instance. ObjectMapper is only used as a configuration template, so I am both ok to use a global singleton or local singleton

@LuciferYang LuciferYang marked this pull request as ready for review September 27, 2022 11:49
@srowen
Copy link
Member

srowen commented Sep 27, 2022

To be clear I was saying it is already thread-safe but the issue could be lock contention. It may not be an issue in Spark, and/or fixed in Jackson, but I'm looking at posts like https://medium.com/feedzaitech/when-jackson-becomes-a-parallelism-bottleneck-f1440a50b429

I agree it doesn't look like there is evidence of contention here, so probably worth the perf improvement.

@LuciferYang
Copy link
Contributor Author

FasterXML/jackson-core#349 (comment)

FasterXML/jackson-core@67add8c#diff-190cbec71e87394830d19fa2fea51b3bc324aa5fe694fc036ef85d1ad39d528f

It seems that Jackson 2.8.7 has fixed the problems mentioned in post. If we have doubts about similar issue, I can only make limited changes as you said

@LuciferYang
Copy link
Contributor Author

LuciferYang commented Sep 27, 2022

I wonder if we can reuse ObjectMapper inside classes where it matters for perf and not try to share one instance so widely.

According to this principle, RebaseDateTime, DataSourceV2Utils and FileDataSourceV2 may reuse ObjectMapper at class scope , there is no possibility of reuse in other 3 files

@srowen
Copy link
Member

srowen commented Sep 27, 2022

I'm OK with the change. I guess I'd slightly prefer keeping the change more 'local' by having a singleton in each of the classes that need this. That minimizes the scope. I don't feel strongly about it.

@LuciferYang LuciferYang changed the title [SPARK-39146][CORE][SQL][K8S] Introduce JacksonUtils to use singleton Jackson ObjectMapper [SPARK-39146][CORE][SQL]Introduce JacksonUtils to use singleton Jackson ObjectMapper Sep 27, 2022
@@ -39,11 +38,10 @@ import org.apache.spark.annotation.DeveloperApi
class ErrorClassesJsonReader(jsonFileURLs: Seq[URL]) {
assert(jsonFileURLs.nonEmpty)

private lazy val mapper = Utils.withScalaModuleMapper
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WOuldn't we want these in a companion object?

* Return a new `ObjectMapper` with `ClassTagExtensions`.
* The mapper registers `DefaultScalaModule` by default.
*/
def withScalaModuleMapper: ObjectMapper with ClassTagExtensions = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, don't know if we need to factor out a utility method just for this, I'd be OK inlining this in the ~3 places it is used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it seems a little redundant and has been removed

@LuciferYang LuciferYang changed the title [SPARK-39146][CORE][SQL]Introduce JacksonUtils to use singleton Jackson ObjectMapper [SPARK-39146][CORE][SQL] Introduce local singleton for ObjectMapper that may be reused Sep 27, 2022
@LuciferYang
Copy link
Contributor Author

GA passed

@srowen
Copy link
Member

srowen commented Sep 29, 2022

Merged to master

@srowen srowen closed this in 9440742 Sep 29, 2022
@LuciferYang
Copy link
Contributor Author

thanks @srowen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants