-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-27216][CORE] Upgrade RoaringBitmap to 0.7.45 #24157
Conversation
This UT only works after #24156 fixed. Now it's easy to reproduce by replacing |
Since current RoaringBitmap couldn't be ser/deser correctly in unsafe KryoSerializer, first thing I could think out is replacing this data structure totally or when use unsafe Kryo. How do you think about it? |
Does this need to be serialized? I wouldn't think so if it doesn't work! |
val bitmap2 : RoaringBitmap = safeSer.deserialize(safeSer.serialize(bitmap)) | ||
assert(bitmap2.equals(bitmap)) | ||
|
||
conf.set("spark.kryo.unsafe", "true") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are changing the conf
which also used by other tests within the suite and now the execution order of these tests are important. If the test execution starts with this test and others are executed latter they might fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can be move to a totally new Suite. I will update it.
Can one of the admins verify this patch? |
Err, scratch that, I was looking at entirely the wrong thing. I'm also confused here -- so far, this change is just the failing UT, right? you will add the actual fix to behavior as part of this pr? |
@srowen @squito I've added another UT which is the minimized dataset from our product issue. if (buf.size == 0) {
// throwFetchFailedException(blockId, address, new IOException(msg))
} After that, the testing |
@LantaoJin I'm still confused by the status of this -- it seems its just test changes, not behavior changes, but it sounds like you are saying some behavior is just broken. Its labeled as a WIP, but you've also pinged people for review. Are you looking for help in determining the right fix? If so, it would help us if you could give a more complete description of what goes wrong. I don't see anything obviously wrong with unsafe kryo and roaring bitmap -- you could try serializing a tiny bitmap and see if the bits make sense Or do you believe this by itself is actually the complete change? |
Since package org.apache.spark.sql
import org.apache.spark.internal.config
import org.apache.spark.internal.config.Kryo._
import org.apache.spark.internal.config.SERIALIZER
import org.apache.spark.sql.internal.SQLConf
import org.apache.spark.sql.test.SharedSQLContext
class SQLQueryWithKryoSuite extends QueryTest with SharedSQLContext {
override protected def sparkConf = super.sparkConf
.set(SERIALIZER, "org.apache.spark.serializer.KryoSerializer")
.set(KRYO_USE_UNSAFE, true)
test("kryo unsafe data quality issue") {
// This issue can be reproduced when
// 1. Enable KryoSerializer
// 2. Set spark.kryo.unsafe to true
// 3. Use HighlyCompressedMapStatus since it uses RoaringBitmap
// 4. Set spark.sql.shuffle.partitions to 6000, 6000 can trigger issue based the supplied data
// 5. Comment the zero-size blocks fetch fail exception in ShuffleBlockFetcherIterator
// or this job will failed with FetchFailedException.
withSQLConf(
SQLConf.SHUFFLE_PARTITIONS.key -> "6000",
config.SHUFFLE_MIN_NUM_PARTS_TO_HIGHLY_COMPRESS.key -> "-1") {
withTempView("t") {
val df = spark.read.parquet(testFile("test-data/dates.parquet")).toDF("date")
df.createOrReplaceTempView("t")
checkAnswer(
sql("SELECT COUNT(*) FROM t"),
sql(
"""
|SELECT SUM(a) FROM
|(
|SELECT COUNT(*) a, date
|FROM t
|GROUP BY date
|)
""".stripMargin))
}
}
}
} |
@LantaoJin you should be able to reopen this, or it will reopen if you push a new commit. |
Sorry I can not reopen it since a force pushing. I open a new #24264 as a updating. |
That's fine, I can reopen them too, but you already have a new PR |
What changes were proposed in this pull request?
HighlyCompressedMapStatus uses RoaringBitmap to record the empty blocks. But RoaringBitmap couldn't be ser/deser with unsafe KryoSerializer.
How was this patch tested?
Adding UT