[spark] support merge-read between kv snapshot and log for primary-key table#2523
[spark] support merge-read between kv snapshot and log for primary-key table#2523wuchong merged 4 commits intoapache:mainfrom
Conversation
wuchong
left a comment
There was a problem hiding this comment.
@YannByron thanks for the contribution.
I rebased the branch and appended a commit to address my minor comments. Will merge it if you don't have concerns.
|
|
||
| private def createSortMergeReader(): SortMergeReader = { | ||
| // Create key encoder for primary keys | ||
| val keyEncoder = encode.KeyEncoder.of(rowType, tableInfo.getPhysicalPrimaryKeys, null) |
There was a problem hiding this comment.
In the latest main branch, we’ve refactored KeyEncoder.
Could you please rebase onto the latest main and use the KeyEncoder.ofPrimaryKey(...) method? Otherwise, the key encoding won’t align with the keys stored in RocksDB, leading to incorrect query results.
| public SortMergeReader( | ||
| @Nullable int[] projectedFields, | ||
| int[] pkIndexes, | ||
| @Nullable CloseableIterator<LogRecord> lakeRecordIterator, |
There was a problem hiding this comment.
Could you rename this parameter and the member variable lakeRecordIterator to snapshotRecordIterator? This can better reflect the usage of Spark.
| } | ||
|
|
||
| // Collect all log records until logStoppingOffset | ||
| val allLogRecords = mutable.ArrayBuffer[ScanRecord]() |
There was a problem hiding this comment.
we need to fetch by size to avoid OOM when log store has huge records.
There was a problem hiding this comment.
@Yohahaha However, we need to sort the changelog, which requires buffering all changelog. Therefore, fetching by size doesn't make much sense in this context. If the changelog is truly huge, we may need to consider supporting spilling the changelog buffer to local disk.
|
@YannByron I found a bug while testing reading PK tables, it fails when using the last column as the primary key, current cases in SparkPrimaryKeyTableReadTest all use first column and partition column as primary key. test("Spark Read: primary key table with last pk") {
withTable("t") {
sql("CREATE TABLE t (id int, name string, pk int, pk2 string) TBLPROPERTIES('primary.key'='pk,pk2')")
checkAnswer(sql("SELECT * FROM t"), Nil)
sql("INSERT INTO t VALUES (1, 'a', 10, 'x'), (2, 'b', 20, 'y')")
checkAnswer(sql("SELECT * FROM t ORDER BY id"), Row(1, "a", 10, "x") :: Row(2, "b", 20, "y") :: Nil)
}
}above case will failed with |
Thank you @Yohahaha , could you open a pull request to add and fix the test case? |
Purpose
Linked issue: close #2427
Brief change log
Tests
API and Format
Documentation