Join GitHub today
Digests of data queries may mismatch even though mutation queries return equal mutations #1165
Data query doesn't include metadata like tombstones and timestamps. It comes with a digest which should differ if metadata (or, of course, main data) is different. For read-repair to make progress we want digest to differ only if there is something to repair. Read-repair will do a mutation query and send out diffs to fix the digest mismatch. That increases read latency, so we would like to avoid this unless really necessary.
Current algorithm of digest calculation includes data which may no longer exist after mutation is compacted, for example cells covered by higher-level tombstones or GC-able tombstones. This may lead to mismatched digest even though mutations returned by mutation queries, which always compact, are equal.