Conversation
|
FYI, I've seen this. Will need some time to understand and review the changes. |
asap-query-engine/src/precompute_operators/datasketches_kll_accumulator.rs
Outdated
Show resolved
Hide resolved
|
Just thinking out loud here. At a very high-level. I think the PR content looks okay but (a) it's mixing multiple different objectives, (b) is difficult to review because it's so large and touches many different files. Some thoughts about how we can split this into more manageable reviewable chunks:
Pls share your thoughts @zzylol @GnaneshGnani |
|
@milindsrivastava1997 - This sounds good. I would suggest two PRs for adding sketchlib to |
|
@GnaneshGnani I would actually like to keep the changes to the same sketch in asap-sketch-ingest and asap-query-engine in the same PR. Instead of splitting by module, you can split by sketch. So you can have 1 PR per sketch or if that's too much overhead, we can do 1 PR for the first sketch and then another for all the remaining ones. I think this is better just so that the repo is in a runnable state at all times. |
|
@milindsrivastava1997, is this ok? PR 1: Rust CI changes |
|
Sounds great @GnaneshGnani |
#124 #137
Core sketch types
CountMinBackend,KllBackend, andCountMinWithHeapBackendenums withLegacyandSketchlibvariantsbackendfield to choose between the two implementations#[serde(skip)]on backend fields; wire format remains matrix-based for compatibilitySketch-core
from_legacy_matrix(),sketch(),sketch_mut()accessorssketch_bytes(),count(); updatemerge,merge_refs, serialization, and Clonefrom_legacy_matrix(),sketch_mut(),sketch_matrix(),topk_heap_items()accessorsKllSketch(no structural changes)#[ctor::ctor]to force legacy mode in testsQuery engine
UDF templates
countminsketch_count.rs.j2,countminsketch_sum.rs.j2,countminsketchwithheap_topk.rs.j2datasketcheskll_.rs.j2,hydrakll_.rs.j2Testing
SKETCH_CORE_CMS_IMPL,SKETCH_CORE_CMWH_IMPL,SKETCH_CORE_KLL_IMPL(default: sketchlib)ARROYO_SKETCH_CMS_IMPL,ARROYO_SKETCH_CMWH_IMPL,ARROYO_SKETCH_KLL_IMPL(default: sketchlib)