Currently, our MinHashing scheme falls back to a LSH scheme for approximate MinHashing. This provides a reduction in data replication from n to b (where n is the number of elements and b is the number of buckets). However, more efficient approximate LSH schemes can achieve a further reduction. We should add a method like multiprobing:
Lv, Qin, et al. "Multi-probe LSH: efficient indexing for high-dimensional similarity search." Proceedings of the 33rd international conference on Very large data bases. VLDB Endowment, 2007.
Currently, our MinHashing scheme falls back to a LSH scheme for approximate MinHashing. This provides a reduction in data replication from n to b (where n is the number of elements and b is the number of buckets). However, more efficient approximate LSH schemes can achieve a further reduction. We should add a method like multiprobing: