You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I want to got the top k element with MinHashLSH but failed. For example, I set 'k=3', but I got ('result: ', ['21', '28', '51', '1', '82', '3', '91', '69', '86', '85']), whose length is larger than 3. My demo is like below:
def query_topk(l, query_doc, k):
forest= MinHashLSHForest(num_perm=256)
count=0
for i in l:
forest.add(str(count), i)
count += 1
forest.index()
result = forest.query(query_doc, k)
return result
l : list of MinHash, query_doc: a MinHash
Is there anything wrong?
By the way, does the input must be a list of string? What if my input is a vector?
Thanks for your patience,
And another question, does this realization just support for texts? if each of my input is a list of float, i.e.[[1,2,3],[1.2,2.3,2.1]], can this work perfectly?
Sincerely,
The text was updated successfully, but these errors were encountered:
Thanks for raising the issue. I just fixed it in 1.2.1.
For your question. MinHash supports bytes as input. So as long as you can convert the object (i.e., integers, strings, floats, lists) into bytes, it works with MinHash. For example:
# For a set of floats, e.g. {1.3, 123.4, 32.9, 3.1415926, ...}minhash.update(struct.pack("f", 3.1415926))
# EVERY ELEMENT in your input set is a LIST of float# e.g. {[1.34, 1.3, 343.0, 123.9], [2.3, 23.2, 86.8], ...}minhash.update(struct.pack("4f", *[1.34, 1.3, 343.0, 123.9]))
Hi, I want to got the top k element with MinHashLSH but failed. For example, I set 'k=3', but I got ('result: ', ['21', '28', '51', '1', '82', '3', '91', '69', '86', '85']), whose length is larger than 3. My demo is like below:
def query_topk(l, query_doc, k):
forest= MinHashLSHForest(num_perm=256)
count=0
for i in l:
forest.add(str(count), i)
count += 1
forest.index()
result = forest.query(query_doc, k)
return result
l : list of MinHash, query_doc: a MinHash
Is there anything wrong?
By the way, does the input must be a list of string? What if my input is a vector?
Thanks for your patience,
And another question, does this realization just support for texts? if each of my input is a list of float, i.e.[[1,2,3],[1.2,2.3,2.1]], can this work perfectly?
Sincerely,
The text was updated successfully, but these errors were encountered: