-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Question] Hamming distance C++ example #39
Comments
I made an example for hamming distance. Since this uses the data in this repository, run this in the top directory of this repository. The number of dimensions must be 16 for 128 bits. I hope this example makes sense.
|
Fantastic. Arigatou gosaimasu Iwasaki-hakase. So going by your example, the way we input data to the index (in chunks as floats, string, etc) doesn't affect the algorithm right? (which is perfectly logical but sometimes implementation details can introduce some oddities) |
どういたしまして Dou itashimashite. Now I think Index::append() is confusing when using hamming distance. One float value is forcedly set to an unsigned char variable. This means that a float value must be positive and smaller than 256.
I am not sure how you want to apply hamming distance to your floats and string because hamming distance is a bitwise operator. |
Indeed, I was wondering if maybe some memory alignment or input data separation was dependent on the elements count when adding data to the index. So having N/4 4byte elements instead of N 1byte elements could cause problems or unexpected behaviors. Well, in your example the number of elements is originally the same and the extra bytes are discarded later.
Oh, I overlooked this. An Uint8 overload would be a lot clearer, yes, and since the underlying function doing the memory allocation/copying is already templated, the required changes are minimal. |
To solve the confusion, I updated Index::append() to be able to use with an uint8 argument. You can write as the example below.
|
Below is a sample for hamming distance with the latest version for your reference.
|
Looks good, thx again! Btw, ONNG takes a really long time to build, around 15-20 mins on my computer, that's expected. But what about panng? does it build any faster? |
PANNG can be built faster than ONNG with the recommended number of edges 100. But, first, you may want to reduce the number of edges of ONNG (ANNG) to build it faster, because 100 edges are too many for most datasets. The number of edges can be specified with NGT::Property::edgeSizeForCreation or -E of ngt create command. |
Akemashite omedeto gosaimasu. Running some more benchmarks, when I construct an ONNG index by setting: it actually takes less time than if I construct an ANNG index with 10-20 edges and then run the optimizer over it, which if I understand correctly, constructs a PANNG:
Is this correct? Am I misunderstanding something? Thx as always for your assistance. |
あけましておめでとうございます。今年もよろしくお願い致します。 I experimentally added the type ONNG ( PANNG can be created with this command. The index constructed by using the following manner might be almost the same as PANNG. But, I am not so sure.
|
Thanks as always for your assistance. I'm finishing up an academic publication regarding ANN algorithms and their use for Augmented Reality and wanted to publicly thank you for your assistance if u are ok with that. |
I'm glad to help you! |
Hi,
do you have any example of ngt usage on C++ with hamming distance?
I'm currently creating an index with this Properties for use with a 128 bits query:
While it does compile and run, no result is found. I'm appending data as float which should be the same size as an uint but maybe that's incorrect.
Using L2 and floats works fine so i'm probably doing something wrong on my end, an example would really help me.
Thanks in advance.
The text was updated successfully, but these errors were encountered: