Recall stays 0.0001 #2

GertjanBrouwer · 2019-08-06T12:57:26Z

I am trying to generate a graph using efanna_graph. I have a dataset of 4000 images. I have calculated and wrote all SIFT descriptors to a .fvecs file and used that to generate the graph. Unfortunately efanna_graph recall never went above 0.0001. I believe it is an issue with the way I write descriptors to .fvecs.

I have tried to write .fvecs multiple ways. The code I am using now is this: write_to_fvecs . As you can see after I have calculated descriptors for each image and concatenated these descriptors in a single array I write them to .fvecs using:
vectorArray.astype(np.int32).tofile('./my_sift_descriptors.fvecs')
As you can see I use np.int32 which seems wrong to me. The reason for using np.int32 is as follows.

First I tried writing to file like this:
vectorArray.astype().tofile('./vanbeeklederwaren_astype_int32.fvecs')
But when I start efanna_graph test_nndescent I get this message: "data dimension: 1124073472
Floating point exception (core dumped)".

Then I tried writing to file like this(which seems to me is the correct way to this):
vectorArray.astype(np.float32).tofile('./vanbeeklederwaren_astype_int32.fvecs')
But again when running efanna_graph I get this message: "data dimension: 1124073472
Floating point exception (core dumped)".

Then I used this snippet: read_fvecs which you can use to read fvecs files in python. I used this snippet to read the first 4 bytes of 4 different files. The first:
The fvecs file provided by TexMex showed the first 4 bytes to be of type of float32 and the value was 1.8e-43.

The file saved without specifying a type was also of type float32 but displayed 128.0 when printed.

The file saved as float32 also was of type float32 and also displayed 128.0 when printed.

The file saved as int32 also was of type but displayed 1.8e-43 when printed.

I assumed the last file should be correct, thus I continued and calculated all my descriptors, saved them to .fvecs and started efanna_graph. However the training did no go as expected and the recall never went above 0.0001. The parameters I used: 200 200 20 10 100.

I can't seem to find a solution. Can you please provide your snippet on how you compute SIFT descriptors and save these to .fvecs file?

Thank you.

The text was updated successfully, but these errors were encountered:

fc731097343 · 2019-08-07T02:04:51Z

The way we read the file is to read the first number (the dimension 128 in your case) of each vector as an "unsigned int" in C++. Then we read the vector as a sequence of float in C++ (float32 in Python). Please see any test*.cpp in the test dir for example.

I think you should write these two parts separately.
Specifically, you should write an int32 then write 128 float32 for each row of the matrix. Hope it can solve your problem.

GertjanBrouwer · 2019-08-07T11:38:07Z

Thank you very much, i misread the documentation and saved every byte as int32. For anyone else with this problem. I use this to save a single image descriptors:
for descriptor in descriptors:
dimension_array = array('i', [128])
dimension_array.tofile(output_file)
float_array = array('f', descriptor)
float_array.tofile(output_file)

GertjanBrouwer closed this as completed Aug 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recall stays 0.0001 #2

Recall stays 0.0001 #2

GertjanBrouwer commented Aug 6, 2019 •

edited

Loading

fc731097343 commented Aug 7, 2019

GertjanBrouwer commented Aug 7, 2019 •

edited

Loading

Recall stays 0.0001 #2

Recall stays 0.0001 #2

Comments

GertjanBrouwer commented Aug 6, 2019 • edited Loading

fc731097343 commented Aug 7, 2019

GertjanBrouwer commented Aug 7, 2019 • edited Loading

GertjanBrouwer commented Aug 6, 2019 •

edited

Loading

GertjanBrouwer commented Aug 7, 2019 •

edited

Loading