Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the structure of the split-#-#-#-minsc-2.0.h5 file? #28

Open
4xle opened this issue Nov 18, 2018 · 30 comments
Open

What is the structure of the split-#-#-#-minsc-2.0.h5 file? #28

4xle opened this issue Nov 18, 2018 · 30 comments

Comments

@4xle
Copy link

4xle commented Nov 18, 2018

I'm working on a created a documented training dataset pipeline for this network to see how different datasets might induce different learning outcomes, and I've been able to follow the process for generating training data using VisualSFM fairly well with some educated guesses combined with the descriptions in the paper and code documentation, which is pretty good. I am attempting to recreate the training pipeline as closely as possible to the original work with some additional documentation which I would not mind turning into a PR once it is finished and I know that I've got it right.

However, there is a hole regarding this section:

 # Use only valid indices to ascertain mutual exclusiveness
                id_file_name = train_data_dir + "split-"
                id_file_name += str(param.dataset.nTrainPercent) + "-"
                id_file_name += str(param.dataset.nValidPercent) + "-"
                id_file_name += str(param.dataset.nTestPercent) + "-"
                id_file_name += ("minsc-" +
                                 str(param.dataset.fMinKpSize) +
                                 ".h5")

                if mode == "train":
                    id_key = "indices_train"
                elif mode == "valid":
                    id_key = "indices_val"
                elif mode == "test":
                    id_key = "indices_test"
                print(id_file_name)
                with h5py.File(id_file_name, "r") as id_file:
                    id_2_keep = np.asarray(id_file[id_key])

                # ind_2_keep = np.in1d(dataset[2], id_2_keep)
                # ind_2_keep += dataset[2] < 0

                # loop through files to figure out how many valid items we have
#                pdb.set_trace() # for tracking of the dataset

This collection of valid indices and the concept of mutually excluding data is not referred to anywhere in the paper or elsewhere in the code, or if it is I cannot find it. I can tell by the name it is supposed to be some collection of ids for point features (my guess would be for SfM pointids?), but given its location in the code it is unclear how those values would be determined. Also, the access section for the dump files:

# Use loadh5 and turn it back to original cur_data_set
                    with h5py.File(final_dump_file_name, "r") as dump_file:
                        cur_ids = dump_file["2"].value

before the comparison is done has no obvious correlation with the "valid_keypoints" and "other_keypoints" keys in the kp-minsc files, so I'm drawing a blank as to what those values could be.

Could someone who is familiar with these files provide:

  1. A quick overview of how the split-#-#-#-minsc-2.0.h5 valid indices are determined for a data-subset?
  2. A short breakdown of the structure of the dump files to make sure that I am not missing any additional components they require?

Thanks,
4xle

@4xle
Copy link
Author

4xle commented Nov 19, 2018

We split the data into training and validation sets, discarding views of training points on the validation set and vice-versa
-from ECCV paper

Going back over the paper again, I think it's possible that this is the section that the above code covers, but I'd still appreciate confirmation on this point if anyone can provide it.

@kmyi
Copy link
Contributor

kmyi commented Nov 19, 2018

Sorry, we cannot support that part for now.

@4xle
Copy link
Author

4xle commented Nov 19, 2018

I'm not asking for any kind of code support.

At this point I've made some progress and I believe I have all intermediate h5 training files including the split-kp-minsc file correctly generated from incorporating my own code, interpreting the documentation provided, and reading through the paper (both of which were very helpful).

My request is for just enough information regarding the h5 file compositions to make sure that I am building the network from the same kind of inputs in the same way the paper describes, in order to validate the training process and to avoid misrepresenting what the network is and isn't capable of learning.

It would assist myself and others in validating your results if there were details about the h5 file structures generated during training which could be provided somewhere. A brief set of key:value descriptions for the file structures would likely be sufficient. Once I'm certain I've got the training pipeline correctly implemented (which I think I already do) I would happily contribute the code I have for building custom training datasets as a pull request.

@kmyi
Copy link
Contributor

kmyi commented Nov 19, 2018

@etrulls do you have any of these things left? e.g. for a small example?

@qiuweibo
Copy link

Hi,

I am also planning to implement LIFT on Kitti datasets, since my project is about traffic objects;
So I would really appreciate it if you have any progress on training your own datasets!

Regards,
Weibo

@17zhuhongbao
Copy link

@4xle Hello, have you solved this problem? I also have this problem. Can I discuss it with you?

@4xle
Copy link
Author

4xle commented Jan 9, 2019

I think I have it figured out but I'm still waiting to hear back from @etrulls to confirm that I've got it right. It's possible I have something very similar but not exactly the same which would give me different results.

Once I'm certain I've got it working properly I plan to submit a detailed PR.

@qiuweibo
Copy link

qiuweibo commented Mar 5, 2019

I'm not asking for any kind of code support.

At this point I've made some progress and I believe I have all intermediate h5 training files including the split-kp-minsc file correctly generated from incorporating my own code, interpreting the documentation provided, and reading through the paper (both of which were very helpful).

My request is for just enough information regarding the h5 file compositions to make sure that I am building the network from the same kind of inputs in the same way the paper describes, in order to validate the training process and to avoid misrepresenting what the network is and isn't capable of learning.

It would assist myself and others in validating your results if there were details about the h5 file structures generated during training which could be provided somewhere. A brief set of key:value descriptions for the file structures would likely be sufficient. Once I'm certain I've got the training pipeline correctly implemented (which I think I already do) I would happily contribute the code I have for building custom training datasets as a pull request.

Hi,

Did you figure out training with your own datasets? If so, maybe we can discuss a bit on this problem.
Thanks so much.

Regards,
Weibo.

@etrulls
Copy link
Member

etrulls commented Mar 7, 2019

Sorry, I missed this (feel free to send an e-mail directly next time). It's just a list with the indices of the points used for training, validation, and testing.

>>> f = h5py.File('piccadilly/split-60-20-20-minsc-2.0-fix.h5', 'r')

>>> [k for k in f]
[u'indices_test', u'indices_train', u'indices_val']

>>> f['indices_train'].value
array([[     1],
       [     2],
       [     5],
       ...,
       [175961],
       [175962],
       [175963]], dtype=uint32)

Example: https://www.dropbox.com/s/uynuszoatimsbe1/split-60-20-20-minsc-2.0-fix.h5?dl=0

@hudsonmartins
Copy link

I think I have it figured out but I'm still waiting to hear back from @etrulls to confirm that I've got it right. It's possible I have something very similar but not exactly the same which would give me different results.

Once I'm certain I've got it working properly I plan to submit a detailed PR.

Hi @4xle, do you still plan to submit a PR about the training pipeline?

@punisher220
Copy link

Excuse me, @4xle
I have been stuck in LIFT training dataset construction. Have you solved the problem of generating all required data for LIFT training?
Can I discuss it with you? Thank you.

@4xle
Copy link
Author

4xle commented Apr 13, 2020

@punisher220 Certainly, though if we can keep the discussion on here it may help other people.

@hudsonmartins I did intend to submit a PR, but thought I lost the work due to a bad hard drive. However, past me appears to have had a better backup policy than I remembered implementing, and I have found a copy at least some of the changes that I made for tooling, though they may be no longer up to date. Assuming I can figure out what I did and document the changes, I will make that PR.

@4xle
Copy link
Author

4xle commented Apr 13, 2020

PR #42 has been made.

@punisher220
Copy link

@4xle Thank you for your hard work to the PR.

@punisher220
Copy link

Excuse me, @4xle
I am sorry to bother you after you uploaded your PR files.

I still can not solve the training data construction problem.
I got stuck in creating files with name "kpfilename" at Line 841 in helper.py with string as: "-kp-minsc-2.0.h5"
I put images files(including .jpg, .mat and .sift file for each image) into train_data_dir,
I am not sure whether they are enough for the training data h5 files construction.
I need to confirm whether I have missed some important files in Visual SFM pipeline.
It is my first question.

Then as the eccv.py displays, at Line 410 the code will turn into "get_scale_hist(train_data_dir, param)" in helper.py
because I did not generate the "hist_file_path" file
Next, at Line 843 in helper.py the code requires the "kpiflename" (as "r") readable to continue with hist file generation.
My second question is: How to obtain the "kpfilename" files from the files in train_data_dir?
I did not find out any other places to edit those "kpfilename" files with "w".

I noticed that from Line 836 to Line 840 in helper.py you wrote comment to use readAsciiSiftFile function
to load the sift file. But I still could not work out the approach to generate the "-kp-minsc-2.0.h5" files.

Could you please help me with more hints? Thank you.

@punisher220
Copy link

Excuse me, @kmyid

At present, with the help of PR from 4xle and the example folder uploaded by etrulls, I come closer to the LIFT training data construction.
But I still have problems which make me confused.

1.In your helper.py(in datasets/eccv2016 folder) you set a function named get_scale_hist(train_data_dir, param) to generate the scales-histogram-minsc-2.0.h5 file.
At Line 811 in your helper.py in datasets/eccv2016 folder, you tried to load files end with "_P.mat". I noticed 4xle also did it in PR.
However, I can only get files end with ".mat" and ".sift" after Visual SFM pipeline.
I wonder whether your "_P.mat" files are the same as the ".mat" files I got from Visual SFM.

(1)If so, I have tried to load my ".mat" files as you did at Line 812 with scipy.io.loadmat(kp_file_name)['TFeatures'], but an error occurred:

raise ValueError('Unknown mat file type, version %s, %s' % ret)
ValueError: Unknown mat file type, version 8, 8

Then I noticed that my ".mat" file may be ASCII files then I tried to load them with readAsciiSiftFile(filepath) from 4xle's PR helper.py.
But an error occurred:

(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc6 in position 8: invalid continuation byte

Now I got no ideas about how to load the ".mat" files(also the ".sift" files). This is the main problem for me.

(2)If not, how did you get those "_p.mat" files which can be loaded by scipy.io.loadmat function. I even doubt that we are using different Visual SFM software.
Or did you utilize some processing methods to write those mat files?

  1. I wonder how to get the "img_name-kp-minsc-2.0.h5" files, they have 4 groups as: other_descriptors, other_keypoints, valid_descriptors and valid_keypoints.
    With the help of 4xle's PR, I still can not work it out.
    Could you please help me?

@4xle
Copy link
Author

4xle commented Apr 17, 2020 via email

@4xle
Copy link
Author

4xle commented Apr 17, 2020

@punisher220 The _P.mat files should be the same as the .mat files. The ReadAsciiSiftFile was for reading the .sift files, but I think in the end that function is not called anywhere (it was primarily for debugging/development).

@kmyid I've found some additional notes & scripts for pre-processing data prior to training, I'll clean them up and add them to the PR, then finish cleaning up the PR.

@punisher220
Copy link

Hello, @4xle
Thank you for your update in PR.
I tested your convert_vsfm_to_h5.py with my path argument including nvm file and mat&sift files
but at Line 140 I still missed kpfilename ends with "-kps.h5"
You said in PR that you are testing the process for everything so I guess you are working hard on that.

@kmyid
I have to admit that it is a complex course to construct the training dataset for LIFT.
In the process, Visual SFM is necessary. I have a question in VSFM pipeline.

The matching for every image pairs(generating .mat files and .sift files)
comsumes quite a long time in Visual SFM.
Espacially when you put over 1000 images, it indeed takes a long time to match images in VSFM.

Even with the help of gpu acceleration, it took me nearly 30 hours to finish the matching step for 1700-image dataset in Visual SFM. What a desperate wait.

I wonder whether there are some other approaches to speedup the matching process in VSFM settings.
Or did you run the VSFM pipeline with a gpu which has a strong compute ability for obvious acceleration and you did not wait for so long?

@4xle
Copy link
Author

4xle commented Apr 21, 2020

@punisher220 You can tweak the parameters of VSFM uses for matching to reduce the GPU computation load, though note that may affect the quality of the final results. If your input is video, you can use the sequence match option to reduce the number of match operations greatly. VSFM is also more affected by available RAM than VRAM as it does the keypoint identification on the GPU, but the matching on CPU if I recall correctly. A bigger GPU will get the identification done faster and be able to keep more images for identifying against in VRAM, but then the CPU is slammed with a matching backlog by comparison, especially if the images are large and have many keypoints. You can call VSFM on the command line without the GUI to save on some overhead that incurs, or you can write scripts which compute the keypoints more efficiently yourself and then call those directly from VSFM or run them against your dataset, importing the results with the images to then do the matching in VSFM, or do the matching yourself and saving it to then load in VSFM. However, VSFM is pretty fast as it is for what it does, so unless you have access to parallelized resources YMMV.

On a separate note, this is the second time that this project has killed a hard drive of mine through excessive use, it seems. I was rebuilding the Piccadilly dataset in VSFM to then test the code with however that will be on hold until I can figure out how I'm going to get the drive I need replaced as I was using a remote machine which is not currently physically accessible due to COVID.

@kmyi
Copy link
Contributor

kmyi commented Apr 21, 2020

On a separate note, this is the second time that this project has killed a hard drive of mine through excessive use, it seems. I was rebuilding the Piccadilly dataset in VSFM to then test the code with however that will be on hold until I can figure out how I'm going to get the drive I need replaced as I was using a remote machine which is not currently physically accessible due to COVID.

I am very sorry to hear that. Yes, the code is nowhere near perfect :( Could be much better...

@4xle
Copy link
Author

4xle commented Apr 21, 2020 via email

@punisher220
Copy link

Hello, I have a small question about Visual SFM models information in nvm file.

I tried Visual SFM with several datasets(just hundreds of images)
and the 3D reconstruction always generated an nvm file with multiple models.
I tried to merge those models but still could not merge them as one model.
I did the matching process with VSFM default settings and I have put the whole dataset into VSFM.
I found it hard to generate a single-model nvm file from the Travel-image dataset in VSFM.

I want to confirm:
When you did LIFT training data generation from Piccadilly,
have you chosen the biggest(or the most proper model displayed) model to generate the training h5 files;
or you just utilized the whole nvm file with multi-models as the Visual SFM generated?

Thank you.

And, @4xle Thanks for your reply. I am sorry to hear that you are in hurdle due to COVID.

@etrulls
Copy link
Member

etrulls commented Apr 29, 2020

We probably just used the largest. As long as you have enough data you'll be fine. You probably want more variation in terms of different scenes anyway.

@4xle
Copy link
Author

4xle commented Apr 29, 2020 via email

@4xle
Copy link
Author

4xle commented Apr 29, 2020 via email

@punisher220
Copy link

I am sure about: When you finish 3D reconstruction in VSFM, the nvm file will contain all models information. No matter which .ply model files you choose to generate from VSFM, the nvm file will not change.

I did VSFM with small dataset and the reconstruction result was a single-model nvm file.
But when I tried VSFM with Travel-image datasets the nvm file contains many models.

Based on that, I want to confirm whether it is proper to use the multi-models nvm file directly or pick up the part data of largest one from the nvm file and go on.

However, Either way I choose, I still can not work out the training data required h5 files generation.

@ZeroSteven618
Copy link

Hello, I did test part of the tf-lift repo with the model provided and the result is impressive.
But I can not get the train pipeline done, even with the help of the PR from 4xle(Thank you for your work)

I am trying to complete training with a small dataset to make sure the necessary steps for the LIFT training.
But I failed. I can only get the sift and mat files from VSFM then I got no idea for the next.
I followed the PR to get all required h5 files but the convert_vsfm_to_h5.py needs an h5 file ends with "-kps.h5" that I do not know how to generate. So I could not go further.

How to make the complex process done? I admit the question may be hard to solve but I still appreciate for more help sincerely.
Thank you.

@ZeroSteven618
Copy link

Hi, @4xle
Thanks for your PR sincerely but I can not solve the LIFT training with the current PR due to my poor program skills.
I did not try the training from Piccadilly dataset because it is too huge. I used a smaller dataset also from tourism.
But I faced the same trouble as @punisher220 in convert_vsfm_to_h5.py of Pre-Processing part and I could not work it out.
Could you please provide more help in the PR?
Thank you in advance.

@ZeroSteven618
Copy link

Hi, @etrulls @kmyid @4xle I feel sorry to bother you.
I wonder the process for pre-processing for LIFT network training in code implementation.
I understand the huge amount of work for the pre-processing part for the dataset Piccadilly experiment including the hard drive space consuming and time killing so it was indeed inconvenient for you to upload and host the huge data.
But I still can not finish the training part. I have several questions.
The LIFT described the Creating the Training Dataset as:

We split the data into training and validation sets, discarding views of training points on the validation set and vice-versa. To build the positive training samples we consider only the feature points that survive the SfM reconstruction process. To extract patches that do not contain any distinctive feature point, as required by our training method, we randomly sample image regions that contain no SIFT features, including those that were not used by SfM

In my opinion, this part described you did feature points mark with VSFM as:
Those feature points surviving the SfM into the model are positive and others are negative samples.
Then you utilized the information from VSFM files to generate the h5 files for LIFT training.
Unfortunately I have no ideas for the process implementation with code.

Question 1:
The nvm file can be read as txt format easily but the mat file and sift files can not be read in scipy.io.loadmat() .
In your original helper.py I found you used "scipy.io.loadmat()["TFeatures"]" to load the mat files but I tried with that and failed.
I wonder if there is something wrong with my scipy version in my pip environment. It is version 1.2.0.
Is it right or the same as yours?

Question 2:
I wonder if it is necessary to read and use the mat files and sift files to generate all of the pre-processing h5 files for training.
(I guess they are necessary especially for the "-kp-minsc-2.0.h5" file in your example folder but I am not sure)
Or only the nvm file is enough?

Question 3:
The pre-processing should finish generating the h5 files as:
list of images for each split(related to split-60-20-20-minsc-2.0.h5 file)
key point file for each jpg (related to -kp-minsc-2.0.h5 file)
histogram of all keypoint scales in all images(related to scales-histogram-minsc-2.0.h5)
Did I miss something in the pre-processing part?
After the pre-processing you did score map from the image patch Detector and the step is related to other files in example folder and I think it has been done in the current code you provided.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants