Documentation Flownet Project
Follow the documentation in https://github.com/lmb-freiburg/flownet2/
Run the network
Everything is done through the main.py script
You can run
python main.py -h for help, or
python main.py command -h for a specific command help.
- Run the network on an image pair, to get the flow
python main.py run <prototxt> <input_weights> <image0> <image1> <output_path_flow>
- To run the network on multiple pairs
Create a text file containing a list of (path_image1, path_image2)
main.py run_multiple <prototxt> <weights> <listfile> <output_dir>
--save-images flag to save the flow, the two images and the warped image1, all as image files.
This will also save a text file in the output directoyr, containing several metrics (flow median, flow mean, L2 distance between warped image and original), as well as the paths of the output images and flow.
- If you have consecutive image, you can provide a list file containing only the list of single images,
main.py consecutive <prototxt> <weights> <listfile> <output_dir>.
Make a video
To make a video, with a list of images or flow images, you can use ffmpeg. See
my make_video.sh script.
Train the network
- To train the network Basically, follow the flownet2 documentation. You have to :
- create the dataset (in the lmdb format), using the command
/build/tools/convert_imageset_and_flow.bin(see make_lmdbs.sh script in flownet2/data directory)
- create your train.prototxt file, from one of the flownet2 models (you have to change the training / testing ldmb paths)
- create your solver.prototxt (it is provided in flownet2/models)
and train with caffe (
caffe train --solver solver.prototxt)
You can also train using my function
main.py train [-h] prototxt input_weights output_weights iterations log_file
I have a few python scripts to run computations on optical flow, you can check in tools.py. The most interesting functions are :
apply_flow_reverse: warp an image, given a flow.
- tools to build a dataset :
generate_two_images_and_flow, which, from one image, generates two sub-images that are translated, and the corresponding flow.
TODO : calculer sur les datasets pour chaque image le flot par rapport a l'image de depart, et le warp. Ensuite calculer la L2 entre le warp et l'originale
- avec les bandes noires
- sans les bandes noires (normalisé sur le nombre de pixels)
- en remplaçant les bandes noires par la moyenne de l'image
They are listed in the flownet2 paper : https://arxiv.org/pdf/1612.01925.pdf
- FlowNet2 : The biggest model, best results but the slowest (~5s for 1 flow, but should be much faster according to paper).
- FlowNet2-s : the fastest model, but bad results
And between those two : FlowNet2-CSS, FlowNet2-SS
You need the point cloud library for this.
You can use the optical flow to perform 3D reconstruction.
I copied three files from the RECONSTRUCT library and adapted them to work with optical flows obtained with flownet.
To use it : modify the paths in the
reconstruct.py file, at the end.add
You have to give the two images, the flow, and the output paths.
python reconstruct.py. It will save a 3D point cloud in pcd format.
You can then vizualize it with pcl_viewer from Point Cloud Library.
You can also change sensor parameter in the
processing.py file, line 49 (sensor size, and focal length).
Summary of results.
I tried to fine-tune FlowNet2-S and FlowNet2-SS on the translation dataset, but I never got better results than with the default weights (except on translated images). They were very bad visually.
I didn't manage to fine-tune train the big FlowNet2 network.
Other approachs we could use to fine tune to our dataset :
- Use the same method they used, but with the lake images as a background, and copy paste natural objects on the foreground (trees, houses).
- We can even do something like having 2 or three layers of depth, that move relatively to one another, to simulate a translation of the boat.
- Try to train using the warp loss. I almost did this but had no time to finish.