This is an implementation of the keypoint network proposed in "Discovery of Latent 3D Keypoints via End-to-end Geometric Reasoning [pdf]". Given a single 2D image of a known class, this network can predict a set of 3D keypoints that are consistent across viewing angles of the same object and across object instances. These keypoints and their detectors are discovered and learned automatically without keypoint location supervision [demo].
Each set contains:
- train.txt, a list of tfrecords used for training.
- dev.txt, a list of tfrecords used for validation.
- test.txt, a list of tfrecords used for testing.
- projection.txt, storing the global 4x4 camera projection matrix.
- job.txt, storing ShapeNet's object IDs in each tfrecord.
main.py --model_dir=MODEL_DIR --dset=DSET
where MODEL_DIR is a folder for storing model checkpoints: (see tf.estimator), and DSET should point to the folder containing tfrecords (download above).
main.py --model_dir=MODEL_DIR --input=INPUT --predict
where MODEL_DIR is the model checkpoint folder, and INPUT is a folder containing png or jpeg test images. We trained the network using the total batch size of 256 (8 x 32 replicas). You may have to tune the learning rate if your batch size is different.
(This is not an officially supported Google product)