CMSC848F Assignment 2: Single View to 3D

Goals: In this assignment, you will explore the types of loss and decoder functions for regressing to voxels, point clouds, and mesh representation from single view RGB input.

0. Setup

Please download and extract the dataset from here. After unzipping, set the appropiate path references in dataset_location.py file here

# Better do this after you've secured a GPU.
conda create -n pytorch3d-env python=3.9
conda activate pytorch3d-env
conda install pytorch==1.12.0 torchvision==0.13.0 torchaudio==0.12.0 cudatoolkit=11.3 -c pytorch
conda install -c fvcore -c iopath -c conda-forge fvcore iopath
conda install pytorch3d -c pytorch3d
pip install numpy PyMCubes matplotlib

Make sure you have installed the packages mentioned in requirements.txt. This assignment will need the GPU version of pytorch.

How to use GPUs on UMIACS cluster?

1. Exploring loss functions (15 points)

This section will involve defining a loss function, for fitting voxels, point clouds and meshes.

1.1. Fitting a voxel grid (5 points)

In this subsection, we will define binary cross entropy loss that can help us fit a 3D binary voxel grid. Define the loss functions here in losses.py file. For this you can use the pre-defined losses in pytorch library.

Run the file python fit_data.py --type 'vox', to fit the source voxel grid to the target voxel grid.

Visualize the optimized voxel grid along-side the ground truth voxel grid using the tools learnt in previous section.

1.2. Fitting a point cloud (5 points)

In this subsection, we will define chamfer loss that can help us fit a 3D point cloud . Define the loss functions here in losses.py file. We expect you to write your own code for this and not use any pytorch3d utilities. You are allowed to use functions inside pytorch3d.ops.knn such as knn_gather or knn_points

Run the file python fit_data.py --type 'point', to fit the source point cloud to the target point cloud.

Visualize the optimized point cloud along-side the ground truth point cloud using the tools learnt in previous section.

1.3. Fitting a mesh (5 points)

In this subsection, we will define an additional smoothening loss that can help us fit a mesh. Define the loss functions here in losses.py file.

For this you can use the pre-defined losses in pytorch library.

Run the file python fit_data.py --type 'mesh', to fit the source mesh to the target mesh.

Visualize the optimized mesh along-side the ground truth mesh using the tools learnt in previous section.

2. Reconstructing 3D from single view (85 points)

This section will involve training a single view to 3D pipeline for voxels, point clouds and meshes. Refer to the save_freq argument in train_model.py to save the model checkpoint quicker/slower.

We also provide pretrained ResNet18 features of images to save computation and GPU resources required. Use --load_feat argument to use these features during training and evaluation. This should be False by default, and only use this if you are facing issues in getting GPU resources. You can also enable training on a CPU by the device argument. Also indiciate in your submission if you had to use this argument.

2.1. Image to voxel grid (20 points)

In this subsection, we will define a neural network to decode binary voxel grids. Define the decoder network here in model.py file, then reference your decoder here in model.py file.

We have provided a decoder network in model.py, but you can also modify it as you wish.

Run the file python train_model.py --type 'vox', to train single view to voxel grid pipeline, feel free to tune the hyperparameters as per your need.

After trained, visualize the input RGB, ground truth voxel grid and predicted voxel in eval_model.py file using: python eval_model.py --type 'vox' --load_checkpoint

You need to add the respective visualization code in eval_model.py

On your webpage, you should include visuals of any three examples in the test set. For each example show the input RGB, render of the predicted 3D voxel grid and a render of the ground truth mesh.

2.2. Image to point cloud (20 points)

In this subsection, we will define a neural network to decode point clouds. Similar as above, define the decoder network here in model.py file, then reference your decoder here in model.py file

Run the file python train_model.py --type 'point', to train single view to pointcloud pipeline, feel free to tune the hyperparameters as per your need.

After trained, visualize the input RGB, ground truth point cloud and predicted point cloud in eval_model.py file using: python eval_model.py --type 'point' --load_checkpoint

You need to add the respective visualization code in eval_model.py.

On your webpage, you should include visuals of any three examples in the test set. For each example show the input RGB, render of the predicted 3D point cloud and a render of the ground truth mesh.

2.3. Image to mesh (20 points)

In this subsection, we will define a neural network to decode mesh. Similar as above, define the decoder network here in model.py file, then reference your decoder here in model.py file

Run the file python train_model.py --type 'mesh', to train single view to mesh pipeline, feel free to tune the hyperparameters as per your need. We also encourage the student to try different mesh initializations here

After trained, visualize the input RGB, ground truth mesh and predicted mesh in eval_model.py file using: python eval_model.py --type 'mesh' --load_checkpoint

You need to add the respective visualization code in eval_model.py.

On your webpage, you should include visuals of any three examples in the test set. For each example show the input RGB, render of the predicted mesh and a render of the ground truth mesh.

2.4. Quantitative comparisions(10 points)

Quantitatively compare the F1 score of 3D reconstruction for meshes vs pointcloud vs voxelgrids. Provide an intutive explaination justifying the comparision.

For evaluating you can run: python eval_model.py --type voxel|mesh|point --load_checkpoint

On your webpage, you should include the f1-score curve at different thresholds for voxelgrid, pointcloud and the mesh network. The plot is saved as eval_{type}.png.

2.5. Analyse effects of hyperparms variations (5 points)

Analyse the results, by varying an hyperparameter of your choice. For example n_points or vox_size or w_chamfer or initial mesh(ico_sphere) etc. Try to be unique and conclusive in your analysis.

2.6. Interpret your model (10 points)

Simply seeing final predictions and numerical evaluations is not always insightful. Can you create some visualizations that help highlight what your learned model does? Be creative and think of what visualizations would help you gain insights. There is no `right' answer - although reading some papers to get inspiration might give you ideas.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
output		output
README.md		README.md
dataset_location.py		dataset_location.py
eval_mesh.png		eval_mesh.png
eval_model.py		eval_model.py
eval_point.png		eval_point.png
fit_data.py		fit_data.py
gt_mesh_0.gif		gt_mesh_0.gif
gt_mesh_500.gif		gt_mesh_500.gif
gt_pc_0.gif		gt_pc_0.gif
gt_pc_500.gif		gt_pc_500.gif
gt_vox_0.gif		gt_vox_0.gif
losses.py		losses.py
mesh_0.gif		mesh_0.gif
mesh_500.gif		mesh_500.gif
model.py		model.py
pc_0.gif		pc_0.gif
pc_500.gif		pc_500.gif
r2n2_custom.py		r2n2_custom.py
report.md.html		report.md.html
requirements.txt		requirements.txt
test.ipynb		test.ipynb
train_model.py		train_model.py
utils.py		utils.py
utils_vox.py		utils_vox.py
vox_0.gif		vox_0.gif

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CMSC848F Assignment 2: Single View to 3D

0. Setup

1. Exploring loss functions (15 points)

1.1. Fitting a voxel grid (5 points)

1.2. Fitting a point cloud (5 points)

1.3. Fitting a mesh (5 points)

2. Reconstructing 3D from single view (85 points)

2.1. Image to voxel grid (20 points)

2.2. Image to point cloud (20 points)

2.3. Image to mesh (20 points)

2.4. Quantitative comparisions(10 points)

2.5. Analyse effects of hyperparms variations (5 points)

2.6. Interpret your model (10 points)

About

Releases

Packages

Languages

pratik2394/CMSC848F_assignment2

Folders and files

Latest commit

History

Repository files navigation

CMSC848F Assignment 2: Single View to 3D

0. Setup

1. Exploring loss functions (15 points)

1.1. Fitting a voxel grid (5 points)

1.2. Fitting a point cloud (5 points)

1.3. Fitting a mesh (5 points)

2. Reconstructing 3D from single view (85 points)

2.1. Image to voxel grid (20 points)

2.2. Image to point cloud (20 points)

2.3. Image to mesh (20 points)

2.4. Quantitative comparisions(10 points)

2.5. Analyse effects of hyperparms variations (5 points)

2.6. Interpret your model (10 points)

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages