Skip to content

pratik2394/CMSC848F_assignment2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CMSC848F Assignment 2: Single View to 3D

Goals: In this assignment, you will explore the types of loss and decoder functions for regressing to voxels, point clouds, and mesh representation from single view RGB input.

0. Setup

Please download and extract the dataset from here. After unzipping, set the appropiate path references in dataset_location.py file here

# Better do this after you've secured a GPU.
conda create -n pytorch3d-env python=3.9
conda activate pytorch3d-env
conda install pytorch==1.12.0 torchvision==0.13.0 torchaudio==0.12.0 cudatoolkit=11.3 -c pytorch
conda install -c fvcore -c iopath -c conda-forge fvcore iopath
conda install pytorch3d -c pytorch3d
pip install numpy PyMCubes matplotlib

Make sure you have installed the packages mentioned in requirements.txt. This assignment will need the GPU version of pytorch.

How to use GPUs on UMIACS cluster?

1. Exploring loss functions (15 points)

This section will involve defining a loss function, for fitting voxels, point clouds and meshes.

1.1. Fitting a voxel grid (5 points)

In this subsection, we will define binary cross entropy loss that can help us fit a 3D binary voxel grid. Define the loss functions here in losses.py file. For this you can use the pre-defined losses in pytorch library.

Run the file python fit_data.py --type 'vox', to fit the source voxel grid to the target voxel grid.

Visualize the optimized voxel grid along-side the ground truth voxel grid using the tools learnt in previous section.

1.2. Fitting a point cloud (5 points)

In this subsection, we will define chamfer loss that can help us fit a 3D point cloud . Define the loss functions here in losses.py file. We expect you to write your own code for this and not use any pytorch3d utilities. You are allowed to use functions inside pytorch3d.ops.knn such as knn_gather or knn_points

Run the file python fit_data.py --type 'point', to fit the source point cloud to the target point cloud.

Visualize the optimized point cloud along-side the ground truth point cloud using the tools learnt in previous section.

1.3. Fitting a mesh (5 points)

In this subsection, we will define an additional smoothening loss that can help us fit a mesh. Define the loss functions here in losses.py file.

For this you can use the pre-defined losses in pytorch library.

Run the file python fit_data.py --type 'mesh', to fit the source mesh to the target mesh.

Visualize the optimized mesh along-side the ground truth mesh using the tools learnt in previous section.

2. Reconstructing 3D from single view (85 points)

This section will involve training a single view to 3D pipeline for voxels, point clouds and meshes. Refer to the save_freq argument in train_model.py to save the model checkpoint quicker/slower.

We also provide pretrained ResNet18 features of images to save computation and GPU resources required. Use --load_feat argument to use these features during training and evaluation. This should be False by default, and only use this if you are facing issues in getting GPU resources. You can also enable training on a CPU by the device argument. Also indiciate in your submission if you had to use this argument.

2.1. Image to voxel grid (20 points)

In this subsection, we will define a neural network to decode binary voxel grids. Define the decoder network here in model.py file, then reference your decoder here in model.py file.

We have provided a decoder network in model.py, but you can also modify it as you wish.

Run the file python train_model.py --type 'vox', to train single view to voxel grid pipeline, feel free to tune the hyperparameters as per your need.

After trained, visualize the input RGB, ground truth voxel grid and predicted voxel in eval_model.py file using: python eval_model.py --type 'vox' --load_checkpoint

You need to add the respective visualization code in eval_model.py

On your webpage, you should include visuals of any three examples in the test set. For each example show the input RGB, render of the predicted 3D voxel grid and a render of the ground truth mesh.

2.2. Image to point cloud (20 points)

In this subsection, we will define a neural network to decode point clouds. Similar as above, define the decoder network here in model.py file, then reference your decoder here in model.py file

Run the file python train_model.py --type 'point', to train single view to pointcloud pipeline, feel free to tune the hyperparameters as per your need.

After trained, visualize the input RGB, ground truth point cloud and predicted point cloud in eval_model.py file using: python eval_model.py --type 'point' --load_checkpoint

You need to add the respective visualization code in eval_model.py.

On your webpage, you should include visuals of any three examples in the test set. For each example show the input RGB, render of the predicted 3D point cloud and a render of the ground truth mesh.

2.3. Image to mesh (20 points)

In this subsection, we will define a neural network to decode mesh. Similar as above, define the decoder network here in model.py file, then reference your decoder here in model.py file

Run the file python train_model.py --type 'mesh', to train single view to mesh pipeline, feel free to tune the hyperparameters as per your need. We also encourage the student to try different mesh initializations here

After trained, visualize the input RGB, ground truth mesh and predicted mesh in eval_model.py file using: python eval_model.py --type 'mesh' --load_checkpoint

You need to add the respective visualization code in eval_model.py.

On your webpage, you should include visuals of any three examples in the test set. For each example show the input RGB, render of the predicted mesh and a render of the ground truth mesh.

2.4. Quantitative comparisions(10 points)

Quantitatively compare the F1 score of 3D reconstruction for meshes vs pointcloud vs voxelgrids. Provide an intutive explaination justifying the comparision.

For evaluating you can run: python eval_model.py --type voxel|mesh|point --load_checkpoint

On your webpage, you should include the f1-score curve at different thresholds for voxelgrid, pointcloud and the mesh network. The plot is saved as eval_{type}.png.

2.5. Analyse effects of hyperparms variations (5 points)

Analyse the results, by varying an hyperparameter of your choice. For example n_points or vox_size or w_chamfer or initial mesh(ico_sphere) etc. Try to be unique and conclusive in your analysis.

2.6. Interpret your model (10 points)

Simply seeing final predictions and numerical evaluations is not always insightful. Can you create some visualizations that help highlight what your learned model does? Be creative and think of what visualizations would help you gain insights. There is no `right' answer - although reading some papers to get inspiration might give you ideas.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published