<a href="https://colab.research.google.com/github/kuock0129/GPU-Accelerated-3D-Machine-Learning/blob/main/HW1/3DMLGPU_HW1_Part1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 3DMLGPU - HW1
In this assignment, we will learn more about 3D ML datasets, including ShapeNet, ModelNet, S3DIS, and Objaverse (which were discussed in L2), and tools and techniques for processing 3D ML models.
We will also take a deep dive into two of the pioneering 3DML point-cloud models: PointNet and PointNet++.

The HW1 assignment consists of 2 parts, provided in two Jupyter notebooks.

*   3DMLGPU_HW1_part1.ipynp (*which you are currently reading now*)
*   3DMLGPU_HW1_part2.ipynp

**Submission of Questions & Answers (QnA)**: Put your answers/results/analysis in a separate word/pdf report and submit along with this completed notebook. We are expecting brief, to-the-point answers for the questions asked in this notebook.

**Notes:** You may not be able to complete HW1 in one sitting. It is recommended that you connect your google drive to periodically save the model weights/visualizations/results etc. Complete one section at a time.

This Homework has some open ended components, so it is ok if some things don't work out completely. Explain which sections you were not able to complete and why. Based on your atempts, we might provide extra credits. However, it is expected that you will complete atleast the mandatory portions (as per instructions) for full credit. The Bonus portions are mentioned at the end.


## ShapeNet dataset
First, we will download a subset of the ShapeNetCore dataset, which includes only part of the part-segmentation data, and get some understanding of its structure.

Please refer https://shapenet.org/ to read more about this dataset

In [None]:
# we created a gdrive zip for you to download the dataset
import gdown
import zipfile

file_id = "13JRSnAkHJxABk0xs8ULk5-6zPdaLTijQ"
url = f"https://drive.google.com/uc?id={file_id}"
output_zip = "shapenet.zip"

print("Downloading dataset from Google Drive...")
gdown.download(url, output_zip, quiet=False)

# Step 3: Extract the downloaded zip file
print("Extracting the dataset...")
with zipfile.ZipFile(output_zip, 'r') as zip_ref:
    zip_ref.extractall()

print("Dataset downloaded at '/content/shapenetcore_partanno_segmentation_benchmark_v0' folder.")

### Task-1 : ShapeNet Dataset (10 points)
Now that you have the data, select any 2 categories and visualize the point cloud. Then do an analysis on the number of data points, number of classes etc.

You may want to use matplotlib, `plotly.graph_objects` or some other standard python library.

Answer the questions:
- describe the file format/structure for the individual data items (objects) in the dataset
- is this in the original dataset published representation?
- How is this data different than a regular image?
- Which libraries did you use to visualize the point cloud?
- What are the number of classes in this dataset? How many data points are there in each class?
- Any additional insights you want to add

In [None]:
# TODO: Your code here



## Task-2 : PointNet Model

### Task-2, Part 1 : Understand PointNet (12 points)
Understand the implemetation of PointNet. The code we are looking at is from https://github.com/fxia22/pointnet.pytorch

Go through the official repo to understand the implementation of this model. Specifically, answer the following questions: (briefly)

- How is the data loaded? look at `ShapeNetDataset` class in `pointnet/dataset.py`. If you have to train this model in another dataset, what are the key components (in terms of data loading/processing) that you would have to implement?
- Understand PointNet architecture. What is `STN3D` and `STNKD` in `pointnet/model.py`? What is the difference between the classification and segmentation models defined in `model.py`?
- Look at `train_classification.py` and `train-segmentation.py`. What is the difference in terms of loss function, optimizer, scheduler and training pipeline, when we train for classification vs when we train for segmentation? Is the data preprocessing same for both?

In [None]:
# Setup and Install
!git clone https://github.com/fxia22/pointnet.pytorch
%cd /content/pointnet.pytorch
!pip install -e .

Cloning into 'pointnet.pytorch'...
remote: Enumerating objects: 213, done.[K
remote: Total 213 (delta 0), reused 0 (delta 0), pack-reused 213 (from 2)[K
Receiving objects: 100% (213/213), 229.91 KiB | 15.33 MiB/s, done.
Resolving deltas: 100% (125/125), done.
/content/pointnet.pytorch
Obtaining file:///content/pointnet.pytorch
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting plyfile (from pointnet==0.0.1)
  Downloading plyfile-1.1-py3-none-any.whl.metadata (2.1 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch->pointnet==0.0.1)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch->pointnet==0.0.1)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch->pointnet==0.0.1)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1

In [None]:
%cd /content/pointnet.pytorch/scripts
!bash build.sh

/content/pointnet.pytorch/scripts
/content/pointnet.pytorch/scripts


### Task-2, Part 2 : Train PointNet on ShapeNet (15 points)
Now that you have the model and decent understanding of the data, let's train this model on Shapenet dataset. This part might need a GPU to train. You can do it in CPU as well by changing some settings.

Look at the parameters. We recommend training for >= 2 but <= 10 epochs.

Answer the questions:
- Print out the model architecture. How many layers, what types and sizes, and parameters does this model have? If necessary write the code snippet to calculate the number of parameters.
- How does the actual model architecture compare to the model architecture as described in the original paper?
- What is the accuracy you obtained for classification? Is the model overfitting? How can you prove whether it is overfitting or not? [Hint: modify the code to plot train/test metrics using Tensorboard]
- What is the accuracy you obtained for segmentation? Is the model overfitting? how can you prove whether it is overfitting or not? [Same hint]

In [None]:
#[TODO] your code here



Now Let's get into training

In [None]:
%cd /content/pointnet.pytorch/utils
!python train_classification.py --dataset ../../../content/shapenetcore_partanno_segmentation_benchmark_v0/ \
 --nepoch=5 --dataset_type shapenet

/content/pointnet.pytorch/utils
Namespace(batchSize=32, num_points=2500, workers=4, nepoch=5, outf='cls', model='', dataset='../../../content/shapenetcore_partanno_segmentation_benchmark_v0/', dataset_type='shapenet', feature_transform=False)
Random Seed:  1755
{'Airplane': 0, 'Bag': 1, 'Cap': 2, 'Car': 3, 'Chair': 4, 'Earphone': 5, 'Guitar': 6, 'Knife': 7, 'Lamp': 8, 'Laptop': 9, 'Motorbike': 10, 'Mug': 11, 'Pistol': 12, 'Rocket': 13, 'Skateboard': 14, 'Table': 15}
{'Airplane': 4, 'Bag': 2, 'Cap': 2, 'Car': 4, 'Chair': 4, 'Earphone': 3, 'Guitar': 3, 'Knife': 2, 'Lamp': 4, 'Laptop': 2, 'Motorbike': 6, 'Mug': 2, 'Pistol': 3, 'Rocket': 3, 'Skateboard': 3, 'Table': 3} 4
{'Airplane': 0, 'Bag': 1, 'Cap': 2, 'Car': 3, 'Chair': 4, 'Earphone': 5, 'Guitar': 6, 'Knife': 7, 'Lamp': 8, 'Laptop': 9, 'Motorbike': 10, 'Mug': 11, 'Pistol': 12, 'Rocket': 13, 'Skateboard': 14, 'Table': 15}
{'Airplane': 4, 'Bag': 2, 'Cap': 2, 'Car': 4, 'Chair': 4, 'Earphone': 3, 'Guitar': 3, 'Knife': 2, 'Lamp': 4, 'Lapto

In [None]:
!python train_segmentation.py --dataset ../../../content/shapenetcore_partanno_segmentation_benchmark_v0 --nepoch=5

Namespace(batchSize=32, workers=4, nepoch=5, outf='seg', model='', dataset='../../../content/shapenetcore_partanno_segmentation_benchmark_v0', class_choice='Chair', feature_transform=False)
Random Seed:  2606
{'Chair': 0}
{'Airplane': 4, 'Bag': 2, 'Cap': 2, 'Car': 4, 'Chair': 4, 'Earphone': 3, 'Guitar': 3, 'Knife': 2, 'Lamp': 4, 'Laptop': 2, 'Motorbike': 6, 'Mug': 2, 'Pistol': 3, 'Rocket': 3, 'Skateboard': 3, 'Table': 3} 4
{'Chair': 0}
{'Airplane': 4, 'Bag': 2, 'Cap': 2, 'Car': 4, 'Chair': 4, 'Earphone': 3, 'Guitar': 3, 'Knife': 2, 'Lamp': 4, 'Laptop': 2, 'Motorbike': 6, 'Mug': 2, 'Pistol': 3, 'Rocket': 3, 'Skateboard': 3, 'Table': 3} 4
2658 704
classes 4
[0: 0/83] train loss: 1.480904 accuracy: 0.236888
[0: 0/83] [94mtest[0m loss: 1.403002 accuracy: 0.009575
[0: 1/83] train loss: 1.373366 accuracy: 0.374787
[0: 2/83] train loss: 1.338310 accuracy: 0.369837
[0: 3/83] train loss: 1.268816 accuracy: 0.414950
[0: 4/83] train loss: 1.204275 accuracy: 0.514125
[0: 5/83] train loss: 1.2171

### Task-2, Part 3 : Visualize Part-Segmentations (10 points)
Visualize the part segmentation results. Look at `utils/show_seg.py` and write a code snippet to visualize segmented regions.

Submit several figures showing visualization of segmented object instance results.

Question: Where does the model tend to fail? why do you think that is the case?

## PointNet++ Model

Paper: https://proceedings.neurips.cc/paper_files/paper/2017/file/d8bf84be3800d12f74d8b05e9b89836f-Paper.pdf

Now that you are prepared to tackle the difficulties of life (and ML models), we will look at this repo for PointNet++:

https://github.com/yanx27/Pointnet_Pointnet2_pytorch


### Task 3, Part 1 (10 pt)
Answer the following questions after you understand the repo and paper:
- What is the difference between PointNet and PointNet++? Explain your answer in terms of Model architecture, loss function, training pipeline etc.
- When you compare the model strictly from the code perspective, what difference do you notice from the previous PointNet repo vs this one?

- Pointnet++ model has several options: with/without normal, SSG, MSG etc. Explain the difference of each option and what specific model architecture design approaches were used support it.


In [None]:
%cd /content/
!git clone https://github.com/yanx27/Pointnet_Pointnet2_pytorch.git

/content
Cloning into 'Pointnet_Pointnet2_pytorch'...
remote: Enumerating objects: 842, done.[K
remote: Total 842 (delta 0), reused 0 (delta 0), pack-reused 842 (from 1)[K
Receiving objects: 100% (842/842), 68.77 MiB | 12.09 MiB/s, done.
Resolving deltas: 100% (485/485), done.
Updating files: 100% (59/59), done.


#### Datasets for PointNet++
This PointNet++ is configured to use several datasets, for the various 3D tasks which it performs:


*   ModelNet40 dataset for Object Classification
* ShapeNet fpr Part Segmentation
*   S3DIS for scene Semantic Segmentation

After downloading these datasets during the next portion of the assignment, inspect them and answer the following questions

1.   Describe the file format/structure for the individual data items (objects) in the dataset.
2.   is this in the original dataset published representation? If not, what is the difference?




In [None]:
# We will use kaggle to download the ModelNet40 dataset
import kagglehub
# Download latest version
path = kagglehub.dataset_download("chenxaoyu/modelnet-normal-resampled")
print("Path to dataset files:", path)

Path to dataset files: /kaggle/input/modelnet-normal-resampled


In [None]:
# we will copy it to our data directory such that it is easy to use
%cd /content/Pointnet_Pointnet2_pytorch/
!mkdir data
!cp -r /root/.cache/kagglehub/datasets/chenxaoyu/modelnet-normal-resampled/versions/1/modelnet40_normal_resampled/ data/modelnet40_normal_resampled/

/content/Pointnet_Pointnet2_pytorch


### Task 3, Part 2 : Train PointNet++ (20 pt)
*NOTE: this code repo uses the name PointNet2 = PointNet++*

Train pointnet2 on the classification and part-segmentation tasks using modelnet40 dataset

In [None]:
# we will work on the pointnet2 class of models
# explore `train_classification.py` to understand more about the arguments
!python train_classification.py --model pointnet2_cls_ssg --log_dir pointnet2_cls_ssg --epoch 2

PARAMETER ...
Namespace(use_cpu=False, gpu='0', batch_size=24, model='pointnet2_cls_ssg', num_category=40, epoch=2, learning_rate=0.001, num_point=1024, optimizer='Adam', log_dir='pointnet2_cls_ssg', decay_rate=0.0001, use_normals=False, process_data=False, use_uniform_sample=False)
Load dataset ...
Traceback (most recent call last):
  File "/content/Pointnet_Pointnet2_pytorch/train_classification.py", line 232, in <module>
    main(args)
  File "/content/Pointnet_Pointnet2_pytorch/train_classification.py", line 121, in main
    train_dataset = ModelNetDataLoader(root=data_path, args=args, split='train', process_data=args.process_data)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/content/Pointnet_Pointnet2_pytorch/data_utils/ModelNetDataLoader.py", line 63, in __init__
    self.cat = [line.rstrip() for line in open(self.catfile)]
                                          ^^^^^^^^^^^^^^^^^^
FileNotFoundError: [

In [None]:
!python test_classification.py --log_dir pointnet2_cls_ssg

For segmentation, we need to download the Shapenet dataset with surface normals, i.e. `shapenetcore_partanno_segmentation_benchmark_v0_normal`

In [None]:
# Download dataset from kaggle
path = kagglehub.dataset_download("mitkir/shapenet")
print("Path to dataset files:", path)

Downloading from https://www.kaggle.com/api/v1/datasets/download/mitkir/shapenet?dataset_version_number=1...


100%|██████████| 1.36G/1.36G [00:20<00:00, 72.5MB/s]

Extracting files...





Path to dataset files: /root/.cache/kagglehub/datasets/mitkir/shapenet/versions/1


In [None]:
!cp -r /root/.cache/kagglehub/datasets/mitkir/shapenet/versions/1/shapenetcore_partanno_segmentation_benchmark_v0_normal/ data/shapenetcore_partanno_segmentation_benchmark_v0_normal/

Let's start training the part segmentation model

In [None]:
!python train_partseg.py --model pointnet2_part_seg_msg --normal --log_dir pointnet2_part_seg_msg --epoch 5

PARAMETER ...
Namespace(model='pointnet2_part_seg_msg', batch_size=16, epoch=5, learning_rate=0.001, gpu='0', optimizer='Adam', log_dir='pointnet2_part_seg_msg', decay_rate=0.0001, npoint=2048, normal=True, step_size=20, lr_decay=0.5)
The number of training data is: 13998
The number of test data is: 2874
No existing model, starting training from scratch...
Epoch 1 (1/5):
Learning rate:0.001000
BN momentum updated to: 0.100000
  9% 78/874 [00:57<09:44,  1.36it/s]
Traceback (most recent call last):
  File "/content/Pointnet_Pointnet2_pytorch/train_partseg.py", line 305, in <module>
    main(args)
  File "/content/Pointnet_Pointnet2_pytorch/train_partseg.py", line 193, in main
    seg_pred, trans_feat = classifier(points, to_categorical(label, num_classes))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)

In [None]:
!python test_partseg.py --normal --log_dir pointnet2_part_seg_msg

#### Train pointnet2 on S3DIS dataset for semantic segmentation task

We need to download 3D indoor parsing dataset (S3DIS)

(s3dis.tar.gz) https://huggingface.co/datasets/Pointcept/s3dis-compressed

and save into `data/s3dis/Stanford3dDataset_v1.2_Aligned_Version/`




As with the ShapeNet dataset, we have also created a google drive zip for this S3DIS, so the following code should handle the data download.

In [None]:
%cd /content/Pointnet_Pointnet2_pytorch
!rm -rf data/s3dis data/stanford_indoor3d/

/content/Pointnet_Pointnet2_pytorch


In [None]:
import os
import zipfile
import gdown
# https://drive.google.com/file/d/1tZcA-_k8eyvgSOqfkzTDEfdaQPtLiBbv/view?usp=sharing
# URL of the Google Drive shareable link
file_id = "1tZcA-_k8eyvgSOqfkzTDEfdaQPtLiBbv"
# Create a direct download URL for gdown:
url = f"https://drive.google.com/uc?id={file_id}"

output_file = "Stanford3dDataset_v1.2_Aligned_Version.zip"

gdown.download(url, output_file, quiet=False)

# Specify the target directory where you want to extract the archive
target_dir = "data/s3dis"
os.makedirs(target_dir, exist_ok=True)

print("Extracting the dataset...")
with zipfile.ZipFile(output_file, 'r') as zip_ref:
    zip_ref.extractall(path=target_dir)

print(f"Archive extracted to: {target_dir}")

Extracting the dataset...
Archive extracted to: data/s3dis


In [None]:
%cd data_utils
!python collect_indoor3d_data.py

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
/content/Pointnet_Pointnet2_pytorch/data/s3dis/Stanford3dDataset_v1.2_Aligned_Version/Area_4/office_12/Annotations/clutter_6.txt
/content/Pointnet_Pointnet2_pytorch/data/s3dis/Stanford3dDataset_v1.2_Aligned_Version/Area_4/office_12/Annotations/clutter_14.txt
/content/Pointnet_Pointnet2_pytorch/data/s3dis/Stanford3dDataset_v1.2_Aligned_Version/Area_4/office_12/Annotations/clutter_21.txt
/content/Pointnet_Pointnet2_pytorch/data/s3dis/Stanford3dDataset_v1.2_Aligned_Version/Area_4/office_12/Annotations/wall_3.txt
/content/Pointnet_Pointnet2_pytorch/data/s3dis/Stanford3dDataset_v1.2_Aligned_Version/Area_4/office_12/Annotations/bookcase_1.txt
/content/Pointnet_Pointnet2_pytorch/data/s3dis/Stanford3dDataset_v1.2_Aligned_Version/Area_4/office_12/Annotations/clutter_15.txt
/content/Pointnet_Pointnet2_pytorch/data/s3dis/Stanford3dDataset_v1.2_Aligned_Version/Area_4/office_12/Annotations/wall_1.txt
/content/Pointnet_Pointnet2_pytorc

Processed data will save in data/stanford_indoor3d/.

In [None]:
## Check model in ./models
## e.g., pointnet2_ssg
%cd /content/Pointnet_Pointnet2_pytorch
!python train_semseg.py --model pointnet2_sem_seg --test_area 5 --log_dir pointnet2_sem_seg --epoch 2

/content/Pointnet_Pointnet2_pytorch
  self.file_list = [d for d in os.listdir(root) if d.find('Area_%d' % test_area) is -1]
  self.file_list = [d for d in os.listdir(root) if d.find('Area_%d' % test_area) is not -1]
PARAMETER ...
Namespace(model='pointnet2_sem_seg', batch_size=16, epoch=2, learning_rate=0.001, gpu='0', optimizer='Adam', log_dir='pointnet2_sem_seg', decay_rate=0.0001, npoint=4096, step_size=10, lr_decay=0.7, test_area=5)
start loading training data ...
 50% 102/204 [00:48<00:48,  2.12it/s]
Traceback (most recent call last):
  File "/content/Pointnet_Pointnet2_pytorch/train_semseg.py", line 294, in <module>
    main(args)
  File "/content/Pointnet_Pointnet2_pytorch/train_semseg.py", line 97, in main
    TRAIN_DATASET = S3DISDataset(split='train', data_root=root, num_point=NUM_POINT, test_area=args.test_area, block_size=1.0, sample_rate=1.0, transform=None)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

In [None]:
!python test_semseg.py --log_dir pointnet2_sem_seg --test_area 5 --visual

Visualization results will save in log/sem_seg/pointnet2_sem_seg/visual/

and you can visualize these .obj files with MeshLab or other tools.

## Task-4 : Objaverse Dataset

*   Now proceed to the second notebook for HW1_Part2_Objaverse

