## Data Patch Generation for Fuji-SfM dataset and PFuji-Size PCL Base Files

Our task with this notebook is to essentially generate PCL patches with apple and tree class data points with approximately similar distributions of data points from each class in those patches.

#### 1. Generation of 3D Data Patch Dataset

First, we normalize _(z-transform)_ each of the base numpy point cloud datasets to re-center the datasets.
Second, we compare the data distribution of each datasets and make sure that the generated data patches have similar distribution during the generation stage. 
Also, we drop the color RGB information for 30 % of the data points for each class to have more robust model training.
Finally, we save this generated dataset for future segmentation model training usage.

#### 2. Generation of 3D Data Patch Dataset with Extra Features

Post this, we also generate a dataset variant with calculated normal information to append extra features into the generated point cloud dataset. And, also randomly drop 30 % of these calculated normals to have more robust segmentation model training.

#### 3. Visualization of Data Patches in the 3D Space

This visualization helps in verifying the upsampling logic used to upsample the apple point cloud data and further look into the generated data patches qualitatively.

In [None]:
# installing open3d package
!pip install open3d

In [3]:
# for loading the dataset into the runtime
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [4]:
# processed pcl numpy arrays of fuji apple datasets
!ls drive/MyDrive/point-cloud-prototyping/datasets/*processed*.zip

drive/MyDrive/point-cloud-prototyping/datasets/fuji-sfm-pcl-processed.zip
drive/MyDrive/point-cloud-prototyping/datasets/pfuji-size-pcl-processed-2018.zip
drive/MyDrive/point-cloud-prototyping/datasets/pfuji-size-pcl-processed-2020.zip


In [None]:
# unzip these files into the current runtime location for further processing
!unzip drive/MyDrive/point-cloud-prototyping/datasets/fuji-sfm-pcl-processed.zip -d .
!unzip drive/MyDrive/point-cloud-prototyping/datasets/pfuji-size-pcl-processed-2018.zip -d .
!unzip drive/MyDrive/point-cloud-prototyping/datasets/pfuji-size-pcl-processed-2020.zip -d .

In [6]:
# import statements for this script
import os
import random
import numpy as np
import open3d as o3d
from tqdm import tqdm
from copy import deepcopy

In [163]:
# copy files to other directory
import shutil
# list all files in the directory
from os import listdir
from os.path import isfile, join

In [7]:
# fixing randomization seed values
random.seed(16)
np.random.seed(16)

In [124]:
# utility functions for patch generation task
# PC max size for visualization for plotly
MAX_PC_SIZE = 40960 # RandLA-Net's input dimension size

# procedural logic for the upsampling function

# separate out apple class sub array with rgb values
# build a KDTree of the separated out sub array
# loop over upsampling_factor number of points
# then random point selection based on index
# post that, query the nearest neighbor to that random point
# preserve rgb information, separate array for actual upsampling operation

# also, update the whole PC to create new upsampling temporary PC
# after certain number of upsampled new PC size chunk is created
# for better upsampling consistency.

def upsample_minor_class(pcl_arr, upsampling_factor = 1.75):
    assert upsampling_factor >= 1.0 and upsampling_factor <= 3.0
    pcl_arr_cls = pcl_arr[pcl_arr[:, -1] == 1]
    pcl_arr_bg = pcl_arr[pcl_arr[:, -1] == 0]
    # open3d pcl processing
    pcd = o3d.geometry.PointCloud()
    pcd.points = o3d.utility.Vector3dVector(pcl_arr_cls[:,0:3])
    # building KD-Tree for NN query
    pcd_tree = o3d.geometry.KDTreeFlann(pcd)
    upsample_count = int((upsampling_factor - 1.0)*pcl_arr_cls.shape[0])
    # upsampled points nested list
    upsampled_points = []
    umsample_limit_flg = 0
    for i in tqdm(range(upsample_count)):
        rand_ind = random.randint(0,pcl_arr_cls.shape[0]-1) 
        [k, idx_vals, _] = pcd_tree.search_knn_vector_3d(pcd.points[rand_ind], 4)
        idx_vals = idx_vals[1:]
        rand_neighbor = random.choice(idx_vals)
        # creating new upsampled points
        arr_vals = (pcl_arr_cls[rand_ind] + pcl_arr_cls[rand_neighbor]) / 2
        upsampled_points.append(list(arr_vals))
        if len(upsampled_points) > 0.8 * pcl_arr_cls.shape[0] and umsample_limit_flg == 0:
            pcl_arr_cls =  np.concatenate((pcl_arr_cls, np.array(upsampled_points)), axis=0)
            pcd = o3d.geometry.PointCloud()
            pcd.points = o3d.utility.Vector3dVector(pcl_arr_cls[:,0:3])
            # building KD-Tree for NN query
            pcd_tree = o3d.geometry.KDTreeFlann(pcd)
            umsample_limit_flg = 1
            upsampled_points = []
    if len(upsampled_points) != 0:
        pcl_arr_cls =  np.concatenate((pcl_arr_cls, np.array(upsampled_points)), axis=0)
    pcl_arr_final = np.concatenate((pcl_arr_cls, pcl_arr_bg), axis=0)
    del pcd
    del pcd_tree
    return pcl_arr_final

# downsampling the PC size for standardization of visualized 'cls' & 'bg' PC size
def downsample_pcl(pcl_arr, downsampling_factor = 0.75):
    assert downsampling_factor >= 0.2 and downsampling_factor < 1.0
    downsampled_pc_count = int(downsampling_factor * pcl_arr.shape[0])
    if downsampled_pc_count > MAX_PC_SIZE:
        downsampled_pc_count = MAX_PC_SIZE
    idx_cls = np.random.randint((pcl_arr.shape[0] - 1),
                                size = downsampled_pc_count)
    pcl_arr = pcl_arr[idx_cls,:] 
    return pcl_arr

In [10]:
# checking the location of the unzipped numpy arrays
!ls *.npy

fuji-sfm-pcl-processed.npy	   pfuji-size-pcl-processed-2020.npy
pfuji-size-pcl-processed-2018.npy


In [18]:
# loading arrays for basic exploratory quantitative analysis
fuji_sfm_arr = np.load('fuji-sfm-pcl-processed.npy')
pfuji_size_2018_arr = np.load('pfuji-size-pcl-processed-2018.npy')
pfuji_size_2020_arr = np.load('pfuji-size-pcl-processed-2020.npy')
print(fuji_sfm_arr.shape, pfuji_size_2018_arr.shape, pfuji_size_2020_arr.shape)

(7153845, 7) (16473962, 7) (14213504, 7)


In [39]:
# checking ratio and fraction of total pc occupied by the apple class point clouds
print(fuji_sfm_arr[fuji_sfm_arr[:,6] == 1][:, 1].shape[0] / fuji_sfm_arr[fuji_sfm_arr[:,6] == 0][:, 0].shape[0], \
    fuji_sfm_arr[fuji_sfm_arr[:,6] == 1][:, 1].shape[0] / (fuji_sfm_arr[fuji_sfm_arr[:,6] == 0][:, 0].shape[0] + fuji_sfm_arr[fuji_sfm_arr[:,6] == 1][:, 1].shape[0]))

print(pfuji_size_2018_arr[pfuji_size_2018_arr[:,6] == 1][:, 1].shape[0] / pfuji_size_2018_arr[pfuji_size_2018_arr[:,6] == 0][:, 0].shape[0], \
    pfuji_size_2018_arr[pfuji_size_2018_arr[:,6] == 1][:, 1].shape[0] / (pfuji_size_2018_arr[pfuji_size_2018_arr[:,6] == 0][:, 0].shape[0] + pfuji_size_2018_arr[pfuji_size_2018_arr[:,6] == 1][:, 1].shape[0]))

print(pfuji_size_2020_arr[pfuji_size_2020_arr[:,6] == 1][:, 1].shape[0] / pfuji_size_2020_arr[pfuji_size_2020_arr[:,6] == 0][:, 0].shape[0], \
    pfuji_size_2020_arr[pfuji_size_2020_arr[:,6] == 1][:, 1].shape[0] / (pfuji_size_2020_arr[pfuji_size_2020_arr[:,6] == 0][:, 0].shape[0] + pfuji_size_2020_arr[pfuji_size_2020_arr[:,6] == 1][:, 1].shape[0]))

# twice the upsampling added for each patch with more than 10K data points
# and apple class objects being present

0.5389213119620704 0.3501941962678811
0.4086897735419506 0.29012049438987414
0.29237205647537456 0.22622901432328016


In [12]:
# checking the pcl data limit points to create window paritions for patch generation
# x-axis coordinate limits
print(np.min(fuji_sfm_arr[:,0]), np.max(fuji_sfm_arr[:,0]), (np.max(fuji_sfm_arr[:,0]) - np.min(fuji_sfm_arr[:,0])) )   
# y-axis coordinate limits
print(np.min(fuji_sfm_arr[:,1]), np.max(fuji_sfm_arr[:,1]), (np.max(fuji_sfm_arr[:,1]) - np.min(fuji_sfm_arr[:,1])) )
# z-axis coordinate limits
print(np.min(fuji_sfm_arr[:,2]), np.max(fuji_sfm_arr[:,2]), (np.max(fuji_sfm_arr[:,2]) - np.min(fuji_sfm_arr[:,2])) )

11.4000082 13.14999771 1.749989509999999
67.25000763 75.0 7.749992370000001
308.75 312.49996948 3.7499694800000043


In [32]:
# checking the pcl data limit points to create window paritions for patch generation
# x-axis coordinate limits
print(np.min(pfuji_size_2018_arr[:,0]), np.max(pfuji_size_2018_arr[:,0]), (np.max(pfuji_size_2018_arr[:,0]) - np.min(pfuji_size_2018_arr[:,0])) )   
# y-axis coordinate limits
print(np.min(pfuji_size_2018_arr[:,1]), np.max(pfuji_size_2018_arr[:,1]), (np.max(pfuji_size_2018_arr[:,1]) - np.min(pfuji_size_2018_arr[:,1])) )
# z-axis coordinate limits
print(np.min(pfuji_size_2018_arr[:,2]), np.max(pfuji_size_2018_arr[:,2]), (np.max(pfuji_size_2018_arr[:,2]) - np.min(pfuji_size_2018_arr[:,2])) )

-0.75 0.74999917 1.4999991700000002
-0.24999963 2.9999995200000003 3.2499991500000003
8.73e-06 3.24999905 3.24999032


In [33]:
# checking the pcl data limit points to create window paritions for patch generation
# x-axis coordinate limits
print(np.min(pfuji_size_2020_arr[:,0]), np.max(pfuji_size_2020_arr[:,0]), (np.max(pfuji_size_2020_arr[:,0]) - np.min(pfuji_size_2020_arr[:,0])) )   
# y-axis coordinate limits
print(np.min(pfuji_size_2020_arr[:,1]), np.max(pfuji_size_2020_arr[:,1]), (np.max(pfuji_size_2020_arr[:,1]) - np.min(pfuji_size_2020_arr[:,1])) )
# z-axis coordinate limits
print(np.min(pfuji_size_2020_arr[:,2]), np.max(pfuji_size_2020_arr[:,2]), (np.max(pfuji_size_2020_arr[:,2]) - np.min(pfuji_size_2020_arr[:,2])) )

-0.99996859 0.99989963 1.9998682200000002
-0.5 2.75 3.25
3.37e-06 2.99999905 2.99999568


In [19]:
# create a 30 % rgb info dropper function
def random_rgb_drop(pcl_arr, drop_frac = 0.3):
    pcl_arr_cls = pcl_arr[pcl_arr[:,6] == 1]
    pcl_arr_bg = pcl_arr[pcl_arr[:,6] == 0]

    idx_cls = np.random.choice(np.arange(pcl_arr_cls.shape[0]), replace=False,
                           size=int(pcl_arr_cls.shape[0] * drop_frac))
    idx_bg = np.random.choice(np.arange(pcl_arr_bg.shape[0]), replace=False,
                           size=int(pcl_arr_bg.shape[0] * drop_frac))
    
    pcl_arr_cls[idx_cls, 3:6] = 0.0
    pcl_arr_bg[idx_bg, 3:6] = 0.0

    pcl_arr =  np.concatenate((pcl_arr_cls, pcl_arr_bg), axis=0)
    return pcl_arr

In [20]:
# drop rgb information for the pcl base arrays in proper to their class sizes
fuji_sfm_arr[:, 3:6] = fuji_sfm_arr[:, 3:6]/255
fuji_sfm_arr = random_rgb_drop(fuji_sfm_arr)
pfuji_size_2018_arr = random_rgb_drop(pfuji_size_2018_arr)
pfuji_size_2020_arr = random_rgb_drop(pfuji_size_2020_arr)

In [35]:
# create normal estimation function and 
# create a 30 % normal info dropper function
def norm_estimate_and_drop(pcl_arr, drop_frac = 0.3):
    pcd = o3d.geometry.PointCloud()
    pcd.points = o3d.utility.Vector3dVector(pcl_arr[:,0:3])
    pcd.estimate_normals() # computationally intensive step
    pcd_arr = np.column_stack((np.asarray(pcd.points), pcl_arr[:,3:6], np.asarray(pcd.normals),
                               pcl_arr[:, 6]))
    del pcd # clear object from the memory

    pcd_arr_cls = pcd_arr[pcd_arr[:,9] == 1]
    pcd_arr_bg = pcd_arr[pcd_arr[:,9] == 0]

    idx_cls = np.random.choice(np.arange(pcd_arr_cls.shape[0]), replace=False,
                           size=int(pcd_arr_cls.shape[0] * drop_frac))
    idx_bg = np.random.choice(np.arange(pcd_arr_bg.shape[0]), replace=False,
                           size=int(pcd_arr_bg.shape[0] * drop_frac))
    
    pcd_arr_cls[idx_cls, 6:9] = 0.0
    pcd_arr_bg[idx_bg, 6:9] = 0.0

    pcd_arr =  np.concatenate((pcd_arr_cls, pcd_arr_bg), axis=0)

    del pcd_arr_cls # clear object from the memory
    del pcd_arr_bg # clear object from the memory

    return pcd_arr

In [36]:
fuji_sfm_norm_arr = norm_estimate_and_drop(fuji_sfm_arr)

In [40]:
pfuji_size_2018_norm_arr = norm_estimate_and_drop(pfuji_size_2018_arr)
pfuji_size_2020_norm_arr = norm_estimate_and_drop(pfuji_size_2020_arr)

In [43]:
print(fuji_sfm_norm_arr.shape, fuji_sfm_arr.shape)
print(pfuji_size_2018_norm_arr.shape, pfuji_size_2018_arr.shape)
print(pfuji_size_2020_norm_arr.shape, pfuji_size_2020_arr.shape)

(7153845, 10) (7153845, 7)
(16473962, 10) (16473962, 7)
(14213504, 10) (14213504, 7)


In [15]:
# if data patch directories not created, create those directories
# both for normal estimate and only with rgb information
FUJI_SFM_TEMP_DIR = 'fuji-data-patches'
NFUJI_SFM_TEMP_DIR = 'fuji-norm-patches'


if not os.path.exists(FUJI_SFM_TEMP_DIR):
    os.makedirs(FUJI_SFM_TEMP_DIR)
if not os.path.exists(NFUJI_SFM_TEMP_DIR):
    os.makedirs(NFUJI_SFM_TEMP_DIR)

In [91]:
# patch window size generation for the fuji sfm dataset
wind_sfm_x = ( 13.15 - 11.4 ) / 3
wind_sfm_y = ( 75.00 - 67.25 ) / 12
wind_sfm_z = ( 312.50 - 308.75 ) / 6
print(wind_sfm_x, wind_sfm_y, wind_sfm_z)

0.5833333333333334 0.6458333333333334 0.625


In [92]:
# patch window size generation for the pfuji size 2018 dataset
wind_pfuji_2018_x = abs(( - 0.75 - 0.75 ) / 4)
wind_pfuji_2018_y = abs(( -0.25 - 3.0 ) / 9)
wind_pfuji_2018_z = abs(( 0 - 3.25 ) / 9)
print(wind_pfuji_2018_x, wind_pfuji_2018_y, wind_pfuji_2018_z)

0.375 0.3611111111111111 0.3611111111111111


In [93]:
# patch window size generation for the pfuji size 2020 dataset
wind_pfuji_2020_x = abs(( - 1.0 - 1.0 ) / 4)
wind_pfuji_2020_y = abs(( -0.5 - 2.75 ) / 7)
wind_pfuji_2020_z = abs(( 0 - 3.0 ) / 6)
print(wind_pfuji_2020_x, wind_pfuji_2020_y, wind_pfuji_2020_z)

0.5 0.4642857142857143 0.5


In [96]:
# global data patch counter for fuji datasets
DATA_PATCH_COUNTER = 0
NORM_DATA_PATCH_COUNTER = 0

In [104]:
# patch generator function to generate the pcl data patches for model training
def patch_generator(pcl_arr, x_rng, y_rng, z_rng, \
                    wind_x, wind_y, wind_z, patch_path, patch_counter):
    for i in range(0, x_rng):
        for j in range(0, y_rng):
            for k in range(0, z_rng):
                pc_temp = pcl_arr[
                                ( pcl_arr[:,0] > np.min(pcl_arr[:,0]) + wind_x*i ) &
                                ( pcl_arr[:,0] <= np.min(pcl_arr[:,0]) + wind_x*(i+1) ) &

                                ( pcl_arr[:,1] > np.min(pcl_arr[:,1]) + wind_y*j ) &
                                ( pcl_arr[:,1] <= np.min(pcl_arr[:,1]) + wind_y*(j+1) ) &

                                ( pcl_arr[:,2] > np.min(pcl_arr[:,2]) + wind_z*k ) &
                                ( pcl_arr[:,2] <= np.min(pcl_arr[:,2]) + wind_z*(k+1) )
                        ]

                if pc_temp.shape[0] > 8192 and np.max(pc_temp[:,-1]) == 1.0:
                    try:
                        pc_temp = downsample_pcl(pc_temp)
                        pc_temp = upsample_minor_class(pc_temp)
                        patch_counter = patch_counter + 1
                        file_name = patch_path + '/data_patch_'+str(patch_counter)+'.npy'
                        np.save(file_name, pc_temp)
                    except:
                        pass

    return (patch_counter + 1)

In [None]:
DATA_PATCH_COUNTER = patch_generator(fuji_sfm_arr, 3, 12, 6,
                                     wind_sfm_x, wind_sfm_y, wind_sfm_z,
                                     FUJI_SFM_TEMP_DIR, DATA_PATCH_COUNTER)

In [None]:
DATA_PATCH_COUNTER = patch_generator(pfuji_size_2018_arr, 4, 9, 9,
                                     wind_pfuji_2018_x, wind_pfuji_2018_y, wind_pfuji_2018_z,
                                     FUJI_SFM_TEMP_DIR, DATA_PATCH_COUNTER + 1)

In [None]:
DATA_PATCH_COUNTER = patch_generator(pfuji_size_2020_arr, 4, 7, 6,
                                     wind_pfuji_2020_x, wind_pfuji_2020_y, wind_pfuji_2020_z,
                                     FUJI_SFM_TEMP_DIR, DATA_PATCH_COUNTER)

In [None]:
!ls fuji-norm-patches

In [None]:
NORM_DATA_PATCH_COUNTER = patch_generator(fuji_sfm_norm_arr, 3, 12, 6,
                                     wind_sfm_x, wind_sfm_y, wind_sfm_z,
                                     NFUJI_SFM_TEMP_DIR, NORM_DATA_PATCH_COUNTER)
                                     
NORM_DATA_PATCH_COUNTER = patch_generator(pfuji_size_2018_norm_arr, 4, 9, 9,
                                     wind_pfuji_2018_x, wind_pfuji_2018_y, wind_pfuji_2018_z,
                                     NFUJI_SFM_TEMP_DIR, NORM_DATA_PATCH_COUNTER)
                                     
NORM_DATA_PATCH_COUNTER = patch_generator(pfuji_size_2020_norm_arr, 4, 7, 6,
                                     wind_pfuji_2020_x, wind_pfuji_2020_y, wind_pfuji_2020_z,
                                     NFUJI_SFM_TEMP_DIR, NORM_DATA_PATCH_COUNTER)

In [138]:
# list all the created data patch files
data_patches_file_lst = [f for f in listdir(FUJI_SFM_TEMP_DIR) if isfile(join(FUJI_SFM_TEMP_DIR, f))]
norm_data_patches_file_lst = [f for f in listdir(NFUJI_SFM_TEMP_DIR) if isfile(join(NFUJI_SFM_TEMP_DIR, f))]

In [187]:
# create final dataset directories for train, valid, test split
# both for normal estimate and only with rgb information
FUJI_SFM_DIR = 'fuji-dataset'
FUJI_SFM_TRN_DIR = 'fuji-dataset/train'
FUJI_SFM_VLD_DIR = 'fuji-dataset/valid'
FUJI_SFM_TST_DIR = 'fuji-dataset/test'

if not os.path.exists(FUJI_SFM_DIR):
    os.makedirs(FUJI_SFM_TRN_DIR)
    os.makedirs(FUJI_SFM_VLD_DIR)
    os.makedirs(FUJI_SFM_TST_DIR)

NFUJI_SFM_DIR = 'fuji-norm-dataset'
NFUJI_SFM_TRN_DIR = 'fuji-norm-dataset/train'
NFUJI_SFM_VLD_DIR = 'fuji-norm-dataset/valid'
NFUJI_SFM_TST_DIR = 'fuji-norm-dataset/test'

if not os.path.exists(NFUJI_SFM_DIR):
    os.makedirs(NFUJI_SFM_DIR)
    os.makedirs(NFUJI_SFM_TRN_DIR)
    os.makedirs(NFUJI_SFM_VLD_DIR)
    os.makedirs(NFUJI_SFM_TST_DIR)

In [159]:
# random split of the lists to prepare training splits
dp_fuji_val =  set(random.sample(data_patches_file_lst, 20))
data_patches_file_lst = set(data_patches_file_lst) - dp_fuji_val
dp_fuji_test =  set(random.sample(data_patches_file_lst, 20))
data_patches_file_lst = set(data_patches_file_lst) - dp_fuji_test

dp_fuji_val = list(dp_fuji_val)
dp_fuji_test = list(dp_fuji_test)
data_patches_file_lst = list(data_patches_file_lst)

since Python 3.9 and will be removed in a subsequent version.
  dp_fuji_val =  set(random.sample(data_patches_file_lst, 20))
since Python 3.9 and will be removed in a subsequent version.
  dp_fuji_test =  set(random.sample(data_patches_file_lst, 20))


In [162]:
print(dp_fuji_val)
print(dp_fuji_test)

['data_patch_234.npy', 'data_patch_150.npy', 'data_patch_188.npy', 'data_patch_361.npy', 'data_patch_398.npy', 'data_patch_217.npy', 'data_patch_338.npy', 'data_patch_162.npy', 'data_patch_53.npy', 'data_patch_196.npy', 'data_patch_164.npy', 'data_patch_3.npy', 'data_patch_276.npy', 'data_patch_45.npy', 'data_patch_110.npy', 'data_patch_156.npy', 'data_patch_103.npy', 'data_patch_23.npy', 'data_patch_181.npy', 'data_patch_84.npy']
['data_patch_123.npy', 'data_patch_97.npy', 'data_patch_362.npy', 'data_patch_344.npy', 'data_patch_51.npy', 'data_patch_261.npy', 'data_patch_264.npy', 'data_patch_357.npy', 'data_patch_209.npy', 'data_patch_154.npy', 'data_patch_363.npy', 'data_patch_350.npy', 'data_patch_377.npy', 'data_patch_393.npy', 'data_patch_171.npy', 'data_patch_121.npy', 'data_patch_95.npy', 'data_patch_345.npy', 'data_patch_369.npy', 'data_patch_232.npy']


In [164]:
# random split of the lists to prepare training splits
norm_dp_fuji_val =  set(random.sample(norm_data_patches_file_lst, 20))
norm_data_patches_file_lst = set(norm_data_patches_file_lst) - norm_dp_fuji_val
norm_dp_fuji_test =  set(random.sample(norm_data_patches_file_lst, 20))
norm_data_patches_file_lst = set(norm_data_patches_file_lst) - norm_dp_fuji_test

norm_dp_fuji_val = list(norm_dp_fuji_val)
norm_dp_fuji_test = list(norm_dp_fuji_test)
norm_data_patches_file_lst = list(norm_data_patches_file_lst)

since Python 3.9 and will be removed in a subsequent version.
  norm_dp_fuji_test =  set(random.sample(norm_data_patches_file_lst, 20))


In [165]:
print(norm_dp_fuji_val)
print(norm_dp_fuji_test)

['data_patch_6.npy', 'data_patch_256.npy', 'data_patch_132.npy', 'data_patch_65.npy', 'data_patch_109.npy', 'data_patch_327.npy', 'data_patch_54.npy', 'data_patch_298.npy', 'data_patch_297.npy', 'data_patch_343.npy', 'data_patch_368.npy', 'data_patch_379.npy', 'data_patch_45.npy', 'data_patch_249.npy', 'data_patch_131.npy', 'data_patch_236.npy', 'data_patch_352.npy', 'data_patch_40.npy', 'data_patch_347.npy', 'data_patch_58.npy']
['data_patch_165.npy', 'data_patch_143.npy', 'data_patch_36.npy', 'data_patch_161.npy', 'data_patch_277.npy', 'data_patch_64.npy', 'data_patch_235.npy', 'data_patch_334.npy', 'data_patch_363.npy', 'data_patch_375.npy', 'data_patch_248.npy', 'data_patch_393.npy', 'data_patch_121.npy', 'data_patch_296.npy', 'data_patch_330.npy', 'data_patch_336.npy', 'data_patch_185.npy', 'data_patch_24.npy', 'data_patch_372.npy', 'data_patch_84.npy']


In [178]:
# three separate copy loops for the three data splits
for file_name in data_patches_file_lst:
    full_file_name = os.path.join(FUJI_SFM_TEMP_DIR, file_name)
    if os.path.isfile(full_file_name):
        shutil.copy(full_file_name, FUJI_SFM_TRN_DIR)

for file_name in dp_fuji_val:
    full_file_name = os.path.join(FUJI_SFM_TEMP_DIR, file_name)
    if os.path.isfile(full_file_name):
        shutil.copy(full_file_name, FUJI_SFM_VLD_DIR)

for file_name in dp_fuji_test:
    full_file_name = os.path.join(FUJI_SFM_TEMP_DIR, file_name)
    if os.path.isfile(full_file_name):
        shutil.copy(full_file_name, FUJI_SFM_TST_DIR)

In [213]:
!ls fuji-dataset/valid

data_patch_103.npy  data_patch_164.npy	data_patch_234.npy  data_patch_398.npy
data_patch_110.npy  data_patch_181.npy	data_patch_23.npy   data_patch_3.npy
data_patch_150.npy  data_patch_188.npy	data_patch_276.npy  data_patch_45.npy
data_patch_156.npy  data_patch_196.npy	data_patch_338.npy  data_patch_53.npy
data_patch_162.npy  data_patch_217.npy	data_patch_361.npy  data_patch_84.npy


In [195]:
# three separate copy loops for the three data splits
for file_name in norm_data_patches_file_lst:
    full_file_name = os.path.join(NFUJI_SFM_TEMP_DIR, file_name)
    if os.path.isfile(full_file_name):
        shutil.copy(full_file_name, NFUJI_SFM_TRN_DIR)

for file_name in norm_dp_fuji_val:
    full_file_name = os.path.join(NFUJI_SFM_TEMP_DIR, file_name)
    if os.path.isfile(full_file_name):
        shutil.copy(full_file_name, NFUJI_SFM_VLD_DIR)

for file_name in norm_dp_fuji_test:
    full_file_name = os.path.join(NFUJI_SFM_TEMP_DIR, file_name)
    if os.path.isfile(full_file_name):
        shutil.copy(full_file_name, NFUJI_SFM_TST_DIR)

In [None]:
# moving the processed point cloud datasets into the drive
# TODO: update the below paths based on your project setup
!zip -r drive/MyDrive/point-cloud-prototyping/datasets/fuji-complete-dataset.zip  fuji-dataset/

In [None]:
# TODO: update the below paths based on your project setup
!zip -r drive/MyDrive/point-cloud-prototyping/datasets/fuji-norm-complete-dataset.zip  fuji-norm-dataset/