# Object Extraction (Racks) for Semantic Segmentation Neural Networks

### The following code utilizes the PointBluePython (LabelPC) library to process facilities and extract their racks into text files.

The code reads from a facilities directory `facilties_rootdir` containing sub directories of the facilities names and each subdirectory containing its facility JSON and LAS or LAZ file: `facilties_rootdir/Facility_Name/...`

The text files have the x, y, z, r, g, b (float, float, float, int, int, int) data structure separated by empty spaces. Each line represents one of the randomly chosen (number configurable) points.

#### The structure of the created dataset files is as follows 

`facilties_rootdir/dataOutput/FacilityName_N/Annotations/...` Where "N" represents the facility number.

The facility is divided into text-like pointcloud files of all of its racks (racks classes) and a general file called clutter for everything else.

This data can be later fed to the semantic segmentation neural network on [Pointnet2 Neural Network](https://github.com/marcofariasmx/Pointnet_Warehouse_pytorch)


**Note:** Please make sure you are running this script under LabelPC's environment to ensure compatibility with its library

In [1]:
import sys
import os
from collections import defaultdict

In [2]:
sys.path.append('../')

In [3]:
from WarehouseDataStructures.Scan import Scan
from Utils.full_file_normals_calculation import normals_calculation

### Execute the code in charge of reading facilities' data

#### Read over the facilities (assumes all facilities with their respective files and file structure are in a single directory

In [5]:
#facilities_path = './facilities_rootdir/'
facilities_path = r'C:\Users\M0x1\Downloads\FacilitiesX10New/'

facilities_list = os.listdir(facilities_path)

print(facilities_list)

['142728079488_US - CA, Los Angeles - Los Palos (Big Bear)', '143039459344_US - NJ, Logan Township - Crossroads', '153924853363_US - TX, El Paso - Railroad', '194913221226_CAN - AB, Calgary - 5555 78th Ave SE (Calgary Foothills)', '194915139195_CAN - AB, Lethbridge - 585 41st St North', '194916756196_CAN - AB, Edmonton - 12536 62 St (Northlands)', '196524033787_CAN - MB, Winnipeg - 200 Dawson Rd N', '196524856938_CAN - QC, Montreal - 5757 Chemin St. Francois', '196527723206_CAN - ON, Brampton - 107 Walker Dr', '196527859613_CAN - QC, Saint-Laurent - 6100 Côte de Liesse']


In [6]:
json_filespath = []
for annotated_facility in facilities_list:
    files_names = os.listdir(facilities_path + annotated_facility)
    for file_name in files_names:
        if file_name.endswith('.json'):
            json_filespath.append(facilities_path + annotated_facility + '/' + file_name)
    
print(json_filespath)

['C:\\Users\\M0x1\\Downloads\\FacilitiesX10New/142728079488_US - CA, Los Angeles - Los Palos (Big Bear)/US - CA, Los Angeles (Big Bear) - Los Palos.json', 'C:\\Users\\M0x1\\Downloads\\FacilitiesX10New/143039459344_US - NJ, Logan Township - Crossroads/loganTownship_1cm_normals.json', 'C:\\Users\\M0x1\\Downloads\\FacilitiesX10New/153924853363_US - TX, El Paso - Railroad/railroad_1cm_normals.json', 'C:\\Users\\M0x1\\Downloads\\FacilitiesX10New/194913221226_CAN - AB, Calgary - 5555 78th Ave SE (Calgary Foothills)/FoothHills_Calgary_1cm_normals.json', 'C:\\Users\\M0x1\\Downloads\\FacilitiesX10New/194915139195_CAN - AB, Lethbridge - 585 41st St North/Lethbridge_1cm_nomals.json', 'C:\\Users\\M0x1\\Downloads\\FacilitiesX10New/194916756196_CAN - AB, Edmonton - 12536 62 St (Northlands)/edmontonNorth_1cm_normals.json', 'C:\\Users\\M0x1\\Downloads\\FacilitiesX10New/196524033787_CAN - MB, Winnipeg - 200 Dawson Rd N/Dawson_1cm_normals.json', 'C:\\Users\\M0x1\\Downloads\\FacilitiesX10New/196524856938

In [7]:
def get_rack_bounding_box(rack, tolerance = 0):
    """
    Method that gets a rack object and an extra tolerance number to return the bounding box coordenates of the rack.
    
    
    Parameters
    -------------
    rack: BaseRack
        rack object to get the bounding box for
    tolerance: float
        tolerance number for the bounding box in meters.
        e.g: .15 = extra 15 centimeters to each side of the bounding box.
        
    Return
    -------------
    xmin, xmax, ymin, ymax, zmin, zmax: float
        The coordinates of the rack's bounding box.
    """
    xmin = min(vertex[0] for vertex in rack.vertices)
    xmax = max(vertex[0] for vertex in rack.vertices)
    ymin = min(vertex[1] for vertex in rack.vertices)
    ymax = max(vertex[1] for vertex in rack.vertices)
    try:
        zmin = rack.base_bottoms[0][0][0] #rack.bottom_height
    except:
        zmin = 0
    zmax = sum(rack.base_heights[0][0]) #rack.top_height
    if zmax == 0:
        zmax = sum(rack.base_bottoms[0][0])
    if zmax == 0:
        zmax = 3
    if zmax < zmin:
        zmax = zmin
        zmin = 0
    
        
    #return xmin, xmax, ymin, ymax, zmin, zmax
    return xmin + tolerance, xmax + tolerance, ymin + tolerance, ymax + tolerance, zmin + tolerance, zmax + tolerance

In [8]:
def map_to_255(value):
    """
    Maps a floating-point value from the range 0 to 1 to the range 0 to 255.
    
    Parameters
    -------------
    value: float
        value to map
        
    Return
    -------------
    mapped_value: int
        Number from 0 to 255.
    """
    scaled_value = value * 255
    mapped_value = int(scaled_value)
    return mapped_value

In [9]:
def calculate_normals(scan, k=100, r=.15):
    """
    Calculates the normals for the currently-selected scan.

    Calculate the normal direction for each point currently loaded in the
    current scan (defined by :func:`~MainWindow.cur_scan` property) and
    store them in the rgb columns of the point cloud DataFrame. In
    addition to the normals, also calculate the local neighborhood
    density and curvature values and store them in the "class" and
    "intensity" columns respectively.

    k = neighboors
    r = radious
    """
    print("Calculating normals for scan file...")
    scan.point_cloud.points.iloc[:]['user_data'] = scan.point_cloud.points.iloc[:][
                                                                ['r', 'g', 'b']].sum(axis=1) / 3.0
    normals, curvature, density = normals_calculation(scan.point_cloud.xyz, k, r)
    scan.point_cloud.points.loc[:, 'r'] = normals[:, 0]
    scan.point_cloud.points.loc[:, 'g'] = normals[:, 1]
    scan.point_cloud.points.loc[:, 'b'] = normals[:, 2]
    scan.point_cloud.points.loc[:, 'class'] = curvature
    scan.point_cloud.points.loc[:, 'intensity'] = density
    print("Normals calculation done!")

#### Create dictionaries and methods to store the gathered information

In [10]:
global_racks_dict = {}
facility_racks_dict = {}

In [11]:
# Function to add objects to the inner dictionary
def add_object(inner_dict, obj_type, obj):
    if obj_type in inner_dict:
        inner_dict[obj_type].append(obj)  # If the object type exists, add the object to the existing list
    else:
        inner_dict[obj_type] = [obj]  # If the object type doesn't exist, create a new list with the object

# Function to add facilities to the main dictionary
def add_facility(facility_dict, facility_name):
    facility_dict[facility_name] = {}  # Create an empty dictionary for the facility


#### Execute loop to iterate over every facility and extract its data

Points to load for each facility

In [12]:
num_points_to_load = 1000000 #20000000

In [13]:
for index, facility_file in enumerate(json_filespath):
    scan = Scan(filename=facility_file, n_points=num_points_to_load)
    shapes = scan.shapes
    
    # Add facility to the facility racks dict
    add_facility(facility_racks_dict, facilities_list[index])
    
    #Calculate normals for scan file
    calculate_normals(scan)

    # Create a dictionary of the facility's racks
    racks_dict = {}
    for rack in shapes.racks:
        if rack.label in racks_dict:
            racks_dict[rack.label].append(rack)
        else:
            racks_dict[rack.label] = [rack]
            
        if rack.label in global_racks_dict:
            global_racks_dict[rack.label].append(rack)
        else:
            global_racks_dict[rack.label] = [rack]
            
    facility_racks_count = 0

    for value in racks_dict.values():
        if isinstance(value, list):
            facility_racks_count += len(value)
        else:
            facility_racks_count += 1

    print("Facility's rack count: ", facility_racks_count)
    
    original_point_cloud_df = scan.point_cloud.points
    point_cloud_df = original_point_cloud_df.copy()
    
    directory = facilities_path + 'dataOutput/' + facilities_list[index] + '_' + str(index+1)
    
    if not os.path.exists(directory + '/Annotations'):
        os.makedirs(directory + '/Annotations')
        
    rack_counter = 1
    
    for key in racks_dict.keys():
        for idx, rack in enumerate(racks_dict[key]):
            xmin, xmax, ymin, ymax, zmin, zmax = get_rack_bounding_box(rack, tolerance=0)

            
            # Create a new DataFrame of the racks points within the bounding box limits
            bounding_box_df = point_cloud_df[(point_cloud_df['x'] >= xmin) & (point_cloud_df['x'] <= xmax) &
                                         (point_cloud_df['y'] >= ymin) & (point_cloud_df['y'] <= ymax) &
                                         (point_cloud_df['z'] >= zmin) & (point_cloud_df['z'] <= zmax)].copy()
            
            if len(bounding_box_df.index) >= 200: #if less than 200 points then the annotation for that rack is not useful for training purposes, discard it
                # Write the output text file
                obj_name = str(rack.label).replace("_", "-" )
                output_text_file = obj_name +'_'+ str(idx+1) + ".txt"
                with open(directory + '/Annotations/' + output_text_file, 'w') as file:
                    for _, row in bounding_box_df.iterrows():
                        file.write("%f %f %f %d %d %d\n" % (row['x'], row['y'], row['z'], map_to_255(row['r']), map_to_255(row['g']), map_to_255(row['b'])))

                # Add rack object to the dict of facilities and their respective racks (dict of dict of lists)
                add_object(facility_racks_dict[facilities_list[index]], rack.label, rack)
            
            print("Saved: Facility ", str(index+1), " / ", str(len(json_filespath)), " rack ", str(rack_counter), "/", str(facility_racks_count))
            rack_counter += 1
            
            # Get the indices of rows to be removed
            indices_to_remove = bounding_box_df.index

            # Remove the rows from the original DataFrame
            point_cloud_df = point_cloud_df.drop(indices_to_remove)
            
            print("Points left: ", len(point_cloud_df.index))
            
    # Create clutter class pointcloud txt file
    output_text_file = 'clutter' + '_' + str(1) + ".txt"
    with open(directory + '/Annotations/' + output_text_file, 'w') as file:
        for _, row in point_cloud_df.iterrows():
            file.write("%f %f %f %d %d %d\n" % (row['x'], row['y'], row['z'], map_to_255(row['r']), map_to_255(row['g']), map_to_255(row['b'])))
    
    # Create general pointcloud txt file
    output_text_file = 'All-points-facility' + '_' + str(index+1) + ".txt"
    with open(directory + '/' + output_text_file, 'w') as file:
        for _, row in original_point_cloud_df.iterrows():
            file.write("%f %f %f %d %d %d\n" % (row['x'], row['y'], row['z'], map_to_255(row['r']), map_to_255(row['g']), map_to_255(row['b'])))
    
    print("Saved general facility file")

Successful points load of  US - CA, Los Angeles (Big Bear) - Los Palos.laz




Calculating normals for scan file...
Normals calculation done!
Facility's rack count:  86
Saved: Facility  1  /  10  rack  1 / 86
Points left:  996041
Saved: Facility  1  /  10  rack  2 / 86
Points left:  990627
Saved: Facility  1  /  10  rack  3 / 86
Points left:  986289
Saved: Facility  1  /  10  rack  4 / 86
Points left:  979715
Saved: Facility  1  /  10  rack  5 / 86
Points left:  977962
Saved: Facility  1  /  10  rack  6 / 86
Points left:  972220
Saved: Facility  1  /  10  rack  7 / 86
Points left:  969053
Saved: Facility  1  /  10  rack  8 / 86
Points left:  960907
Saved: Facility  1  /  10  rack  9 / 86
Points left:  957132
Saved: Facility  1  /  10  rack  10 / 86
Points left:  951517
Saved: Facility  1  /  10  rack  11 / 86
Points left:  948206
Saved: Facility  1  /  10  rack  12 / 86
Points left:  944333
Saved: Facility  1  /  10  rack  13 / 86
Points left:  939013
Saved: Facility  1  /  10  rack  14 / 86
Points left:  935736
Saved: Facility  1  /  10  rack  15 / 86
Points lef

Saved: Facility  2  /  10  rack  44 / 52
Points left:  687946
Saved: Facility  2  /  10  rack  45 / 52
Points left:  680413
Saved: Facility  2  /  10  rack  46 / 52
Points left:  672363
Saved: Facility  2  /  10  rack  47 / 52
Points left:  664334
Saved: Facility  2  /  10  rack  48 / 52
Points left:  656675
Saved: Facility  2  /  10  rack  49 / 52
Points left:  649075
Saved: Facility  2  /  10  rack  50 / 52
Points left:  639242
Saved: Facility  2  /  10  rack  51 / 52
Points left:  631738
Saved: Facility  2  /  10  rack  52 / 52
Points left:  624015
Saved general facility file
Successful points load of  railroad_1cm_normals.laz
Calculating normals for scan file...
Normals calculation done!
Facility's rack count:  97
Saved: Facility  3  /  10  rack  1 / 97
Points left:  1000662
Saved: Facility  3  /  10  rack  2 / 97
Points left:  1000662
Saved: Facility  3  /  10  rack  3 / 97
Points left:  1000502
Saved: Facility  3  /  10  rack  4 / 97
Points left:  996484
Saved: Facility  3  /  10

Saved: Facility  4  /  10  rack  22 / 60
Points left:  884830
Saved: Facility  4  /  10  rack  23 / 60
Points left:  876309
Saved: Facility  4  /  10  rack  24 / 60
Points left:  869567
Saved: Facility  4  /  10  rack  25 / 60
Points left:  866974
Saved: Facility  4  /  10  rack  26 / 60
Points left:  866471
Saved: Facility  4  /  10  rack  27 / 60
Points left:  865925
Saved: Facility  4  /  10  rack  28 / 60
Points left:  862338
Saved: Facility  4  /  10  rack  29 / 60
Points left:  860026
Saved: Facility  4  /  10  rack  30 / 60
Points left:  851575
Saved: Facility  4  /  10  rack  31 / 60
Points left:  847303
Saved: Facility  4  /  10  rack  32 / 60
Points left:  846507
Saved: Facility  4  /  10  rack  33 / 60
Points left:  838325
Saved: Facility  4  /  10  rack  34 / 60
Points left:  835520
Saved: Facility  4  /  10  rack  35 / 60
Points left:  827767
Saved: Facility  4  /  10  rack  36 / 60
Points left:  823561
Saved: Facility  4  /  10  rack  37 / 60
Points left:  823087
Saved: F

Saved: Facility  7  /  10  rack  38 / 150
Points left:  936567
Saved: Facility  7  /  10  rack  39 / 150
Points left:  935713
Saved: Facility  7  /  10  rack  40 / 150
Points left:  934709
Saved: Facility  7  /  10  rack  41 / 150
Points left:  933949
Saved: Facility  7  /  10  rack  42 / 150
Points left:  933738
Saved: Facility  7  /  10  rack  43 / 150
Points left:  932649
Saved: Facility  7  /  10  rack  44 / 150
Points left:  931763
Saved: Facility  7  /  10  rack  45 / 150
Points left:  930337
Saved: Facility  7  /  10  rack  46 / 150
Points left:  929573
Saved: Facility  7  /  10  rack  47 / 150
Points left:  929504
Saved: Facility  7  /  10  rack  48 / 150
Points left:  928179
Saved: Facility  7  /  10  rack  49 / 150
Points left:  926588
Saved: Facility  7  /  10  rack  50 / 150
Points left:  925790
Saved: Facility  7  /  10  rack  51 / 150
Points left:  924831
Saved: Facility  7  /  10  rack  52 / 150
Points left:  923785
Saved: Facility  7  /  10  rack  53 / 150
Points left: 

Saved: Facility  8  /  10  rack  15 / 63
Points left:  957481
Saved: Facility  8  /  10  rack  16 / 63
Points left:  956298
Saved: Facility  8  /  10  rack  17 / 63
Points left:  953567
Saved: Facility  8  /  10  rack  18 / 63
Points left:  949723
Saved: Facility  8  /  10  rack  19 / 63
Points left:  947865
Saved: Facility  8  /  10  rack  20 / 63
Points left:  945435
Saved: Facility  8  /  10  rack  21 / 63
Points left:  944022
Saved: Facility  8  /  10  rack  22 / 63
Points left:  941320
Saved: Facility  8  /  10  rack  23 / 63
Points left:  939542
Saved: Facility  8  /  10  rack  24 / 63
Points left:  935435
Saved: Facility  8  /  10  rack  25 / 63
Points left:  932445
Saved: Facility  8  /  10  rack  26 / 63
Points left:  930390
Saved: Facility  8  /  10  rack  27 / 63
Points left:  927996
Saved: Facility  8  /  10  rack  28 / 63
Points left:  923501
Saved: Facility  8  /  10  rack  29 / 63
Points left:  922000
Saved: Facility  8  /  10  rack  30 / 63
Points left:  917479
Saved: F

Points left:  885272
Saved: Facility  9  /  10  rack  81 / 116
Points left:  881485
Saved: Facility  9  /  10  rack  82 / 116
Points left:  878070
Saved: Facility  9  /  10  rack  83 / 116
Points left:  874925
Saved: Facility  9  /  10  rack  84 / 116
Points left:  873760
Saved: Facility  9  /  10  rack  85 / 116
Points left:  871646
Saved: Facility  9  /  10  rack  86 / 116
Points left:  871213
Saved: Facility  9  /  10  rack  87 / 116
Points left:  870046
Saved: Facility  9  /  10  rack  88 / 116
Points left:  867590
Saved: Facility  9  /  10  rack  89 / 116
Points left:  867590
Saved: Facility  9  /  10  rack  90 / 116
Points left:  867030
Saved: Facility  9  /  10  rack  91 / 116
Points left:  864070
Saved: Facility  9  /  10  rack  92 / 116
Points left:  864070
Saved: Facility  9  /  10  rack  93 / 116
Points left:  864070
Saved: Facility  9  /  10  rack  94 / 116
Points left:  864070
Saved: Facility  9  /  10  rack  95 / 116
Points left:  861265
Saved: Facility  9  /  10  rack  9

Create the whole dataset of facilities racks dictionary in orther to divide them into 5 different areas of facilities of as equally as possible numbers of different rack types. This will promote a better balanced dataset of the data to train and test.

In [14]:
def create_facility_sets(facility_dict):
    # Create a dictionary to store the total count of objects for each object type across all facilities
    object_type_counts = defaultdict(int)
    
    # Calculate the total count of objects for each object type across all facilities
    for inner_dict in facility_dict.values():
        for obj_type, objects in inner_dict.items():
            object_type_counts[obj_type] += len(objects)
    
    # Sort the facilities based on the difference in object type counts compared to the average count
    average_counts = {obj_type: count / len(facility_dict) for obj_type, count in object_type_counts.items()}
    sorted_facilities = sorted(facility_dict, key=lambda facility: sum(abs(len(objects) - average_counts[obj_type]) for obj_type, objects in facility_dict[facility].items()))
    
    # Divide the sorted facilities into five sets
    facility_sets = [[] for _ in range(5)]
    set_index = 0
    for facility in sorted_facilities:
        facility_sets[set_index].append(facility)
        set_index = (set_index + 1) % 5
    
    # Print each set with their facilities and object type counts
    for i, facility_set in enumerate(facility_sets, start=1):
        print(f"Set {i}:")
        for facility in facility_set:
            object_types = facility_dict[facility]
            object_type_counts = {obj_type: len(objects) for obj_type, objects in object_types.items()}
            print(f"  Facility {facility}: {object_type_counts}")

In [15]:
create_facility_sets(facility_racks_dict)

Set 1:
  Facility 196524856938_CAN - QC, Montreal - 5757 Chemin St. Francois: {'select_rack': 58, 'suspended_rack': 1}
  Facility 153924853363_US - TX, El Paso - Railroad: {'drive_in_rack': 23, 'select_rack': 55, 'gravity_feed_rack': 1}
Set 2:
  Facility 194913221226_CAN - AB, Calgary - 5555 78th Ave SE (Calgary Foothills): {'select_rack': 56, 'suspended_rack': 1}
  Facility 142728079488_US - CA, Los Angeles - Los Palos (Big Bear): {'select_rack': 86}
Set 3:
  Facility 143039459344_US - NJ, Logan Township - Crossroads: {'select_rack': 52}
  Facility 194916756196_CAN - AB, Edmonton - 12536 62 St (Northlands): {'select_rack': 29, 'suspended_rack': 2, 'drive_in_rack': 3}
Set 4:
  Facility 196527859613_CAN - QC, Saint-Laurent - 6100 Côte de Liesse: {'select_rack': 43, 'suspended_rack': 3, 'gravity_feed_rack': 6}
  Facility 194915139195_CAN - AB, Lethbridge - 585 41st St North: {'select_rack': 10, 'drive_in_rack': 4}
Set 5:
  Facility 196527723206_CAN - ON, Brampton - 107 Walker Dr: {'selec

Print some useful statistics of racks

In [16]:
def calculate_total_counts(facility_dict):
    total_counts = defaultdict(int)
    
    for inner_dict in facility_dict.values():
        for obj_type, objects in inner_dict.items():
            total_counts[obj_type] += len(objects)
    
    return total_counts

In [17]:
total_counts = calculate_total_counts(facility_racks_dict)

In [18]:
# Print the total counts
for obj_type, count in total_counts.items():
    print(f"{obj_type}: {count}")

select_rack: 584
drive_in_rack: 40
gravity_feed_rack: 23
suspended_rack: 7
