# **General information**

The purpose of this script is to set up the directories for the general dataset that will be used throughout the process of creating the detection model.

### **Step #01 - Importing relevant Python libraries**

In [5]:
from utils.Subframes import Subframes, subexport

### **Step #02 - Creating directories to save the sub-frames**

A total of three new directories will be created for the training, validation and testing dataset. 

In [6]:
!mkdir -p /general_dataset/sub_frames_500/train
!mkdir -p /general_dataset/sub_frames_500/val
!mkdir -p /general_dataset/sub_frames_500/test

### **Step #03 - Creating the variables used to create the directories for the dataset**

The variables that need to be created are the following:
- Directories for the original unsliced training, validation and testing images ;
- Directories for the original unsliced training, validation and testing images' annotations ;
- File for the original unsliced training, validation and testing images' annotations ;
- The sub-frame's width and height.

In [7]:
train_images_directory = "/general_dataset/train"
train_annotations_file = "/general_dataset/groundtruth/json/big_size/train_big_size_A_B_E_K_WH_WB.json"
train_subframes_directory = "/general_dataset/sub_frames_500/train"

val_images_directory = "/general_dataset/val"
val_annotations_file = "/general_dataset/groundtruth/json/big_size/val_big_size_A_B_E_K_WH_WB.json"
val_subframes_directory = "/general_dataset/sub_frames_500/val"

test_images_directory = "/general_dataset/test"
test_annotations_file = "/general_dataset/groundtruth/json/big_size/test_big_size_A_B_E_K_WH_WB.json"
test_subframes_directory = "/general_dataset/sub_frames_500/test"

subframe_width = 500
subframe_height = 500

### **Step #04 - Creating the sub-frames for the training dataset**

In [8]:
# Creating subframes from the images stored in the training dataset.
json_dic = subexport(
        img_dir = train_images_directory,
        img_anno_path = train_annotations_file,
        sfm_width = subframe_width,
        sfm_height = subframe_height,
        sfm_output_dir = train_subframes_directory,
        sfm_overlap = False,
        sfm_strict = True,
        print_rate = 50,
        sfm_object_only = True,
        sfm_anno_export = True
    )

--------------------------------------
Starting creation of the sub-frames...
--------------------------------------
Image [0   /928 ] done.
Image [50  /928 ] done.
Image [100 /928 ] done.
Image [150 /928 ] done.
Image [200 /928 ] done.
Image [250 /928 ] done.
Image [300 /928 ] done.
Image [350 /928 ] done.
Image [400 /928 ] done.
Image [450 /928 ] done.
Image [500 /928 ] done.
Image [550 /928 ] done.
Image [600 /928 ] done.
Image [650 /928 ] done.
Image [700 /928 ] done.
Image [750 /928 ] done.
Image [800 /928 ] done.
Image [850 /928 ] done.
Image [900 /928 ] done.
------------------------------------
Finished creation of the sub-frames.
------------------------------------
--------------------------------------
Elapsed time : 0:10:55
--------------------------------------



  return_var = np.array(all_results)[:,:3].tolist()


File 'coco_subframes.json' correctly saved at '/general_dataset/sub_frames_500/train'.



### **Step #05 - Creating the sub-frames for the validation dataset**

In [9]:
# Creating subframes from the images stored in the validation dataset.
json_dic = subexport(
        img_dir = val_images_directory,
        img_anno_path = val_annotations_file,
        sfm_width = subframe_width,
        sfm_height = subframe_height,
        sfm_output_dir = val_subframes_directory,
        sfm_overlap = False,
        sfm_strict = True,
        print_rate = 50,
        sfm_object_only = False,
        sfm_anno_export = True
    )

--------------------------------------
Starting creation of the sub-frames...
--------------------------------------
Image [0   /111 ] done.
Image [50  /111 ] done.
Image [100 /111 ] done.
------------------------------------
Finished creation of the sub-frames.
------------------------------------
--------------------------------------
Elapsed time : 0:03:24
--------------------------------------

File 'coco_subframes.json' correctly saved at '/general_dataset/sub_frames_500/val'.



### **Step #06 - Creating the sub-frames for the testing dataset**

In [10]:
# Creating subframes from the images stored in the testing dataset.
json_dic = subexport(
        img_dir = test_images_directory,
        img_anno_path = test_annotations_file,
        sfm_width = subframe_width,
        sfm_height = subframe_height,
        sfm_output_dir = test_subframes_directory,
        sfm_overlap = False,
        sfm_strict = True,
        print_rate = 50,
        sfm_object_only = False,
        sfm_anno_export = True
    )

--------------------------------------
Starting creation of the sub-frames...
--------------------------------------
Image [0   /258 ] done.
Image [50  /258 ] done.
Image [100 /258 ] done.
Image [150 /258 ] done.
Image [200 /258 ] done.
Image [250 /258 ] done.
------------------------------------
Finished creation of the sub-frames.
------------------------------------
--------------------------------------
Elapsed time : 0:07:30
--------------------------------------

File 'coco_subframes.json' correctly saved at '/general_dataset/sub_frames_500/test'.



### **Step #7 - Validating the sub-frames created for the training, validation and testing datasets**

In [11]:
validation_dict = {
    "train_images_count": 0,
    "train_subframes_count": 0,
    "val_images_count": 0,
    "val_subframes_count": 0,
    "test_images_count": 0,
    "test_subframes_count": 0
}

for file in os.listdir(train_images_directory):
    if file.endswith(".JPG"):
        validation_dict["train_images_count"] += 1

for file in os.listdir(train_subframes_directory):
    if file.endswith(".JPG"):
        validation_dict["train_subframes_count"] += 1

for file in os.listdir(val_images_directory):
    if file.endswith(".JPG"):
        validation_dict["val_images_count"] += 1

for file in os.listdir(val_subframes_directory):
    if file.endswith(".JPG"):
        validation_dict["val_subframes_count"] += 1

for file in os.listdir(test_images_directory):
    if file.endswith(".JPG"):
        validation_dict["test_images_count"] += 1

for file in os.listdir(test_subframes_directory):
    if file.endswith(".JPG"):
        validation_dict["test_subframes_count"] += 1

print("Results of the validation phase:\n"
      "   - With {} images in the training dataset...\n"
      "         {} subframes were created.\n\n"
      "   - With {} images in the validation dataset...\n"
      "         {} subframes were created.\n\n"
      "   - With {} images in the testing dataset...\n"
      "         {} subframes were created."
      .format(validation_dict["train_images_count"], validation_dict["train_subframes_count"], validation_dict["val_images_count"], 
      validation_dict["val_subframes_count"], validation_dict["test_images_count"], validation_dict["test_subframes_count"]))

Results of the validation phase:
   - With 928 images in the training dataset...
         4029 subframes were created.

   - With 111 images in the validation dataset...
         9781 subframes were created.

   - With 258 images in the testing dataset...
         22266 subframes were created.
