### Steps taken to verify that the starting "reference" point for the annotations in the LabelME generated JSON files is the top left corner.

1) Opened the .json and .png files for a spectrogram that contained a boat annotation that was approximately the width of the full spectrogram. 
    - Image file name: 20190206T083004-File-22_20Hz.png (attached in the directory where this notebook is saved)
    - Annontation json file name: 20190206T083004-File-22.json
2) Determine if the smallest x value is ~0 or not. If the smallest x value is larger than 0, confirms hypothesis that "reference" point is the top left corner. 
    - LabelME Annotations are saved in the format ([x1,y1], [x2,y2])
    - Smallest x value is 318, confirming the hypothesis:     
    - Excerpt from json annotation file: 
            {
            "label": "boat",
            "points": [
                [
                318.57142857142856,
                526.7619047619048
                ],
                [
                2163.809523809524,
                1057.7142857142858
                ]
            ],
            "group_id": null,
            "shape_type": "rectangle",
            "flags": {}
            }


### Steps taken to determine approximate location for left and right edges of the spectrogram within the image file.

1) Annotated 10ximages in Roboflow with a bounding box outlining exactly the edge of the border on the Spectrogram within the image. 
    - Images used were from the images that Chris and Mitchell used when they created the annotations 1-2 years ago.
    - Labeled as "test-spect-loc" in the msds-capstone-test Roboflow project
2) Created a version of the dataset in Roboflow that only included the "test-spect-loc" annotations
2) Exported the annotations from Roboflow in YOLOv8 format to a zip folder
3) Extracted all the .txt files, filtered out the empty .txt files
4) Based on YOLOv8 .txt format, determined average, min, and max values for the pixel location of the 


Results (in pixels):
    avg_left_edges: 309.532, min_left_edges: 305.99499999999983, max_left_edges: 313.9949999999999
    avg_right_edges: 2173.266, min_right_edges: 2166.8999999999996, max_right_edges: 2176.38



In [1]:
!pip install Pillow



In [27]:
from PIL import Image

# Path to image created by Chris when Chris & Mitchell created annotations 1-2 years ago
image_path = '/Users/debbiesubocz/capstone/rf_upload_labeled_cb_iter3/20160113T133004-File-25_20Hz.png'

# Open the image
image = Image.open(image_path)

# Get the size of the image
img_width, img_height = image.size

# Print the size
print(f"The image size is {img_width}x{img_height} pixels.")


The image size is 2400x1200 pixels.


#### Verified that the 2400x1200 pixel size for the images was consistent across the following images:
 - Old Fred Olsen
 - New Fred Olsen
 - Azura

In [4]:
# Path to where roboflow export was saved
path_to_export = "/Users/debbiesubocz/Downloads/msds-capstone-test.v3i.yolov8/train/labels"

#### Find all the .txt files that contain the annotations added in Roboflow for around the border of the spectrogram

In [26]:
import os

# Specify the directory containing the .txt files
directory = path_to_export

# Initialize an array to hold the contents of non-empty text files
non_empty_files_contents = []
non_empty_files = []

# Loop through each file in the directory
for filename in os.listdir(directory):
    # Check if the file is a .txt file
    if filename.endswith('.txt'):
        # Construct the full file path
        file_path = os.path.join(directory, filename)
        
        # Open and read the file
        with open(file_path, 'r') as file:
            content = file.read().strip()
            
            # If the file is not empty, add its content to the array
            if content:
                non_empty_files_contents.append(content)
                non_empty_files.append(filename)

# At this point, non_empty_files_contents contains the contents of all non-empty .txt files in the directory
print("Contents of non-empty text files:", non_empty_files_contents)
print("File names: ", non_empty_files)


Contents of non-empty text files: ['0 0.517375 0.4812166666666667 0.7780833333333333 0.8157749999999999', '0 0.5170041666666666 0.4814833333333333 0.7756791666666667 0.8196416666666667', '0 0.5179958333333333 0.4824583333333334 0.7776583333333333 0.8215749999999999', '0 0.5158833333333332 0.48291666666666666 0.7739833333333334 0.8258416666666667', '0 0.5173666666666666 0.4809333333333333 0.7764 0.8185416666666666', '0 0.5176625 0.48145 0.7769958333333333 0.8162416666666666', '0 0.5180875 0.48060833333333336 0.776175 0.821225\n0 0.5176708333333334 0.48306666666666664 0.7770041666666666 0.8186583333333334', '0 0.5180458333333333 0.48199166666666665 0.7744291666666667 0.8173166666666667', '0 0.516625 0.48224166666666674 0.77825 0.8244833333333333', '0 0.51645 0.48268333333333335 0.7779041666666667 0.8220333333333334']
File names:  ['20190122T053004-File-17_png.rf.29703389148b636251eeb6867e90e5ea.txt', '20190120T083004-File-16_png.rf.07cc0a37eef72b3ab402b72fb1e64e78.txt', '20181231T040004-

#### Use the coordinates in the .txt files to determine average, min, and max values for the right and left edges of the spectrogram.

Note: Coordinates in YOLO format are in normalized format: (classification label, center of the annotation bounding box x coordinate, center of the annotation bounding box y coordinate, width of the annotation bounding box, height of the annotation bounding box )

In [25]:
left_edges = []
right_edges = []
for ne_file in non_empty_files_contents:
    #print(ne_file)
    ne_file_splt = ne_file.split()
    #print(ne_file_splt)
    x_center = float(ne_file_splt[1])
    width = float(ne_file_splt[3])

    x_left_edge_spectrogram = x_center - (width/2)
    x_right_edge_spectrogam = x_center + (width/2)

    left_edges.append(x_left_edge_spectrogram)
    right_edges.append(x_right_edge_spectrogam)

    print(f'left-edge-spect: {x_left_edge_spectrogram}, x_right_edge_spectrogam: {x_right_edge_spectrogam}')

avg_left_edges = (sum(left_edges) / len(left_edges)) * img_width
avg_right_edges = (sum(right_edges) / len(right_edges)) * img_width
max_left_edges = max(left_edges) * img_width
min_left_edges = min(left_edges) * img_width
max_right_edges = max(right_edges) * img_width
min_right_edges = min(right_edges) * img_width

print(f'\navg_left_edges: {avg_left_edges}, min_left_edges: {min_left_edges}, max_left_edges: {max_left_edges}')
print(f'avg_right_edges: {avg_right_edges}, min_right_edges: {min_right_edges}, max_right_edges: {max_right_edges}')

left-edge-spect: 0.12833333333333335, x_right_edge_spectrogam: 0.9064166666666666
left-edge-spect: 0.12916458333333325, x_right_edge_spectrogam: 0.90484375
left-edge-spect: 0.12916666666666665, x_right_edge_spectrogam: 0.906825
left-edge-spect: 0.12889166666666657, x_right_edge_spectrogam: 0.9028749999999999
left-edge-spect: 0.12916666666666665, x_right_edge_spectrogam: 0.9055666666666666
left-edge-spect: 0.12916458333333336, x_right_edge_spectrogam: 0.9061604166666667
left-edge-spect: 0.13000000000000006, x_right_edge_spectrogam: 0.906175
left-edge-spect: 0.13083124999999995, x_right_edge_spectrogam: 0.9052604166666667
left-edge-spect: 0.1275, x_right_edge_spectrogam: 0.90575
left-edge-spect: 0.1274979166666666, x_right_edge_spectrogam: 0.9054020833333334

avg_left_edges: 309.532, min_left_edges: 305.99499999999983, max_left_edges: 313.9949999999999
avg_right_edges: 2173.266, min_right_edges: 2166.8999999999996, max_right_edges: 2176.38
