# Image Resizing of Breast Ultra Sound Image dataset

Here, we resize the images to a uniform resolution. The original dataset has images with an average image size of 500 x 500 pixels. Since pixels are representative of features and most algorithms require uniform feature size therefore we create datasets with uniform resolution. We create a dataset with resolution of 512 x 512. Additionaly, we also create datasets by downsampling to 256 x 256, 128 x 128, and 64 x 64 to test our algorithms. 

Note that image augmentation (by flipping along Y axis) has already been performed prior to this resizing step, so that  benign and malignant cases are almost balanced (437 vs 420, respectively).

### Import packages

In [1]:
import os
import re
import random
from pathlib import Path
from pprint import pprint

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import cv2

from utils import init_img_dict, get_file_dicts, filter_files, find_mask, print_ndarray_info
from utils import img_read, img_write, img_resize, img_flip, comp_fft, histogram_equalization
from utils import display_img, display_img_list_3, display_3_imgs, display_3_hist, resize_imgs, flip_imgs, append_img_data

### Create image list (with dict object for each image) for the 3 classes

In [2]:
normal_img_dir = './Dataset_BUSI_with_GT/normal'
benign_img_dir = './Dataset_BUSI_with_GT/benign'
malignant_img_dir = './Dataset_BUSI_with_GT/malignant'

# Get a list of images in the images directory
normal_img_list = get_file_dicts(normal_img_dir)
num_normal_img = len(normal_img_list)
print(f"Number of images in normal dataset: {num_normal_img}")

benign_img_list = get_file_dicts(benign_img_dir)
num_benign_img = len(benign_img_list)
print(f"Number of images in benign dataset: {num_benign_img}")

malignant_img_list = get_file_dicts(malignant_img_dir)
num_malignant_img = len(malignant_img_list)
print(f"Number of images in malignant dataset: {num_malignant_img}")

# We will not consider normal images for our analysis.
# Malignant is considered positive and Benign is considered negative
num_total_img = num_benign_img + num_malignant_img

print()
print(f"% of benign images (negative) in the dataset: {100*num_benign_img/num_total_img:0.2f}% ")
print(f"% of malignant images (positive) in the dataset: {100*num_malignant_img/num_total_img:0.2f}% ")
print(f"Total number of images (positive + negative) in the dataset: {num_total_img} \n")

Number of images in normal dataset: 133
Number of images in benign dataset: 437
Number of images in malignant dataset: 420

% of benign images (negative) in the dataset: 50.99% 
% of malignant images (positive) in the dataset: 49.01% 
Total number of images (positive + negative) in the dataset: 857 



In [3]:
# Resize the entire dataset (3 classes)
img_size = 64

normal_img_out_dir = './Dataset_BUSI_with_GT/normal_64'
benign_img_out_dir = './Dataset_BUSI_with_GT/benign_64'
malignant_img_out_dir = './Dataset_BUSI_with_GT/malignant_64'

resize_imgs(normal_img_list, normal_img_out_dir, img_size)
resize_imgs(benign_img_list, benign_img_out_dir, img_size)
resize_imgs(malignant_img_list, malignant_img_out_dir, img_size)

print("Completed")

Number of images resized to 64 resolution: 133
Number of images resized to 64 resolution: 437
Number of images resized to 64 resolution: 420
Completed
