# Breast cancer detection from thermal imaging

The main purpose of this project is to develop a comprehensive decision support system for breast cancer screening.

## Library import
In this section, the libraries that will be used throughout this model will be imported. Keep in mind that part of the libraries used by this program are declared in the files found in `src/scripts/*.py`.

In [None]:
# Modules are reloaded automatically before entering the execution of code throughout this notebook
%reload_ext autoreload
%autoreload 2

In [None]:
from scripts import *

In [None]:
computer.check_available_devices(ignore=True) # Check available devices

## Data selection
To make this model work correctly it will be necessary to extract and save the images found in the `data` folder.

In this folder there are two labeled folders that contain all the images to be used:
```
data
├── healthy
└── sick
```

In [None]:
random_state = 42

In [None]:
data = Data("./data/") # Data imported into a table

data.images.head(3) # Display first 3 rows

## Transformation
In the transformation stage, the data is adapted to find the solution to the problem to be solved.

First of all, the data obtained previously will be divided to be able to use it for training and to check the results.

In [None]:
data.training, data.test = data.train_test_split(test_size=0.15, random_state=random_state, stratify=True) # Split data into train and test

In [None]:
# The category distribution is shown for the original, training, and test data
data.count_labels(data.images, "Original")
data.count_labels(data.training, "Training")
data.count_labels(data.test, "Test")

### Creation of generators
Once the data is divided, different transformation techniques are applied on it to expand the size of the dataset in real time while training the model. To apply a correct solution to the problem, the training and validation dataset will be divided into k consecutive folds, while the test dataset will remain fixed.

In [None]:
training_validation_generator = data.training_validation_generator(n_splits=5, random_state=random_state) # Generate training and validation generators
test_generator = data.test_generator() # Generate test generator

### Filter creation
Once the necessary generators have been created, the filters are created for their subsequent model training.

In [None]:
# TODO: Apply grid search
filters = {
	"original": lambda x: x,
	"high": lambda x: data.get_image_tensor(x, (330, 0, 0), (360, 255, 255)) + data.get_image_tensor(x, (0, 0, 0), (60, 255, 255)),
	"medium": lambda x: data.get_image_tensor(x, (60, 0, 0), (130, 255, 255)),
	"low": lambda x: data.get_image_tensor(x, (130, 0, 0), (330, 255, 255))
}

data.show_images(training_validation_generator[0][0], filters, size=3, name="Training") # Show some images from the training generator

## Data Mining
This section seeks to apply techniques that are capable of extracting useful patterns and then evaluate them.

### Model creation
The models with which they are going to work throughout the project are created. In this case three types of models will be used, five for the high temperature model, five for the medium temperature model and finally five for the low temperature model.

Keep in mind that the number of models for each type depends on the number of folds that have been made, that is, the number of generators that are available.

In [None]:
high_model = [Model(path=f"./output/state_{random_state}/high/fold_{index}", filter=filters["high"]) for index in range(len(training_validation_generator))] # High temperature models creation
medium_model = [Model(path=f"./output/state_{random_state}/medium/fold_{index}", filter=filters["medium"]) for index in range(len(training_validation_generator))] # Medium temperature models creation
low_model = [Model(path=f"./output/state_{random_state}/low/fold_{index}", filter=filters["low"]) for index in range(len(training_validation_generator))] # Low temperature models creation

In [None]:
[high_model[index].compile() for index in range(len(high_model))] # Compile the high temperature models
[medium_model[index].compile() for index in range(len(medium_model))] # Compile the medium temperature models
[low_model[index].compile() for index in range(len(low_model))] # Compile the low temperature models

### Model training
The created model is trained indicating the times that are going to be used.

In [None]:
[high_model[index].fit(training_validation_generator[index][0], training_validation_generator[index][1], epochs=600, verbose=False) for index in range(len(high_model))] # Train the high temperature models
[medium_model[index].fit(training_validation_generator[index][0], training_validation_generator[index][1], epochs=600, verbose=False) for index in range(len(medium_model))] # Train the medium temperature models
[low_model[index].fit(training_validation_generator[index][0], training_validation_generator[index][1], epochs=600, verbose=False) for index in range(len(low_model))] # Train the low temperature models

### Model evaluation
The trained model is evaluated using the generators created before. In this case, the best weight matrix obtained in the training will be used.

In [None]:
# Evaluate the high temperature model
for index in range(len(high_model)):
	high_model[index].evaluate(training_validation_generator[index][0], title="train_generator", path=None)
	high_model[index].evaluate(training_validation_generator[index][1], title="validation_generator", path=None)
	high_model[index].evaluate(test_generator, title="test_generator", path=None)

# Evaluate the medium temperature model
for index in range(len(medium_model)):
	medium_model[index].evaluate(training_validation_generator[index][0], title="train_generator", path=None)
	medium_model[index].evaluate(training_validation_generator[index][1], title="validation_generator", path=None)
	medium_model[index].evaluate(test_generator, title="test_generator", path=None)

# Evaluate the low temperature model
for index in range(len(low_model)):
	low_model[index].evaluate(training_validation_generator[index][0], title="train_generator", path=None)
	low_model[index].evaluate(training_validation_generator[index][1], title="validation_generator", path=None)
	low_model[index].evaluate(test_generator, title="test_generator", path=None)

### Obtaining the weighted average

The three models extracted above are combined to obtain, through the use of differential evolution, the optimal distribution of weights to obtain a future prediction.

In [None]:
join_models = [Join(high_model[index], medium_model[index], low_model[index], path=f"./output/state_{random_state}/join/fold_{index}") for index in range(len(high_model))] # Models are joined

In [None]:
[join_models[index].get_weighted_average(test_generator, iterations=100, tolerance=1e-7) for index in range(len(high_model))] # Compute the weighted average

In [None]:
[join_models[index].evaluate(test_generator, title="test_generator") for index in range(len(high_model))] # Evaluate the weighted average

### Grad-CAM
An activation map of the predictions obtained by the convolutional network is displayed.

In [None]:
# The activation map is displayed
for index, image in data.test.iterrows():
	join_models.visualize_heatmap(image)