### https://www.kaggle.com/competitions/rsna-breast-cancer-detection/overview

Goal of the Competition
The goal of this competition is to identify breast cancer. You'll train your model with screening mammograms obtained from regular screening.

Your work improving the automation of detection in screening mammography may enable radiologists to be more accurate and efficient, improving the quality and safety of patient care. It could also help reduce costs and unnecessary medical procedures.

Context
According to the WHO, breast cancer is the most commonly occurring cancer worldwide. In 2020 alone, there were 2.3 million new breast cancer diagnoses and 685,000 deaths. Yet breast cancer mortality in high-income countries has dropped by 40% since the 1980s when health authorities implemented regular mammography screening in age groups considered at risk. Early detection and treatment are critical to reducing cancer fatalities, and your machine learning skills could help streamline the process radiologists use to evaluate screening mammograms.

Currently, early detection of breast cancer requires the expertise of highly-trained human observers, making screening mammography programs expensive to conduct. A looming shortage of radiologists in several countries will likely worsen this problem. Mammography screening also leads to a high incidence of false positive results. This can result in unnecessary anxiety, inconvenient follow-up care, extra imaging tests, and sometimes a need for tissue sampling (often a needle biopsy).

The competition host, the Radiological Society of North America (RSNA) is a non-profit organization that represents 31 radiologic subspecialties from 145 countries around the world. RSNA promotes excellence in patient care and health care delivery through education, research, and technological innovation.

Your efforts in this competition could help extend the benefits of early detection to a broader population. Greater access could further reduce breast cancer mortality worldwide.


In [12]:
import numpy as np
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.applications.resnet50 import ResNet50


In [2]:
# Load the ResNet50 model pretrained on ImageNet
resnet = ResNet50(weights='imagenet', include_top=False, input_shape=(150, 150, 3))


Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5


In [3]:
# Add new layers on top of the pretrained ResNet50 model
x = resnet.output
x = tf.keras.layers.GlobalAveragePooling2D()(x)
x = tf.keras.layers.Dense(256, activation='relu')(x)
x = tf.keras.layers.Dropout(0.5)(x)
predictions = tf.keras.layers.Dense(1, activation='sigmoid')(x)


In [4]:
# Create the final model
model = tf.keras.models.Model(inputs=resnet.input, outputs=predictions)


In [5]:
# Compile the model
model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])


In [6]:

# Define the data generators for training and validation
train_datagen = ImageDataGenerator(rescale=1./255, 
                                   shear_range=0.2, 
                                   zoom_range=0.2, 
                                   horizontal_flip=True)

test_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
        'data/train',
        target_size=(150, 150),
        batch_size=32,
        class_mode='binary')

validation_generator = test_datagen.flow_from_directory(
        'data/validation',
        target_size=(150, 150),
        batch_size=32,
        class_mode='binary')


FileNotFoundError: [Errno 2] No such file or directory: 'data/train'

In [7]:
# Train the model
history = model.fit(
        train_generator,
        steps_per_epoch=2000 // 32,
        epochs=50,
        validation_data=validation_generator,
        validation_steps=800 // 32)


NameError: name 'train_generator' is not defined

In [8]:
# Save the trained model
model.save('breast_cancer_model.h5')


In [9]:
# Load the saved model
loaded_model = tf.keras.models.load_model('breast_cancer_model.h5')


In [10]:
# Make predictions on new images
image_path = 'data/test/0/abc.jpg' # replace with your own image path
image = tf.keras.preprocessing.image.load_img(image_path, target_size=(150, 150))
image_array = tf.keras.preprocessing.image.img_to_array(image)
image_array = tf.expand_dims(image_array, 0) # create batch dimension
predictions = loaded_model.predict(image_array)
if predictions[0][0] < 0.5:
    print('Benign')
else:
    print('Malignant')


FileNotFoundError: [Errno 2] No such file or directory: 'data/test/0/abc.jpg'