<a href="https://colab.research.google.com/github/aidigitalmillionaire/AI-Agents-for-Medical-Diagnostics/blob/main/Kaggle_AI_Report_Medical_Imaging_Competitions.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
# IMPORTANT: RUN THIS CELL IN ORDER TO IMPORT YOUR KAGGLE DATA SOURCES,
# THEN FEEL FREE TO DELETE THIS CELL.
# NOTE: THIS NOTEBOOK ENVIRONMENT DIFFERS FROM KAGGLE'S PYTHON
# ENVIRONMENT SO THERE MAY BE MISSING LIBRARIES USED BY YOUR
# NOTEBOOK.
import kagglehub
nghihuynh_kaggle_ai_report_23_path = kagglehub.dataset_download('nghihuynh/kaggle-ai-report-23')

print('Data source import complete.')


![](https://i.postimg.cc/cJMzRFcK/header.png)

<span style='font-size:12px; font-family:Verdana;'>Kaggle medical imaging competitions typically involve three primary tasks: object detection, classification, and segmentation. With the emergence of transformer models, these tasks have witnessed remarkable advancements. Transformers, which revolutionized natural language processing in 2017, have now found applications in medical imaging DL models. Adopting transformers has led to enhanced performance in various medical imaging competitions. In this report, I focus on benchmarking the top Kaggle solutions in classification, object detection, and segmentation for advanced medical imaging modalities such as MRI, CT, and X-rays. Specifically, I will emphasize recent deep learning-based methods for these tasks, highlighting their methodology designs and performances in handling volumetric imaging data. By reviewing and analyzing these top solutions, I aim to provide insights for our community on effectively adapting artificial intelligence (AI) techniques for medical imaging. Overall, this report can serve as a valuable reference for our community to gauge the advancement of Deep Learning in medical imaging competitions in the past five years.</span>

# <div style="padding:14px;color:white;margin:0;font-family:Georgia;font-size:30px;text-align:left;display:fill;border-radius:5px;background-color:#00b4d8;overflow:hidden">1. Introduction</div>

<span style='font-size:12px; font-family:Verdana;'>Artificial Intelligence (A.I) has been the core of all remarkable developments in our society. Nowadays, A.I has been incorporated into nearly every business with the hope to maximize productivity, efficiency, and accuracy. Over the past few years, I have observed a significant positive impact of A.I, especially Deep Learning (DL) on medical imaging.</span>

<span style='font-size:12px; font-family:Verdana;'>Medical imaging is a non-invasive technique and process of imaging the interior of a body for clinical analysis and medical intervention <a href='https://en.wikipedia.org/wiki/Medical_imaging'>[1]</a>. Different modalities such as Magnetic resonance imaging (MRI), computational tomography (CT), and X-rays can provide versatile information, ranging from structure, morphology to physiological function <a href='https://www.frontiersin.org/articles/10.3389/fradi.2021.781868/full'>[2]</a>.</span>

<span style='font-size:12px; font-family:Verdana;'>The first part of this report will focus on top DL model types in Kaggle medical imaging competitions. The second part will review some of the top solutions in depth with special attention to the importance of those competitions. </span>


In [None]:
%matplotlib inline
import gc, os, sys, time
import pandas as pd, numpy as np
from pathlib import Path
import matplotlib.pyplot as plt
import seaborn as sns
from matplotlib.ticker import MaxNLocator
from IPython.display import HTML, display
import plotly.express as px
import plotly.graph_objects as go
from sklearn.preprocessing import StandardScaler
import scipy

# <div style="padding:14px;color:white;margin:0;font-family:Georgia;font-size:30px;text-align:left;display:fill;border-radius:5px;background-color:#00b4d8;overflow:hidden">2. Data Collection</div>

<span style='font-size:12px; font-family:Verdana;'>Data was collected using `meta-kaggle` dataset. `Model detail` and `Model type` was acquired manually from top 10 writeups for each competition. Other columns were based on `meta-kaggle` dataset.</span>

<span style='font-size:12px; font-family:Verdana;'>Only medical imaging competitions were selected. Data cleaning was performed on the meta-data to remove any missing values related to `Model detail`.</span>

<span style='font-size:12px; font-family:Verdana;'>Incranial Hemorrage Detection, and Oscis Pulmonary Fibrosis were removed from the meta-data because they are outliers.</span>

<span style='font-size:12px; font-family:Verdana;'>Data collection, cleaning and extraction were all performed using Excel.</span>

In [None]:
medical_competitions = pd.read_csv('/kaggle/input/kaggle-ai-report-23/Meta_data_competitions.csv')
metrics = medical_competitions.Metric.unique()
medical_competitions.head()

# <div style="padding:14px;color:white;margin:0;font-family:Georgia;font-size:30px;text-align:left;display:fill;border-radius:5px;background-color:#00b4d8;overflow:hidden">3. Kaggle Medical Imaging Competitions</div>

<span style='font-size:12px; font-family:Verdana;'>Medical image analysis has become a major research field in biomedical research, with thousands of papers published on various image analysis topics, including classification, object detection, and segmentation <a href='https://pubmed.ncbi.nlm.nih.gov/27503079/'>[5]</a>. However, validation and evaluation of new methods were based on the authorsâ€™ personal data sets, rendering fair and direct comparison of the solutions impossible <a href='https://www.sciencedirect.com/science/article/abs/pii/0734189X86900836?via%3Dihub'>[6]</a>.</span>

<span style='font-size:12px; font-family:Verdana;'>To tackle the fairness and biases in research, there has been a growing interest in organizing biomedical challenges. The first grand challenge in this domain was organized during the International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI) in 2007<a href='https://pubmed.ncbi.nlm.nih.gov/19211338/'>[7]</a>. This competition marked a significant turning point in research practices, leading to changes in how new methods are validated and evaluated.</span>
    
<span style='font-size:12px; font-family:Verdana;'>Over time, research practice began to change, and the number of challenges organized annually has been increasing steadily. Since 2018, Kaggle has become a prestigious platform for hosting many medical imaging competitions with grand prize reward. These competitions often involve classification, object detection, and segmentation using medical images shown in <b>Figure 1</b>. By hosting these competitions on Kaggle, researchers and data scientists are provided with standardized datasets, evaluation metrics, and a common platform for benchmarking their algorithms. This approach enables fair and direct comparisons between different algorithms, fostering advancements in medical image analysis.  </span>

![](https://i.postimg.cc/0QXcnFKN/2.png)

## <div style="padding:14px;color:white;margin:0;font-family:Georgia;font-size:30px;text-align:left;display:fill;border-radius:5px;background-color:#00b4d8;overflow:hidden">3.1 Object Detection</div>

<span style='font-size:12px; font-family:Verdana;'>Medical object detection is the task of identifying medical-based objects within an image. Object detection is an individual classification of the imageâ€™s pixels, where the objects present in the image are located. The detection task requires the generation of Regions Of Interest (ROI) containing the objects. In medical imaging, these objects can correspond to anatomical structures (e.g. organs) or anomalies (e.g. pulmonary nodules) <a href='https://www.imaios.com/en/resources/blog/introduction-to-deep-learning-model-types-for-object-detection-in-medical-imaging'>[12]</a>.</span>

**SIIM-FISABIO-RSNA COVID-19 Detection**

**Goal**: identify and localize COVID-19 abnormalities on chest radiographs.

**Image Modality**: CT scans. A Computerized Tomography scan (CT or CAT scan) uses computers and rotating X-ray machines to create cross-sectional images of the body. These images provide more detailed information than normal X-ray images. They can show the soft tissues, blood vessels, and bones in various parts of the body [[8]](https://www.kaggle.com/code/andradaolteanu/siim-covid-19-box-detect-dcm-metadata).

**Dates**: May 17, 2021 to Aug 9, 2021

**1305** teams, **1786** competitors, **32,307** submissions

**Evaluation**: mAP

**Reward**: $100,000

![](https://i.postimg.cc/0N8nsPpQ/object-detection.png)

## <div style="padding:14px;color:white;margin:0;font-family:Georgia;font-size:30px;text-align:left;display:fill;border-radius:5px;background-color:#00b4d8;overflow:hidden">3.2 Classification</div>

<span style='font-size:12px; font-family:Verdana;'>Medical Image Classification is a task in medical image analysis that involves classifying medical images, such as X-rays, MRI scans, and CT scans, into different categories based on the type of image or the presence of specific structures or diseases. The goal is to use computer algorithms to automatically identify and classify medical images based on their content, which can help in diagnosis, treatment planning, and disease monitoring <a href='https://paperswithcode.com/task/medical-image-classification#:~:text=Medical%20Image%20Classification%20is%20a,of%20specific%20structures%20or%20diseases.'>[11]</a>.</span>

**RSNA Screening Mammography Breast Cancer Detection**

**Goal**: identify breast cancer.

**Image Modality**: low-energy X-rays to examine the human breast for diagnosis and screening

**Dates**: Nov 28, 2022 to Feb 27, 2023

**1,687** teams, **2,146** competitors, **45,911** submissions

**Evaluation**: pF1

**Reward**: $50,000

![](https://i.postimg.cc/4d0QVcj7/classification.png)

## <div style="padding:14px;color:white;margin:0;font-family:Georgia;font-size:30px;text-align:left;display:fill;border-radius:5px;background-color:#00b4d8;overflow:hidden">3.3 Segmentation</div>

<span style='font-size:12px; font-family:Verdana;'>Image segmentation is a computer vision task in which we label specific regions of pixels in an image with their corresponding classes. Medical image segmentation is a crucial example of this domain and offers numerous benefits for clinical use.</span>

<span style='font-size:12px; font-family:Verdana;'>Image segmentation tasks can be classified into two categories: semantic segmentation and instance segmentation <a href='https://arxiv.org/abs/1910.07655'>[9]</a>, <a href='https://arxiv.org/abs/2001.05566'>[10]</a>. Semantic segmentation is the process of labeling one or more specific regions of interest in an image. This process treats multiple objects within a single category as one entity. In contrast, instance segmentation is the process of detecting and delineating each object of interest in an image. This process is a combination of object detection and semantic segmentation. However, it differs from semantic segmentation because it gives a unique label to every instance of a particular object in the image.</span>

**UW-Madison GI Tract Image Segmentation**

**Goal**: segment the stomach and intestines on MRI scans.

**Image Modality**: MRI scans. Magnetic resonance imaging (MRI) is a medical imaging technique used in radiology to form pictures of the anatomy and the physiological processes of the body. MRI scanners use strong magnetic fields, magnetic field gradients, and radio waves to generate images of the organs in the body.

**Dates**: Apr 14, 2022 to July 14, 2022

**1,548** teams, **2,078** competitors, **40,956** submissions

**Evaluation**: mean Dice coefficient

**Reward**: $25,000

![](https://i.postimg.cc/65J1Ps17/segmentation.png)

# <div style="padding:14px;color:white;margin:0;font-family:Georgia;font-size:30px;text-align:left;display:fill;border-radius:5px;background-color:#00b4d8;overflow:hidden">4. Deep Learning Models in Medical Imaging</div>

<span style='font-size:12px; font-family:Verdana;'>Deep learning methods have emerged as a dominant force in medical imaging analysis, revolutionizing various aspects of the discipline. Convolutional neural networks (CNNs) have played a pivotal role and remained at the forefront of research and development. Prominent CNN architectures such as ResNets, EfficientNets, and DenseNets have shown great performance in classification. YOLOs and R-CNNs have proven their effectiveness in object detection, while U-Nets dominate segmentation shown in <b>Figure 2, 3</b>. In contrast, RNN architectures like LSTM and GRU have shown great performance in analyzing sequential medical images, such as video or dynamic medical imaging modalities like functional MRI (fMRI) shown in <b>Figure 2, 3</b>. Although Vision Transformers have recently emerged and replaced convolutions with a complex attention mechanism, which has exceeded the performance of CNNs in many tasks, they need enormous amounts of training data, even more than CNNs. Due to the lack of medical image data, they are not commonly used in medical imaging competitions shown in<b> Figure 3</b>. Nevertheless, the potential of Vision Transformers in medical imaging analysis remains promising. As the availability of annotated medical image datasets increases, these models may become more prevalent and demonstrate their full capabilities in solving challenging medical imaging tasks.</span>

![](https://i.postimg.cc/8kBhTqzf/3.png)

In [None]:
# Figure 1: Distribution of models used in top solutions
models = medical_competitions['Model type'].value_counts().sort_values(ascending=False).reset_index()
models.columns = ['Model type','Count']
models['Count'] = np.round(models['Count']/models['Count'].sum()*100,2)

fig = px.bar(models, x='Model type', y='Count',
             text='Count',
             color_discrete_sequence=['#00b4d8'],
             height=500, width=900,
             title='Figure 3: Overview of Top Model Types in Medical Imaging Competitions',
             template='plotly_white')
fig.update_traces(textposition='outside')
fig.update_xaxes(title='Model Type')
fig.update_yaxes(title='Percentage %')
fig.update_traces(width=0.50,marker_line_color = 'black', marker_line_width = 2)
fig.update_layout(title_y=0.02, title_x=0.1)
fig.show()

<span style='font-size:12px; font-family:Verdana;'>Over the past 5 years, CNN and UNet models have emerged as powerful tools in numerous medical competitions, consistently delivering remarkable results, as shown in <b>Figure 4, 5</b>. Specifically, CNNs have primarily been employed for classification and object detection tasks since they are excellent feature extractors. Therefore, it can be utilized to classify medical images and avoid complicated and expensive feature engineering <a href='https://journalofbigdata.springeropen.com/articles/10.1186/s40537-019-0276-2'>[3]</a>. In contrast, UNet, the most widespread image segmentation architecture, has achieved tremendous attention from academic and industrial researchers <a href='https://arxiv.org/pdf/2211.14830.pdf'>[4]</a>. As a result, UNets have emerged as highly effective tools for segmentation tasks across various medical image modalities, as shown in <b>Figure 5</b>.</span>

In [None]:
# Figure 1: Distribution of models used in top solutions
models = medical_competitions.groupby(['Model type','Year'], as_index=False)['Model type'].value_counts()
models.columns = ['Model type','Year','Count']
models['Count'] = np.round(models['Count']/models['Count'].sum()*100,2)


fig = px.bar(models, x='Year', y='Count', color='Model type',
             barmode='group',color_discrete_sequence=['#00b4d8','#edede9','#d6ccc2',
                                                      '#f5ebe0','#fb8b24' ],
             height=500, width=800,opacity=0.9,
             title='Figure 4: Distribution of top model types used in top 10% solutions from 2018 to 2023',
             template='plotly_white')
fig.update_xaxes(title='Year')
fig.update_yaxes(title='Percentage %')
fig.update_traces(width=0.12,marker_line_color = 'black', marker_line_width = 2)
fig.update_layout(bargap=0.4,
                  legend_title_text='',
                  legend=dict(orientation='h',
                              yanchor="top",
                              y=1.1,
                              xanchor="left",
                              x=0.0
                             ),
                   title_y=0.02, title_x=0.01

                )
fig.show()

In [None]:
# Figure 1: Distribution of models used in top solutions
models = medical_competitions.groupby(['Model type','Task'], as_index=False)['Model type'].value_counts()
models.columns = ['Model type','Task','Count']
models['Count'] = np.round(models['Count']/models['Count'].sum()*100,2)

fig = px.bar(models, x='Task', y='Count', color='Model type',
             barmode='group',color_discrete_sequence=['#00b4d8','#edede9','#d6ccc2',
                                                      '#f5ebe0','#fb8b24' ],
             height=500, width=800,opacity=0.9,
             title='Figure 5: Distribution of top model types used in top 10% solutions in different tasks',
             template='plotly_white')

fig.update_xaxes(title='Model type')
fig.update_yaxes(title='Percentage %')

fig.update_traces(width=0.12,marker_line_color = 'black', marker_line_width = 2)
fig.update_layout(bargap=0.4,
                  legend_title_text='',
                  legend=dict(orientation='h',
                              yanchor="top",
                              y=1.1,
                              xanchor="left",
                              x=0.0
                             ),
                  title_y=0.02, title_x=0.01
                )
fig.show()

# <div style="padding:14px;color:white;margin:0;font-family:Georgia;font-size:30px;text-align:left;display:fill;border-radius:5px;background-color:#00b4d8;overflow:hidden">5. Conclusion</div>

<span style='font-size:12px; font-family:Verdana;'>In this report, I presented an overview of the advancement of AI in Kaggle medical imaging competitions. Over the past few years, Kaggle has provided a great platform with various real-life medical image datasets to benchmark algorithms. Deep learning methods, particularly convolutional neural networks (CNNs) and UNet models, have shown their effectiveness in dealing with various medical image modalities such as MRI, CT scans, and X-rays in these competitions. Recently, the emergence of Vision Transformers has shown some promising applications, including medical image analysis. Nevertheless, despite their potentials, Vision Transformers are not widely used in medical imaging competitions.</span>



# <div style="padding:20px;color:white;margin:0;font-family:Georgia;font-size:30px;text-align:left;display:fill;border-radius:5px;background-color:#00b4d8;overflow:hidden">References</div>

[[1] Medical Imaging](https://en.wikipedia.org/wiki/Medical_imaging)

[[2] Review and Prospect: Artificial Intelligence in Advanced Medical Imaging](https://www.frontiersin.org/articles/10.3389/fradi.2021.781868/full)

[[3] Deep convolutional neural network based medical image classification for disease diagnosis](https://journalofbigdata.springeropen.com/articles/10.1186/s40537-019-0276-2)

[[4] Medical Image Segmentation Review: The Success of U-Net](https://arxiv.org/pdf/2211.14830.pdf)

[[5] 20th anniversary of the Medical Image Analysis journal (MedIA)](https://pubmed.ncbi.nlm.nih.gov/27503079/)

[[6] Anything you can do, I can do better (No you can't)](https://www.sciencedirect.com/science/article/abs/pii/0734189X86900836?via%3Dihub)

[[7] Comparison and evaluation of methods for liver segmentation from CT datasets](https://pubmed.ncbi.nlm.nih.gov/19211338/)

[[8] ðŸ˜·SIIM Covid-19: Box Detect & .dcm metadata](https://www.kaggle.com/code/andradaolteanu/siim-covid-19-box-detect-dcm-metadata)

[[9] Deep Semantic Segmentation of Natural and Medical Images: A Review](https://arxiv.org/abs/1910.07655)

[[10] Image Segmentation Using Deep Learning: A Survey](https://arxiv.org/abs/2001.05566)

[[11] Medical Image Classification](https://paperswithcode.com/task/medical-image-classification#:~:text=Medical%20Image%20Classification%20is%20a,of%20specific%20structures%20or%20diseases.)

[[12] Detection and segmentation in medical imaging: types of deep learning models](https://www.imaios.com/en/resources/blog/introduction-to-deep-learning-model-types-for-object-detection-in-medical-imaging)

In [None]:
# Submission
submission_df = pd.read_csv("/kaggle/input/2023-kaggle-ai-report/sample_submission.csv")
submission_df.loc[0]['value']='Kaggle Competitions'
submission_df.loc[1]['value']='https://www.kaggle.com/code/nghihuynh/kaggle-ai-report-medical-imaging-competitions'
submission_df.loc[2]['value']='https://www.kaggle.com/code/ahsuna123/kaggle-ai-report-healthcare-surge/comments#2336493'
submission_df.loc[3]['value']='https://www.kaggle.com/code/bnzn261029/kaggle-competitions-reflect-the-development-of-ai/comments#2344474'
submission_df.loc[4]['value']='https://www.kaggle.com/code/omarrajaa/image-data-report/comments#2344544'

submission_df.to_csv('submission.csv', index=False)
submission_df.head()