# RSNA2024 LSDC Submission Baseline
This notebook is forked [here](https://www.kaggle.com/code/itsuki9180/rsna2024-lsdc-submission-baseline). In the [previous notebook](https://www.kaggle.com/code/itsuki9180/rsna2024-lsdc-making-dataset), the author selected the images we wanted to use and exported them to png.The original notebook training. And the original other trained EfficientNet_B4 model with these images. 

I desided to change the model to see if there is any improvement. The reason why I choose DenseNet201 for training is that DenseNet201 has generally same number of parameters and size with EfficientNet_B4 so that I believe that Kaggle GPU could handle it and we don't need extra machine.

### My other Notebooks
- [RSNA2024 LSDC Making Dataset](https://www.kaggle.com/code/itsuki9180/rsna2024-lsdc-making-dataset) 
- [RSNA2024 LSDC Training DenseNet](https://www.kaggle.com/code/hugowjd/rsna2024-lsdc-training-densenet) 
- [RSNA2024 LSDC Submission DenseNet](https://www.kaggle.com/code/hugowjd/rsna2024-lsdc-densenet-submission) <- you're reading now

### Reference:
* [Huang, G., Liu, Z., Van Der Maaten, L., and Weinberger, K. Q. Densely connected convolutional networks. CVPR, 2017.](https://arxiv.org/abs/1608.06993)
* [Mingxing Tan and Quoc V. Le. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. ICML 2019.](https://arxiv.org/abs/1905.11946)

### Future Improvement
* I'm certain that we can run other models to train these images. We can do it by changing ***MODEL_NAME*** parameters in **Config** session. I would list some CNN models which are suitable for image classification and these numbers of parameters.
  * ResNet:
    * ResNet-18: ~11.7 million parameters
    * **ResNet-34: ~21.8 million parameters**
    * **ResNet-50: ~25.6 million parameters**
    * ResNet-101: ~44.5 million parameters
    * ResNet-152: ~60 million parameters
  * VGG:
    * VGG-16: ~138 million parameters
    * VGG-19: ~143 million parameters
  * Inception Networks:
    * Inception v1 (GoogleNet): ~6.8 million parameters
    * Inception v3: ~23.8 million parameters
  * DenseNet:
    * DenseNet-121: ~8 million parameters
    * **DenseNet-169: ~14 million parameters**
        * Waiting to do
    * **DenseNet-201: ~20 million parameters** 
        * My Submission LB: 0.61
        * [10 folds submission LB](https://www.kaggle.com/code/sadidul012/densenet201-submission): 0.6
    * **DenseNet-161: ~28.7 million parameters**
        * 10 folds(This notebook V9): 0.59
    * DenseNet-264: ~33 million parameters
  * MobileNets (parameters can vary significantly with changes in alpha and resolution multipliers):
    * MobileNetV1 (1.0 224): ~4.2 million parameters
    * MobileNetV2 (1.0 224): ~3.5 million parameters
    * MobileNetV3 Large: ~5.4 million parameters
  * Vision Transformers (ViT):
    * ViT-B/16 (base model with patch size 16x16): ~86 million parameters
  * Xception:
    * **Xception: ~22.9 million parameters**
        * my test LB : 0.66
  * EfficientNet
    * EfficientNet-B0: ~5.3 million parameters
    * EfficientNet-B1: ~7.8 million parameters
    * EfficientNet-B2: ~9.2 million parameters
        * [EFNetV2 LB](https://www.kaggle.com/code/shubhamcodez/rsna-efficientnet-starter-notebook): 1.01 
    * EfficientNet-B3: ~12 million parameters
        * [Original Notebook](https://www.kaggle.com/code/itsuki9180/rsna2024-lsdc-submission-baseline) LB: 0.69
    * **EfficientNet-B4: ~19 million parameters**
        * [Original Notebook](https://www.kaggle.com/code/itsuki9180/rsna2024-lsdc-submission-baseline) LB: 0.70
    * EfficientNet-B5: ~30 million parameters
    * EfficientNet-B6: ~43 million parameters
    * EfficientNet-B7: ~66 million parameters
* The original author said that we can improve the dataset making process
* I only trained 5 **folds** and 10 **epochs**, you can modify these parameters. But take care of overfitting.

# Import Libralies

In [1]:
import os
import gc
import sys
from PIL import Image
import cv2
import math, random
import numpy as np
import pandas as pd
from glob import glob
from tqdm import tqdm
import matplotlib.pyplot as plt
from sklearn.model_selection import KFold

from collections import OrderedDict

import torch
import torch.nn.functional as F
from torch import nn
from torch.utils.data import DataLoader, Dataset
from torch.optim import AdamW

import timm
from timm.utils import ModelEmaV2
from transformers import get_cosine_schedule_with_warmup

import albumentations as A

from sklearn.model_selection import KFold

import re
import pydicom

In [2]:
rd = '/kaggle/input/rsna-2024-lumbar-spine-degenerative-classification'

# Config

In [3]:
DENSE201_DIR = f'/kaggle/input/densenet-weights-for-rsna-2024/'
DENSE161_DIR = f'/kaggle/input/densenet161-weights-rsna2024/'
device = 'cuda:0' if torch.cuda.is_available() else 'cpu'
N_WORKERS = os.cpu_count()
USE_AMP = True
SEED = 8620

IMG_SIZE = [512, 512]
IN_CHANS = 30
N_LABELS = 25
N_CLASSES = 3 * N_LABELS

N_FOLDS = 5

# MODEL_NAME = "tf_efficientnet_b4.ns_jft_in1k"
# DENSE_MODEL_NAME = "densenet201"
DENSE_MODEL_NAME = 'densenet161.tv_in1k'
BATCH_SIZE = 1

In [4]:
rd = '/kaggle/input/rsna-2024-lumbar-spine-degenerative-classification'

In [5]:
device = torch.device('cuda:0') if torch.cuda.is_available() else torch.device('cpu')
device

device(type='cuda', index=0)

In [6]:
df = pd.read_csv(f'{rd}/test_series_descriptions.csv')
df.head()

Unnamed: 0,study_id,series_id,series_description
0,44036939,2828203845,Sagittal T1
1,44036939,3481971518,Axial T2
2,44036939,3844393089,Sagittal T2/STIR


In [7]:
study_ids = list(df['study_id'].unique())

In [8]:
sample_sub = pd.read_csv(f'{rd}/sample_submission.csv')

In [9]:
LABELS = list(sample_sub.columns[1:])
LABELS

['normal_mild', 'moderate', 'severe']

In [10]:
CONDITIONS = [
    'spinal_canal_stenosis', 
    'left_neural_foraminal_narrowing', 
    'right_neural_foraminal_narrowing',
    'left_subarticular_stenosis',
    'right_subarticular_stenosis'
]

LEVELS = [
    'l1_l2',
    'l2_l3',
    'l3_l4',
    'l4_l5',
    'l5_s1',
]

In [11]:
def atoi(text):
    return int(text) if text.isdigit() else text

def natural_keys(text):
    return [ atoi(c) for c in re.split(r'(\d+)', text) ]

# Define Dataset

In [12]:
class RSNA24TestDataset(Dataset):
    def __init__(self, df, study_ids, phase='test', transform=None):
        self.df = df
        self.study_ids = study_ids
        self.transform = transform
        self.phase = phase
    
    def __len__(self):
        return len(self.study_ids)
    
    def get_img_paths(self, study_id, series_desc):
        pdf = self.df[self.df['study_id']==study_id]
        pdf_ = pdf[pdf['series_description']==series_desc]
        allimgs = []
        for i, row in pdf_.iterrows():
            pimgs = glob.glob(f'{rd}/test_images/{study_id}/{row["series_id"]}/*.dcm')
            # 对获取的路径列表进行自然排序
            pimgs = sorted(pimgs, key=natural_keys)
            allimgs.extend(pimgs)
        # 返回包含所有图像路径的列表
        return allimgs
    
    def read_dcm_ret_arr(self, src_path):
        """
        读取DICOM图像文件并返回处理后的图像数组。

        该方法主要负责读取DICOM格式的医学影像文件，对其进行归一化处理，调整其大小，并返回处理后的图像数组。

        参数:
        src_path (str): DICOM图像文件的路径。

        返回:
        numpy.ndarray: 经过处理后的图像数组。
        """
        # 使用pydicom库读取DICOM图像文件
        dicom_data = pydicom.dcmread(src_path)
        # 获取DICOM图像的像素数组
        image = dicom_data.pixel_array
        # 归一化
        image = (image - image.min()) / (image.max() - image.min() + 1e-6) * 255
        img = cv2.resize(image, (IMG_SIZE[0], IMG_SIZE[1]),interpolation=cv2.INTER_CUBIC)
        # 确保调整大小后的图像形状符合预期
        assert img.shape==(IMG_SIZE[0], IMG_SIZE[1])
        return img

    def __getitem__(self, idx):
        """
        根据索引获取数据样本。

        这个方法用于实现数据集的随机访问。它根据提供的索引提取特定的研究（study），
        并从该研究中加载三种类型的图像：矢状位T1、矢状位T2/STIR和轴位T2。
        如果配置了转换，则应用这些转换。最后，它以指定的格式返回图像数据和研究ID。

        参数:
        idx (int): 数据集中的索引，用于标识要提取的特定研究。

        返回:
        tuple: 包含两部分的元组：
               1. 格式化后的图像数据（numpy数组）。
               2. 研究的唯一标识符（字符串）。
        """
        # 初始化一个零数组，用于存储图像数据
        x = np.zeros((IMG_SIZE[0], IMG_SIZE[1], IN_CHANS), dtype=np.uint8)
        st_id = self.study_ids[idx]        
        
        # Sagittal T1
        # 加载矢状位T1图像
        allimgs_st1 = self.get_img_paths(st_id, 'Sagittal T1')
        if len(allimgs_st1)==0:
            print(st_id, ': Sagittal T1, has no images')
        
        else:
            # 计算步长，以在图像序列中均匀采样
            step = len(allimgs_st1) / 10.0
            # 计算采样的起始点st和结束点end
            st = len(allimgs_st1)/2.0 - 4.0*step
            end = len(allimgs_st1)+0.0001
            # 加载并堆叠图像
            for j, i in enumerate(np.arange(st, end, step)):
                try:
                    ind2 = max(0, int((i-0.5001).round()))
                    # 读取DICOM图像，并将其转换为np.uint8类型，然后存储到x数组的相应位置
                    img = self.read_dcm_ret_arr(allimgs_st1[ind2])
                    x[..., j] = img.astype(np.uint8)
                except:
                    print(f'failed to load on {st_id}, Sagittal T1')
                    pass
            
        # Sagittal T2/STIR
        # 加载矢状位T2/STIR图像
        allimgs_st2 = self.get_img_paths(st_id, 'Sagittal T2/STIR')
        if len(allimgs_st2)==0:
            print(st_id, ': Sagittal T2/STIR, has no images')
            
        else:
            step = len(allimgs_st2) / 10.0
            st = len(allimgs_st2)/2.0 - 4.0*step
            end = len(allimgs_st2)+0.0001
            # 加载并堆叠图像
            for j, i in enumerate(np.arange(st, end, step)):
                try:
                    ind2 = max(0, int((i-0.5001).round()))
                    img = self.read_dcm_ret_arr(allimgs_st2[ind2])
                    x[..., j+10] = img.astype(np.uint8)
                except:
                    print(f'failed to load on {st_id}, Sagittal T2/STIR')
                    pass
            
        # Axial T2
        # 加载轴位T2图像
        allimgs_at2 = self.get_img_paths(st_id, 'Axial T2')
        if len(allimgs_at2)==0:
            print(st_id, ': Axial T2, has no images')
            
        else:
            step = len(allimgs_at2) / 10.0
            st = len(allimgs_at2)/2.0 - 4.0*step
            end = len(allimgs_at2)+0.0001
            # 加载并堆叠图像
            for j, i in enumerate(np.arange(st, end, step)):
                try:
                    ind2 = max(0, int((i-0.5001).round()))
                    img = self.read_dcm_ret_arr(allimgs_at2[ind2])
                    x[..., j+20] = img.astype(np.uint8)
                except:
                    print(f'failed to load on {st_id}, Axial T2')
                    pass  
            
            
        if self.transform is not None:
            x = self.transform(image=x)['image']

        x = x.transpose(2, 0, 1)
                
        return x, str(st_id)

In [13]:
transforms_test = A.Compose([
    A.Resize(IMG_SIZE[0], IMG_SIZE[1]),
    A.Normalize(mean=0.5, std=0.5)
])

In [14]:
test_ds = RSNA24TestDataset(df, study_ids, transform=transforms_test)
test_dl = DataLoader(
    test_ds, 
    batch_size=1, 
    shuffle=False,
    num_workers=N_WORKERS,
    pin_memory=True,
    drop_last=False
)

# Define Model

In [15]:
class RSNA24Model(nn.Module):
    def __init__(self, model_name, in_c=30, n_classes=75, pretrained=True, features_only=False):
        super().__init__()
        # 创建指定模型，配置预训练、特征提取模式、输入通道数和类别数。
        # 基于 TIMM 库实现了特定模型架构，用于医学影像分析任务。
        self.model = timm.create_model(
                                    model_name,
                                    pretrained=pretrained, 
                                    features_only=features_only,
                                    in_chans=in_c,
                                    num_classes=n_classes,
                                    global_pool='avg'
                                    )
    
    def forward(self, x):
        y = self.model(x)
        return y

# Load Models

In [16]:
models = []

In [17]:
import glob
DENSE_CKPT_PATHS = glob.glob(f'{DENSE161_DIR}best_wll_model_fold-*.pt')
DENSE_CKPT_PATHS = sorted(DENSE_CKPT_PATHS)

In [18]:
for i, cp in enumerate(DENSE_CKPT_PATHS):
    print(f'loading {cp}...')
    model = RSNA24Model(DENSE_MODEL_NAME, IN_CHANS, N_CLASSES, pretrained=False)
    model.load_state_dict(torch.load(cp))
    model.eval()
    # 将模型的计算精度设置为半精度（float16），以提高计算效率
    model.half()
    model.to(device)
    models.append(model)

loading /kaggle/input/densenet161-weights-rsna2024/best_wll_model_fold-0.pt...
loading /kaggle/input/densenet161-weights-rsna2024/best_wll_model_fold-1.pt...
loading /kaggle/input/densenet161-weights-rsna2024/best_wll_model_fold-2.pt...
loading /kaggle/input/densenet161-weights-rsna2024/best_wll_model_fold-3.pt...
loading /kaggle/input/densenet161-weights-rsna2024/best_wll_model_fold-4.pt...
loading /kaggle/input/densenet161-weights-rsna2024/best_wll_model_fold-5.pt...
loading /kaggle/input/densenet161-weights-rsna2024/best_wll_model_fold-6.pt...
loading /kaggle/input/densenet161-weights-rsna2024/best_wll_model_fold-7.pt...
loading /kaggle/input/densenet161-weights-rsna2024/best_wll_model_fold-8.pt...
loading /kaggle/input/densenet161-weights-rsna2024/best_wll_model_fold-9.pt...


# Inference loop

In [19]:
# 开启自动混合精度训练
autocast = torch.cuda.amp.autocast(enabled=USE_AMP, dtype=torch.half)
y_preds = []
row_names = []

with tqdm(test_dl, leave=True) as pbar:
    with torch.no_grad():
        for idx, (x, si) in enumerate(pbar):
            x = x.to(device)
            # 初始化用于存储每个研究的预测结果的数组
            pred_per_study = np.zeros((25, 3))

            # 生成行名，用于后续的结果组织和标识
            for cond in CONDITIONS:
                for level in LEVELS:
                    row_names.append(si[0] + '_' + cond + '_' + level)

             # 开启自动混合精度
            with autocast:
                for m in models:
                    # 获取模型的预测结果
                    y = m(x)[0]
                    # 对每个标签进行预测，并计算softmax概率
                    for col in range(N_LABELS):
                        pred = y[col*3:col*3+3]
                        y_pred = pred.float().softmax(0).cpu().numpy()
                        pred_per_study[col] += y_pred / len(models)
                y_preds.append(pred_per_study)

# 将所有预测结果合并为一个数组
y_preds = np.concatenate(y_preds, axis=0)

100%|██████████| 1/1 [00:03<00:00,  3.05s/it]


# Make Submission

In [20]:
sub = pd.DataFrame()
sub['row_id'] = row_names
sub[LABELS] = y_preds
sub.head(25)

Unnamed: 0,row_id,normal_mild,moderate,severe
0,44036939_spinal_canal_stenosis_l1_l2,0.34353,0.405086,0.251384
1,44036939_spinal_canal_stenosis_l2_l3,0.143158,0.426515,0.430327
2,44036939_spinal_canal_stenosis_l3_l4,0.118542,0.326268,0.555191
3,44036939_spinal_canal_stenosis_l4_l5,0.28939,0.274641,0.435969
4,44036939_spinal_canal_stenosis_l5_s1,0.830504,0.097165,0.072331
5,44036939_left_neural_foraminal_narrowing_l1_l2,0.519408,0.450863,0.029729
6,44036939_left_neural_foraminal_narrowing_l2_l3,0.319323,0.575623,0.105054
7,44036939_left_neural_foraminal_narrowing_l3_l4,0.230049,0.516572,0.25338
8,44036939_left_neural_foraminal_narrowing_l4_l5,0.145875,0.459954,0.394171
9,44036939_left_neural_foraminal_narrowing_l5_s1,0.195298,0.375256,0.429446


In [21]:
sub.to_csv('submission.csv', index=False)
pd.read_csv('submission.csv').head()

Unnamed: 0,row_id,normal_mild,moderate,severe
0,44036939_spinal_canal_stenosis_l1_l2,0.34353,0.405086,0.251384
1,44036939_spinal_canal_stenosis_l2_l3,0.143158,0.426515,0.430327
2,44036939_spinal_canal_stenosis_l3_l4,0.118542,0.326268,0.555191
3,44036939_spinal_canal_stenosis_l4_l5,0.28939,0.274641,0.435969
4,44036939_spinal_canal_stenosis_l5_s1,0.830504,0.097165,0.072331


# Conclusion
We created the dataset, performed training, and inference in this notebook. 

This competition is a bit complicated to handle the dataset, so there may be a better way.

I think there are many other areas to improve in my notebook. I hope you can learn from my notebook and get a better score.