# Introduction
Welcome to the [VinBigData Chest X-ray Abnormalities Detection](https://www.kaggle.com/c/vinbigdata-chest-xray-abnormalities-detection/data) compedition.　　

First of all, I would like to mention that I used https://www.kaggle.com/drcapa/chest-x-ray-starter as a reference to create this Notebook.  

There are some great explanations in past Kaggle Notebooks, so you may want to refer to those as well.
https://www.kaggle.com/zahaviguy/what-are-lung-opacities

I am a beginner in both Kaggle and English, so I would like to grow from now on.  


# Basic knowledge of X-ray
The degree of absorption of X-rays is defined by the X-rays' wavelength, the thickness of the subject, the density of the subject, and the atoms that make up the subject. When X-rays are irradiated to a living organism, different degrees of blackness (degree of blackness) appear on the X-ray film depending on their absorption. Bones are the whitest part of the human body on film because they absorb X-rays and do not allow them to pass through, while lung fields appear relatively black because the lungs contain a lot of air.   
**Note that this competition image includes a black and white inverted image in addition to the standard image.**


# Table of Contents

In this Notebook, we will briefly discuss the following 15 types.
I have prioritized short and easy to understand over medical correctness. If there are any incorrect expressions, I would appreciate your comments.

- [1. No finding](#1.-No-finding)
- [2. Aortic enlargement](#2.-Aortic-enlargement)
- [3. Atelectasis](#3.-Atelectasis)
- [4. Calcification](#4.-Calcification)
- [5. Cardiomegaly](#5.-Cardiomegaly)
- [6. Consolidation](#6.-Consolidation)
- [7. ILD](#7.-ILD)
- [8. Infiltration](#8.-Infiltration)
- [9. Lung Opacity](#9.-Lung-Opacity)
- [10. Nodule/Mass](#10.-Nodule/Mass)
- [11. Other lesion](#11.-Other-lesion)
- [12. Pleural effusion](#12.-Pleural-effusion)
- [13. Pleural thickening](#13.-Pleural-thickening)
- [14. Pneumothorax](#14.-Pneumothorax)
- [15. Plumonary fibrosis](#15.-Plumonary-fibrosis)

In [None]:
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib
import seaborn as sns
import pydicom as dicom
import cv2

import warnings
warnings.filterwarnings("ignore")

In [None]:
path = '/kaggle/input/vinbigdata-chest-xray-abnormalities-detection/'
#os.listdir(path)

In [None]:
train_data = pd.read_csv(path+'train.csv')
samp_subm = pd.read_csv(path+'sample_submission.csv')

In [None]:
#print('Number train samples:', len(train_data.index))
#print('Number test samples:', len(samp_subm.index))

In [None]:
fig, ax = plt.subplots(1, 1, figsize=(12, 4))
x = train_data['class_name'].value_counts().keys()
y = train_data['class_name'].value_counts().values
ax.bar(x, y)
ax.set_xticklabels(x, rotation=90)
ax.set_title('Distribution of the labels')
plt.grid()
plt.show()

# Show Examples

In [None]:
def plot_example(idx_list):
    fig, axs = plt.subplots(1, 3, figsize=(15, 10))
    fig.subplots_adjust(hspace = .1, wspace=.1)
    axs = axs.ravel()
    for i in range(3):
        image_id = train_data.loc[idx_list[i], 'image_id']
        data_file = dicom.dcmread(path+'train/'+image_id+'.dicom')
        img = data_file.pixel_array
        axs[i].imshow(img, cmap='gray')
        axs[i].set_title(train_data.loc[idx_list[i], 'class_name'])
        axs[i].set_xticklabels([])
        axs[i].set_yticklabels([])
        if train_data.loc[idx_list[i], 'class_name'] != 'No finding':
            bbox = [train_data.loc[idx_list[i], 'x_min'],
                    train_data.loc[idx_list[i], 'y_min'],
                    train_data.loc[idx_list[i], 'x_max'],
                    train_data.loc[idx_list[i], 'y_max']]
            p = matplotlib.patches.Rectangle((bbox[0], bbox[1]),
                                             bbox[2]-bbox[0],
                                             bbox[3]-bbox[1],
                                             ec='r', fc='none', lw=2.)
            axs[i].add_patch(p)
            

# 1. No finding

[Back to top](#Table-of-Contents)

There are no findings on x-ray images.  
This is the normal image and is the baseline image needed to differentiate from the abnormal image.

In [None]:
idx_list = train_data[train_data['class_id']==14][0:3].index.values
plot_example(idx_list)

# 2. Aortic enlargement

[Back to top](#Table-of-Contents)

Aortic enlargement is known as a sign of an aortic aneurysm.
Prone to occur in the ascending aorta.
In general, the term aneurysm is used when the axial diameter is >5.0 cm for the ascending aorta and >4.0 cm for the descending aorta.  
(Please compare the below images to the normal images.)

In [None]:
idx_list = train_data[train_data['class_id']==0][9:12].index.values
plot_example(idx_list)

# 3. Atelectasis
[Back to top](#Table-of-Contents)

Atelectasis is a condition where there is no air in part or all of the lungs. And the lungs are collapsed. A common cause of atelectasis is obstruction of the bronchi.  

In atelectasis, there is an increase in density on chest x-ray (usually whiter; black on black-and-white inversion images).   

In [None]:
idx_list = train_data[train_data['class_id']==1][0:3].index.values
plot_example(idx_list)

# 4. Calcification
[Back to top](#Table-of-Contents)

Many diseases or conditions can cause calcification on chest x-ray. Calcium (calcification) may be deposited in areas where previous inflammation of the lungs or pleura has healed. Calcium may be deposited in the aorta due to atherosclerosis. Or calcification may occur in mediastinal lymph nodes.  

The image is characterized by a density similar to that of bone.  

In [None]:
idx_list = train_data[train_data['class_id']==2][0:3].index.values
plot_example(idx_list)

# 5. Cardiomegaly
[Back to top](#Table-of-Contents)

Cardiomegaly can be caused by many conditions, including hypertension, coronary artery disease, infections, inherited disorders, and cardiomyopathies.  

Cardiomegaly is usually diagnosed when the ratio of the heart's width to the width of the chest is more than 50%. This diagnostic criterion may be an essential basis for this competition.  

*Comment from readers*  
**For the criterion heart-to-lung ratio > 0.5 for the diagnosis of cardiomegaly.  
This is only valid if the X-ray is performed while the patient is standing. If the patient is sitting or in bed, this criterion cannot be used.
An essential point to know it is to detect air in the stomach: if there is no air in it, the patient is not standing. **  


In [None]:
idx_list = train_data[train_data['class_id']==3][0:3].index.values
plot_example(idx_list)

# 6. Consolidation
[Back to top](#Table-of-Contents)

Consolidation is officially referred to as air space consolidation. It is a decrease in lung permeability due to infiltration of fluid, cells, or tissue replacing the air-containing spaces in the alveoli. 

On X-rays, the lung field's density is increased, and pulmonary blood vessels are not seen, but black bronchi can be seen in the white background, which is called "air bronchogram." Since air remains in the bronchial tubes, they do not absorb X-rays and appear black, and the black and white are reversed from normal lung fields.  

In [None]:
idx_list = train_data[train_data['class_id']==4][0:3].index.values
plot_example(idx_list)

# 7. ILD
[Back to top](#Table-of-Contents)

ILD stands for "Interstitial Lung Disease."  
Interstitial lung disease is a general term for many conditions in which the interstitial space is injured. The interstitial space refers to the walls of the alveoli (air sacs in the lungs) and the space around the blood vessels and small airways.

Chest radiographic findings include ground-glass opacities (i.e., an area of hazy opacification), linear reticular shadows, and granular shadows.  

In [None]:
idx_list = train_data[train_data['class_id']==5][0:3].index.values
plot_example(idx_list)

# 8. Infiltration
[Back to top](#Table-of-Contents)

The infiltration of some fluid component into the alveoli causes an infiltrative shadow (Infiltration).  

It is difficult to distinguish from consolidation and, in some cases, impossible to distinguish.  
Please see https://allnurses.com/consolidation-vs-infiltrate-vs-opacity-t483538/.  

In [None]:
idx_list = train_data[train_data['class_id']==6][0:3].index.values
plot_example(idx_list)

# 9. Lung Opacity
[Back to top](#Table-of-Contents)

Lung opacity is defined as any area in the chest radiograph that is more white than it should be.  
Lung opacity is a loose term.
Please see https://www.kaggle.com/zahaviguy/what-are-lung-opacities.

It is difficult to distinguish from infiltration or consolidation and, in some cases, impossible to distinguish.
Please see https://allnurses.com/consolidation-vs-infiltrate-vs-opacity-t483538/.

(Infiltration, consolidation, and opacity will vary depending on who labels them, so I personally question how meaningful it is to distinguish and predict them accurately.)

In [None]:
idx_list = train_data[train_data['class_id']==7][0:3].index.values
plot_example(idx_list)

# 10. Nodule/Mass
[Back to top](#Table-of-Contents)

A nodule/mass is a round shade (typically less than 3 cm in diameter) that appears on a chest X-ray image. It can be seen in primary lung cancer, metastasis from other parts of the body such as colon cancer and kidney cancer, tuberculosis, pulmonary mycosis, non-tuberculous mycobacterium, obsolete pneumonia, and benign tumors.

In [None]:
idx_list = train_data[train_data['class_id']==8][0:3].index.values
plot_example(idx_list)

# 11. Other lesion
[Back to top](#Table-of-Contents)

Others include all abnormalities that do not fall into any other category. This includes bone penetrating images, fractures, and subcutaneous emphysema, etc.

In [None]:
idx_list = train_data[train_data['class_id']==9][0:3].index.values
plot_example(idx_list)

# 12. Pleural effusion
[Back to top](#Table-of-Contents)

Pleural effusion is the accumulation of water outside the lungs in the chest cavity. The lungs are located in the chest cavity surrounded by the chest wall. The outside of the lungs is covered by a thin membrane called the pleura. The pleura consists of two layers, one on the chest wall (parietal pleura) and the other covering the lungs (visceral pleura). The fluid that accumulates between these layers is called pleural effusion.  

The findings of pleural effusion vary widely. They also vary depending on whether the radiograph is taken in the upright or supine position.  
The most common findings are as follows. **Elevation of the diaphragm on one side, flattening the diaphragm, or blunting the angle between rib and diaphragm (typically more than 30 degrees)**. 

In [None]:
idx_list = train_data[train_data['class_id']==10][0:3].index.values
plot_example(idx_list)

# 13. Pleural thickening
[Back to top](#Table-of-Contents)

The pleura is the membrane that covers the lungs, and the change in the thickness of the pleura is called pleural thickening. It is often seen in the uppermost part of the lung field (the apex of the lung).  

"It typically involves the apex of the lung, which is called ‘pulmonary apical cap’. On chest X-rays, the apical cap is an irregular density located at the extreme apex and is less than 5 mm in width".  
*Saito, A. et al. Pleural thickening on screening chest X-rays: a single institutional study. Respir Res 20, 138 (2019). https://doi.org/10.1186/s12931-019-1116-9*

In [None]:
idx_list = train_data[train_data['class_id']==11][0:3].index.values
plot_example(idx_list)

# 14. Pneumothorax
[Back to top](#Table-of-Contents)

A pneumothorax is a condition in which air leaks from the lungs and accumulates in the chest cavity. When air leaks and accumulates in the chest, it cannot expand outward like a balloon due to the ribs' presence. Instead, the lungs are pushed by the air and become smaller. In other words, a pneumothorax is a situation where air leaks from the lungs and the lungs become smaller (collapsed).  

In a chest radiograph of a pneumothorax, the collapsed lung is whiter than normal, and the area where the lung is gone is uniformly black. Besides, the edges of the lung may appear linear.



In [None]:
idx_list = train_data[train_data['class_id']==12][1:4].index.values
plot_example(idx_list)

# 15. Plumonary fibrosis
[Back to top](#Table-of-Contents)

Inflammation of the lung interstitium due to various causes, resulting in thickening and hardening of the walls, fibrosis, and scarring.  

The fibrotic areas lose their air content, which often results in dense cord shadows or granular shadows.

In [None]:
idx_list = train_data[train_data['class_id']==13][3:6].index.values
plot_example(idx_list)

## Thank you for reading my Notebook
(Please feel free to upvote.)