# SIIM: Step-by-Step Image Detection for Beginners
## Part 4(mini). Multi-Output Regression

👉 Part 1. [EDA to Preprocessing](https://www.kaggle.com/songseungwon/siim-covid-19-detection-10-step-tutorial-1)

👉 Part 2. [Basic Modeling - Simplest Image Classification Models using Keras](https://www.kaggle.com/songseungwon/siim-covid-19-detection-10-step-tutorial-2)

👉 Part 3(mini). [Preprocessing for Multi-Output Regression that Detect Opacities](https://www.kaggle.com/songseungwon/siim-covid-19-detection-mini-part-preprocess)

> index
```
Step 1. Load Train Data Table
     1-a. extract data with only one opacity
     1-b. extract image paths
Step 2. Load Image Dataset
     2-a. Data Preprocessing
Step 3. Modeling
     3-a. Train-valid split
     3-b. Modeling
     3-c. Training
     3-d. Evaluation
 ```


This model trains on data with only one opacity.

The X matrix consists of the image and the Y matrix consists of 4 borders, i.e. 4 coordinate vectors, that make up the box that detects opacity.

We will deal with multi output regression through this simple learning.

## Step 1. Load Train Data Table

In [None]:
import pandas as pd

In [None]:
train_df = pd.read_csv('../input/siim-covid19-preprocessed-datasettrain/train_full_info.csv')
train_df.head()

In [None]:
train_df.drop(columns=['Unnamed: 0'], inplace=True)

### 1-a. extract data with only one opacity

In [None]:
train_df[train_df.OpacityCount == 1]

### 1-b. extract image paths

In [None]:
train_df[train_df.OpacityCount == 1]['path']

In [None]:
img_path_array = train_df[train_df.OpacityCount == 1]['path'].values
img_path_array[:5]

## Step 2. Load Image Dataset

In [None]:
import matplotlib.pyplot as plt
import numpy as np

In [None]:
plt.imread(img_path_array[1]).shape

In [None]:
np.empty((256,256),dtype=int)

In [None]:
len(img_path_array)

In [None]:
imgs = []
i = 0
for path in img_path_array:
    imgs.append(plt.imread(path))
    i+=1
    if i % 100 == 0:
        print('{}/{}'.format(i,len(img_path_array)))
    elif i == len(img_path_array):
        print('{}/{} - done!'.format(i,len(img_path_array)))

In [None]:
X_train = np.array(imgs)
X_train.shape

In [None]:
X_train = X_train[:,:,:,np.newaxis]
X_train.shape

### 2-a. Data Preprocessing

In [None]:
train_df[train_df.OpacityCount==1].iloc[:,-4:].apply(lambda x : x.str.strip('[]'))

In [None]:
remove_brk_y = train_df[train_df.OpacityCount==1].iloc[:,-4:].apply(lambda x : x.str.strip('[]'))
for col in remove_brk_y.columns:
    remove_brk_y[col] = remove_brk_y[col].astype('float')

In [None]:
remove_brk_y.info()

In [None]:
Y_train = np.array(remove_brk_y)
Y_train.shape

## Step 3. Modeling

### 3-a. Train-valid split

In [None]:
from sklearn.model_selection import train_test_split
X_train, X_valid, Y_train, Y_valid = train_test_split(
    X_train, Y_train, test_size=0.3, random_state=42)

In [None]:
print('Shape of X_train : ', X_train.shape)
print('Shape of Y_train : ', Y_train.shape)
print('Shape of X_valid : ', X_valid.shape)
print('Shape of Y_valid : ', Y_valid.shape)


### 3-b. Modeling

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dropout, Dense

model = Sequential([
    Conv2D(16,(3,3),activation='relu',input_shape=(256,256,1)),
    MaxPooling2D(2,2),
    Conv2D(32,(3,3),activation='relu'),
    MaxPooling2D(2,2),
    Conv2D(64,(3,3),activation='relu'),
    MaxPooling2D(2,2),
    Conv2D(128,(3,3),activation='relu'),
    MaxPooling2D(2,2),
    Flatten(),
    Dropout(0.5),
    Dense(128,activation='relu'),
    Dense(32,activation='relu'),
    Dense(4,activation='linear')
])

In [None]:
model.summary()

In [None]:
model.compile(optimizer='adam',loss='mse',metrics=['mae'])

In [None]:
from tensorflow.keras.callbacks import ModelCheckpoint

In [None]:
filepath = 'my_checkpoint.ckpt'
cp = ModelCheckpoint(
    filepath=filepath,
    save_weights_only=True,
    save_best_only=True,
    monitor='val_loss',
    verbose=1
)

### 3-c. Training

In [None]:
model.fit(
    X_train, Y_train,
    validation_data=(X_valid,Y_valid),
    epochs=12,
    callbacks=[cp]
)

### 3-d. Evaluation

In [None]:
model.load_weights(filepath)
model.evaluate(X_valid, Y_valid)

---
If there are any mistakes, please feel free to give feedback! Thank you!