## Introduction

In general, stethoscope is an acoustic medical instrument that be used by the doctor to diagnose the problem in the heart or lungs.

However, with the rise of deep learning in the field of Image Processing and Audio Processing, I believe deep learning model can achieved the acceptable level of disease diagnosis. If we can get to that state, stethoscope can help the patient or anyone who want to monitor their health, and inform the doctor if they find something anormaly. It might also be very helpful in this period of pandemic where we need to limit the contact as much as possible 

## Technical Disscussion

I've played with this dataset and found it is very unbalance. We have so many subjects with COPD and very small sample of others classes. I've tried to classify the disease. Even though reaching quite high accuracy in total (~ 95%), there are some classes which has very low accuracy. That make me change idea to classify healthy/unhealthy first.

I use Xresnet18 to classify spectrogram image with carefully splitting classes in train/valid dataset and use oversampling for handling unbalance dataset. The code is quite neat thanks to fast.ai

In [None]:
! pip install -Uqq fastai
! pip install -qq torchaudio==0.7.0
! pip install -qq librosa

In [None]:
from fastai.vision.all import *
import torchaudio
import pathlib
import librosa
from IPython.display import Audio
import librosa.display
import numpy as np
import pandas as pd
import os
import matplotlib.pyplot as plt
import seaborn as sns

warnings.filterwarnings("ignore")
%config Completer.use_jedi = False

 Helper method for getting audio files, getting label and configuration for audio processing

In [None]:
mypath = "../input/respiratory-sound-database/Respiratory_Sound_Database/Respiratory_Sound_Database/audio_and_txt_files/"
filenames = get_files(mypath, extensions='.wav')
filenames

In [None]:
p_diag = pd.read_csv("../input/respiratory-sound-database/Respiratory_Sound_Database/Respiratory_Sound_Database/patient_diagnosis.csv",header=None) # patient diagnosis file
p_diag.head()

In [None]:
# configuration for audio processing
n_fft=1024
hop_length=256
target_rate=44100
num_samples=int(target_rate)

In [None]:
## Method for labelling sample (Healthy/Unhealthy)
def get_y(path): 
    desease = p_diag[p_diag[0] == int(path.stem[:3])][1].values[0]
    if desease == "Healthy":
        return "Healthy"
    else : 
        return "Unhealthy"

In [None]:
## Method for getting all audio files, I get file withc rate 44100 Hz only because resampling take so much time :( 
def get_items(path): 
    fns = [fn for fn in get_files(path, extensions='.wav') if torchaudio.load_wav(fn)[1] == target_rate]
    return fns

In [None]:
## Helper method to tranform audio array to Spectrogram
au2spec = torchaudio.transforms.MelSpectrogram(sample_rate=target_rate,n_fft=n_fft, hop_length=hop_length, n_mels=256)
ampli2db = torchaudio.transforms.AmplitudeToDB()

In [None]:
def get_x(path, target_rate=target_rate, num_samples=num_samples*2):
    x, rate = torchaudio.load_wav(path)
    if rate != target_rate: 
        x = torchaudio.transforms.Resample(orig_freq=rate, new_freq=target_rate, resampling_method='sinc_interpolation')(x)
    x = x[0] / 32768
    x = x.numpy()
    sample_total = x.shape[0]
    randstart = random.randint(target_rate, sample_total-target_rate*3)
    x = x[randstart:num_samples+randstart]
    x = librosa.util.fix_length(x, num_samples)
    torch_x = torch.tensor(x)
    spec = au2spec(torch_x)
    spec_db = ampli2db(spec)
    spec_db = spec_db.data.squeeze(0).numpy()
    spec_db = spec_db - spec_db.min()
    spec_db = spec_db/spec_db.max()*255
    return spec_db

In [None]:
## Getting all files and labels
items = get_items(mypath)
labels = [get_y(item) for item in items]
Counter(labels)

From the labels's counter above, we reconfirm that the dataset is very unbalance

the train_test_split method below here is to guarantee the classes is spllited equally in train and validation set, too avoid the problem that we just have Unhealthy samples in validations set. Details can be found here: https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html


In [None]:
test_size=0.3
splitter = TrainTestSplitter(test_size=test_size, random_state=42, stratify=labels)

In [None]:
db = DataBlock(
    blocks=(ImageBlock, CategoryBlock),
    get_x=get_x,
    get_y=get_y,
    splitter=splitter,
    item_tfms=[Resize(256)])

In [None]:
dsets = db.datasets(items)

In [None]:
dsets

To overcome the problem of unbalance dataset, we try to weight the probability of classes's apperance using WeightDataLoader

In [None]:
count = Counter(labels)
wgts = [1/count[dsets.vocab[label]] for img, label in dsets.train]
wgts[:10]

Each weight in the list above is the probability of each file will appear in a batch. 

In [None]:
dls = db.dataloaders(items, num_workers=2, dl_type=WeightedDL, wgts=wgts)

To reconfirm we have balance classes in each batch. We try to get one batch and see if the distribution of each class is equal

In [None]:
x, y = dls.one_batch()

In [None]:
sum(y)/len(y)
## ~50% => we are fine here

In [None]:
dls.show_batch()

In [None]:
## We use xresnet18 as model
learn = cnn_learner(dls, xresnet18, metrics=error_rate)

In [None]:
## model learning
learn.fine_tune(10)

In [None]:
learn.show_results()

In [None]:
interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix(figsize=(12,12), dpi=60)