<a href="https://colab.research.google.com/github/KKAARRIIMM15/Karim-El-deeb/blob/main/CovidCough.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**this project shows how to use audio processing in complex medical diagnoses.**
Audio signals can capture more sensitive information than the normal human auditory system because the human auditory system can typically detect frequencies between about 20 Hz to 20,000 Hz (20 kHz). While Audio signals can cover a massive frequency range which the human ear can not perceive

**I used audio processing techniques to distinguish between the positive COVID-19 cough and the negetive COVID-19 cough using LSTM neural network**

### the dataset contains audio records of the cough sound of people having COVID-19 and people who do not have COVID-19

the dataset is from kaggle [link dataset](https://www.kaggle.com/datasets/pranaynandan63/covid-19-cough-sounds)

In [None]:
from google.colab import drive
drive.mount('/content/drive')

import librosa
import numpy as np
import tensorflow as tf


In [None]:
import os

## **i convert the spectrogram to a decibel scale, and then extracts the MFCC features from this decibel-scaled spectrogram**

#     the reason is:
1- Perceptual Scaling: The human auditory system is more sensitive to differences in loudness at lower intensity levels than at higher intensity levels. By converting the spectrogram to dB scale, you're aligning the representation with human perception, making it more relevant for tasks that involve human auditory perception

2- Highlighting Relative Changes: Converting to dB emphasizes relative changes in energy rather than absolute energy levels. This is particularly useful when analyzing features like harmonics, formants, or other spectral characteristics, as these can be more meaningful in relative terms

3- Noise Floor Interpretation: In the dB scale, a lower threshold (often referred to as the "noise floor") can be more easily identified, which can be crucial for distinguishing between signal and noise, and for setting appropriate detection or threshold levels in various applications

In [None]:

def MFCCs( y, sr ):

  S = librosa.feature.melspectrogram(y=y, sr=sr, n_mels=128,
                                      fmax=8000)

  MFCC = librosa.feature.mfcc( S=librosa.power_to_db(S)   )

  positiveCoefficient  = np.where(MFCC > 0, MFCC, 0)       # Extract the postive positiveCoefficient from the  MFCCs
  negativeCoefficient  = np.where(MFCC < 0, MFCC, 0)       # Extract the postive negativeCoefficient from the  MFCCs

  pstv_parts = np.array_split(positiveCoefficient.flatten(), 600 )
  ngtv_parts = np.array_split(negativeCoefficient.flatten(), 600 )

  pstv = [ ]
  ngtv = [ ]

  i = 0
  while( i < len(pstv_parts) ):

      if np.any(pstv_parts[i]):
        pstv.append( pstv_parts[i].mean() / 400 )

      else:
        mean_value = 0.0
        pstv.append( mean_value )

      if np.any(ngtv_parts[i]):
        ngtv.append( ngtv_parts[i].mean() / 400)

      else:
        mean_value = 0.0
        ngtv.append( mean_value )

      i = i + 1

  pstv = np.array(pstv)
  ngtv = np.array(ngtv)


  concatenated_arr = np.concatenate(( pstv , ngtv ))
  concatenated_arr = np.round(concatenated_arr , decimals = 5 )

  return concatenated_arr



### **Extract the spectral wave bandwidth from the raw audio signal. Spectral bandwidth is a measure of the width of the frequency range in a signal's spectrum**

#  Here are the reasons why i considered using spectral bandwidth in Covid cough detection

1-**Cough Sound Variability**: Cough sounds can vary widely based on factors such as the presence of mucus, the force of the cough, and underlying health conditions. Spectral bandwidth can help capture this variability by quantifying the spread of frequencies present in the cough sound.

2-**Characterizing Cough Types**: Different types of coughs, such as dry coughs and wet coughs, may exhibit distinct spectral characteristics. Spectral bandwidth could be used to differentiate between these types based on the distribution of frequencies.

3-**Respiratory Condition Detection**: Certain respiratory conditions can manifest with specific changes in the acoustic properties of cough sounds. By analyzing spectral bandwidth, you might be able to identify patterns associated with conditions like bronchitis, pneumonia, or even early signs of chronic obstructive pulmonary disease (COPD).

4-**Severity Assessment**: The severity of respiratory conditions can affect the acoustic features of cough sounds. Changes in spectral bandwidth might correlate with the severity of the condition, providing a potential way to assess the health status of the individual.

In [None]:

def acousticWaveBandwidth( y , sr ):

  rowData = librosa.feature.spectral_bandwidth( y=y, sr=sr )
  spectral_bandwidth_parts = np.array_split( (rowData.flatten() ), 300)

  specBandwidth = [ ]
  k = 0
  while( k < len(spectral_bandwidth_parts) ):

    if np.any(spectral_bandwidth_parts[k]):
      specBandwidth.append(spectral_bandwidth_parts[k].mean() )

    else:
      mean_value = 0.0
      specBandwidth.append( mean_value )

    k = k + 1

  specBandwidth = ( np.array(specBandwidth) / 2000 )
  specBandwidth = np.round( specBandwidth , decimals = 5 )

  return specBandwidth


## **Extract the spectral wave contrast from the raw audio signal.**
**Spectral contrast is an acoustic feature used to describe the difference in magnitudes between the peaks and valleys in the spectrum of an audio signal. It provides information about how prominent certain frequency bands are relative to others, which can be valuable for understanding the tonal characteristics of sound. Spectral contrast can be especially useful in differentiating sounds with distinct spectral components or identifying variations in the tonal quality of audio signals.**

# Here are the reasons why i considered using spectral contrast in cough analysis

1-**Harmonic-to-Noise Ratio**: Spectral contrast can offer insights into the balance between harmonic components and noise in cough sounds. Distinct spectral contrast patterns might emerge for different types of coughs, such as dry coughs (with fewer harmonic components) and wet coughs (with more noise due to mucus).

2-**Cough Sound Characterization**: Spectral contrast can help distinguish between different characteristics of cough sounds, such as the intensity, raspiness, or breathiness. These attributes can be indicative of underlying health conditions.

3-**Detecting Respiratory Disorders**: Respiratory conditions can introduce changes in the spectral characteristics of cough sounds. Abnormalities in spectral contrast might be associated with conditions like asthma, bronchitis, or pneumonia.

In [None]:

def acousticWaveContrast( y , sr ):

  rowData = librosa.feature.spectral_contrast( y=y, sr=sr )
  contrast_parts = np.array_split( (rowData.flatten() ), 1000)

  spectralContrast = [ ]
  k = 0
  while( k < len(contrast_parts) ):

    if np.any(contrast_parts[k]):
      spectralContrast.append(contrast_parts[k].mean() )

    else:
      mean_value = 0.0
      spectralContrast.append( mean_value )

    k = k + 1

  spectralContrast = ( np.array(spectralContrast) / 50 )
  spectralContrast = np.round( spectralContrast , decimals = 5 )

  return spectralContrast


### **Spectral centroid is a fundamental audio feature that describes the "center of mass" or the average frequency of a sound spectrum. It provides a measure of where the "center" of the distribution of spectral energy in an audio signal lies. the spectral centroid is a way to describe where the "center of activity" is in terms of acoustic frequency content**

# Here are the reasons why i considered using spectral centroid in cough analysis

1-**Distinctive Acoustic Signatures:** COVID-19 can cause a range of respiratory symptoms, including coughing. Different respiratory infections and conditions might result in coughs with distinct spectral characteristics. Spectral centroid could help identify specific tonal patterns associated with COVID-19 coughs.

2-**Tonal Quality:** Spectral centroid can capture the tonal characteristics of cough sounds. COVID-19 coughs might have specific tonal qualities or changes in spectral centroid that distinguish them from coughs associated with other respiratory conditions or healthy coughs.

3-**Differentiating from Other Coughs:** Respiratory illnesses can have similar symptoms. Spectral centroid analysis might help differentiate COVID-19 coughs from coughs caused by other viral infections, allergies, or common colds based on their unique tonal qualities.

4-**Severity Assessment:** COVID-19 severity can vary greatly among individuals. Changes in the spectral centroid of cough sounds over time might correlate with the progression of the disease and its severity.

In [None]:


def acoustic_Centroid( y , sr ):
  bandwidth = librosa.feature.spectral_centroid( y=y, sr=sr )
  centroid_parts = np.array_split(bandwidth[0], 300)

  Centroid = [ ]
  k = 0
  while( k < len(centroid_parts) ):

    if np.any(centroid_parts[k]):
      Centroid.append(centroid_parts[k].mean() )

    else:
      mean_value = 0.0
      Centroid.append( mean_value )

    k = k + 1

  Centroid = ( np.array(Centroid) / 5000 )
  Centroid = np.round( Centroid , decimals = 5 )

  return ( Centroid )


**zero-crossing rate (ZCR) is a feature that quantifies how often the amplitude of an audio signal changes sign, meaning how many times the waveform crosses the zero amplitude line within a given time frame. This feature is particularly useful for understanding certain aspects of the overall shape and noisiness of an audio signal**

# `Here are the reasons why i considered using ZCR in cough analysis`

1-**Respiratory Condition Detection:** ZCR values might change in coughs associated with specific respiratory conditions. Analyzing ZCR could help identify patterns associated with conditions like bronchitis, pneumonia, or chronic obstructive pulmonary disease (COPD).

2-**Monitoring Disease Progression:** Tracking changes in ZCR over time for individuals with respiratory conditions can provide insights into disease progression. Shifts in ZCR values might indicate changes in the characteristics of cough sounds as the condition evolves.

3-**Objective Assessment:** ZCR provides an objective measure of the amplitude changes within cough sounds, reducing subjectivity in traditional assessments of cough quality.

4-**Feature Combination:** Combining ZCR with other acoustic features, such as spectral centroid or Mel-frequency cepstral coefficients, can create a more comprehensive feature set for accurate cough sound analysis.

In [None]:

def ZCR( y, sr  ):
  zero_crossing_rate = librosa.feature.zero_crossing_rate(y)
  ZCR_parts = np.array_split(zero_crossing_rate[0], 300)

  ZCRate = [ ]
  k = 0
  while( k < len(ZCR_parts) ):

    if np.any(ZCR_parts[k]):
      ZCRate.append(ZCR_parts[k].mean() )

    else:
      mean_value = 0.0
      ZCRate.append( mean_value )

    k = k + 1

  ZCRate = ( np.array(ZCRate) )
  ZCRate = np.round( ZCRate , decimals = 5 )

  return  ZCRate


**Combine all customized cough audio features ---> (Audio Feature Extraction )**

In [None]:

def extractCoughFeatures( y, sr ):

  BW = acousticWaveBandwidth( y , sr )
  mfccs = MFCCs( y , sr )
  cntr = acousticWaveContrast( y , sr )
  AC = acoustic_Centroid( y , sr )
  zcr = ZCR( y, sr  )

  concatenated_arr = np.concatenate(( BW  , cntr , mfccs , AC , zcr  ))    # i Combined all audio features in one data Structure before feeding to nueral network
  concatenated_arr = np.round(concatenated_arr , decimals = 5 )

  return concatenated_arr.reshape(62, 50)
  # i reshaped the input data to have 62 time steps (sequences) and each step have 50 features which allow the neural network to capture long-range dependencies
  # reshaping the input data into a sequence format provides a more suitable input representation for the LSTM to learn and extract meaningful patterns.
  # When using flattened input, you might lose the temporal structure of your data so Reshaping the data allows the LSTM to capitalize on the temporal information that is critical for analyzing sequential data.

# **Playing two negative COVID-19 sample**

In [None]:
from IPython.display import Audio

# Play a negative COVID-19 audio file
Audio("drive//MyDrive//Covid-19//Negative//7_Negative_male_42.wav")

In [None]:
from IPython.display import Audio
# Play a negative COVID-19 audio file
Audio("drive//MyDrive//Covid-19//Negative//16_Negative_male_52.wav")

# **Playing two Positive COVID-19 sample**

In [None]:
from IPython.display import Audio

# Play a Positive COVID-19 audio file
Audio("drive//MyDrive//Covid-19//Positive//618_Positive_male_57.wav")

In [None]:
from IPython.display import Audio

# Play a Positive COVID-19 audio file
Audio("drive//MyDrive//Covid-19//Positive//553_Positive_male_22.wav")

In [None]:
covidFeatures = [ ];         covidLabel = [ ]

negativeFeatures = [ ];     negativeLabel = [ ]


**load the negative COVID-19 cough audio records**

In [None]:

directory = "drive//MyDrive//Covid-19//Negative"
audio_files = os.listdir(directory)   # get all audio files from the folder "Negative"
i = 0
while(i < len(audio_files) ):
  y, sr = librosa.load("drive//MyDrive//Covid-19//Negative//" + audio_files[i] )        # y = audio Signal  ,   sr = Sample Rate
  negativeFeatures.append( extractCoughFeatures(y , sr) )
  negativeLabel.append(0)
  i = i + 1


**load the Positive COVID-19 cough audio records**

In [None]:

directory = "drive//MyDrive//Covid-19//Positive"
audio_files = os.listdir(directory)  # get all audio files from the folder "Positive"
i = 0
while(i < len(audio_files) ):
  y, sr = librosa.load("drive//MyDrive//Covid-19//Positive//" + audio_files[i] )        # y = audio Signal  ,   sr = Sample Rate
  covidFeatures.append( extractCoughFeatures(y , sr) )
  covidLabel.append(1)
  i = i + 1


In [None]:
from sklearn.model_selection import train_test_split
import numpy as np


voice = np.concatenate(( covidFeatures , negativeFeatures  ), axis=0)
cls = np.concatenate(( covidLabel , negativeLabel   ), axis=0)


xTrain, xval, yTrain, yval = train_test_split(voice, cls , test_size=0.2, random_state=42)  # 20% for Testing

### **I build a type of neural network called Long Short-Term Memory (LSTM) networks which are commonly used in audio data analysis. LSTMs are effective neural network for capturing temporal dynamics and dependencies in audio data**

### **LSTM is used in audio data analysis for several reasons:**

1- **Sequential Dependencies:** Audio data often has sequential dependencies, where the order of data points matters. LSTMs are designed to handle such sequences and are capable of capturing patterns and dependencies over time, making them suitable for tasks involving audio signals.

2- **Time Series Analysis:** Audio signals, such as speech, music, or environmental sounds, are essentially time series data. LSTMs are effective in modeling and predicting time series patterns, including variations in amplitude, frequency, and temporal dynamics.

3 - **Temporal Modeling:** LSTMs excel at modeling temporal relationships, making them well-suited for tasks that involve changes over time, such as phoneme recognition, music generation, and sound event detection.

4- **Long-Term Dependencies:** LSTMs can capture long-range dependencies in audio data. For example, in speech recognition, the context of previous phonemes can influence the interpretation of the current phoneme. LSTMs can remember such context over longer sequences.

5- **Feature Learning:** LSTMs can learn to extract relevant features from raw audio data, reducing the need for handcrafted feature engineering. This is particularly useful for capturing nuanced characteristics of audio, such as pitch, timbre, and rhythm.

6- **Robustness to Variability:** Audio data can exhibit variability due to different speakers, environmental conditions, or instruments. LSTMs can learn to handle such variability by identifying invariant patterns across different instances.

7- **Emotion Recognition:** LSTMs can capture subtle variations in vocal tone, pitch, and rhythm that convey emotional information in speech. They are used for tasks like emotion recognition and sentiment analysis in audio.

8- **analyze physiological signals:** In healthcare, LSTMs can be used to analyze physiological signals like heartbeats, breathing patterns, and EEG signals. These signals have temporal characteristics that can be captured using LSTMs.

In [None]:

from keras import layers
from keras.models import Sequential
from keras.layers import LSTM, Embedding, Dense
from tensorflow import keras
import matplotlib.pyplot as plt
from keras import datasets, layers, models, losses
from keras.optimizers import Adam
from keras.layers import BatchNormalization


initializer = tf.keras.initializers.lecun_normal()

model = models.Sequential()

model.add(LSTM(units=64, input_shape=( 62, 50 )  , return_sequences=True , activation='tanh', recurrent_activation='sigmoid'))
model.add(LSTM(units=16  , activation='tanh', recurrent_activation='sigmoid'))

# when searching i found an activation funcation has better performane when dealing with negative values than RELU which called "SELU".
# the reason is Relu ignore -ve values while SELU is a type of activation funcation that considere both +ve and -ve values

model.add(layers.Dense(16, activation="selu" ))
model.add(Dense(units=2, activation='softmax'))



In [None]:
optimiz = Adam(learning_rate=0.003)

model.compile(optimizer= optimiz , loss='SparseCategoricalCrossentropy', metrics=['accuracy'])

model.summary()

Model: "sequential_12"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 lstm_24 (LSTM)              (None, 62, 64)            29440     
                                                                 
 lstm_25 (LSTM)              (None, 16)                5184      
                                                                 
 dense_24 (Dense)            (None, 16)                272       
                                                                 
 dense_25 (Dense)            (None, 2)                 34        
                                                                 
Total params: 34,930
Trainable params: 34,930
Non-trainable params: 0
_________________________________________________________________


In [None]:
history = model.fit( xTrain , yTrain , epochs=50, batch_size=30 , validation_data=(xval, yval)  )

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


In [None]:
model.save("covidCough.h5")