<a href="https://colab.research.google.com/github/HernanDL/Noise-Cancellation-Using-GenAI/blob/main/Noise_Cancellation_Using_Generative_AI_(Colab).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Noise Cancellation Using Generative AI
This project implements a noise cancellation system using a Generative Adversarial Network (GAN). It takes noisy audio as input, generates an inverse waveform, and combines the two to cancel the noise, resulting in silence.

In this notebook, we'll:
1. Load audio and preprocess it.
2. Convert it to a spectrogram (frequency-domain representation).
3. Build and train a GAN model for predicting inverse waveforms.
4. Visualize the results (waveforms and spectrograms).
5. Optionally integrate with external APIs (like OpenAI).


In [None]:
# Step 1: Install Required Libraries
!pip install numpy scipy librosa soundfile matplotlib torch torchaudio tensorflow

## Step 2: Load and Preprocess Audio Data
Let's load the audio file using `librosa` and visualize the waveform.

**For Colab:** We will upload the `.wav` file directly to the Colab environment.

In [None]:
import librosa
import librosa.display
import numpy as np
import matplotlib.pyplot as plt
import soundfile as sf
from google.colab import files

# Function to upload files to Colab
uploaded = files.upload()
audio_file = list(uploaded.keys())[0]  # Get the uploaded file name

# Function to load the audio file
def load_audio(file_path):
    audio, sr = librosa.load(file_path, sr=None)
    return audio, sr

# Function to plot the waveform
def plot_waveform(audio, sr):
    plt.figure(figsize=(10, 4))
    librosa.display.waveshow(audio, sr=sr)
    plt.title('Waveform')
    plt.show()

# Load and visualize the audio
audio, sr = load_audio(audio_file)
plot_waveform(audio, sr)

## Step 3: Convert Audio to Spectrogram
We convert the audio waveform into a spectrogram using the Short-Time Fourier Transform (STFT) for feature extraction.

In [None]:
# Convert audio waveform to spectrogram
def audio_to_spectrogram(audio, sr):
    stft = librosa.stft(audio)
    spectrogram = np.abs(stft)
    return spectrogram

# Plot the spectrogram
def plot_spectrogram(spectrogram, sr):
    plt.figure(figsize=(10, 4))
    librosa.display.specshow(librosa.amplitude_to_db(spectrogram, ref=np.max), sr=sr, x_axis='time', y_axis='log')
    plt.colorbar(format='%+2.0f dB')
    plt.title('Spectrogram')
    plt.show()

# Example: Convert audio to spectrogram and visualize
spectrogram = audio_to_spectrogram(audio, sr)
plot_spectrogram(spectrogram, sr)

## Step 4: Build the Generative Model (GAN)
We'll implement the Generator and Discriminator models using PyTorch to predict an inverse waveform.

In [None]:
import torch
import torch.nn as nn

# Generator Model
class Generator(nn.Module):
    def __init__(self):
        super(Generator, self).__init__()
        self.model = nn.Sequential(
            nn.Conv2d(1, 64, kernel_size=4, stride=2, padding=1),
            nn.ReLU(),
            # Additional layers can be added here
            nn.ConvTranspose2d(64, 1, kernel_size=4, stride=2, padding=1),
            nn.Tanh()
        )

    def forward(self, x):
        return self.model(x)

# Discriminator Model
class Discriminator(nn.Module):
    def __init__(self):
        super(Discriminator, self).__init__()
        self.model = nn.Sequential(
            nn.Conv2d(2, 64, kernel_size=4, stride=2, padding=1),
            nn.LeakyReLU(0.2),
            # Additional layers can be added here
            nn.Conv2d(64, 1, kernel_size=4, stride=2, padding=1),
            nn.Sigmoid()
        )

    def forward(self, x):
        return self.model(x)

# Initialize models
generator = Generator()
discriminator = Discriminator()

## Step 5: Define Training Process
We'll set up the training loop for both the generator and discriminator. The generator will attempt to produce an inverse waveform, and the discriminator will evaluate how close it is to the real inverse.

In [None]:
# Loss function and optimizers
criterion = nn.BCELoss()  # Binary Cross Entropy loss for GAN
optimizer_G = torch.optim.Adam(generator.parameters(), lr=0.0002)
optimizer_D = torch.optim.Adam(discriminator.parameters(), lr=0.0002)

# Training loop
def train(model_G, model_D, data_loader, optimizer_G, optimizer_D, criterion, num_epochs=100):
    for epoch in range(num_epochs):
        for i, (noisy, clean) in enumerate(data_loader):
            # Optimize the discriminator
            optimizer_D.zero_grad()
            loss_D = criterion(model_D(...), ...)
            loss_D.backward()
            optimizer_D.step()

            # Optimize the generator
            optimizer_G.zero_grad()
            loss_G = criterion(model_G(...), ...)
            loss_G.backward()
            optimizer_G.step()

        print(f"Epoch [{epoch}/{num_epochs}], Loss_G: {loss_G.item()}, Loss_D: {loss_D.item()}")

# Note: The actual data loader would need to provide batches of noisy and clean pairs of audio.
# Replace '...' with appropriate training code for your data loader.

## Step 6: Inference and Visualization of Results
After training, we can generate the inverse waveform and visualize the effect on the original noisy audio.

In [None]:
# Function to reconstruct audio from spectrogram
def spectrogram_to_audio(spectrogram, sr):
    return librosa.griffinlim(spectrogram)

# Example: Generate inverse waveform and combine with original
inverse_waveform = spectrogram_to_audio(generator(spectrogram), sr)
combined_waveform = audio + inverse_waveform

# Plot original and combined waveforms to visualize the cancellation effect
plot_waveform(combined_waveform, sr)

## Step 7: (Optional) Integration with External APIs
If you'd prefer to use external APIs such as OpenAI or Google AI, you can integrate them as follows:

In [None]:
import openai

# Example: Call OpenAI API (this is hypothetical)
def call_openai_audio_model(audio):
    response = openai.Audio.create(audio_file=audio, ...)
    return response['output_audio']

# Example usage
output_audio = call_openai_audio_model('path_to_noisy_audio.wav')