<a href="https://colab.research.google.com/github/RDGopal/IB9AU-2026/blob/main/MLM3_Fundamental_Concepts_of_Diffusion.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This notebook illustrates the process of how a bimodal distribution, when subjected to repeated additions of Gaussian noise, gradually transforms into a unimodal (single-peaked) distribution. This phenomenon is related to the Central Limit Theorem and the properties of convolution. Each iteration of adding noise effectively 'blurs' the distribution, eventually smoothing out the distinct peaks into a single, broader peak.

First, we define two normal distributions with different means and combine them to create a bimodal distribution. This serves as our initial dataset.

In [None]:
# We want a bimodal distribution of a single variable. We will define the distribution and draw samples from it

import numpy as np
import matplotlib.pyplot as plt

# Define the parameters of the two normal distributions
mean1 = 0
std1 = 1
mean2 = 5
std2 = 1

# Define the proportion of samples from each distribution
proportion1 = 0.6
proportion2 = 0.4

# Number of samples to draw
n_samples = 1000

# Draw samples from each normal distribution
samples1 = np.random.normal(mean1, std1, int(n_samples * proportion1))
samples2 = np.random.normal(mean2, std2, int(n_samples * proportion2))

# Combine the samples to create the bimodal distribution
bimodal_samples = np.concatenate((samples1, samples2))

# Shuffle the samples to mix them randomly
np.random.shuffle(bimodal_samples)

# Plot the histogram of the bimodal distribution
plt.hist(bimodal_samples, bins=50, density=True, alpha=0.6, color='g')
plt.title('Bimodal Distribution')
plt.xlabel('Value')
plt.ylabel('Density')
plt.show()


Next, we define a function `add_noise` that takes a sample and a standard deviation, and adds random Gaussian noise to each data point. This function will be used in subsequent steps to simulate the 'blurring' effect.

In [None]:
# We will draw a sample of 1000 points from the above bimodal distribution. For each of the sample data, say xi, we replace it with xi+N(0,0.05). N(0,0.5) is a normal distribution with mean 0 and a standard deviation of 0.5. Define a function for this where the input to the function is the original sample of 1000 points and a value for the standard deviation. The output should the replaced sample of 1000 points

import matplotlib.pyplot as plt
import numpy as np
def add_noise(sample, std_dev):
  """
    Adds Gaussian noise to each element of a sample.

    Args:
      sample: A numpy array of data points.
      std_dev: The standard deviation of the Gaussian noise to add.

    Returns:
      A numpy array of data points with added noise.
  """
  noise = np.random.normal(0, std_dev, len(sample))
  noisy_sample = sample + noise
  return noisy_sample



Here, we apply the `add_noise` function once to our initial bimodal samples. The plot shows the immediate effect of a single noise addition.

In [None]:
# Apply the function to the bimodal_samples with a standard deviation of 0.5
noisy_bimodal_samples = add_noise(bimodal_samples, 0.5)

# You can optionally plot the new distribution to see the effect of the noise
plt.hist(noisy_bimodal_samples, bins=50, density=True, alpha=0.6, color='b')
plt.title('Bimodal Distribution with Noise')
plt.xlabel('Value')
plt.ylabel('Density')
plt.show()

Finally, we repeatedly apply the `add_noise` function. In each iteration, the output of the previous step becomes the input for the current step. By plotting the distribution at each step, we can observe the gradual transition from a bimodal to a unimodal distribution as the noise accumulates.

In [None]:
# Let's run the function repeatedly (say 100 times), with each time output of one iteration becoming the input for the next iteration. We want the empirical plot shown for each iteration

import matplotlib.pyplot as plt
# Set the number of iterations
num_iterations = 100

# Set the standard deviation for the noise function
noise_std_dev = 0.5

# Initialize the samples for the first iteration
current_samples = bimodal_samples

# Loop through the iterations
for i in range(num_iterations):
  # Apply the add_noise function to the current samples
  current_samples = add_noise(current_samples, noise_std_dev)

  # Plot the histogram for the current iteration
  plt.hist(current_samples, bins=50, density=True, alpha=0.6, color='r')
  plt.title(f'Empirical Distribution - Iteration {i+1}')
  plt.xlabel('Value')
  plt.ylabel('Density')
  plt.show()