
# Differential Privacy: Laplace and Exponential Mechanisms

## Introduction

In today's data-driven world, protecting individual privacy while maintaining data utility is essential. 
Differential privacy provides a mathematical framework to ensure that individual contributions in a dataset remain confidential, 
even in statistical analyses.

This notebook focuses on two primary mechanisms used in **differential privacy**:
1. **Laplace Mechanism** - Used to add noise to numerical data while preserving privacy.
2. **Exponential Mechanism** - Used when working with categorical or ranking data.

We will generate **synthetic datasets** to demonstrate these concepts clearly.

---


## Importing Necessary Libraries

In [None]:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import random

# Set seed for reproducibility
random.seed(42)
np.random.seed(42)



## Laplace Mechanism

The **Laplace Mechanism** is used to add noise to numerical queries (e.g., sum, mean, count) to ensure differential privacy.
The amount of noise added follows a **Laplace distribution**, which is controlled by a privacy parameter **epsilon (ε)**.

The noise added is proportional to **1/ε**, meaning that a **higher ε results in less noise** and vice versa.

### Laplace Noise Formula:

Given a sensitivity Δf of the query function, the noise is drawn from:

\[ 	ext{Lap}(b) \]

where \( b = \frac{Δf}{ε} \).

Now, let's generate a **synthetic dataset** and apply the Laplace mechanism.


In [None]:

# Generate a synthetic dataset
num_samples = 1000
true_values = np.random.randint(100, 200, num_samples)  # True dataset values

# Function to add Laplace noise
def laplace_mechanism(value, sensitivity, epsilon):
    scale = sensitivity / epsilon
    noise = np.random.laplace(0, scale, len(value))
    return value + noise

# Apply Laplace Mechanism
epsilon = 1.0  # Privacy budget
sensitivity = 10  # Sensitivity of the query
noisy_values = laplace_mechanism(true_values, sensitivity, epsilon)

# Visualize the results
plt.figure(figsize=(8, 5))
plt.hist(true_values, bins=30, alpha=0.5, label="Original Data")
plt.hist(noisy_values, bins=30, alpha=0.5, label="Noisy Data (Laplace)")
plt.xlabel("Value")
plt.ylabel("Frequency")
plt.legend()
plt.title("Impact of Laplace Mechanism on Data")
plt.show()



## Exponential Mechanism

The **Exponential Mechanism** is used for cases where the query output is categorical or ranking-based, rather than numerical.
Instead of adding noise directly to numerical data, it **assigns probabilities** to different possible outputs **based on a utility function**.

### Probability of Selecting an Outcome:

For an outcome \( o \) with utility function \( u(o, D) \):

\[ P(o) \propto e^{(ε u(o, D))/2Δu} \]

where:
- **ε** is the privacy budget.
- **u(o, D)** is the utility score of outcome **o**.
- **Δu** is the sensitivity of the utility function.

Now, let's implement the **Exponential Mechanism** on a synthetic dataset.


In [None]:

# Define a set of categorical outcomes
categories = ["High", "Medium", "Low"]

# Define a simple utility function
def utility_function(category):
    return {"High": 10, "Medium": 5, "Low": 1}[category]

# Implementing the Exponential Mechanism
def exponential_mechanism(categories, utility_function, epsilon):
    utilities = np.array([utility_function(c) for c in categories])
    sensitivity = np.max(utilities) - np.min(utilities)
    probabilities = np.exp((epsilon * utilities) / (2 * sensitivity))
    probabilities /= np.sum(probabilities)
    return np.random.choice(categories, p=probabilities)

# Apply the Exponential Mechanism multiple times
epsilon = 1.0  # Privacy budget
selected_categories = [exponential_mechanism(categories, utility_function, epsilon) for _ in range(1000)]

# Visualizing the selection probability
pd.Series(selected_categories).value_counts(normalize=True).plot(kind="bar", figsize=(8, 5), color="skyblue")
plt.xlabel("Category")
plt.ylabel("Selection Probability")
plt.title("Exponential Mechanism: Probability Distribution of Selected Categories")
plt.show()



## Conclusion

In this notebook, we explored **two key mechanisms of Differential Privacy**:

1. **Laplace Mechanism**: Used for numerical queries by adding calibrated noise from the Laplace distribution.
2. **Exponential Mechanism**: Used for categorical outputs by assigning probabilities based on a utility function.

These techniques help preserve **individual privacy** while still allowing meaningful analysis. Understanding these methods is crucial for designing **privacy-preserving machine learning and statistical analyses**.

Thank you for following along! 🚀
