# Day 93: Privacy-Preserving Machine Learning

## Introduction

In an era where data drives innovation, protecting individual privacy while extracting valuable insights has become one of the most critical challenges in machine learning. Privacy-preserving machine learning (PPML) encompasses a set of techniques that enable us to train models, make predictions, and analyze data without exposing sensitive information.

Consider the healthcare industry: hospitals want to collaborate on building better diagnostic models, but they cannot share patient records due to privacy regulations like HIPAA and GDPR. Or think about financial institutions that need to detect fraud patterns across multiple banks without revealing transaction details. Privacy-preserving ML techniques make these scenarios possible.

The tension between data utility and privacy protection has spawned innovative cryptographic and algorithmic approaches. From **homomorphic encryption** that allows computation on encrypted data, to **secure multi-party computation** that enables collaborative learning without data sharing, to **differential privacy** that adds controlled noise to protect individualsâ€”these techniques form the foundation of privacy-aware AI systems.

### Why This Matters

- **Regulatory Compliance**: GDPR, CCPA, HIPAA, and other regulations mandate privacy protection
- **Trust & Ethics**: Users increasingly demand transparency about how their data is used
- **Collaborative ML**: Organizations can jointly train models without compromising proprietary data
- **Federated Learning Foundation**: Privacy techniques are essential for distributed learning systems

### Learning Objectives

By the end of this lesson, you will:

1. Understand the fundamental privacy threats in machine learning (model inversion, membership inference, etc.)
2. Explore cryptographic approaches: homomorphic encryption and secure multi-party computation
3. Implement differential privacy mechanisms to protect sensitive data
4. Apply privacy-preserving techniques to real-world ML workflows
5. Evaluate the privacy-utility tradeoff in practical scenarios

## Theory: Foundations of Privacy-Preserving Machine Learning

### 1. Privacy Threats in Machine Learning

Before diving into solutions, let's understand the threats:

#### Model Inversion Attacks
An adversary can reconstruct training data by querying a model. If a face recognition model outputs confidence scores, attackers can optimize input images to maximize confidence for a specific person, effectively reconstructing their face.

#### Membership Inference Attacks
Given a data point and a model, an attacker determines whether that data point was in the training set. This is problematic for sensitive datasets (e.g., medical records, financial data).

#### Training Data Extraction
Large language models have been shown to memorize and regurgitate training data verbatim, potentially exposing private information like credit card numbers or personal identifiers.

---

### 2. Differential Privacy (DP)

**Differential Privacy** provides a mathematical guarantee that the presence or absence of any single individual's data has a negligible effect on the model's output.

#### Formal Definition

A randomized mechanism $\mathcal{M}$ satisfies $(\epsilon, \delta)$-differential privacy if for all datasets $D_1$ and $D_2$ differing in at most one element, and for all possible outputs $S$:

$$P[\mathcal{M}(D_1) \in S] \leq e^\epsilon \cdot P[\mathcal{M}(D_2) \in S] + \delta$$

**Parameters:**
- $\epsilon$ (epsilon): Privacy budget. Smaller values = stronger privacy (typical: 0.1 to 10)
- $\delta$ (delta): Probability of privacy breach (typical: $\frac{1}{n^2}$ where $n$ is dataset size)

#### Mechanisms for Achieving DP

**1. Laplace Mechanism** (for numerical queries):
$$\tilde{f}(D) = f(D) + \text{Lap}\left(\frac{\Delta f}{\epsilon}\right)$$

where $\Delta f$ is the sensitivity (max change in output when one record changes).

**2. Gaussian Mechanism** (for better composition):
$$\tilde{f}(D) = f(D) + \mathcal{N}\left(0, \frac{2\Delta f^2 \ln(1.25/\delta)}{\epsilon^2}\right)$$

**3. Exponential Mechanism** (for selecting from discrete sets):
Used when adding noise to numerical output isn't appropriate (e.g., choosing best model parameters).

#### DP-SGD (Differentially Private Stochastic Gradient Descent)

For training neural networks with privacy:

1. **Clip gradients** per sample: $\bar{g}_i = g_i / \max(1, \frac{\|g_i\|_2}{C})$
2. **Add Gaussian noise**: $\tilde{g} = \frac{1}{N}\left(\sum_{i=1}^{N} \bar{g}_i + \mathcal{N}(0, \sigma^2 C^2 I)\right)$
3. **Update model**: $\theta_{t+1} = \theta_t - \eta \tilde{g}$

The privacy budget is tracked using the **privacy accounting** framework (typically using RÃ©nyi DP or moments accountant).

---

### 3. Homomorphic Encryption (HE)

Homomorphic encryption allows computation on encrypted data without decryption. The result, when decrypted, matches the result of operations on plaintext.

#### Types of Homomorphic Encryption

**Partially Homomorphic Encryption (PHE):**
- RSA: supports multiplication
- Paillier: supports addition
- Example: $E(a) + E(b) = E(a + b)$

**Somewhat Homomorphic Encryption (SWHE):**
- Limited depth of operations (noise accumulates)

**Fully Homomorphic Encryption (FHE):**
- Arbitrary computation on encrypted data
- Schemes: BFV, CKKS, TFHE
- Practical but computationally expensive (10,000x slowdown typical)

#### CKKS Scheme (for Approximate Arithmetic)

Used for ML applications:

$$E(m_1) \otimes E(m_2) = E(m_1 \times m_2)$$
$$E(m_1) \oplus E(m_2) = E(m_1 + m_2)$$

**Key insight**: Allows approximate computations (suitable for floating-point ML operations).

---

### 4. Secure Multi-Party Computation (SMPC)

SMPC allows multiple parties to jointly compute a function on their private inputs without revealing them to each other.

#### Secret Sharing

**Additive Secret Sharing:**
Split secret $x$ into $n$ shares: $x = x_1 + x_2 + \ldots + x_n$

Each party holds $x_i$; no single party knows $x$.

**Example: Computing Sum Privately**
- Alice has $a$, Bob has $b$, Carol has $c$
- Split each value into 3 shares
- Distribute shares so each party has one share of each value
- Each computes local sum, then reveal and combine

#### Garbled Circuits

Used for arbitrary boolean circuits:
1. Convert function to boolean circuit
2. "Garble" the circuit (encrypt truth tables)
3. Execute using oblivious transfer
4. Reveal only the output

**Cost**: $O(|C|)$ where $|C|$ is circuit size. Practical for small functions.

---

### 5. Federated Learning with Privacy

Combining federated learning (Day 91) with privacy:

**Privacy Enhancements:**
1. **Secure Aggregation**: Encrypt model updates during aggregation
2. **Differential Privacy**: Add noise to gradients before sharing
3. **Homomorphic Encryption**: Aggregate encrypted gradients on server

**Privacy Budget Management:**
$$\epsilon_{\text{total}} = \sum_{t=1}^{T} \epsilon_t$$

Use adaptive clipping and noise scheduling to minimize privacy loss.

---

### 6. Privacy-Utility Tradeoff

There's always a tradeoff between privacy and model accuracy:

- **Strong privacy** (low $\epsilon$) â†’ More noise â†’ Lower accuracy
- **Weak privacy** (high $\epsilon$) â†’ Less noise â†’ Higher accuracy

**Key considerations:**
- Dataset size (larger datasets tolerate more noise)
- Model complexity (simpler models may be more robust to noise)
- Task sensitivity (medical diagnosis vs. movie recommendations)

In [None]:
# Import necessary libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.datasets import load_breast_cancer, make_classification
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report
import warnings
warnings.filterwarnings('ignore')

# Set random seeds for reproducibility
np.random.seed(42)

# Configure plotting
plt.style.use('seaborn-v0_8-darkgrid')
sns.set_palette("husl")

print("âœ“ Libraries imported successfully")
print("âœ“ Privacy-preserving ML lesson environment ready")

## Implementation 1: Differential Privacy with Laplace Mechanism

Let's implement a basic differential privacy mechanism for protecting aggregate statistics.

In [None]:
class LaplaceMechanism:
    """
    Implements the Laplace Mechanism for differential privacy.
    Adds Laplace noise calibrated to sensitivity and epsilon.
    """
    
    def __init__(self, epsilon, sensitivity):
        """
        Parameters:
        - epsilon: Privacy budget (smaller = more private)
        - sensitivity: Maximum change in output when one record changes
        """
        self.epsilon = epsilon
        self.sensitivity = sensitivity
        self.scale = sensitivity / epsilon
    
    def add_noise(self, true_value):
        """Add Laplace noise to protect privacy"""
        noise = np.random.laplace(0, self.scale)
        return true_value + noise
    
    def release_statistic(self, data, query_func):
        """
        Release a differentially private statistic.
        
        Parameters:
        - data: Input dataset
        - query_func: Function to compute statistic (e.g., mean, sum)
        """
        true_result = query_func(data)
        private_result = self.add_noise(true_result)
        return private_result, true_result

# Example: Computing average salary with privacy
np.random.seed(42)
salaries = np.array([50000, 55000, 60000, 65000, 70000, 75000, 80000, 85000, 90000, 95000])

print("=" * 60)
print("DIFFERENTIAL PRIVACY EXAMPLE: Average Salary")
print("=" * 60)
print(f"\nTrue average salary: ${np.mean(salaries):,.2f}")

# Test different epsilon values
epsilons = [0.1, 1.0, 5.0, 10.0]
sensitivity = 45000 / len(salaries)  # Max change if one salary changes by $45k

for eps in epsilons:
    mechanism = LaplaceMechanism(epsilon=eps, sensitivity=sensitivity)
    private_mean, _ = mechanism.release_statistic(salaries, np.mean)
    print(f"\nÎµ = {eps:4.1f}: ${private_mean:,.2f} (privacy: {'strong' if eps < 1 else 'moderate' if eps < 5 else 'weak'})")

print("\n" + "=" * 60)
print("Note: Smaller epsilon = stronger privacy but more noise")
print("=" * 60)

In [None]:
# Visualization: Privacy-Utility Tradeoff
np.random.seed(42)

# Simulate the effect of epsilon on query accuracy
true_mean = 70000
n_trials = 100
epsilons = np.logspace(-1, 2, 50)  # 0.1 to 100
sensitivity = 45000 / 10

results = []
for epsilon in epsilons:
    errors = []
    for _ in range(n_trials):
        mechanism = LaplaceMechanism(epsilon=epsilon, sensitivity=sensitivity)
        private_mean, _ = mechanism.release_statistic(np.array([true_mean]), lambda x: x[0])
        error = abs(private_mean - true_mean)
        errors.append(error)
    results.append(np.mean(errors))

# Create figure with subplots
fig, axes = plt.subplots(1, 2, figsize=(15, 5))

# Plot 1: Mean Absolute Error vs Epsilon
axes[0].plot(epsilons, results, linewidth=2.5, color='#e74c3c')
axes[0].fill_between(epsilons, results, alpha=0.3, color='#e74c3c')
axes[0].set_xscale('log')
axes[0].set_xlabel('Privacy Budget (Îµ)', fontsize=12, fontweight='bold')
axes[0].set_ylabel('Mean Absolute Error ($)', fontsize=12, fontweight='bold')
axes[0].set_title('Privacy-Utility Tradeoff\n(Lower Îµ = More Privacy, Higher Error)', 
                  fontsize=13, fontweight='bold', pad=15)
axes[0].grid(True, alpha=0.3)
axes[0].axvline(x=1.0, color='green', linestyle='--', label='Îµ=1 (Strong Privacy)', linewidth=2)
axes[0].axvline(x=10.0, color='orange', linestyle='--', label='Îµ=10 (Weak Privacy)', linewidth=2)
axes[0].legend(fontsize=10)

# Plot 2: Distribution of noisy results for different epsilons
test_epsilons = [0.5, 2.0, 10.0]
colors = ['#e74c3c', '#f39c12', '#2ecc71']

for i, eps in enumerate(test_epsilons):
    mechanism = LaplaceMechanism(epsilon=eps, sensitivity=sensitivity)
    samples = [mechanism.add_noise(true_mean) for _ in range(1000)]
    axes[1].hist(samples, bins=50, alpha=0.5, label=f'Îµ={eps}', color=colors[i], density=True)

axes[1].axvline(x=true_mean, color='black', linestyle='--', linewidth=2, label='True Value')
axes[1].set_xlabel('Noisy Query Result ($)', fontsize=12, fontweight='bold')
axes[1].set_ylabel('Density', fontsize=12, fontweight='bold')
axes[1].set_title('Distribution of Private Results\n(More noise at lower Îµ)', 
                  fontsize=13, fontweight='bold', pad=15)
axes[1].legend(fontsize=10)
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print("\nðŸ“Š Key Observations:")
print("  â€¢ Lower epsilon (Îµ) adds more noise â†’ stronger privacy, less accuracy")
print("  â€¢ Higher epsilon adds less noise â†’ weaker privacy, more accuracy")
print("  â€¢ Choosing Îµ requires balancing privacy needs vs. utility requirements")

## Implementation 2: Differentially Private Machine Learning

Now let's implement a privacy-preserving logistic regression model using gradient clipping and noise addition (simplified DP-SGD).

In [None]:
# Load dataset (Breast Cancer classification)
data = load_breast_cancer()
X, y = data.data, data.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Normalize features
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

print("=" * 60)
print("DATASET: Breast Cancer Classification")
print("=" * 60)
print(f"Training samples: {len(X_train)}")
print(f"Test samples: {len(X_test)}")
print(f"Features: {X_train.shape[1]}")


class DPLogisticRegression:
    """
    Differentially Private Logistic Regression using gradient clipping and noise.
    Simplified implementation of DP-SGD principles.
    """
    
    def __init__(self, epsilon=1.0, delta=1e-5, clip_norm=1.0, learning_rate=0.01):
        self.epsilon = epsilon
        self.delta = delta
        self.clip_norm = clip_norm
        self.learning_rate = learning_rate
        self.weights = None
        self.bias = None
        
    def _sigmoid(self, z):
        return 1 / (1 + np.exp(-np.clip(z, -500, 500)))
    
    def _clip_gradient(self, gradient):
        """Clip gradient to bounded L2 norm"""
        norm = np.linalg.norm(gradient)
        if norm > self.clip_norm:
            return gradient * (self.clip_norm / norm)
        return gradient
    
    def _add_noise(self, gradient, n_samples):
        """Add Gaussian noise calibrated to privacy budget"""
        # Simplified noise calculation (proper DP-SGD uses privacy accounting)
        noise_scale = self.clip_norm * np.sqrt(2 * np.log(1.25 / self.delta)) / self.epsilon
        noise = np.random.normal(0, noise_scale / n_samples, gradient.shape)
        return gradient + noise
    
    def fit(self, X, y, epochs=100, batch_size=32):
        n_samples, n_features = X.shape
        self.weights = np.zeros(n_features)
        self.bias = 0
        
        for epoch in range(epochs):
            # Shuffle data
            indices = np.random.permutation(n_samples)
            X_shuffled = X[indices]
            y_shuffled = y[indices]
            
            # Mini-batch training
            for i in range(0, n_samples, batch_size):
                X_batch = X_shuffled[i:i+batch_size]
                y_batch = y_shuffled[i:i+batch_size]
                
                # Compute predictions
                predictions = self._sigmoid(np.dot(X_batch, self.weights) + self.bias)
                
                # Compute gradients (per sample, then clip)
                errors = predictions - y_batch
                grad_w = np.zeros(n_features)
                grad_b = 0
                
                for j in range(len(X_batch)):
                    sample_grad = X_batch[j] * errors[j]
                    sample_grad = self._clip_gradient(sample_grad)
                    grad_w += sample_grad
                    grad_b += errors[j]
                
                grad_w /= len(X_batch)
                grad_b /= len(X_batch)
                
                # Add noise for privacy
                grad_w = self._add_noise(grad_w, len(X_batch))
                
                # Update weights
                self.weights -= self.learning_rate * grad_w
                self.bias -= self.learning_rate * grad_b
    
    def predict(self, X):
        predictions = self._sigmoid(np.dot(X, self.weights) + self.bias)
        return (predictions >= 0.5).astype(int)
    
    def score(self, X, y):
        predictions = self.predict(X)
        return accuracy_score(y, predictions)


# Train models with different privacy levels
print("\n" + "=" * 60)
print("COMPARING NON-PRIVATE VS PRIVATE MODELS")
print("=" * 60)

# Non-private baseline
baseline_model = LogisticRegression(max_iter=1000, random_state=42)
baseline_model.fit(X_train, y_train)
baseline_acc = baseline_model.score(X_test, y_test)
print(f"\nâœ“ Non-Private Model: {baseline_acc:.4f} accuracy")

# Private models with different epsilon values
privacy_configs = [
    (0.5, "Strong Privacy"),
    (2.0, "Moderate Privacy"),
    (10.0, "Weak Privacy")
]

for epsilon, label in privacy_configs:
    dp_model = DPLogisticRegression(epsilon=epsilon, clip_norm=1.0, learning_rate=0.01)
    dp_model.fit(X_train, y_train, epochs=50, batch_size=32)
    dp_acc = dp_model.score(X_test, y_test)
    accuracy_loss = baseline_acc - dp_acc
    print(f"âœ“ DP Model (Îµ={epsilon:4.1f}): {dp_acc:.4f} accuracy | {label} | Loss: {accuracy_loss:.4f}")

print("\n" + "=" * 60)
print("ðŸ’¡ Privacy-Accuracy Tradeoff is evident!")
print("   Lower epsilon â†’ More privacy â†’ Lower accuracy")
print("=" * 60)

## Hands-On Activity: Privacy Attack Simulation

Let's simulate a **membership inference attack** to understand why privacy-preserving techniques matter. Then, we'll demonstrate how differential privacy protects against such attacks.

### Scenario
Can an attacker determine if a specific record was in the training set by observing model predictions?

### Your Task
1. Train two models: one with a specific record, one without
2. Compare their predictions on that record
3. Show how DP makes this attack harder

In [None]:
# Membership Inference Attack Simulation
np.random.seed(42)

# Create synthetic dataset
X_full, y_full = make_classification(n_samples=500, n_features=20, n_informative=15, 
                                      n_redundant=5, random_state=42)
X_full = StandardScaler().fit_transform(X_full)

# Select a target record to attack
target_idx = 100
X_target = X_full[target_idx:target_idx+1]
y_target = y_full[target_idx:target_idx+1]

# Create two datasets: with and without target
X_with_target = X_full
y_with_target = y_full

X_without_target = np.delete(X_full, target_idx, axis=0)
y_without_target = np.delete(y_full, target_idx)

print("=" * 70)
print("MEMBERSHIP INFERENCE ATTACK SIMULATION")
print("=" * 70)
print(f"Target record index: {target_idx}")
print(f"Dataset with target: {len(X_with_target)} samples")
print(f"Dataset without target: {len(X_without_target)} samples")


def membership_attack(X_train_with, y_train_with, X_train_without, y_train_without, 
                      X_target, y_target, private=False, epsilon=1.0):
    """
    Perform membership inference attack.
    Returns confidence scores for both models.
    """
    if private:
        # Train DP models
        model_with = DPLogisticRegression(epsilon=epsilon, learning_rate=0.01)
        model_with.fit(X_train_with, y_train_with, epochs=30, batch_size=32)
        
        model_without = DPLogisticRegression(epsilon=epsilon, learning_rate=0.01)
        model_without.fit(X_train_without, y_train_without, epochs=30, batch_size=32)
        
        # Get predictions (probability approximation)
        pred_with = model_with._sigmoid(np.dot(X_target, model_with.weights) + model_with.bias)[0]
        pred_without = model_without._sigmoid(np.dot(X_target, model_without.weights) + model_without.bias)[0]
    else:
        # Train non-private models
        model_with = LogisticRegression(max_iter=1000, random_state=42)
        model_with.fit(X_train_with, y_train_with)
        
        model_without = LogisticRegression(max_iter=1000, random_state=42)
        model_without.fit(X_train_without, y_train_without)
        
        # Get prediction probabilities
        pred_with = model_with.predict_proba(X_target)[0, y_target[0]]
        pred_without = model_without.predict_proba(X_target)[0, y_target[0]]
    
    # Attack metric: difference in confidence
    confidence_gap = abs(pred_with - pred_without)
    return pred_with, pred_without, confidence_gap


print("\n" + "-" * 70)
print("ATTACK 1: Non-Private Models")
print("-" * 70)

pred_with, pred_without, gap = membership_attack(
    X_with_target, y_with_target, 
    X_without_target, y_without_target,
    X_target, y_target, 
    private=False
)

print(f"Model WITH target confidence:    {pred_with:.4f}")
print(f"Model WITHOUT target confidence: {pred_without:.4f}")
print(f"Confidence gap:                  {gap:.4f}")
print(f"\nðŸš¨ Attack Result: {'MEMBER DETECTED' if gap > 0.1 else 'INCONCLUSIVE'}")
print(f"   (Large gap indicates target was likely in training set)")


print("\n" + "-" * 70)
print("ATTACK 2: Differentially Private Models (Îµ=1.0)")
print("-" * 70)

pred_with_dp, pred_without_dp, gap_dp = membership_attack(
    X_with_target, y_with_target,
    X_without_target, y_without_target,
    X_target, y_target,
    private=True, epsilon=1.0
)

print(f"DP Model WITH target confidence:    {pred_with_dp:.4f}")
print(f"DP Model WITHOUT target confidence: {pred_without_dp:.4f}")
print(f"Confidence gap:                     {gap_dp:.4f}")
print(f"\nâœ… Attack Result: {'MEMBER DETECTED' if gap_dp > 0.1 else 'PROTECTED - INCONCLUSIVE'}")
print(f"   (Smaller gap = privacy protection working)")


print("\n" + "=" * 70)
print("ðŸ“Š ATTACK EFFECTIVENESS COMPARISON")
print("=" * 70)
print(f"Non-Private Gap:  {gap:.4f} ({'Vulnerable' if gap > 0.1 else 'Protected'})")
print(f"Private Gap (DP): {gap_dp:.4f} ({'Vulnerable' if gap_dp > 0.1 else 'Protected'})")
print(f"\nGap Reduction: {((gap - gap_dp) / gap * 100):.1f}%")
print("\nðŸ’¡ Differential Privacy makes membership inference attacks significantly harder!")
print("=" * 70)

## Key Takeaways

âœ… **Privacy Threats are Real**: Machine learning models are vulnerable to attacks like membership inference, model inversion, and training data extraction.

âœ… **Differential Privacy Provides Mathematical Guarantees**: The (Îµ, Î´)-DP definition ensures that any single individual's data has negligible impact on model outputs.

âœ… **Privacy Has a Cost**: There's always a tradeoff between privacy (controlled by Îµ) and model utility/accuracy. Lower Îµ = stronger privacy but more noise.

âœ… **Multiple Approaches Exist**: 
   - **Differential Privacy**: Add calibrated noise to data or gradients
   - **Homomorphic Encryption**: Compute on encrypted data
   - **Secure Multi-Party Computation**: Collaborative learning without data sharing

âœ… **DP-SGD for Deep Learning**: Gradient clipping + noise addition enables training neural networks with formal privacy guarantees.

âœ… **Real-World Impact**: Privacy-preserving ML is critical for healthcare, finance, federated learning, and regulatory compliance (GDPR, HIPAA).

âœ… **Implementation Considerations**: 
   - Choose Îµ based on sensitivity of data
   - Larger datasets tolerate noise better
   - Privacy budget depletes with each query/training iteration
   - Use privacy accounting frameworks (RÃ©nyi DP, moments accountant)

**Remember**: Privacy is not an afterthoughtâ€”it should be built into the ML pipeline from the start!

## Further Resources

### Research Papers & Foundational Work
- [The Algorithmic Foundations of Differential Privacy (Dwork & Roth)](https://www.cis.upenn.edu/~aaroth/Papers/privacybook.pdf) - Comprehensive textbook on DP theory
- [Deep Learning with Differential Privacy (Abadi et al., 2016)](https://arxiv.org/abs/1607.00133) - Original DP-SGD paper
- [Membership Inference Attacks Against Machine Learning Models (Shokri et al., 2017)](https://arxiv.org/abs/1610.05820) - Seminal attack paper

### Libraries & Tools
- [Opacus (PyTorch)](https://opacus.ai/) - DP training for PyTorch models with privacy accounting
- [TensorFlow Privacy](https://github.com/tensorflow/privacy) - Google's DP library for TensorFlow
- [PySyft](https://github.com/OpenMined/PySyft) - Framework for encrypted, privacy-preserving ML
- [TenSEAL](https://github.com/OpenMined/TenSEAL) - Library for homomorphic encryption in Python
- [Diffprivlib (IBM)](https://github.com/IBM/differential-privacy-library) - General-purpose DP library

### Courses & Tutorials
- [Privacy-Preserving Machine Learning (Coursera)](https://www.coursera.org/learn/uva-dasi-privacy-preserving-machine-learning) - University of Virginia
- [Applied Privacy for Data Science (MOOC)](https://opendp.org/) - OpenDP Project tutorials
- [Differential Privacy Blog by Google](https://developers.googleblog.com/2019/09/enabling-developers-and-organizations.html) - Practical DP implementation

### Regulatory & Ethics Resources
- [GDPR Official Text](https://gdpr-info.eu/) - European data protection regulation
- [NIST Privacy Framework](https://www.nist.gov/privacy-framework) - U.S. privacy engineering guidelines
- [AI Ethics Guidelines (IEEE)](https://standards.ieee.org/industry-connections/ec/autonomous-systems/) - Ethical considerations

### Next Steps
- **Day 94**: Explore differential privacy techniques in depth
- **Day 95**: Learn about distributed training with gradient synchronization
- **Advanced Topic**: Study federated learning with secure aggregation (combining Day 91 + 93 concepts)