<a href="https://colab.research.google.com/github/AmanPriyanshu/DP-HyperparamTuning/blob/main/RL_DP_Demo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Demo for RL-DP-Project:

## SET-UP

In [1]:
!nvidia-smi

NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.



In [6]:
!git clone https://github.com/AmanPriyanshu/DP-HyperparamTuning.git ./RL_DP_Project

Cloning into './RL_DP_Project'...
remote: Enumerating objects: 53, done.[K
remote: Counting objects:   1% (1/53)[Kremote: Counting objects:   3% (2/53)[Kremote: Counting objects:   5% (3/53)[Kremote: Counting objects:   7% (4/53)[Kremote: Counting objects:   9% (5/53)[Kremote: Counting objects:  11% (6/53)[Kremote: Counting objects:  13% (7/53)[Kremote: Counting objects:  15% (8/53)[Kremote: Counting objects:  16% (9/53)[Kremote: Counting objects:  18% (10/53)[Kremote: Counting objects:  20% (11/53)[Kremote: Counting objects:  22% (12/53)[Kremote: Counting objects:  24% (13/53)[Kremote: Counting objects:  26% (14/53)[Kremote: Counting objects:  28% (15/53)[Kremote: Counting objects:  30% (16/53)[Kremote: Counting objects:  32% (17/53)[Kremote: Counting objects:  33% (18/53)[Kremote: Counting objects:  35% (19/53)[Kremote: Counting objects:  37% (20/53)[Kremote: Counting objects:  39% (21/53)[Kremote: Counting objects:  41% (22/53)[Kremote: C

## Importing Everything:

In [5]:
!pip install opacus

Collecting opacus
  Downloading opacus-0.14.0-py3-none-any.whl (114 kB)
[?25l[K     |██▉                             | 10 kB 25.9 MB/s eta 0:00:01[K     |█████▊                          | 20 kB 27.6 MB/s eta 0:00:01[K     |████████▋                       | 30 kB 11.6 MB/s eta 0:00:01[K     |███████████▌                    | 40 kB 9.1 MB/s eta 0:00:01[K     |██████████████▍                 | 51 kB 5.0 MB/s eta 0:00:01[K     |█████████████████▏              | 61 kB 5.3 MB/s eta 0:00:01[K     |████████████████████            | 71 kB 5.7 MB/s eta 0:00:01[K     |███████████████████████         | 81 kB 6.4 MB/s eta 0:00:01[K     |█████████████████████████▉      | 92 kB 4.6 MB/s eta 0:00:01[K     |████████████████████████████▊   | 102 kB 5.0 MB/s eta 0:00:01[K     |███████████████████████████████▌| 112 kB 5.0 MB/s eta 0:00:01[K     |████████████████████████████████| 114 kB 5.0 MB/s 
Installing collected packages: opacus
Successfully installed opacus-0.14.0


In [8]:
from opacus import PrivacyEngine
from tqdm.notebook import tqdm
import pandas as pd
import numpy as np
import torch
import random
import os
from sklearn.datasets import make_classification
import warnings
warnings.filterwarnings("ignore")

## Reading the Data:

In [9]:
class ClassificationDataset(torch.utils.data.Dataset):
	def __init__(self, x, y):
		self.x = x
		self.y = y

	def __len__(self):
		return len(self.x)

	def __getitem__(self, idx):
		return torch.from_numpy(self.x[idx].astype(np.float32)), torch.from_numpy(np.array([self.y[idx]]).astype(np.float32))

In [10]:
def load_sklearn(val_split=0.2):
	x, y = make_classification(n_samples=4000, n_features=8, n_informative=2, n_redundant=2, n_classes=2, n_clusters_per_class=2, flip_y=0.15, class_sep=1.5, hypercube=True, shift=0.0, shuffle=True, random_state=0)
	train_x, train_y, test_x, test_y = x[:int((1-val_split)*len(x))], y[:int((1-val_split)*len(x))], x[int((1-val_split)*len(x)):], y[int((1-val_split)*len(x)):]
	return train_x, train_y, test_x, test_y

In [11]:
def load_dataset():
	train_x, train_y, test_x, test_y = load_sklearn()
	train_dataset = ClassificationDataset(train_x, train_y)
	test_dataset = ClassificationDataset(test_x, test_y)
	return train_dataset, test_dataset

In [12]:
train_dataset, test_dataset = load_dataset()
print("Training:", type(train_dataset), "Size:", len(train_dataset))
print("Testing:", type(test_dataset), "Size:", len(test_dataset))

Training: <class '__main__.ClassificationDataset'> Size: 3200
Testing: <class '__main__.ClassificationDataset'> Size: 800


## Creating a Torch-Model

In [13]:
def get_model():
	model = torch.nn.Sequential(
			torch.nn.Linear(8, 4),
			torch.nn.ReLU(),
			torch.nn.Linear(4, 1),
			torch.nn.Sigmoid(),
		)
	return model

In [14]:
print(get_model())

Sequential(
  (0): Linear(in_features=8, out_features=4, bias=True)
  (1): ReLU()
  (2): Linear(in_features=4, out_features=1, bias=True)
  (3): Sigmoid()
)


## Loading our Algorithms:

In [15]:
from RL_DP_Project.experiment.train_single_model import Experiment
from RL_DP_Project.algorithms.bayesian_optimization import Bayesian
from RL_DP_Project.algorithms.grid_search_algorithm import GridSearch
from RL_DP_Project.algorithms.evolutionary_optimization import EvolutionaryOptimization
from RL_DP_Project.algorithms.reinforcement_learning_optimization import RLOptimization

## Running A DP-Experiment

In [16]:
def run_sample():
	criterion = torch.nn.BCELoss()
	train_dataset, test_dataset = load_dataset()
	e = Experiment(get_model, criterion, train_dataset, test_dataset)
	results = e.run_experiment(1, 0.001)
	print()
	print("RESULTS:")
	_ = [print(key+":", round(item, 4)) for key, item in results.items()]

In [17]:
run_sample()

{'type': 'training', 'epoch': 1, 'loss': 0.69, 'acc': 0.56}: 100%|██████████| 400/400 [00:04<00:00, 83.70it/s]
{'type': 'testing', 'epoch': 1, 'loss': 0.6925, 'acc': 0.5537}: 100%|██████████| 100/100 [00:00<00:00, 216.89it/s]
{'type': 'training', 'epoch': 2, 'loss': 0.6755, 'acc': 0.5856}: 100%|██████████| 400/400 [00:04<00:00, 81.59it/s]
{'type': 'testing', 'epoch': 2, 'loss': 0.6776, 'acc': 0.5787}: 100%|██████████| 100/100 [00:00<00:00, 203.11it/s]
{'type': 'training', 'epoch': 3, 'loss': 0.6602, 'acc': 0.6125}: 100%|██████████| 400/400 [00:04<00:00, 84.09it/s]
{'type': 'testing', 'epoch': 3, 'loss': 0.6614, 'acc': 0.5988}: 100%|██████████| 100/100 [00:00<00:00, 187.62it/s]
{'type': 'training', 'epoch': 4, 'loss': 0.6436, 'acc': 0.6347}: 100%|██████████| 400/400 [00:04<00:00, 81.83it/s]
{'type': 'testing', 'epoch': 4, 'loss': 0.6441, 'acc': 0.6162}: 100%|██████████| 100/100 [00:00<00:00, 153.76it/s]
{'type': 'training', 'epoch': 5, 'loss': 0.6252, 'acc': 0.6613}: 100%|██████████| 40


RESULTS:
eps: 1.2102
train_loss: 0.3983
val_loss: 0.3892
train_acc: 0.8809
val_acc: 0.8862





## Creating our Reward Function:

In [18]:
def calculate_reward(eps, train_loss, val_loss, alpha_u=0.5, alpha_p=0.5):
	return alpha_p*np.exp(-(eps)) + alpha_u*np.exp(-(val_loss))

## Creating Functions to Run Optimizers:

In [19]:
def run_grid_search():
	criterion = torch.nn.BCELoss()
	train_dataset, test_dataset = load_dataset()
	e = Experiment(get_model, criterion, train_dataset, test_dataset)
	gs = GridSearch(e.run_experiment, calculate_reward, 10, search_space_nm=[2, 5], search_space_lr=[0.001, 0.05])
	progress = gs.run()
	return progress

In [20]:
def run_bayesian():
	criterion = torch.nn.BCELoss()
	train_dataset, test_dataset = load_dataset()
	e = Experiment(get_model, criterion, train_dataset, test_dataset)
	b = Bayesian(e.run_experiment, calculate_reward, 100, search_space_nm=[2, 5], search_space_lr=[0.001, 0.05])
	progress = b.run()
	return progress

In [21]:
def run_evolutionary_optimization():
	criterion = torch.nn.BCELoss()
	train_dataset, test_dataset = load_dataset()
	e = Experiment(get_model, criterion, train_dataset, test_dataset)
	eo = EvolutionaryOptimization(e.run_experiment, calculate_reward, 10, search_space_nm=[2, 5], search_space_lr=[0.001, 0.05])
	progress = eo.run()
	return progress

In [22]:
def run_reinforcement_learning_optimization():
	criterion = torch.nn.BCELoss()
	train_dataset, test_dataset = load_dataset()
	e = Experiment(get_model, criterion, train_dataset, test_dataset)
	rl = RLOptimization(e.run_experiment, calculate_reward, 10, search_space_nm=[2, 5], search_space_lr=[0.001, 0.05])
	progress = rl.run()
	return progress

## Running Each Optimizer:

In [23]:
gs_progress = run_grid_search()

100%|██████████| 100/100 [23:36<00:00, 14.17s/it]


In [24]:
bo_progress = run_bayesian()

100%|██████████| 100/100 [22:30<00:00, 13.50s/it, best loss: 0.21964238103345235]
{'lr': 0.0024137336369208263, 'nm': 4.877549655447803}


In [25]:
eo_progress = run_evolutionary_optimization()

{'gen_num': 0, 'lr': 0.0206, 'nm': 2.3, 'eps': 0.3326, 'val_loss': 0.5004, 'reward': 0.6617}: 100%|██████████| 10/10 [02:18<00:00, 13.89s/it]
  0%|          | 0/9 [00:00<?, ?it/s]

{'gen_num': 0, 'reward_mean': 0.6923151689180784, 'reward_max': 0.7569686079096618}


{'gen_num': 1, 'lr': 0.0206, 'nm': 4.1, 'eps': 0.176, 'val_loss': 0.5832, 'reward': 0.6984}: 100%|██████████| 9/9 [02:05<00:00, 13.95s/it]
  0%|          | 0/9 [00:00<?, ?it/s]

{'gen_num': 1, 'reward_mean': 0.7300564713494908, 'reward_max': 0.7534460038120583}


{'gen_num': 2, 'lr': 0.0353, 'nm': 2.3, 'eps': 0.3326, 'val_loss': 0.4973, 'reward': 0.6626}: 100%|██████████| 9/9 [02:05<00:00, 13.95s/it]
  0%|          | 0/9 [00:00<?, ?it/s]

{'gen_num': 2, 'reward_mean': 0.7306871899130688, 'reward_max': 0.7557390548625252}


{'gen_num': 3, 'lr': 0.0402, 'nm': 3.8, 'eps': 0.1885, 'val_loss': 0.5336, 'reward': 0.7073}: 100%|██████████| 9/9 [02:04<00:00, 13.84s/it]
  0%|          | 0/9 [00:00<?, ?it/s]

{'gen_num': 3, 'reward_mean': 0.7295607055249926, 'reward_max': 0.7555506264692051}


{'gen_num': 4, 'lr': 0.0206, 'nm': 2.0, 'eps': 0.3944, 'val_loss': 0.4933, 'reward': 0.6423}: 100%|██████████| 9/9 [02:04<00:00, 13.84s/it]
  0%|          | 0/9 [00:00<?, ?it/s]

{'gen_num': 4, 'reward_mean': 0.7305160926447191, 'reward_max': 0.7642557124915101}


{'gen_num': 5, 'lr': 0.0059, 'nm': 3.8, 'eps': 0.1885, 'val_loss': 0.4333, 'reward': 0.7383}: 100%|██████████| 9/9 [02:04<00:00, 13.88s/it]
  0%|          | 0/9 [00:00<?, ?it/s]

{'gen_num': 5, 'reward_mean': 0.7453124008784301, 'reward_max': 0.7661512924830864}


{'gen_num': 6, 'lr': 0.0304, 'nm': 3.5, 'eps': 0.2047, 'val_loss': 0.564, 'reward': 0.6919}: 100%|██████████| 9/9 [02:03<00:00, 13.75s/it]
  0%|          | 0/9 [00:00<?, ?it/s]

{'gen_num': 6, 'reward_mean': 0.7448884226340449, 'reward_max': 0.7633982450772268}


{'gen_num': 7, 'lr': 0.0353, 'nm': 4.1, 'eps': 0.176, 'val_loss': 0.7356, 'reward': 0.6589}: 100%|██████████| 9/9 [02:04<00:00, 13.82s/it]
  0%|          | 0/9 [00:00<?, ?it/s]

{'gen_num': 7, 'reward_mean': 0.7376926754126129, 'reward_max': 0.7676821222215464}


{'gen_num': 8, 'lr': 0.0108, 'nm': 4.4, 'eps': 0.166, 'val_loss': 0.4717, 'reward': 0.7355}: 100%|██████████| 9/9 [02:04<00:00, 13.87s/it]
  0%|          | 0/9 [00:00<?, ?it/s]

{'gen_num': 8, 'reward_mean': 0.748433337716076, 'reward_max': 0.764924089273108}


{'gen_num': 9, 'lr': 0.0402, 'nm': 4.4, 'eps': 0.166, 'val_loss': 1.1178, 'reward': 0.587}: 100%|██████████| 9/9 [02:03<00:00, 13.76s/it]

{'gen_num': 9, 'reward_mean': 0.7282234614522305, 'reward_max': 0.7659901341604181}





In [26]:
rl_progress = run_reinforcement_learning_optimization()

100%|██████████| 10/10 [02:19<00:00, 13.93s/it]
100%|██████████| 10/10 [02:18<00:00, 13.85s/it]
100%|██████████| 10/10 [02:19<00:00, 13.93s/it]
100%|██████████| 10/10 [02:20<00:00, 14.05s/it]
100%|██████████| 10/10 [02:18<00:00, 13.90s/it]
100%|██████████| 10/10 [02:17<00:00, 13.74s/it]
100%|██████████| 10/10 [02:20<00:00, 14.01s/it]
100%|██████████| 10/10 [02:19<00:00, 13.90s/it]
100%|██████████| 10/10 [02:17<00:00, 13.76s/it]
100%|██████████| 10/10 [02:18<00:00, 13.84s/it]


## Evaluating the Results:

In [27]:
eo_progress_ext = np.concatenate(eo_progress, 0)
rl_progress_ext = np.concatenate(rl_progress, 0)

In [28]:
print("Maximum Reward Achieved by Each Algorithm:")
algorithms = [gs_progress, bo_progress, eo_progress_ext, rl_progress_ext]
max_rewards_index = [np.argmax(i.T[-1]) for i in algorithms]
max_rewards = pd.DataFrame(np.stack([i[index] for i, index in zip(algorithms, max_rewards_index)]))
max_rewards.columns = 'nm, lr, eps, train_loss, val_loss, train_acc, val_acc, reward'.split(', ')
pd.set_option("display.max_rows", None, "display.max_columns", None)
print(max_rewards)

Maximum Reward Achieved by Each Algorithm:
         nm        lr       eps  train_loss  val_loss  train_acc   val_acc  \
0  4.666667  0.001000  0.158780    0.389207  0.378369    0.88500  0.880313   
1  4.877550  0.002414  0.153917    0.368038  0.351870    0.89125  0.884062   
2  4.379012  0.003984  0.166633    0.394809  0.372726    0.89000  0.895000   
3  4.666667  0.001000  0.158780    0.381342  0.373430    0.88250  0.885000   

     reward  
0  0.769081  
1  0.780358  
2  0.767682  
3  0.770777  
