# Tuning Fufi hyperparameters with W&B Sweep 🧹 🐶
The idea is automatically tuning hyperparameters with the sweep by the means of a `grid `search.
In order not to spend an eternity doing this thing, we start with a` random` search, and then put a `grid `search on it.

## W&B Setup
🪄 Install `wandb` library and login

Start by installing the library and logging in to your free account.

In [47]:
!pip install wandb -qU
# Log in to your W&B account
import wandb
wandb.login()

True

## Installing libraries 📚

In [None]:
# useful for import notebook
!pip install import-ipynb
!pip install gym==0.25.2
#needed from March
!pip install numpy==1.23.5



## Setting things up for the environment 🌍 🪖

In [None]:
import os

#saving current directory just to be sure
content_dir = os.getcwd()

#cloning Fufi repo from git
!git clone https://github.com/Gaianeve/gym-Fufi.git
#installing things
!pip install /content/gym-Fufi

In [None]:
# Enter the environment directory
%cd /content/gym-Fufi
# Actually importing the library for our environment
import gym_Fufi

In [None]:
#get back to content directory so I save everything there
%cd ..
!pwd

## importing libraries and functions 📚


In [50]:
#libraries
import argparse
import random
import time
from distutils.util import strtobool
import gym
import numpy as np

import torch
import torch.nn as nn
import torch.optim as optim
import torch.optim.lr_scheduler as lr_scheduler
from torch.distributions.categorical import Categorical
from torch.utils.tensorboard import SummaryWriter
from torchsummary import summary

## Log needed function from files 📡
Loading files directly from git, so I don't have to upload it by hand.

In [51]:
#get files from git
!git clone https://github.com/Gaianeve/FUFONE.git

Cloning into 'FUFONE'...
remote: Enumerating objects: 265, done.[K
remote: Counting objects: 100% (114/114), done.[K
remote: Compressing objects: 100% (110/110), done.[K
remote: Total 265 (delta 66), reused 4 (delta 4), pack-reused 151[K
Receiving objects: 100% (265/265), 8.37 MiB | 7.50 MiB/s, done.
Resolving deltas: 100% (140/140), done.


In [52]:
!pwd
%cd FUFONE/PPO
from environment import vectorize_env
from agent_class import Agent
from agent_utils import anneal, collect_data, GAE, PPO_train_agent, evaluate_agent

#back to main directory
%cd ..
#import main function from file
from main_function_sweep import main

/content
/content/FUFONE/PPO
/content/FUFONE


In [53]:
#back to content directory
%cd ..

/content


## Define the sweep 📑
I assume to start with a random search, that is why we take random distribution:
*   `batch_size`  ➡  Quantized log uniform. Returns `round(X / q) * q` where `X` is` log_uniform_values`. Basically, a pretty good approximation of a log_q uniform distribution
*   `ent_coef ` ➡ Discrete uniform distribution on integers. Between 0.01, that is the default value and 1, that is the thing that works best for now.
*  ` num_envs`  ➡ see batch
*   `learning_rate` ➡ see batch




In [60]:
import numpy as np
import random

# Define sweep config
sweep_configuration = {
    "method": "random",
    "name": "sweep_Fufi",
    "metric": {"goal": "maximize", "name": "episodic_return"},
    "parameters": {
       "batch_size": {'distribution': 'q_log_uniform_values','q': 2,'min': 32,'max': 4096},
       "ent_coef": {'distribution': 'uniform', 'min': 0.01,'max': 1},
       "num_envs":  {'distribution': 'q_log_uniform_values','q': 2,'min': 4,'max': 16},
       "learning_rate":  {'distribution': 'uniform', 'min': 1.5e-7,'max': 1.5e-4}
    },
}


  and should_run_async(code)


In [61]:
#print the result
import pprint
pprint.pprint(sweep_configuration)

{'method': 'random',
 'metric': {'goal': 'maximize', 'name': 'episodic_return'},
 'name': 'sweep_Fufi',
 'parameters': {'batch_size': {'distribution': 'q_log_uniform_values',
                               'max': 4096,
                               'min': 32,
                               'q': 2},
                'ent_coef': {'distribution': 'uniform', 'max': 1, 'min': 0.01},
                'learning_rate': {'distribution': 'uniform',
                                  'max': 0.00015,
                                  'min': 1.5e-07},
                'num_envs': {'distribution': 'q_log_uniform_values',
                             'max': 16,
                             'min': 4,
                             'q': 2}}}


## Run main with the sweep 🏃 🧹
The `wandb.sweep` function initializes the sweep using the configuration. The `wandb.agent `function runs the sweep, executing the `sweep_main` function for each set of parameters.

📚 **Handling Parameters in Script**: In `sweep_main`, `wandb.init()` initializes a run. The script updates the args with the parameters from the sweep `(wandb.config)`, which are then passed to the main function.

📚 **Note**: Added `if __name__ == "__main__":` This ensures that main is called only when the script is executed directly, not when imported as a module.

In [62]:
if __name__ == "__main__":
    sweep_id = wandb.sweep(sweep=sweep_configuration, project="Fufi_sweep")  # Set up the sweep

    def sweep_main():
        with wandb.init() as run:
            args = parse_args()
            # Update args with sweep parameters
            args.learning_rate = wandb.config.learning_rate
            args.batch_size = wandb.config.batch_size
            args.ent_coef = wandb.config.ent_coef
            args.num_envs = wandb.config.num_envs

            main()

Create sweep with ID: p0y5snia
Sweep URL: https://wandb.ai/cartpole_maria_gaia/Fufi_sweep/sweeps/p0y5snia


In [None]:
 wandb.agent(sweep_id, function=sweep_main, count = 30)

[34m[1mwandb[0m: Agent Starting Run: awuz427q with config:
[34m[1mwandb[0m: 	batch_size: 54
[34m[1mwandb[0m: 	ent_coef: 0.8569368546754683
[34m[1mwandb[0m: 	learning_rate: 0.00011016910379135655
[34m[1mwandb[0m: 	num_envs: 8




VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

  deprecation(
  deprecation(
  deprecation(
  deprecation(
  logger.warn(
  logger.deprecation(
See here for more information: https://www.gymlibrary.ml/content/api/[0m
  deprecation(
  self.pid = _posixsubprocess.fork_exec(
  logger.warn(
See here for more information: https://www.gymlibrary.ml/content/api/[0m
  deprecation(
If you want to render in human mode, initialize the environment in this way: gym.make('EnvName', render_mode='human') and don't call the render method.
See here for more information: https://www.gymlibrary.ml/content/api/[0m
  deprecation(
  self.pid = _posixsubprocess.fork_exec(
If you want to render in human mode, initialize the environment in this way: gym.make('EnvName', render_mode='human') and don't call the render method.
See here for more information: https://www.gymlibrary.ml/content/api/[0m
  deprecation(
