# Tuning Fufi hyperparameters with W&B Sweep 🧹 🐶
The idea is automatically tuning hyperparameters with the sweep by the means of a `grid `search.
In order not to spend an eternity doing this thing, we start with a` random` search, and then put a `grid `search on it.

## W&B Setup
🪄 Install `wandb` library and login

Start by installing the library and logging in to your free account.

In [1]:
!pip install wandb -qU
# Log in to your W&B account
import wandb
wandb.login()

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m6.9/6.9 MB[0m [31m16.6 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m207.3/207.3 kB[0m [31m10.1 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m300.2/300.2 kB[0m [31m17.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m62.7/62.7 kB[0m [31m6.6 MB/s[0m eta [36m0:00:00[0m
[?25h

<IPython.core.display.Javascript object>

[34m[1mwandb[0m: Logging into wandb.ai. (Learn how to deploy a W&B server locally: https://wandb.me/wandb-server)
[34m[1mwandb[0m: You can find your API key in your browser here: https://wandb.ai/authorize
wandb: Paste an API key from your profile and hit enter, or press ctrl+c to quit:

 ··········


[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc


True

## Installing libraries 📚

In [1]:
# useful for import notebook
!pip install import-ipynb
!pip install gym==0.25.2
#needed from March
!pip install numpy==1.23.5



## Setting things up for the environment 🌍 🪖

In [2]:
import os

#saving current directory just to be sure
content_dir = os.getcwd()

#cloning Fufi repo from git
!git clone https://github.com/Gaianeve/gym-Fufi.git
#installing things
!pip install /content/gym-Fufi

Cloning into 'gym-Fufi'...
remote: Enumerating objects: 194, done.[K
remote: Counting objects: 100% (87/87), done.[K
remote: Compressing objects: 100% (87/87), done.[K
remote: Total 194 (delta 36), reused 0 (delta 0), pack-reused 107[K
Receiving objects: 100% (194/194), 70.91 KiB | 1.39 MiB/s, done.
Resolving deltas: 100% (59/59), done.
Processing ./gym-Fufi
  [1;31merror[0m: [1msubprocess-exited-with-error[0m
  
  [31m×[0m [32mpython setup.py egg_info[0m did not run successfully.
  [31m│[0m exit code: [1;36m1[0m
  [31m╰─>[0m See above for output.
  
  [1;35mnote[0m: This error originates from a subprocess, and is likely not a problem with pip.
  Preparing metadata (setup.py) ... [?25l[?25herror
[1;31merror[0m: [1mmetadata-generation-failed[0m

[31m×[0m Encountered error while generating package metadata.
[31m╰─>[0m See above for output.

[1;35mnote[0m: This is an issue with the package mentioned above, not pip.
[1;36mhint[0m: See above for details.


In [3]:
# Enter the environment directory
%cd /content/gym-Fufi
# Actually importing the library for our environment
import gym_Fufi

/content/gym-Fufi


In [4]:
#get back to content directory so I save everything there
%cd ..
!pwd

/content
/content


  and should_run_async(code)


## importing libraries and functions 📚


In [5]:
#libraries
import argparse
import random
import time
from distutils.util import strtobool
import gym
import numpy as np

import torch
import torch.nn as nn
import torch.optim as optim
import torch.optim.lr_scheduler as lr_scheduler
from torch.distributions.categorical import Categorical
from torch.utils.tensorboard import SummaryWriter
from torchsummary import summary

## Log needed function from files 📡
Loading files directly from git, so I don't have to upload it by hand.

In [6]:
#get files from git
!git clone https://github.com/Gaianeve/FUFONE.git

  and should_run_async(code)


Cloning into 'FUFONE'...
remote: Enumerating objects: 268, done.[K
remote: Counting objects: 100% (117/117), done.[K
remote: Compressing objects: 100% (113/113), done.[K
remote: Total 268 (delta 68), reused 4 (delta 4), pack-reused 151[K
Receiving objects: 100% (268/268), 8.37 MiB | 7.22 MiB/s, done.
Resolving deltas: 100% (142/142), done.


In [14]:
!pwd
%cd FUFONE/PPO
from environment import vectorize_env
from agent_class import Agent
from agent_utils import anneal, collect_data, GAE, PPO_train_agent, evaluate_agent

#back to main directory
%cd ..
#import main function from file
from main_function_sweep import main, parse_args

/content
/content/FUFONE/PPO
/content/FUFONE


  and should_run_async(code)


In [16]:
#back to content directory
%cd ..

/


## Define the sweep 📑
I assume to start with a random search, that is why we take random distribution:
*   `batch_size`  ➡  Quantized log uniform. Returns `round(X / q) * q` where `X` is` log_uniform_values`. Basically, a pretty good approximation of a log_q uniform distribution
*   `ent_coef ` ➡ Discrete uniform distribution on integers. Between 0.01, that is the default value and 1, that is the thing that works best for now.
*  ` num_envs`  ➡ see batch
*   `learning_rate` ➡ see batch




In [9]:
import numpy as np
import random

# Define sweep config
sweep_configuration = {
    "method": "random",
    "name": "sweep_Fufi",
    "metric": {"goal": "maximize", "name": "episodic_return"},
    "parameters": {
       "batch_size": {'distribution': 'q_log_uniform_values','q': 2,'min': 32,'max': 4096},
       "ent_coef": {'distribution': 'uniform', 'min': 0.01,'max': 1},
       "num_envs":  {'distribution': 'q_log_uniform_values','q': 2,'min': 4,'max': 16},
       "learning_rate":  {'distribution': 'uniform', 'min': 1.5e-7,'max': 1.5e-4}
    },
}


In [10]:
#print the result
import pprint
pprint.pprint(sweep_configuration)

{'method': 'random',
 'metric': {'goal': 'maximize', 'name': 'episodic_return'},
 'name': 'sweep_Fufi',
 'parameters': {'batch_size': {'distribution': 'q_log_uniform_values',
                               'max': 4096,
                               'min': 32,
                               'q': 2},
                'ent_coef': {'distribution': 'uniform', 'max': 1, 'min': 0.01},
                'learning_rate': {'distribution': 'uniform',
                                  'max': 0.00015,
                                  'min': 1.5e-07},
                'num_envs': {'distribution': 'q_log_uniform_values',
                             'max': 16,
                             'min': 4,
                             'q': 2}}}


## Run main with the sweep 🏃 🧹
The `wandb.sweep` function initializes the sweep using the configuration. The `wandb.agent `function runs the sweep, executing the `sweep_main` function for each set of parameters.

📚 **Handling Parameters in Script**: In `sweep_main`, `wandb.init()` initializes a run. The script updates the args with the parameters from the sweep `(wandb.config)`, which are then passed to the main function.

📚 **Note**: Added `if __name__ == "__main__":` This ensures that main is called only when the script is executed directly, not when imported as a module.

In [22]:
import wandb
if __name__ == "__main__":
    sweep_id = wandb.sweep(sweep=sweep_configuration, project="Fufi_sweep")  # Set up the sweep

    def sweep_main():
        with wandb.init() as run:
            args = parse_args()
            # Update args with sweep parameters
            args.learning_rate = wandb.config.learning_rate
            args.batch_size = wandb.config.batch_size
            args.ent_coef = wandb.config.ent_coef
            args.num_envs = wandb.config.num_envs

            main()

Create sweep with ID: w6znkcdg
Sweep URL: https://wandb.ai/cartpole_maria_gaia/Fufi_sweep/sweeps/w6znkcdg


In [None]:
 wandb.agent(sweep_id, function=sweep_main, count = 10)

[34m[1mwandb[0m: Agent Starting Run: 4rrkc1ho with config:
[34m[1mwandb[0m: 	batch_size: 898
[34m[1mwandb[0m: 	ent_coef: 0.7220375134685579
[34m[1mwandb[0m: 	learning_rate: 4.867026938330826e-05
[34m[1mwandb[0m: 	num_envs: 6




VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

  deprecation(
  deprecation(
  deprecation(
  deprecation(
  logger.warn(
  logger.deprecation(
See here for more information: https://www.gymlibrary.ml/content/api/[0m
  deprecation(
  self.pid = _posixsubprocess.fork_exec(
  logger.warn(
See here for more information: https://www.gymlibrary.ml/content/api/[0m
  deprecation(
If you want to render in human mode, initialize the environment in this way: gym.make('EnvName', render_mode='human') and don't call the render method.
See here for more information: https://www.gymlibrary.ml/content/api/[0m
  deprecation(
  self.pid = _posixsubprocess.fork_exec(
If you want to render in human mode, initialize the environment in this way: gym.make('EnvName', render_mode='human') and don't call the render method.
See here for more information: https://www.gymlibrary.ml/content/api/[0m
  deprecation(
  self.pid = _posixsubprocess.fork_exec(
If you want to render in human mode, initialize the environment in this way: gym.make('EnvName', render_m

VBox(children=(Label(value='0.967 MB of 0.967 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
charts/SPS,▁▃▄▅▅▆▆▇▆▆▇▇▇▇▇▇▇▇▇▇▇▇█▇████████████████
charts/episodic_length,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▂▂▂▃▂▂▂▁▄▂▃▃▆▄▄▃▃█▃▃▄▄▂
charts/episodic_return,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▂▂▂▃▂▂▂▁▄▂▃▃▆▄▄▃▃█▃▃▄▄▂
charts/learning_rate,███▇▇▇▇▇▇▆▆▆▆▆▆▅▅▅▅▅▄▄▄▄▄▄▃▃▃▃▃▂▂▂▂▂▂▁▁▁
episodic_return,▁▁▁▁▁▁▁▁▂▁▁▁▁▁▁▁▁▂▃▃▃▂▄▁▅▃▃▃█▂▅▃▄▃▆▃▃▃▂▃
global_step,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▂▂▂▂▂▂▃▃▃▄▄▄▄▅▅▅▆▆▆▆▇▇▇██
losses/approx_kl,▄██▆▅▇▅▃▄▅▄▃▃▂▃▃▃▂▃▂▂▁▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
losses/clipfrac,▁▆▇▅▄█▅▂▂▃▃▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
losses/entropy,█▆▅▃▂▁▂▂▁▁▂▂▂▃▃▃▃▃▂▃▄▄▃▄▄▄▄▃▄▄▃▄▃▃▄▄▄▄▄▄
losses/explained_variance,▁▅█▅▂▆▄▃▂▆▄▆▃▃▄▃▃▃▃▂▂▃▄▃▃▁▃▅▂▃▂▃▃▃▃▂▃▃▃▃

0,1
charts/SPS,1563.0
charts/episodic_length,201.0
charts/episodic_return,201.0
charts/learning_rate,0.0
episodic_return,201.0
global_step,499712.0
losses/approx_kl,0.0
losses/clipfrac,0.0
losses/entropy,5.20797
losses/explained_variance,3e-05


[34m[1mwandb[0m: Agent Starting Run: 7bsdsjpe with config:
[34m[1mwandb[0m: 	batch_size: 340
[34m[1mwandb[0m: 	ent_coef: 0.5511765739819863
[34m[1mwandb[0m: 	learning_rate: 0.00013562004536524507
[34m[1mwandb[0m: 	num_envs: 10




VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

  deprecation(
  deprecation(
  deprecation(
  deprecation(
  logger.warn(
  logger.deprecation(
See here for more information: https://www.gymlibrary.ml/content/api/[0m
  deprecation(
  self.pid = _posixsubprocess.fork_exec(
  logger.warn(
See here for more information: https://www.gymlibrary.ml/content/api/[0m
  deprecation(
If you want to render in human mode, initialize the environment in this way: gym.make('EnvName', render_mode='human') and don't call the render method.
See here for more information: https://www.gymlibrary.ml/content/api/[0m
  deprecation(
  self.pid = _posixsubprocess.fork_exec(
If you want to render in human mode, initialize the environment in this way: gym.make('EnvName', render_mode='human') and don't call the render method.
See here for more information: https://www.gymlibrary.ml/content/api/[0m
  deprecation(
  self.pid = _posixsubprocess.fork_exec(
If you want to render in human mode, initialize the environment in this way: gym.make('EnvName', render_m