# Homework 1: Intro to Deep RL with Single Agent Training Environments

The goal of this assignment is to gain hands-on experience with the key components of Reinforcement Learning (RL) environments. 

For more details please checkout [HW1.md](../HW1.md)

## Setup

You will need to make a copy of this notebook in your Google Drive before you can edit the homework files. You can do so with **File &rarr; Save a copy in Drive**.

In [None]:
#@title Mount Your Google Drive
#@markdown Your work will be stored in a folder called `cs285_f2022` by default to prevent Colab instance timeouts from deleting your edits.

import os
from google.colab import drive
drive.mount('/content/gdrive')

In [None]:
#@title Setup Mount Symlink

DRIVE_PATH = '/content/gdrive/My\ Drive/rl_class'
DRIVE_PYTHON_PATH = DRIVE_PATH.replace('\\', '')
if not os.path.exists(DRIVE_PYTHON_PATH):
  %mkdir $DRIVE_PATH

## the space in `My Drive` causes some issues,
## make a symlink to avoid this
SYM_PATH = '/content/rl_class'
if not os.path.exists(SYM_PATH):
  !ln -s $DRIVE_PATH $SYM_PATH

In [None]:
#@title Apt Install Requirements

#@markdown Run each section with Shift+Enter

#@markdown Double-click on section headers to show code.

!apt update
!apt install -y --no-install-recommends \
        build-essential \
        curl \
        git \
        gnupg2 \
        make \
        cmake \
        ffmpeg \
        swig \
        libz-dev \
        unzip \
        zlib1g-dev \
        libglfw3 \
        libglfw3-dev \
        libxrandr2 \
        libxinerama-dev \
        libxi6 \
        libxcursor-dev \
        libgl1-mesa-dev \
        libgl1-mesa-glx \
        libglew-dev \
        libosmesa6-dev \
        lsb-release \
        ack-grep \
        patchelf \
        wget \
        xpra \
        xserver-xorg-dev \
        xvfb


In [None]:
#@title Clone Homework Repo

%cd $SYM_PATH
!git clone https://<TOKEN>@<Your Personalized Github Classroom Assignement link>
# Use GitHub Personal Access Token as a password. Be careful not to expose your token; it's equivalent to a password!!! Don't commit your token!
%cd <Your Personalized Github Classroom Assignement repo>
%pip install swig
%pip install -e .
%pip install -r requirements_colab.txt


In [None]:
# # In case we need to install the dependencies manually
# !pip install  tensorboard  torch swig gymnasium[box2d] ray[rllib] scikit-image pygame numba PyYAML GitPython tensorflow_probability

## Editing Code

To edit code, click the folder icon on the left menu. Navigate to the corresponding file (`multigrid/...`). Double click a file to open an editor. There is a timeout of about ~12 hours with Colab while it is active (and less if you close your browser window). We sync your edits to Google Drive so that you won't lose your work in the event of an instance timeout, but you will need to re-mount your Google Drive and re-install packages with every new instance.

In [None]:
#@title Imports
from __future__ import annotations

import json
import pathlib
from pathlib import Path
import os
import subprocess
import numpy as np
from dataclasses import dataclass, asdict, field
from types import SimpleNamespace
import git
from IPython.display import Image

import ray
from multigrid.envs import *
from multigrid.utils.training_utilis import algorithm_config, get_checkpoint_dir
from multigrid.scripts.train import configure_algorithm, train
from multigrid.scripts.visualize import main_evaluation


In [None]:
#@title Fix Variables

# Set the working diretory to the repo root
REPO_ROOT = subprocess.check_output(['git', 'rev-parse', '--show-toplevel']).strip().decode('utf-8')
os.chdir(REPO_ROOT)

SUBMISSION_CONFIG_FILE = sorted(
    Path("submission").expanduser().glob("**/submission_config.json"), key=os.path.getmtime
)[-1]

with open(SUBMISSION_CONFIG_FILE, "r") as file:
    submission_config_data = file.read()

submission_config = json.loads(submission_config_data)

SUBMITTER_NAME = submission_config["name"]

CURRENT_DIR = os.getcwd()
GIT_COMMIT_HASH =  git.Repo(REPO_ROOT).head.commit
TAGS = {"user_name": SUBMITTER_NAME, "git_commit_hash": GIT_COMMIT_HASH}


## For Agent Training



In [None]:
#@title Training Arguments
@dataclass
class Args:

  #@markdown agent config
  num_agents: int = 1 #@param {type: "integer"}
  algo: str = "PPO"  #@param {type: "string"}
  framework: str = "torch" #@param ['torch', 'tf2']
  lstm: bool = False #@param {type:"boolean"}
  training_scheme: str = "DTDE"  #@param ['CTCE', 'DTDE', 'CTDE']

  #@markdown environemnt config
  env: str = "MultiGrid-CompetativeRedBlueDoor-v3-DTDE-Red-Single"  #@param {type: "string"}

  #@markdown training config
  num_workers: int = 10  #@param {type: "integer"}
  num_gpus: int = 0 #@param {type: "integer"}
  lr: float = 0.001  #@param {type: "float"}
  # NOTE Please only keep the checkpoints that you want to submit
  save_dir: str = "submission/ray_results/" #@param {type: "string"}
  load_dir: str = None #@param {type:"string", default:None}
  user_name: str = "<Your Name>" #@param {type: "string"}
  experiment_name: str = "my_experiment" #@param {type: "string"}
  mlflow_tracking_uri: str = "submission/mlflow/", #@param {type: "string"}
  checkpoint_freq: int = 20 #@param {type: "integer"}
  num_timesteps: float = 1e6 #@param {type: "string"}
  checkpoint_freq: int = 20 #@param {type: "integer"}
  seed: int = 1 #@param {type: "integer"}
  local_mode: bool = False  #@param {type:"boolean"}

  def to_namespace(self):
      return SimpleNamespace(**asdict(self))




In [None]:
#@title Set up Training Arguments
args = Args().to_namespace()
print(args) # Prints the values of all attributes

config = configure_algorithm(args)
stop_conditions = {"timesteps_total": args.num_timesteps}


 ## Initialize and Show Tensorboard Before Training

 Filter tags for key performance metrics:

episode_len_mean|ray/tune/episode_reward_mean|episode_reward_min|entropy|vf|loss|kl|cpu|ram

In [None]:
# Load the TensorBoard notebook extension
%load_ext tensorboard

In [None]:
# Start TensorBoard and Map the `logdir`` to `save_dir` i.e. `/content/gdrive/MyDrive/rl_class/week-1-intro-to-deep-rl-and-agent-training-environment/submission/ray_results/`
%tensorboard --logdir /content/rl_class/<Your Personalized Github Classroom Assignement repo>/submission/ray_results

## Execute training

In [None]:
#@title Execute Training
train(
    algo=args.algo,
    config=config,
    stop_conditions=stop_conditions,
    save_dir=args.save_dir,
    load_dir=args.load_dir,
    local_mode=args.local_mode,
    experiment_name=args.experiment_name,
    training_scheme=args.training_scheme,
)

# NOTE - Please remember to clear your training outputs before you submit your notebook to reduce file size and increase readability

In [None]:
# NOTE-  Manually shutdown Ray if needed
ray.shutdown()

## Submission for Task 3 - Monitor and Track Agent Training with Tensorboard and Save Out Visualization from Evaluation

1. Please take screenshots of your Tensorboard plots that highlight your performance metrics
2. Embedd your images here in CoLab
3. Only save the best checkpoint and video in the /submission folder and push to your repo


In [None]:
from google.colab import files
from IPython.display import Image

uploaded = files.upload()

# Assuming a single image file is uploaded
for fn in uploaded.keys():
  print('User uploaded file "{name}" with length {length} bytes'.format(name=fn, length=len(uploaded[fn])))
  display(Image(fn))

In [None]:
#@title Your Tensorboard Screenshots Go Here
...

## Agent Evaluation

#### *For running the Evaluation, we recommend restarting the kernel first to avoid any possible environmental issues that linger from training

In [None]:

#@title Evaluation Arguments

@dataclass
class EvalArgs:

  #@markdown agent config
  algo: str = "PPO"  #@param {type: "string"}
  framework: str = "torch" #@param ['torch', 'tf2']
  lstm: bool = False #@param {type: "boolean"}

  #@markdown environemnt config
  env: str = "MultiGrid-CompetativeRedBlueDoor-v3-DTDE-Red-Single"  #@param {type: "string"}
  env_config: Dict = field(default_factory=dict)  # Use default_factory to create a new dict for each instance

  #@markdown Evaluation config
  num_episodes: int = 10 #@param {type: "integer"}
  load_dir: str = "submission/ray_results/MultiGrid-CompetativeRedBlueDoor-v3-DTDE-Red-Single/" #@param {type: "string"}
  save_dir: str = "submission/evaluation_reports/" #@param {type: "string"}
  gif: str = "DTDE-Red-Single-testing" #@param {type: "string"}
  render_mode: str = "human" #@param {type: "string"}

  def to_namespace(self):
    return SimpleNamespace(**asdict(self))

In [None]:
#@title Set up Evaluation Arguments

eval_args = EvalArgs().to_namespace()
print(eval_args)  # Prints the values of all attributes

In [None]:
#@title Execute Evaluation
exported_gif_filename = main_evaluation(args=eval_args)

In [None]:
#@title Visualize and Display Evaluated Agent Behavniors

# Load the GIF
display(Image(filename=exported_gif_filename))

# Recommendations for Homework Submission
Backup and download your notebook first from Google CodeLab

In [None]:
#@title Makesure you are already in YOUR_REPO_FOLDER i.e. /content/rl_class/multigrid
!pwd

In [None]:
#@title Configure Git
!git config --global user.email "your-email@example.com"
!git config --global user.name "Your Name"


In [None]:
#@title Move the Notebook to the Repo Folder (Optional)
!mv "your-homework1.ipynb" "/content/YOUR_REPO_FOLDER/notebooks"


In [None]:
#@title Add, Commit and Push Changes to GitHub Classroom

!git add .
!git commit -m "Your commit message"
!git push origin main
