### Cloning the repository and changing directory

1. Clone the repository from GitHub.
2. Navigate to the 'generative-models' directory inside the cloned repository.

In [None]:
!git clone https://github.com/morel-g/generative-models.git
%cd generative-models

### Install necessary libraries

To run the provided code successfully, we first install several essential Python libraries and dependencies.

In [None]:
%%capture
!pip install pytorch_lightning
!pip install datasets
!pip install einops
!pip install diffusers
!pip install geoopt
!pip install cartopy
!pip install mlflow

#### Install libraries for RL datasets (optional)

##### Install for Kaggle notebooks

In [None]:
%%capture
import os
if not os.path.exists('.mujoco_setup_complete'):
  # Get the prereqs
  !apt-get -qq update
  !apt-get -qq install -y libosmesa6-dev libgl1-mesa-glx libglfw3 libgl1-mesa-dev libglew-dev patchelf
  # Get Mujoco
  !mkdir ~/.mujoco
  !wget -q https://mujoco.org/download/mujoco210-linux-x86_64.tar.gz -O mujoco.tar.gz
  !tar -zxf mujoco.tar.gz -C "$HOME/.mujoco"
  !rm mujoco.tar.gz
  # Add it to the actively loaded path and the bashrc path (these only do so much)
  !echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HOME/.mujoco/mujoco210/bin' >> ~/.bashrc 
  !echo 'export LD_PRELOAD=$LD_PRELOAD:/usr/lib/x86_64-linux-gnu/libGLEW.so' >> ~/.bashrc 
  # THE ANNOYING ONE, FORCE IT INTO LDCONFIG SO WE ACTUALLY GET ACCESS TO IT THIS SESSION
  !echo "/root/.mujoco/mujoco210/bin" > /etc/ld.so.conf.d/mujoco_ld_lib_path.conf
  !ldconfig
  # Install Mujoco-py
  !pip3 install -U 'mujoco-py<2.2,>=2.1'
  # run once
  !touch .mujoco_setup_complete

try:
  if _mujoco_run_once:
    pass
except NameError:
  _mujoco_run_once = False
if not _mujoco_run_once:
  # Add it to the actively loaded path and the bashrc path (these only do so much)
  try:
    os.environ['LD_LIBRARY_PATH']=os.environ['LD_LIBRARY_PATH'] + ':/root/.mujoco/mujoco210/bin'
  except KeyError:
    os.environ['LD_LIBRARY_PATH']='/root/.mujoco/mujoco210/bin'
  try:
    os.environ['LD_PRELOAD']=os.environ['LD_PRELOAD'] + ':/usr/lib/x86_64-linux-gnu/libGLEW.so'
  except KeyError:
    os.environ['LD_PRELOAD']='/usr/lib/x86_64-linux-gnu/libGLEW.so'
  # presetup so we don't see output on first env initialization
  import mujoco_py
  _mujoco_run_once = True
#sources of this code block : https://gist.github.com/BuildingAtom/3119ac9c595324c8001a7454f23bf8c8, 
#https://www.kaggle.com/code/mmdalix/openai-gym-mujoco-env-setup-and-training-2022/notebook
!pip install git+https://github.com/Farama-Foundation/d4rl@master#egg=d4rl

import os
os.environ.pop('LD_PRELOAD', None)

##### Install for Colab notebooks

In [None]:
%%capture
# installations primiarly needed for Mujoco
!apt-get install -y \
    libgl1-mesa-dev \
    libgl1-mesa-glx \
    libglew-dev \
    libosmesa6-dev \
    software-properties-common
!apt-get install -y patchelf
%pip install -f https://download.pytorch.org/whl/torch_stable.html \
                free-mujoco-py \
                gym \
                git+https://github.com/rail-berkeley/d4rl.git \
                mediapy

---

### Running the main script with configuration files

**Configuration files**: Execute the `main.py` script with a configuration file provided in `config_files/`
The files contain specific parameters and settings for the code execution.

**Output Location**: After execution, the results will be saved in `../outputs/version_i/figures/`. Where `i` denotes the version number of the current run.

**MLflow integration**: MLflow is a powerful open-source platform designed for managing the complete machine learning lifecycle. MLflow offers tools for logging parameters, code versions, metrics, and artifacts in machine learning projects, enabling more efficient tracking and comparison of various experiments. MLflow can be used with the present code. By default mlflow repository is ../mlrun. To use MLflow in a colab environement see the [section below]() (ref to mlflow section)

---

#### Note on progress bar display in notebook/Colab

When running the script in a notebook/Colab environment, you might encounter an issue with the progress bar display. Specifically, the validation progress bar may be displayed on a new line at each update. This behavior is a result of the interaction between the notebook/Colab and the progress bar implementation in PyTorch Lightning.

**Possible solutions**:
1. **Using a different progress bar**: Some libraries offer notebook-specific progress bars that display better in these environments. You could consider integrating one of these if the progress bar issue with PyTorch Lightning remains problematic.
2. **Disabling the progress bar**: If the display issue becomes too distracting, consider disabling the progress bar by adjusting the respective parameter in the configuration file.
3. **Running locally**: If possible, run the script in a local environment (like a terminal) where the progress bar display works as expected.

---



To run a model on a toy dataset use the configuration files provided in **`config_files/toy/`**.



In [None]:
CONFIG_FILE="config_files/toy/score_toy_config"
!python main.py --config_file $CONFIG_FILE

To run a model on an image dataset use the configuration files provided in **`config_files/img/`**.

Make sure to have a GPU available.

In [None]:
CONFIG_FILE="config_files/img/score_fashion_mnist_config.py"
!python main.py --config_file $CONFIG_FILE

To generate samples from a specific checkpoint use the main_viz file.

In [None]:
CKPT_PATH = "path/to/some/checkpoint.ckpt"
!python main_viz.py -c $CKPT_PATH -gpu 0

###  MLflow integration for colab environement

MLflow is a powerful open-source platform designed for managing the complete machine learning lifecycle. MLflow offers tools for logging parameters, code versions, metrics, and artifacts in machine learning projects, enabling more efficient tracking and comparison of models. In this section, we'll go through the process of utilizing MLflow in a colab notebook for effective experiment tracking.

MLflow's tracking server usually runs locally. However, in a Colab environment, the server would be running on a remote server in the cloud, not accessible directly from the local machine. `pyngrok` is a Python wrapper for Ngrok, a utility that creates a secure tunnel to the localhost. This tunnel allows to access the MLflow tracking server running on Colab from local machine's web browser. First install pyngrok:

In [None]:
!pip install pyngrok --quiet

And then run mlflow through pyngrok's tunnel. It is needed to create an account here https://dashboard.ngrok.com/ to obtain an authtoken.

In [None]:
import os

# Set the MLflow tracking URI to a local directory
os.environ['MLFLOW_TRACKING_URI'] = 'file:///content/mlruns'
# Start the MLflow tracking server in the background
get_ipython().system_raw("mlflow ui --backend-store-uri /content/mlruns &")


# create remote tunnel using ngrok.com to allow local port access
# borrowed from https://colab.research.google.com/github/alfozan/MLflow-GBRT-demo/blob/master/MLflow-GBRT-demo.ipynb#scrollTo=4h3bKHMYUIG6

from pyngrok import ngrok

# Terminate open tunnels if exist
ngrok.kill()

# Setting the authtoken (optional)
# Get your authtoken from https://dashboard.ngrok.com/auth
NGROK_AUTH_TOKEN = ""
ngrok.set_auth_token(NGROK_AUTH_TOKEN)

# Set up a tunnel to the mlflow ui port 5000
public_url = ngrok.connect(addr="5000")
print("MLflow Tracking UI:", public_url)