<a href="https://colab.research.google.com/github/Strojove-uceni/2024-final-letadylka-prochazka-belohlavek/blob/main/demo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Demo file
Authors: Michal Belohlavek, Tomas Prochazka
## Intro
Welcome to the demo file, where one can run the project with zero effort and see the results and visualistaion for themselves. While this is an easy and plesant way to enjoy this Neural Net, we strongly urge everyone to run the project as it was intended, expand upon it and improve it. If you seek a commented version of the code in the most miniscule detail, please take a look at our implementation in the GitHub repo.
## Abstract
This project was created and submited as the final semestral project for the Machine Learning 2 class on FNSPE CTU. We take the structure and code from the authors of Multi-Agent Reinforcement Learning in Graphs (reference in README). We modified, reshaped and added valuable parts to the original implementation to make it applicable to the problem of free routing and plane path navigation. The core aim of this project is to provide a rather fast neural network that navigates multiple planes along a graph with dynamically changing weights with the goal of reaching the target as fast as possible, while avoiding collisions. In this demo file, we demonstrate our work, provide an overview of the used techniques and give advice on how to select fine tuned hyperparameters.

*For additional details, see our presentation on GitHub.*
## Shoert overview of the used ML techniques

## How to run the project
### Selecting parameters
### Advice for parameter selection

## Our results


Download files from our Git repository




In [None]:
# Download the network.py file from GitHub
!wget -q https://raw.githubusercontent.com/Strojove-uceni/2024-final-letadylka-prochazka-belohlavek/main/src/environment.py

!wget -q https://raw.githubusercontent.com/Strojove-uceni/2024-final-letadylka-prochazka-belohlavek/main/src/eval.py

!wget -q https://raw.githubusercontent.com/Strojove-uceni/2024-final-letadylka-prochazka-belohlavek/main/src/network.py

!wget -q https://raw.githubusercontent.com/Strojove-uceni/2024-final-letadylka-prochazka-belohlavek/main/src/routing.py

!wget -q https://raw.githubusercontent.com/Strojove-uceni/2024-final-letadylka-prochazka-belohlavek/main/src/policy.py

!wget -q https://raw.githubusercontent.com/Strojove-uceni/2024-final-letadylka-prochazka-belohlavek/main/src/model.py

!wget -q https://raw.githubusercontent.com/Strojove-uceni/2024-final-letadylka-prochazka-belohlavek/main/src/wrapper.py

!wget -q https://github.com/Strojove-uceni/2024-final-letadylka-prochazka-belohlavek/tree/main/data/adj_mat_fixed.npy

!wget -q https://github.com/Strojove-uceni/2024-final-letadylka-prochazka-belohlavek/tree/main/data/dist_mat_fixed.npy

!wget -q https://github.com/Strojove-uceni/2024-final-letadylka-prochazka-belohlavek/tree/main/data/sparse_points_fixed.json

!wget -q https://github.com/Strojove-uceni/2024-final-letadylka-prochazka-belohlavek/blob/main/src/config.yaml

!pip install pyyaml

Import the files

In [None]:
import numpy as np
import json

adj_mat = np.load('adj_mat_fixed.npy')
dist_mat = np.load('dist_mat_fixed')
with open('sparse_points_fixed.json', 'r') as file:
    sparse_points = json.load(file)

import yaml

with open('config.yaml', 'r') as file:
    config = yaml.safe_load(file)

main.py adjusted for evaluation only

In [None]:
from network import Network
from environment import reset_and_get_sizes

import numpy as np
import torch
import torch.nn.functional as F
import torch.optim as optim
import copy
import traceback
import json
import networkx as nx
import matplotlib.pyplot as plt
from collections import defaultdict
from tqdm import tqdm

from model import DGN, MLP, CommNet, NetMon, DQN
from routing import Routing

from wrapper import NetMonWrapper
from policy import ShortestPath, EpsilonGreedy
from pathlib import Path
from torch.utils.tensorboard.writer import SummaryWriter
from util import (
    dim_str_to_list,
    filter_dict,
    get_state_dict,
    interpolate_model,
    load_state_dict,
    set_attributes,
    set_seed,
)
from eval import evaluate
import sys
from torch.utils.tensorboard.writer import SummaryWriter

In [None]:
# Normalize distance matrix
dist_mat[adj_mat==0] = 0
min = 1000
for i in range(118):
    for j in range(118):
        if 0< dist_mat[i][j] < min:
            min = dist_mat[i][j]
max = np.max(dist_mat)

new_min = 1
new_max = 10
dist_mat = ((dist_mat-min)/(max-min))*(new_max-new_min) + new_min

config['only_eval']['eval'] = True # overwrite any setting to evaluate only
if config['only_eval']['eval']:
    assert Path(config['only_eval']['model_path']).exists()
    loaded_dict = torch.load(config['only_eval']['model_path'], map_location='cpu')
    loaded_model_arg_values = loaded_dict["args"]
    loaded_model_arg_values['only_eval'] = {}
    loaded_model_arg_values['only_eval']['eval'] = True
    loaded_model_arg_values['only_eval']['model_path'] = config['only_eval']['model_path']
    config = loaded_model_arg_values
    config['evaluation']['episodes'] = 10
    config['evaluation']['episode_steps'] = 100
    config['training']['mini_batch_size'] = 8
    config['training']['sequence_length'] = 5
    config['netmon']['iterations'] = 3
    config['device'] = 'cpu'
    for key in config:
        if key == "device":
            print(f"{key}:  {config[key]}")
            continue
        print(key)
        for subkey in config[key]:
            print(f"\t{subkey}: {config[key][subkey]}")

cbase = config['base']
cnetmon = config['netmon']
device = config['device']
ctar_update = config['target_update']
ceval = config['evaluation']
ctraining = config['training']
ceps = config['epsilon_greedy']

# Define network environment
network = Network(adj_mat, dist_mat, sparse_points)


# Define type of environment
env = Routing(network, cbase['n_planes'], cbase['env_var'], adj_mat, dist_mat, k=cbase['n_neighbors'], enable_action_mask=False)

# Define activation function
activation_function = getattr(F, cbase['activ_f'])

# Dynamically resets the environment
n_agents, agent_obs_size, n_nodes, node_obs_size = reset_and_get_sizes(env)

print("Sizes before netmon:")
print("Agent observation size: ", agent_obs_size)
print("Node observation size: ", node_obs_size)

# Use NetMon - init is rather long :)
netmon = NetMon(node_obs_size,  # 'in_features' in init
                cnetmon['dim'],     # 'hidden_features' in init
                cnetmon['enc_dim'] , # 'encoder_units' in init
                iterations=cnetmon['iterations'],
                activation_fn=activation_function,
                rnn_type= cnetmon['rnn_type'], rnn_carryover=cnetmon['rnn_carryover'], agg_type=cnetmon['agg_type'],
                output_neighbor_hidden=cnetmon['neighbor'], output_global_hidden=cnetmon['global']
                ).to(device)    # Move to device


# Get observations from the environment
summary_node_obs = torch.tensor(env.get_node_observation(), dtype=torch.float32, device=device).unsqueeze(0)
summary_node_adj = torch.tensor(env.get_nodes_adjacency(), dtype=torch.float32, device=device).unsqueeze(0)
summary_node_agent = torch.tensor(env.get_node_agent_matrix(), dtype=torch.float32, device=device).unsqueeze(0)
# Summarizes our current model - just to have it somewhere
netmon_summary = netmon.summarize(summary_node_obs, summary_node_adj, summary_node_agent)
node_state_size = netmon.get_state_size()

node_aux_size = 0 if env.get_node_aux() is None else len(env.get_node_aux()[0]) # = n_waypoints

# Now we wrap the whole netmon class with a Wrapper - agents will use observations from netmon
env = NetMonWrapper(env, netmon, cnetmon['start_up_iters'])
_, agent_obs_size, _, _ = reset_and_get_sizes(env)  # Observation length

print("Sizes after netmon:")
print(f"Node state size: {node_state_size}")        # 256
print(f"Agent observation size with netmon: {agent_obs_size}")  # 3263
print(f"Node auxiliary size: {node_aux_size}")      # 0



# In_features are 'agent_obs_size'
# 'env.action_space.n' is equal to the number of neighbors - choices, 'num_actions' in DGN definition
cdgn = config['dgn']
if cbase['model_type'] == "dgn":
    model = DGN(agent_obs_size, cdgn['hidden_dim'], env.action_space.n, cdgn['heads'], cdgn['att_layers'], activation_function, cdgn['kv_values']).to(device)
elif cbase['model_type'] == "comm_net":
    ccom_net = config['commnet']
    model = CommNet(agent_obs_size, cdgn['hidden_dim'], env.action_space.n, comm_rounds=ccom_net['comm_rounds'], activation_fn=activation_function).to(device)
elif cbase['model_type'] == "dqn":
    model = DQN(agent_obs_size, cdgn['hidden_dim'], env.action_space.n, activation_function).to(device)
else:
    raise ValueError("Invalid model type")


# print(config)
# Load paramters of model for quick evaluation
if config['only_eval']['eval']:
    assert Path(config['only_eval']['model_path']).exists()
    load_state_dict(
        torch.load(config['only_eval']['model_path'], map_location=device),
        model,
        netmon,
    )


In [None]:
model_tar = copy.deepcopy(model).to(device)     # Create a deep copy of the current model == DGN

model = model.load_state_dict(torch.load('model.pth'))
model_has_state = hasattr(model, "state")
aux_model = None

policy = EpsilonGreedy(env, model, env.action_space.n, epsilon=ceps['epsilon'], step_before_train=ctraining['step_before_train'], epsilon_update_freq=ceps['epsilon_update_freq'], epsilon_decay=ceps['epsilon_decay'])

if config['only_eval']['eval']:
    model.eval()
    netmon.eval()
    print("loaded")
    print("Performing Evaluation")
    metrics = evaluate(env, policy, ceval['episodes'], ceval['episode_steps'],
                      True, "eval_dict", ceval['output_detailed'], ceval['output_node_state_aux']
                      )
    print(json.dumps(metrics, indent=4, sort_keys=True, default=str))
    sys.exit(0)

In [None]:
# Evaluate
comment = "_"
if hasattr(env, "env_var"):
    comment += f"R{env.env_var.value}"
comment += "_netmon"
writer = SummaryWriter(comment=comment)
print("Performing evaluation:")
metrics = evaluate(
    env,
    policy,
    ceval['episodes'],
    ceval['episode_steps'],
    Path(writer.get_logdir()) /"eval",
    ceval['output_detailed'],
    ceval['output_node_state_aux']
)
paths_to_save = env.save_paths()
print(json.dumps(metrics, indent = 4, sort_keys=True, default=str))

for plane in env.planes:
    print(plane.paths)

env.plot_trajectory()