## Tutorial 1: Overview of *RSA-RL*
This tutorial shows overview of ***RSA-RL***.  
*RSA-RL* consists of three components: ***Agent***, ***Environment*** consisting of ***Network***, and ***Requester***. 
Let's import their necessary components. 

In [15]:
!pip install git+https://github.com/Optical-Networks-Group/rsa-rl.git

Collecting git+https://github.com/Optical-Networks-Group/rsa-rl.git
  Cloning https://github.com/Optical-Networks-Group/rsa-rl.git to /tmp/pip-req-build-gl3yejvy
  Running command git clone -q https://github.com/Optical-Networks-Group/rsa-rl.git /tmp/pip-req-build-gl3yejvy
Building wheels for collected packages: rsarl
  Building wheel for rsarl (setup.py) ... [?25l[?25hdone
  Created wheel for rsarl: filename=rsarl-1.0.0-cp36-none-any.whl size=340387 sha256=2a06ed17eff706aeda089dda0d0fecc92104e8497a48c767b085c2e3082dfd62
  Stored in directory: /tmp/pip-ephem-wheel-cache-ebd43xtz/wheels/15/a4/71/b2231a5f0b14ef5ff7de5c7c1ad520f474351e764dfc5c0f3e
Successfully built rsarl


In [None]:
import numpy as np
from rsarl.envs import DeepRMSAEnv
from rsarl.requester import UniformRequester
from rsarl.networks import SingleFiberNetwork
from rsarl.agents.ksp_agents import KSP_FF_Agent

## Experimental Settings
First, we select the network topology, i.e. ***Network*** and request generator, i.e.***Requester***,  
and then build ***Environment*** with a random seed for re-productivity. 
In this tutorial, type of ***Network*** is ***SingleFiberNetwork***, and topology is *National Science Foundation(NSF)*.

In [None]:
# build network: topology-name, the number of slots, whether to consider weighted edges or not
net = SingleFiberNetwork("nsf", n_slot=100, is_weight=True)
# build requester
requester = UniformRequester(
    net.n_nodes,
    avg_service_time=10,
    avg_request_arrival_rate=12)

In [None]:
# Reward is +1 if assignment succeeds; otherwise -1. 
env = DeepRMSAEnv(net, requester)
# setting seed for reproductivity
env.seed(0)

Next, the *Agent* of the ***K shortest path and first fit algorithm***  is generated with *k*=5. 
If you use K shortest path algorithm as routing, its related Agent is built and calculate k shortest paths in advance. 

In [None]:
# build agent
agent = KSP_FF_Agent(k=5)
# pre-calculate all path related to all combination of a pair of nodes
agent.prepare_ksp_table(net)

## Evaluation Loop
That's all to prepare for evaluation! Let's evaluate prepared *Agent*. 

In [None]:
# exp settings
n_requests = 10000

# metrics
n_blocking = 0
total_reward = 0

obs = env.reset()
for _ in range(n_requests):
    # Get action from observation
    act = agent.act(obs)
    # Do action and get next state
    obs, reward, done, info = env.step(act)
    # Store next state
    if done:
        obs = env.reset()
        
    # calc performance
    n_blocking += 0 if info["is_success"] else 1
    total_reward += reward

    
print(f'Blocking Probability: {n_blocking / n_requests * 100}')
print(f'Total Rewards: {total_reward}')

Blocking Probability: 7.12
Total Rewards: 8576.0


Congratulation! 
You have finished evaluating the *Agent*. 
This is a simple example of evaluating *Agent*. 
As shown in this program, *RSA-RL* already provides famous heuristic algorithms. 

### Convenient Library 1: *evaluator*
*RSA-RL* provides convenience library: ***evaluator***  function.
*evaluator* conducts the above evaluation loop. 
***experience*** is a log of interaction between *Environment* and *Agent*. 

In [None]:
from rsarl.evaluator import evaluation

In [None]:
env.reset()
experiences = evaluation(env, agent, n_requests)

In [None]:
# calc performance
n_blocking = sum([0 if x.is_success else 1 for x in experiences])
total_reward = sum([x.reward for x in experiences])

print(f'Blocking Probability: {n_blocking / n_requests * 100}')
print(f'Total Rewards: {total_reward}')

Blocking Probability: 7.12
Total Rewards: 8576.0


### Convenient Library 2: *summary*
*RSA-RL* also provides convenience function: ***summary*** that calculates three metrics: 

- **Blocking probability**
- **Slot utilization**
- **Total rewards**

*summary* function measures performance based on the returned *experiences* from *evaluation* function.  

In [None]:
from rsarl.evaluator import summary
# calc performance
blocking_prob, avg_util, total_reward = summary(experiences)

print(f'Blocking Probability: {blocking_prob}')
print(f'Avg. Slot-utilization: {avg_util}')
print(f'Total Rewards: {total_reward}')

Blocking Probability: 7.12
Avg. Slot-utilization: 0.4321618636363636
Total Rewards: 8576.0


### Convenient Library 3: *batch_evaluator* and *batch_summary*
*RSA-RL* also provides aforementioned library in batch type. 
For batch-type library, *Environment*  must be converted to the following batch-type *Environment*. 
 
- `SerialVectorEnv`
- `MultiprocessVectorEnv`

Built *Environment* is applied to `make_serial_vector_env`(`make_multiprocess_vector_env`) function. 

In [None]:
from rsarl.envs import make_multiprocess_vector_env, make_serial_vector_env
from rsarl.evaluator import batch_warming_up, batch_evaluation

In [None]:
seed = 0
n_envs = 5
# build batch-env
envs = make_serial_vector_env(env, n_envs, seed, test=True)
# envs = make_multiprocess_vector_env(env, n_envs, seed, test=True)

In [None]:
envs.reset()
# If you want to process some number of requests before evaluation, 
# warming_up function runs. 
batch_warming_up(envs, agent, n_requests=3000)
# evaluation
experiences = batch_evaluation(envs, agent, n_requests=n_requests)

In [None]:
# calc performance
from rsarl.evaluator import batch_summary
blocking_probs, avg_utils, total_rewards = batch_summary(experiences)

for env_id, (blocking_prob, avg_util, total_reward) in enumerate(zip(blocking_probs, avg_utils, total_rewards)):
    print(f'[{env_id}-th ENV]Blocking Probability: {blocking_prob}')
    print(f'[{env_id}-th ENV]Avg. Slot-utilization: {avg_util}')
    print(f'[{env_id}-th ENV]Total Rewards: {total_reward}')

[0-th ENV]Blocking Probability: 7.22
[0-th ENV]Avg. Slot-utilization: 0.4413706363636364
[0-th ENV]Total Rewards: 8556.0
[1-th ENV]Blocking Probability: 5.66
[1-th ENV]Avg. Slot-utilization: 0.4281894090909091
[1-th ENV]Total Rewards: 8868.0
[2-th ENV]Blocking Probability: 6.7299999999999995
[2-th ENV]Avg. Slot-utilization: 0.43052704545454545
[2-th ENV]Total Rewards: 8654.0
[3-th ENV]Blocking Probability: 6.67
[3-th ENV]Avg. Slot-utilization: 0.43662577272727277
[3-th ENV]Total Rewards: 8666.0
[4-th ENV]Blocking Probability: 7.000000000000001
[4-th ENV]Avg. Slot-utilization: 0.43451663636363635
[4-th ENV]Total Rewards: 8600.0


## Conclusion
That's all! 
This tutorial shows overview of *RSA-RL* components. 
Next tutorial demonstrate how to develop your own heuristic *Agent*. 