# Gym random agent attacking a chain-like network

## Chain network
We consider a computer network of Windows and Linux machines where each machine has vulnerability 
granting access to another machine as per the following pattern:

    Start ---> (Linux ---> Windows --->  ... Linux ---> Windows)*  ---> Linux[Flag]

The network is parameterized by the length of the central Linux-Windows chain.
The start node leaks the credentials to connect to all other nodes:

For each `XXX ---> Windows` section, the XXX node has:
    -  a local vulnerability exposing the RDP password to the Windows machine
    -  a bunch of other trap vulnerabilities (high cost with no outcome)
For each `XXX ---> Linux` section,
    - the Windows node has a local vulnerability exposing the SSH password to the Linux machine
    - a bunch of other trap vulnerabilities (high cost with no outcome)

The chain is terminated by one node with a flag (reward).

## Benchmark
The following plot shows the average and one standard deviation cumulative reward over time as a random agent attacks the network.

In [None]:
%%HTML
<img src="random_plot.png" width="300">

In [None]:
import sys
import logging
import gym
from gym import spaces
import numpy as np
import networkx as nx
import cyberbattle.simulation.actions as actions
import cyberbattle._env.cyberbattle_env as cyberbattle_env
import cyberbattle.agents.random_agent as random_agent
from cyberbattle._env.defender import ScanAndReimageCompromisedMachines
import cyberbattle.custom_scenarios.mfo.mfo as mfo
import importlib
importlib.reload(actions)
importlib.reload(cyberbattle_env)
importlib.reload(mfo)

logging.basicConfig(stream=sys.stdout, level=logging.ERROR, format="%(levelname)s: %(message)s")

In [None]:
# chainpattern.create_network_chain_link(2)

In [None]:
gym_env = gym.make('CyberBattleChain-v0', size=4, attacker_goal=cyberbattle_env.AttackerGoal(own_atleast=0, own_atleast_percent=1.0),
      defender_constraint=cyberbattle_env.DefenderConstraint(
          maintain_sla=0.80
      ),
      defender_agent=ScanAndReimageCompromisedMachines(
          probability=0.2,
          scan_capacity=2,
          scan_frequency=5))

In [None]:
gym_env.environment

In [None]:
gym_env.environment.network.nodes

In [None]:
gym_env.action_space

In [None]:
gym_env.action_space.sample()

In [None]:
gym_env.observation_space.sample()

In [None]:
for i in range(100) : gym_env.sample_valid_action()

In [None]:
random_agent.run_random_agent(1, 10000, gym_env)

In [None]:
o,r,d,i = gym_env.step(gym_env.sample_valid_action())

In [None]:
o