### Simple Example 3 - Manual Control

By default this runs a few "move forward" actions for two agents, in a separate window.

If you uncomment the "input" line below, it opens a text box in the Jupyter notebook, allowing basic manual control.

eg Enter `"0 2 s<enter>"` to tell agent 0 to move forward, and step the environment.

You should be able to see the red agent step forward, and get a reward from the env, looking like this:

`Rewards:  {0: -1.0, 1: -1.0}   [done= {0: False, 1: False, '__all__': False} ]`

Note that this example is set up to use the straightforward "PIL" renderer - without the special SBB artwork!
The agent observations are displayed as squares of varying sizes, with a paler version of the agent colour.  The targets are half-size squares in the full agent colour.

You can switch to the "PILSVG" renderer which is prettier but currently renders the agents one step behind, because it needs to know which way the agent is turning.  This can be confusing if you are debugging step-by-step.

The image below is what the separate window should look like.


(See also: simple_rendering_demo.ipynb)

![simple_example_3.png](simple_example_3.png)

In [1]:
import random
import numpy as np
import time

In [2]:
from flatland.envs.rail_generators import sparse_rail_generator
from flatland.envs.line_generators import sparse_line_generator
from flatland.envs.observations import TreeObsForRailEnv
from flatland.envs.predictions import ShortestPathPredictorForRailEnv
from flatland.envs.rail_env import RailEnv
from flatland.utils.rendertools import RenderTool, AgentRenderVariant

In [3]:
from IPython.display import HTML, display, clear_output
import ipywidgets as ipw
from io import BytesIO
import PIL
from matplotlib import pyplot as plt
from matplotlib.animation import FuncAnimation
import time      
  
def create_rendering_area():
    rendering_area = ipw.Image()
    display(rendering_area)
    return rendering_area

def render_env_to_image(flatland_renderer):
    flatland_renderer.render_env(show=False, show_observations=False)
    image = flatland_renderer.get_image()
    return image

def render_env(flatland_renderer, rendering_area : ipw.Image):
    pil_image = PIL.Image.fromarray(render_env_to_image(flatland_renderer))
    if rendering_area is None:
        clear_output(wait=False)
        display(pil_image)
        return

    # convert numpy to PIL to png-format bytes  
    with BytesIO() as fOut:
        pil_image.save(fOut, format="png")
        byPng = fOut.getvalue()

    # set the png bytes as the image value; 
    # this updates the image in the browser.
    rendering_area.value=byPng

In [4]:
random.seed(1)
np.random.seed(1)

In [5]:
nAgents = 3
n_cities = 2
max_rails_between_cities = 2
max_rails_in_city = 4
seed = 0
env = RailEnv(
        width=20,
        height=30,
        rail_generator=sparse_rail_generator(
            max_num_cities=n_cities,
            seed=seed,
            grid_mode=True,
            max_rails_between_cities=max_rails_between_cities,
            max_rail_pairs_in_city=max_rails_in_city
        ),
        line_generator=sparse_line_generator(),
        number_of_agents=nAgents,
        obs_builder_object=TreeObsForRailEnv(max_depth=3, predictor=ShortestPathPredictorForRailEnv())
    )

init_observation = env.reset()

In [6]:
# Print the observation vector for agent 0
obs, all_rewards, done, _ = env.step({0: 0})
for i in range(env.get_num_agents()):
    env.obs_builder.util_print_obs_subtree(tree=obs[i])

 Direction  root :  0 ,  0 ,  0 ,  0 ,  0 ,  0 ,  45.0 ,  0 ,  0 ,  0 ,  1.0 ,  0
	 Direction  L : -np.inf
	 Direction  F :  inf ,  inf ,  inf ,  inf ,  2 ,  7 ,  38.0 ,  0 ,  0 ,  0 ,  1.0 ,  0
		 Direction  L :  inf ,  inf ,  inf ,  inf ,  8 ,  9 ,  36.0 ,  0 ,  0 ,  0 ,  1.0 ,  0
			 Direction  L : -np.inf
			 Direction  F :  inf ,  inf ,  inf ,  inf ,  31 ,  40 ,  5.0 ,  0 ,  0 ,  0 ,  1.0 ,  0
			 Direction  R :  inf ,  inf ,  inf ,  inf ,  10 ,  42 ,  47.0 ,  0 ,  0 ,  0 ,  1.0 ,  0
			 Direction  B : -np.inf
		 Direction  F :  inf ,  inf ,  inf ,  inf ,  8 ,  40 ,  47.0 ,  0 ,  0 ,  0 ,  1.0 ,  0
			 Direction  L :  inf ,  inf ,  inf ,  inf ,  41 ,  42 ,  45.0 ,  0 ,  0 ,  0 ,  1.0 ,  0
			 Direction  F :  inf ,  inf ,  inf ,  inf ,  41 ,  42 ,  45.0 ,  0 ,  0 ,  0 ,  1.0 ,  0
			 Direction  R : -np.inf
			 Direction  B : -np.inf
		 Direction  R : -np.inf
		 Direction  B : -np.inf
	 Direction  R : -np.inf
	 Direction  B : -np.inf
 Direction  root :  0 ,  0 ,  0 ,  0 ,  0 ,  0 , 

### Manual control: 

s = perform step
q = quit 

[agent id] [1-2-3 action]  turnleft+move, move to front, turnright+move)

In [7]:
rendering_area = create_rendering_area()

env_renderer = RenderTool(env, gl="PIL",
                                  agent_render_variant=AgentRenderVariant.AGENT_SHOWS_OPTIONS_AND_BOX,
                                  show_debug=True,
                                  screen_height=750,
                                  screen_width=750)



for step in range(10):

    # This is an example command, setting agent 0's action to 2 (move forward), and agent 1's action to 2, 
    # then stepping the environment.
    cmd = "0 2 1 2 s"
    
    # uncomment this input statement if you want to try interactive manual commands
    # cmd = input(">> ")
    
    cmds = cmd.split(" ")

    action_dict = {}

    i = 0
    while i < len(cmds):
        if cmds[i] == 'q':
            import sys

            sys.exit()
        elif cmds[i] == 's':
            obs, all_rewards, done, _ = env.step(action_dict)
            action_dict = {}
            print("Rewards: ", all_rewards, "  [done=", done, "]")
        else:
            agent_id = int(cmds[i])
            action = int(cmds[i + 1])
            action_dict[agent_id] = action
            i = i + 1
        i += 1

    render_env(env_renderer, rendering_area)

Image(value=b'')

Rewards:  {0: 0, 1: 0, 2: 0}   [done= {0: False, 1: False, 2: False, '__all__': False} ]
Rewards:  {0: 0, 1: 0, 2: 0}   [done= {0: False, 1: False, 2: False, '__all__': False} ]
Rewards:  {0: 0, 1: 0, 2: 0}   [done= {0: False, 1: False, 2: False, '__all__': False} ]
Rewards:  {0: 0, 1: 0, 2: 0}   [done= {0: False, 1: False, 2: False, '__all__': False} ]
Rewards:  {0: 0, 1: 0, 2: 0}   [done= {0: False, 1: False, 2: False, '__all__': False} ]
Rewards:  {0: 0, 1: 0, 2: 0}   [done= {0: False, 1: False, 2: False, '__all__': False} ]
Rewards:  {0: 0, 1: 0, 2: 0}   [done= {0: False, 1: False, 2: False, '__all__': False} ]
Rewards:  {0: 0, 1: 0, 2: 0}   [done= {0: False, 1: False, 2: False, '__all__': False} ]
Rewards:  {0: 0, 1: 0, 2: 0}   [done= {0: False, 1: False, 2: False, '__all__': False} ]
Rewards:  {0: 0, 1: 0, 2: 0}   [done= {0: False, 1: False, 2: False, '__all__': False} ]
