# Naive Agent in the NumberLine Environment

This is the simplest agent exploring the Numberline environment. The Naive agent moves randomly and memorizes where it has been in a tree. The NumberLine environment has 5 available actions corresponding to +0 +1 -1 +10 and -9, with these five actions the naïve agent traverses the natural numbers. Once it has explored at random for 1000 actions you can ask it to move anywhere on the number line it has seen before: moving on the number line is exactly the same thing as manipulating the environment since in this environment the only thing the agent can manipulate is it's position in the environment. Still, no matter the complexity of the environment the agent can be thought of as merely traversing its state-space just as it traverses the natural numbers in this simple environment.

First we import the agent and the environment...

In [1]:
from sensorimotor.agents import NaiveSensorimotor
from sensorimotor.envs import NumberLine

import time

Initialize the Environment and the Agent...

In [2]:
env = NumberLine()
env.seed(0)
agent = NaiveSensorimotor(env)

Training: allow the agent to explore (at random in this case)...

In [3]:
for i_episode in range(1):
    obs = env.reset()
    #env.render()
    for t_timesteps in range(1000):
        action = agent.random_step(obs)
        obs, reward, done, info = env.step(action)
        
        # notice its moving through the environment state-space...
        time.sleep(.001)
        print(obs, end='\r')

1451

Now that it's done training, (via it's random walk) inspect which state of the environment it ended up on...

In [4]:
#entire_tree = agent.previous
final_state = agent.previous.name
final_state

135

Let's look at a state of the environment it visited prior to its current location...

In [5]:
three_b4_final = agent.previous.parent.parent.parent.name
three_b4_final

137

Let's inspect how it got there by looking at the full detail of this node (the naive agent makes an explicit memory everytime it sees a new state)...

In [6]:
agent.previous

Node('/root/0/-9/-18/-27/-26/-27/-17/-17/-17/-16/-6/-15/-24/-14/-4/6/7/7/6/16/15/16/17/18/19/10/10/20/21/31/32/31/31/22/23/24/15/16/7/6/7/17/16/17/17/27/28/27/18/19/10/11/12/3/-6/-5/-6/-15/-16/-25/-26/-25/-15/-14/-13/-13/-12/-12/-21/-21/-21/-20/-20/-21/-22/-21/-22/-21/-30/-39/-48/-49/-48/-47/-48/-57/-57/-56/-65/-64/-63/-62/-63/-62/-63/-53/-53/-53/-62/-63/-63/-63/-63/-62/-62/-71/-72/-71/-71/-71/-71/-70/-79/-79/-78/-87/-86/-87/-86/-95/-96/-105/-95/-96/-86/-87/-88/-87/-88/-88/-97/-106/-115/-116/-117/-116/-115/-105/-106/-115/-115/-115/-124/-124/-133/-123/-113/-114/-123/-123/-122/-131/-130/-130/-129/-128/-118/-119/-118/-127/-136/-145/-135/-134/-135/-135/-135/-144/-145/-145/-154/-163/-153/-153/-162/-152/-151/-151/-141/-131/-131/-121/-121/-111/-101/-91/-92/-92/-82/-81/-80/-89/-88/-78/-79/-69/-69/-68/-68/-67/-67/-76/-85/-86/-87/-96/-86/-87/-77/-78/-77/-77/-78/-78/-79/-78/-78/-77/-67/-76/-66/-67/-68/-68/-58/-58/-59/-49/-58/-57/-58/-58/-58/-67/-68/-69/-68/-68/-67/-76/-77/-76/-75/-65/-66/-75/-65/

Notice the last action it took (to get to its current state) is listed as 'edge' above.
```
    0 = do nothing
    1 = +1
    2 = -1
    3 = +10
    4 = -9
```
And let's ask the agent to figure out how to get from somehting it has seen before to the last state of the environment that it saw...

In [7]:
print('going from', three_b4_final, 'to', final_state, 'using the environment actions...')
print(agent.get_path(target=final_state, start=three_b4_final))
print('...which correspond to...')
print([
    {0: '+0', 1: '+1', 2: '-1', 3: '+10', 4: '-9'}.get(action, '+0')
    for action in agent.get_path(target=final_state, start=three_b4_final)])

going from 137 to 135 using the environment actions...
[2, 2]
...which correspond to...
['-1', '-1']


Pretty cool, the agent knows how to manipulate the environment from one state to produce another state (that is, it knows how to traverse the state space of the environment, at least in this case)...

Now let's actually ask it to do so. We'll reset the state to something it has seen before...

In [8]:
agent.reset(three_b4_final)

137

Then we'll ask it to execute the steps to get to the last state of the environment saw...

In [9]:
agent.do(final_state, verbose=True)

[2, 2]


135

Let's try a longer environment manipulation: from the first thing the agent ever saw to the last thing the agent ever saw...

In [10]:
print(agent.get_path(target=final_state, start=0))

[3, 3, 4, 3, 3, 2, 3, 4, 1, 2, 4, 3, 4, 4, 1, 4, 3, 3, 3, 1, 1, 2, 2, 2, 4, 2, 4, 3, 1, 1, 2, 4, 3, 3, 1, 1, 1, 3, 2, 3, 4, 3, 4, 1, 3, 1, 1, 2, 2, 4, 3, 4, 1, 2, 2, 3, 2, 1, 4, 3, 2, 4, 3, 1, 3, 3, 1, 2, 2, 4, 3, 4, 3, 4, 2, 3, 3, 4, 3, 1, 4, 1, 3, 3, 2, 2, 2, 2, 1, 3, 1, 1, 4, 1, 4, 2, 1, 3, 4, 4, 2, 1, 3, 3, 4, 3, 2, 3, 2, 4, 1, 4, 4, 1, 3, 3, 3, 3, 1, 4, 4, 4, 4, 1, 1, 3, 1, 3, 4, 1, 4, 1, 4, 1, 2, 1, 1, 1, 2, 1, 4, 3, 4, 2, 4, 2, 2, 4, 3, 2, 3, 2, 4, 1, 4, 4, 3, 4, 1, 1, 2, 2, 2, 3, 2, 3, 1, 4, 2, 2, 1, 3, 1, 1, 4, 3, 3, 2, 2, 4, 2, 4, 2, 3, 1, 1, 1, 3, 2, 3, 2, 1, 1, 2, 4, 2, 2, 1, 2, 4, 2, 2, 2, 2, 2, 2, 2, 1, 2, 3, 2, 2, 4, 2, 1, 1, 2, 1, 1, 3, 1, 3, 4, 2, 4, 2, 1, 4, 2, 3, 1, 2, 3, 1, 2, 1, 1, 2, 4, 4, 3, 2, 2, 4, 4, 3, 2, 1, 3, 3, 4, 3, 3, 1, 4, 4, 3, 1, 4, 3, 4, 3, 2, 4, 3, 3, 4, 2, 1, 2, 4, 3, 1, 2, 1, 1, 4, 2, 1, 2, 2, 4, 1, 3, 2, 2, 1, 2, 1, 3, 2, 1, 2, 3, 4, 2, 2, 2, 3, 1, 2, 3, 4, 3, 4, 4, 3, 3, 1, 3, 4, 2, 4, 3, 2, 4, 1, 2, 3, 2, 1, 4, 3, 3, 1, 4]


The above action-path may be less than 1000 steps, this is because the agent looks for the shortest path it has ever seen between the two state representations.

We'll close the environment...

In [11]:
env.close()

## Review

The naive agent makes explicit memory, it doesn't generalize in any way. It doesn't understand patterns. It can't detect that this one kind of action is exactly the opposite from another kind of action. It can't extrapolate or draw conclusions. It is not intelligent.

If the environment is small the agent is able to memorize the environment and produce any configuration of it that you would like. This is the essential role of any sensorimotor inference engine: that it can manipulate the environment it is connected to merely by you showing it the state of the environment you would like to see.

In order to achieve scale such that a Sensorimotor agent can manipulate any size and complexity of a deterministic environment we need to infuse it with more intelligence.

In [12]:
# import anytree
# print(anytree.RenderTree(agent.root, style=anytree.render.AsciiStyle()))