In [1]:
from engine.api.cartesian import *


# Understanding how agents learn through communication

Agents can use communication to achieve two kinds of goals.  On one hand, the agent might communicate in order to improve itself.  On the other, multiple agents might work together to solve some problem that no agent could do alone.  Both of these ideas play into the general notion that agents can learn how to attain some goal or minimize a cost.  

Presumably, interaction between agents could cause one to change its goals or redefine a cost function.  Agents might also share knowledge or awareness of the environment they exist in with each other. 

To study the kinds of interactions and the kinds of learning that might happen between multiple agents, I have designed a virtual world that multiple agents can interact in.  This notebook explores some ideas for the world, and outlines possible games or goals we could design for.


### Key terms:

  - Environment - A virtual world that agents can interact with.  It may provide sensory inputs, like a sense of position in space or a way to send/receive messages.
  - Location - A collection of spatial integer coordinates, where each coordinate references a specific position in space.  This is may be a relative or absolute position in space.
  - Agent - An piece of code interacts with the environment to perform actions or receive sensory input.


  - for slightly more details: engine/base_classes.py

## Virtual world:

The `CartesianEnvironment` is 

    A traversable n-dimensional Cartesian coordinate space of pre-defined shape
    that defines the ways agents can interact within this space

    Agents use the environment to send each other messages,
    check the state of the grid, and move to new locations.
    
Some facts about the traversable space for a CartesianEnvironment:
  - A coordinate in n-dimensional space consists of n integers, where each integer corresponds to a discrete location.  2D space is (x, y).  A 3D coordinate is (x, y, z).  
  - There are no walls.  For instance, in a 2D space, all 4 corners are connected.
  - Coordinates are not used directly or event explicitly defined, but Locations are!  1 or more coordinate may make up a Location.
  
A Location allows us to define collections of coordinates in coordinate space.  For instance, a location in 2D space might consist of two coordinates: [(0, 0), (0, 1)].  

  - `RelLoc`  - A relative location denotes an offset from some undefined reference point.  For instance, (-1, 0) might mean one coordinate above.
  - `AbsLoc`  - An absolute location is relative to some assumed reference point.  Therefore, (0, 0) could be origin of a Cartesian Space.


## Agents

    An agent interacts with the environment to perform actions or receive
    sensory input.  It is identifiable by an integer AgentId.

    Agents:
      - Can exist a Location, so they have many coordinates and in fact can be distributed beings.
      - Decide how the query the environment to get information from it.  
        - To an Agent, the Environment is just an API.  
        - An agent might ask only for relative positioning, or it might be aware of absolute position.
        - It might not even know where its own body parts are located, or might have limited knowledge of where other agents are.
      - Ask the environment to process requests to move in the space and send messages to other agents.
      


## Games (not implemented):;

1. A Traveling Salesman game: Each coordinate in the Environment must be visited at least once, and all squares should be visited in the minimum number of moves and minimum amount of communication
1. Sequence Prediction: Agents can learn to predict the next movements of other agents in the environment
1. Self Awareness: A distributed agent (like a column of 3 coordinates) must learn how to move without separating itself.  Once it can do this, it should figure out how to efficiently clean the space.
  - Presumably, we could get agents to learn to work together to visit all squares.  for instance a Toroid and a Cylinder in 3D space might learn to travel together.


## Tasks:

- Build out Cartesian Environment, Location
- Design some simple Agents that interact with the space
- Implement a way to visualize this space.

# Simple Example

In [2]:
# Create Cartesian Environment of 2 dimensions, and 9 coordinates
env2 = CartesianEnvironment((3, 3))

# Create an agent that moves down 1 coordinate at a time.
agent = DeterministicAgent([(1,0)])

# Place this agent at coordinate (1,1)
env2.add_agent(agent.ID, AbsLoc((1,1)))

print('created agent with ID: %s' % agent.ID)

created agent with ID: 140240950951720


In [3]:
# initially, the agent has visited no coordinates 
env2.visit_counts()

array([0, 0, 0, 0, 0, 0, 0, 0, 0])

In [4]:
# Show the location of this agent on the grid
list(env2.agents_at_location())

[(140240950951720, AbsLoc([[1, 1]]))]

In [5]:

# Have agent interact with its environment twice
for _ in range(2):
    # It will ask to move down 1 coordinate
    agent.next_action(env2)

    # Let's see what happened
    print('agent locations', list(env2.agent_location(agent.ID)))
    print('visit counts', '\n', env2.visit_counts())

agent locations [AbsLoc([2, 1])]
visit counts 
 [0 0 0 0 0 0 0 1 0]
agent locations [AbsLoc([0, 1])]
visit counts 
 [0 1 0 0 0 0 0 1 0]


### API

In [6]:
print('\n'.join(x for x in dir(env2) if not x.startswith('_')))

add_agent
agent_inbox
agent_location
agent_locations_nearby
agents_at_location
grid_ndim
grid_shape
neighbors
process_request
visit_counts
visit_counts_nearby


In [7]:
import numpy as np
env22 = CartesianEnvironment((10,10))

ad22 = DeterministicAgent(np.random.randint(-1, 2, (5, 3, 2)))
startloc = AbsLoc(np.random.randint(0, 3, (3, 2)), 2)
env22.add_agent(ad22.ID, startloc)
list(env22.agents_at_location())

[(140240951047728, AbsLoc([[1, 1],
         [2, 1],
         [2, 2]]))]

In [8]:
ad22.next_action(env22)
list(env22.agents_at_location())
env22.visit_counts()

array([0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0])

In [9]:
ad22.next_action(env22)
list(env22.agents_at_location())
env22.visit_counts()

array([0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,
       0, 0, 0, 0, 0, 0, 0, 0])

### 3D grid

In [10]:
env3 = CartesianEnvironment((2,2,2))
a3 = DeterministicAgent([
        RelLoc((0,0,1)),
        RelLoc((0,0,1)),
        RelLoc((0,1,0))
        ])

env3.add_agent(a3.ID, AbsLoc((0,0,0)))
print(env3.visit_counts())

print('---')
for _ in range(5):
    a3.next_action(env3)
    print(env3.visit_counts().reshape(env3.grid_shape))
print('---') 
print(env3.visit_counts())

[0 0 0 0 0 0 0 0]
---
[[[0 1]
  [0 0]]

 [[0 0]
  [0 0]]]
[[[1 1]
  [0 0]]

 [[0 0]
  [0 0]]]
[[[1 1]
  [1 0]]

 [[0 0]
  [0 0]]]
[[[1 1]
  [1 1]]

 [[0 0]
  [0 0]]]
[[[1 1]
  [2 1]]

 [[0 0]
  [0 0]]]
---
[1 1 2 1 0 0 0 0]


### Display
using OpenGL (via Pyglet)

In [11]:
d2 = cartesian_display(env2, scale=(50,50))
d2.schedule(agent.next_action, 1, env=env2)
d2.run()  # note: the display renders in a window, not in the notebook

In [12]:
d22 = cartesian_display(env22, scale=(50,50))
d22.schedule(ad22.next_action, 1, env=env22)
d22.run()  # note: the display renders in a window, not in the notebook

In [13]:
env = CartesianEnvironment((200, 200))

a1 = DeterministicAgent(np.random.randint(-1, 2, (1000, 100, 2)))
env.add_agent(a1.ID, n_coord=100)
a2 = DeterministicAgent(np.random.randint(-1, 2, (10000, 1, 2)))
env.add_agent(a2.ID, np.random.randint(0, 200, (1, 2)))

d = cartesian_display(env, refresh_interval=1/10, scale=(3, 3),
                      visit_counts=True, agents=True)

def run_agents(a1, a2, env):
    a1.next_action(env)
    a2.next_action(env)

d.schedule(run_agents, None, a1, a2, env)

d.run()  # note: the display renders in a window, not in the notebook


In [None]:
env = CartesianEnvironment((500, 500))

n_coord = 10
a1 = DeterministicAgent(np.random.randint(-1, 2, (70, n_coord, 2)))
env.add_agent(a1.ID, n_coord=n_coord)
n_coord = 20
a2 = DeterministicAgent(np.random.randint(-1, 2, (80, n_coord, 2)))
env.add_agent(a2.ID, n_coord=n_coord)

d = cartesian_display(env, refresh_interval=1/10, scale=(1, 1),
                      visit_counts=True, agents=True)

def run_agents(a1, a2, env):
    a1.next_action(env)
    a2.next_action(env)

d.schedule(run_agents, None, a1, a2, env)

d.run()  # note: the display renders in a window, not in the notebook

In [14]:
#d.window.close()
#d1.unschedule()
d.unschedule(all_displays=True)