# Example: Playing Pong

A useful way to study behaviour is simple small "toy" environments.  Within these environments, if an actor is not yet capable of behaving, it may *learn* to behave through interaction and some gentle prodding to the right direction.  This process is called *[reinforcement learning]( https://en.wikipedia.org/wiki/Reinforcement_learning )*.

In this article, we'll see how a simple table-top game can be modelled within EDEN, to be used in closed-loop interaction experiments.

The game has a rectangular teo-dimensional field, and a ball (or puck) bouncing around it.  There are two "player" entities that can move horizontally in the field and try to deflect the ball by tracking the paddle.

The actors provided by this example are not neural networks but hard-wired linear control law.  Arguably, this is good enough for this purpose, so adding neural networks is not that impressive as an AI task; nonetheless, it may be useful to study neurons.

## Modelling the play environment

### The field

As a form of "tennis" in the wide sense, the playing field is a rectangle of a certain *width* and *height*. To simplify math, we'll be using the *half* width (`hw`) and height(`hh`) as measures, and take (0,0) as the coordinates of the field's middle. This is where we'll serve the balls from as well. (A variant is serving from a paddle, which would give more time to the players to react.)

A ball of a certain *radius* will be bouncing around this field. If it touches ±`hw` it is reflected, if it touches ±`hh` one of the player scores and the game continues with a serve.


### The paddles

The paddles are also shaped as rectangles, which are moved around on *horizontal* rails, by a *control force* that a *player* suggests for each of them.  
Since the players are not part of the playing field, they interact with the display controls through `<VariableReference>`s, as we'll see in the following.

So then, the constants for all the above are:

In [None]:
import numpy as np
field_hw = 1.0 # half width
field_hh = 1.5 # half height

paddle_hw = 0.20 # likewise for paddle
paddle_hh = 0.05

paddle_y = 1 # distance from middle at whose +- the paddles lie

ball_r = 0.05 # ball radius

## Showing the playing field

Here's how the field will look like, following the above.

In [None]:
import matplotlib.pyplot as plt
def draw_field(ax, px, py, p1x, p2x, s1=None, s2=None, time=None):
    ax.set_xlim(-field_hw, +field_hw)
    ax.set_ylim(-field_hh, +field_hh)
    ax.set_aspect('equal') # don't distort the playing field
    from matplotlib.patches import Rectangle, Circle
    ax.add_patch(Rectangle((p1x-paddle_hw,-paddle_y-paddle_hh), paddle_hw*2, paddle_hh*2, linewidth=0, color='k'))
    ax.add_patch(Rectangle((p2x-paddle_hw,+paddle_y-paddle_hh), paddle_hw*2, paddle_hh*2, linewidth=0, color='k'))
    ax.add_patch(Circle((px, py), ball_r, ec="red", facecolor='orange', linewidth=.5))
    # ax.axhline(-paddle_y); ax.axhline(paddle_y) to taste
    if time is not None: ax.set_xlabel(f'Time: {time:.3f} sec')
    if s1 is not None: ax.text(0.5, 0, f'Score: {s1:.0f}', ha='center', va='bottom', transform=ax.transAxes)
    if s2 is not None: ax.text(0.5, 1, f'Score: {s2:.0f}', ha='center', va='top', transform=ax.transAxes)

# Show an example
draw_field(plt.gca(), 0.3, 0.4, -0.5, 0.6)

### Showing the ball's trace through the field

It is easier to perceive the fast moving ball if we add the trail of its previous recent positions, with fading colour and width going back to time.

However, all easily accessible graphics tools can't handle this concept without nasty artifacts; we'll have to construct the ribbon from polygonal vertices and edges instead.  The algorithm goes as follows:

In [None]:
N = 4 # segments
t = np.linspace(0,2*np.pi,N+1)
xy = np.vstack([np.cos(t), np.sin(t)]).T
wmax = .1; wmin = .03
width = (wmax-wmin)*(1 - t/(2*np.pi)) + wmin
def intersect_2d(p00, p01, p10, p11):
    '''Get the lerp factors to intersect 2d segs'''
    den= ((p00[0]-p01[0])*(p10[1]-p11[1]) - (p00[1]-p01[1])*(p10[0]-p11[0]))
    f0 = ((p00[0]-p10[0])*(p10[1]-p11[1]) - (p00[1]-p10[1])*(p10[0]-p11[0])) / den
    f1 = ((p00[0]-p10[0])*(p00[1]-p01[1]) - (p00[1]-p10[1])*(p00[0]-p01[0])) / den
    return f0,f1
def fancy_streamline(ax, xy, width, value, **kwargs):
    '''Draw a thick ribbon of variable width and value.  By Sotirios Panagiotou'''
    N = len(width)
    if N < 2: return None # can't draw a line like this
    def segify(xy):
        ''' Duplicate the elements except for the first and the last.'''
        return xy.repeat(2, axis=0)[1:-1]

    d = np.diff(xy, axis=0) 
    n = d / (np.linalg.norm(d,axis=-1)[:,None] + 1e-20) # normals
    ww = np.vstack([width[:-1], width[1:]]).T.flatten() # width replicated
    tt = np.vstack([value[:-1], value[1:]]).T.flatten() # value replicated
    nn = (n.repeat(2, axis=0)*ww[:,None])[:,::-1]*[-1, 1] # binormals
    xyxy = segify(xy) # points replicated for start+end
    pp = np.vstack([(xyxy+nn),(xyxy-nn)])
    
    #replace inner-side points with intersection point of segs
    for i in range(0,len(pp)-3,2): 
        if i + 2 == len(pp)/2: continue
        f0, f1 =  intersect_2d(pp[i], pp[i + 1], pp[i + 2], pp[i + 3])
        if 0 <= f0 <= 1 and 0 <= f1 <= 1 :
            pp[i+1] , pp[i+2] = 2*[(1-f0)*pp[i]+f0*pp[i+1]]
    # ax.plot(*xy.T); ax.plot(*pp[0:len(pp)//2].T); ax.plot(*pp[len(pp)//2:].T)
    # Gouraud shading will still show mach bands (why?) along the strip (try with hsv), oh well.
    tris = [ [xyxy.shape[0]+i+1, i, i+1, ] for i in range((N-1)*2-1)]
    tris+= [ [xyxy.shape[0]+i, xyxy.shape[0]+i+1, i, ] for i in range((N-1)*2-1)]
    im = ax.tripcolor(pp[:,0], pp[:,1],  np.hstack([tt,tt]), triangles=tris, shading="gouraud", **kwargs)
    return im

im = fancy_streamline(plt.gca(), xy, width, t, cmap='gray', edgecolor='#0f0f0f00')
plt.axis('equal');plt.colorbar(im);#plt.show()

## Game Dynamics

It wouldn't be fun if things didn't change, here is how things change.

### Walls

Whenever the ball hits one of the horizontal walls, its velocity (and position) is reflected along the wall.

### Paddle movement

Each paddle receives an input force from a player (with a game-specific cap for the maximum amplitude) which leads to accelerating toward one direction or the other.  The paddles are, constrained between the field's two horizontal walls, and each's movement speed is set to zero whenever it hits a wall.


### Paddle hits 

When the ball hits a paddle, its vertical speed is reflected to the opposite direction.  Its horizontal speed gets more interesting in that case, depending on which part of the paddle it bounced on: it becomes a fraction of what it was the closer it gets to the middle, but at the same time is incremented by a fraction of total speed the further away it gets from the paddle's center.


### Serving

When the ball reaches the vertical end of the field behind one of the paddles, a point is awarded to the opposite side's player and the ball is put back in the middle of the playing field. The ball stays still for a small period, after which it is launched toward one or the other paddle at a random direction and play resumes.


## Implementing the playing field

Let's cast all those play rules into a LEMS description with the various `<Parameter>s` and `<StateVariable>`s:

In [None]:
%%writefile PongField.nml
<neuroml>
<!-- Let's make a new ComponentType ❗ it's a big one because it runs a whole pong-tennis field. -->
<ComponentType name="PongField">
    <Constant name="sec" dimension="time" value="1 sec"/>
    <Constant name="per_sec" dimension="per_time" value="1 per_s"/> <!-- LATER meters per second -->
    <Parameter name="pi" dimension="none" value="3.14159"/> <!-- Add the dimensionless constant π -->
    
    <Parameter name="hw" dimension="none" description="half-width"/>
    <Parameter name="hl" dimension="none" description="Standard deviation of current"/>
    
    <Property name="ball_r"   dimension="none" defaultValue="0.05" description="Radius of the ball"/>
    <Property name="max_ball_speed" dimension="per_time" defaultValue="100 per_s" description="Don't let the ball run too fast, or it might clip through the paddle"/>
    <Property name="min_ball_vertical_speed" dimension="per_time" defaultValue="1 per_s" description="Don't let the ball run too slow, or it might take ages to bounce through the field"/>
 
    <Property name="paddle_y" dimension="none" defaultValue="1" description="Location of the paddle"/>
    <Property name="paddle_hw" dimension="none" defaultValue="0.20" description="Width  of the paddle"/>
    <Property name="paddle_hh" dimension="none" defaultValue="0.05" description="Height of the paddle"/>
    
    <Property name="paddle_h_gain_factor"   dimension="none" defaultValue="0.40" description="Add some of total speed to H-speed when bouncing on the paddle's sides."/>
    <Property name="paddle_h_loss_factor"   dimension="none" defaultValue="0.5" description="Deciamte H-speed then bouncing on the paddle's middle."/>
    <Property name="max_paddle_force_per_s2" dimension="none" defaultValue="600" description="Maximum possible paddle acceleration"/>
    
    <Property name="serve_speed" dimension="per_time" defaultValue="4 per_s" description="Speed to serve at"/>
    <Property name="serve_duration" dimension="time" defaultValue="100 ms" description="Time betwen point and serve"/>

    <VariableRequirement name="paddle_1_a_per_s2" dimension="none" description="Requested paddle force from player 1."/>
    <VariableRequirement name="paddle_2_a_per_s2" dimension="none" description="Requested paddle force from player 2."/>
    
    <Dynamics>
        <StateVariable name="px" exposure="bx" dimension="none"  description="H-Location of ball"/>
        <StateVariable name="py" exposure="by" dimension="none"  description="V-Location of ball"/>
        <StateVariable name="vx" exposure="vx" dimension="per_time" description="H-speed of ball"/>
        <StateVariable name="vy" exposure="vy" dimension="per_time" description="V-speed of ball"/>
        <StateVariable name="paddle_1_x" dimension="none"  description="H-Location of paddle 1 (bottom)"/>
        <StateVariable name="paddle_2_x" dimension="none"  description="H-Location of paddle 2 (top)   "/>
        <StateVariable name="paddle_1_v" dimension="per_time" description="H-Speed of paddle 1 (bottom)"/>
        <StateVariable name="paddle_2_v" dimension="per_time" description="H-Speed of paddle 2 (top)   "/>
        
        <StateVariable name="score_1" exposure="score_1" dimension="none" description="Score for player 1 (bottom)"/>
        <StateVariable name="score_2" exposure="score_2" dimension="none" description="Score for player 2 (top)   "/>
        <StateVariable name="serve_clock" exposure="serve_clock" dimension="time" description="Timer for serve (negative to disable)"/>
        <!--
        <DerivedVariable name="paddle_1_a_per_s2" dimension="none" value="40*(px-paddle_1_x) + 900*(2*random(1)-1)/2" description="Requested control force for player 1"/>
        <DerivedVariable name="paddle_2_a_per_s2" dimension="none" value="30*(px-paddle_2_x) + 900*(2*random(1)-1)/2" description="Requested control force for player 2"/>-->
        <!-- Limit paddle force to what's "feasible", LATER dimensional abs function? -->
        <ConditionalDerivedVariable name="paddle_1_aeff_per_s2" dimension="none" description="Effective control force for player 1">
            <Case condition="paddle_1_a_per_s2 .gt. +max_paddle_force_per_s2" value="+max_paddle_force_per_s2"/>
            <Case condition="paddle_1_a_per_s2 .lt. -max_paddle_force_per_s2" value="-max_paddle_force_per_s2"/>
            <Case value="paddle_1_a_per_s2"/>
        </ConditionalDerivedVariable>
        <ConditionalDerivedVariable name="paddle_2_aeff_per_s2" dimension="none" description="Effective control force for player 2">
            <Case condition="paddle_2_a_per_s2 .gt. +max_paddle_force_per_s2" value="+max_paddle_force_per_s2"/>
            <Case condition="paddle_2_a_per_s2 .lt. -max_paddle_force_per_s2" value="-max_paddle_force_per_s2"/>
            <Case value="paddle_2_a_per_s2"/>
        </ConditionalDerivedVariable>
        
        <!-- Keep track of 2-D speed -->
        <DerivedVariable name="ball_speed" dimension="per_time" value="sqrt((vx/per_sec)^2+(vy/per_sec)^2)*per_sec" description="Magnitude of speed vector"/>
        
        <!-- And sundry -->
        <DerivedVariable name="random_phase" dimension="none" value="random(2*pi)" description="Have a named random variable to use the same value in two places"/>
        <ConditionalDerivedVariable name="serve_clock_rate" dimension="per_time" description="Advance only when it's not elapsed">
            <Case condition="serve_clock >= 0*sec" value="1 * per_sec"/>
            <Case value="0"/>
        </ConditionalDerivedVariable>

        <!-- Move the ball and paddles around, and occasionally count time -->
        <TimeDerivative variable="px" value="vx"/> <TimeDerivative variable="py" value="vy"/>
        <TimeDerivative variable="paddle_1_x" value="paddle_1_v"/> <TimeDerivative variable="paddle_1_v" value="paddle_1_aeff_per_s2 * per_sec * per_sec"/>
        <TimeDerivative variable="paddle_2_x" value="paddle_2_v"/> <TimeDerivative variable="paddle_2_v" value="paddle_2_aeff_per_s2 * per_sec * per_sec"/>
        <TimeDerivative variable="serve_clock" value="serve_clock_rate"/>
  
        <!-- bounce on walls -->
        <OnCondition test="px + ball_r .gt. hw">
            <StateAssignment variable="px" value="2*(hw - ball_r) - px"/>
            <StateAssignment variable="vx" value="-vx"/>
        </OnCondition>
        <OnCondition test="px .lt. -hw"> <!-- note: greater-than is allowed in  https://www.w3.org/TR/xml/#syntax -->
            <StateAssignment variable="px" value="-hw - (px + hw)"/>
            <StateAssignment variable="vx" value="-vx"/>
        </OnCondition>
  
        <!-- score on non-walls -->
        <OnCondition test="py > hl">
            <StateAssignment variable="score_1" value="score_1 + 1"/>
            <StateAssignment variable="serve_clock" value="0*sec"/>
            <StateAssignment variable="px" value="0"/> <StateAssignment variable="py" value="0"/>
            <StateAssignment variable="vx" value="0"/> <StateAssignment variable="vy" value="0"/>
        </OnCondition>
        <OnCondition test="-hl > py">
            <StateAssignment variable="score_2" value="score_2 + 1"/>
            <StateAssignment variable="serve_clock" value="0*sec"/>
            <StateAssignment variable="px" value="0"/> <StateAssignment variable="py" value="0"/>
            <StateAssignment variable="vx" value="0"/> <StateAssignment variable="vy" value="0"/>
        </OnCondition>
  
        <!-- bounce on paddles. Could also add up some of the paddle's speed -->
        <!-- fix the hitbox to be perfectly circular if you want -->
        <OnCondition test="( (paddle_hw + ball_r) > abs( px - paddle_1_x)  ) .and. ( (paddle_hh + ball_r) > abs(py + paddle_y) )">
            <StateAssignment variable="vy" value="-vy"/>
            <StateAssignment variable="py" value="-(paddle_y - ball_r - paddle_hh)"/>
            <!-- funny math: give a way to manipulate the ball's direction based on distance form the center of the paddle -->
            <StateAssignment variable="vx" value="vx*((1-paddle_h_loss_factor)+ (paddle_h_loss_factor)*abs((px - paddle_1_x) / paddle_hw)) + (ball_speed * paddle_h_gain_factor * (px - paddle_1_x) / paddle_hw)"/>
        </OnCondition>
        <OnCondition test="( (paddle_hw + ball_r) > abs( px - paddle_2_x)  ) .and. ( (paddle_hh + ball_r) > abs(py - paddle_y) )">
            <StateAssignment variable="vy" value="-vy"/>
            <StateAssignment variable="py" value="+(paddle_y - ball_r - paddle_hh)"/>
            <!-- funny math: give a way to manipulate the ball's direction based on distance form the center of the paddle -->
            <StateAssignment variable="vx" value="vx*((1-paddle_h_loss_factor)+ (paddle_h_loss_factor)*abs((px - paddle_2_x) / paddle_hw)) + (ball_speed * paddle_h_gain_factor * (px - paddle_2_x) / paddle_hw)"/>
        </OnCondition>
        
        <!-- constrain paddles -->
        <OnCondition test="(paddle_hw + paddle_1_x) > hw">
            <StateAssignment variable="paddle_1_v" value="0*paddle_1_v"/> 
            <StateAssignment variable="paddle_1_x" value="hw - paddle_hw"/> <!-- or to taste, eg reflect -->
        </OnCondition>
        <OnCondition test="-hw > (-paddle_hw + paddle_1_x)">
            <StateAssignment variable="paddle_1_v" value="0*paddle_1_v"/> 
            <StateAssignment variable="paddle_1_x" value="-hw + paddle_hw"/> <!-- or to taste, eg reflect -->
        </OnCondition>
        <OnCondition test="(paddle_hw + paddle_2_x) > hw">
            <StateAssignment variable="paddle_2_v" value="0*paddle_2_v"/> 
            <StateAssignment variable="paddle_2_x" value="hw - paddle_hw"/> <!-- or to taste, eg reflect -->
        </OnCondition>
        <OnCondition test="-hw > (-paddle_hw + paddle_2_x)">
            <StateAssignment variable="paddle_2_v" value="0*paddle_2_v"/> 
            <StateAssignment variable="paddle_2_x" value="-hw + paddle_hw"/> <!-- or to taste, eg reflect -->
        </OnCondition>
        
        <!--  serve mechanics -->
        <OnCondition test="serve_clock > serve_duration">
            <StateAssignment variable="vx" value="serve_speed * cos(random_phase)"/>
            <StateAssignment variable="vy" value="serve_speed * (sin(random_phase) + 0.4*(2*H(sin(random_phase))-1))"/>
            <StateAssignment variable="serve_clock" value="-1 * sec"/> <!-- disable --> 
        </OnCondition>
        <OnCondition test="ball_speed > max_ball_speed">
            <StateAssignment variable="vx" value="vx * max_ball_speed/ball_speed"/>
            <StateAssignment variable="vy" value="vy * max_ball_speed/ball_speed"/>
        </OnCondition>
        <OnCondition test="(abs(vy/per_sec) .lt. min_ball_vertical_speed/per_sec) .and. (serve_clock .lt. 0)"> 
            <StateAssignment variable="vy" value="min_ball_vertical_speed * (2*H(vy/per_sec)-1)"/> <!-- a sign function would look nicer, if it was in lems -->
        </OnCondition>

        <!--  and set initial conditions -->
        <OnStart>
            <StateAssignment variable="px" value="+0.7"/>
            <StateAssignment variable="py" value="-0.5"/>
            <StateAssignment variable="vx" value="3*per_sec"/> <!-- or to random5 --> 
            <StateAssignment variable="vy" value="5*per_sec"/> <!-- or to random3 --> 
            <StateAssignment variable="paddle_1_x" value="0"/> <StateAssignment variable="paddle_1_v" value="0 * per_sec"/>
            <StateAssignment variable="paddle_2_x" value="-0.5"/> <StateAssignment variable="paddle_2_v" value="0 * per_sec"/>
            <StateAssignment variable="serve_clock" value="-1 * sec"/> <!-- serve right away --> 
        </OnStart>
        
        <!-- done! -->
    </Dynamics>
</ComponentType>
</neuroml>

## Implementing the players

For illustration purposes (and since setting a SNN to competently control the ball appears *tricky* even when any living creature could manage), we'll implement the players as *proportional control* governors which chase after where the ball is now.  Keep in mind that the effective command is capped in the playing field, to simulate real-life constraints.  And because that would be too perfect, we'll introduce *noise* (simulating "tremor" and general uncertainty) that perturbates the control enough to make it interesting.

We could have implemented all this inside one LEMS component, but were we'll break it down since we want our own non-trivial system (a *spiking neural network* perhaps?) to work the paddles.  The playing field (or *environment*) is separated from the *controller*, with the only communication flows being observation (in whichever form) and control command (in whichever form).

Here is then, the *proportional control* rule expressed in LEMS.  Note that the same `<ComponentType>` can be used for both players.  
**Note**: Even though the component is static (the paddle's command force is simply [where it isn't minus where it is]( https://web.archive.org/web/20030813200740/http://www.afmissileers.org/AFMissileers/pdf/NL1997/Dec97.pdf#page=5 )), we continuously assign a `StateVariable` with an always-holding `<OnCondition>` so that the `<VariableRequirement>` can observe it.

In [None]:
%%writefile PongPlayer.nml
<neuroml>
<!-- Let's make a new ComponentType ❗ for the player. -->
<ComponentType name="PongPlayer">
    
    <Property name="force_per_deviation" dimension="none" defaultValue="100" description="Weight of (ball to paddle center) on force command"/>
    <Property name="noise_intensity"     dimension="none" defaultValue="900" description="Weight of noise on force command"/>
    
    <VariableRequirement name="paddle_x" dimension="none" description="The horizontal coordinate of the paddle (where the paddle is)."/>
    <VariableRequirement name="ball_x"   dimension="none" description="The horizontal coordinate of the ball (where the paddle isn't), to blidly track."/>
    <Dynamics>
        <!-- NOTE: should be a derived variable, but eden doesn't allow VariableReferences to non-state variables yet. LATER make that happen. -->
        <StateVariable name="request_a_per_s2" exposure="request_a_per_s2" dimension="none"  description="H-Location of ball"/>
        <OnCondition test="2 + 2 == 4">
            <!-- NOTE: the effect of this formula is sensitive to timestep ! See the Ornstein-Uhlenbeck noise example for a dt-corrected random source. -->
            <StateAssignment variable="request_a_per_s2" value="force_per_deviation*(ball_x-paddle_x) + noise_intensity*(2*random(1)-1)/2"/>
        </OnCondition>
    </Dynamics>
</ComponentType>
</neuroml>

## Putting it all together

The next step is to add the playing field and the players in a simulation.  Since there is no hard `<Requirement>` or dependency on other entities, for "cells" we'll just instantiate 1 field and 2 players as `<population>`s.  A creative modeller could also have formulated them as synapses or input sources attached to a random cell, but the way shown here is closer to the intention.

In [None]:
%%writefile Pong.nml
<neuroml>
<include file="PongField.nml" />
<include file="PongPlayer.nml" />

<!-- Add a pong field type, and players -->
<PongField id="MyFirstWhat" hw="1" hl="1.5" />
<PongPlayer id="MyPlayer" noise_intensity="900" />

<network id="Net" type="networkWithTemperature" temperature="37degC" >
    <!-- Add said enviromnent and players -->
    <population id="Pong" component="MyFirstWhat" size="1"/>
    <population id="Players" component="MyPlayer" size="2"/>
</network>

<Simulation id="Sim" length="10 s" step="0.1 ms" target="Net" seed="20190109" > <!-- seed used -->

    <!-- Use EdenCustomSetup ❗ -->
    <EdenCustomSetup filename="Pong_CustomSetup.gen.txt"/>
    
    <!-- Use EdenOutputFile to sample less frequently than on every timestep -->
    <EdenOutputFile id="MyFirstOutputFile" href="results.gen.txt" format="ascii_v0" sampling_interval="5 msec">
        <OutputColumn id="vx" quantity="Pong[0]/px"/>
        <OutputColumn id="vy" quantity="Pong[0]/py"/>
        <OutputColumn id="p1" quantity="Pong[0]/paddle_1_x"/>
        <OutputColumn id="p2" quantity="Pong[0]/paddle_2_x"/>
        <OutputColumn id="s1" quantity="Pong[0]/score_1"/>
        <OutputColumn id="s2" quantity="Pong[0]/score_2"/>
    </EdenOutputFile>
</Simulation>
<Target component="Sim"/>
</neuroml>

We'll be using `<EdenCustomSetup>` to set up what we cannot through pure NeuroML.

Since we are using `<VariableRequirement>`s, for them to work we'll have to *point* their target for each instance.  That means to point the field's paddles to the players' requested control force, and also the players to the ball's horizontal position, and also the current position of the paddle for each player.

While at it, add some variation between the two players' control stiffness, otherwise they would both be moving both paddles identically (even under noise interference).

In [None]:
%%writefile Pong_CustomSetup.gen.txt

set cell Pong all all paddle_1_a_per_s2 Players[0]/request_a_per_s2 # paddles need to know ...
set cell Pong all all paddle_2_a_per_s2 Players[1]/request_a_per_s2 # what the players want

set cell Players all all ball_x Pong[0]/px # they both track px
set cell Players all all paddle_x multi # they use different paddles
values Pong[0]/paddle_1_x Pong[0]/paddle_2_x 
set cell Players all all force_per_deviation multi  # different behaviours otherwise they'll track exactly the same way. NEXT: allow values line with comment.
values 40 30 

## Displaying results

Let's run this simulation:

In [None]:
!pip install -q eden-simulator

In [None]:
from eden_simulator import runEden; from matplotlib import pyplot as plt
results = runEden('Pong.nml')
t = results['t']; px, py, p1, p2, s1, s2 = (results[f'Pong[0]/{x}'] for x in ('px', 'py', 'paddle_1_x', 'paddle_2_x', 'score_1', 'score_2'))
plt.plot(results['t'],px, label='Ball X')
plt.plot(results['t'],py, label='Ball Y')
plt.plot(results['t'],p1, label='Paddle 1 X')
plt.plot(results['t'],p2, label='Paddle 2 X')
# plt.plot(results['t'],s1, label='Player 1 score')
# plt.plot(results['t'],s2, label='Player 2 score')
plt.xlabel('Time (sec)'); plt.legend();
# plt.xlim((0.16, 0.17))
# plt.xlim((0.95, 1))

Each line helps us visualise the game:
- The `Ball X` line shows how the ball bounces between walls.  Notice that the "bouncing period" may change when the ball hits a paddle and gets a different `vx`.
- The `Ball Y` line shows how the ball bounces between paddles. If it exceeds the usual range in either direction, it's probably a point!
- `Paddle 1` and `Paddle 2` show how the two paddles try to anticipate `Ball X`. It's not clear when it's a hit or not, but we have `Ball Y` for that.

Let's use the routines made [previously]( #Showing-the-playing-field ) to better visualise the game's progression:

In [None]:
fig, ax = plt.subplots()
time_fade = 0.1 # how long (in time) should the ball's trace be?
def animate(time):
    ax.clear()
    # Get the recent trace for the ball
    ii = np.logical_and((time - time_fade) < t , t <= time) # could also do binary search
    ii, = np.where(ii)
    # Cut off instant large transitions caused by scoring, keep only after then last big one (if it exists)
    cut_locations = abs(np.diff(py[ii])) > field_hh/2.1
    cut_locations = [1]+list(cut_locations) # add guard value to keep whole sequence, if there's no reset
    back_idx = -1-(np.argmax(cut_locations[::-1])) # search from end to start for the latest cut
    ii = ii[back_idx:] #
    # Set up the motion line 
    tt = t[ii]; xy = np.vstack( (px[ii], py[ii])).T
    wmax = ball_r*.8; wmin = ball_r*.3
    weight = (tt - (time- time_fade))/time_fade
    width = (wmin) + (wmax-wmin)*weight
    fancy_streamline(ax, xy, width, weight*.3, cmap='gray_r', clim=(0,1), alpha=1)
    draw_field(ax, *xy[-1], p1[ii][-1], p2[ii][-1], s1[ii][-1], s2[ii][-1], time)

if 0 == 1: animate(0.13) # useful for debugging
else:
    import matplotlib.animation as animation
    frames_per_sec = 20
    sim_per_real = 1/2
    sim_duration = 10
    ani = animation.FuncAnimation(fig, animate, frames=np.linspace(0, sim_duration, int(sim_duration*frames_per_sec/sim_per_real))[:], interval = (1000/frames_per_sec), repeat_delay=1000)
    
    plt.close() #to avoid the redundant pyplot of only the first frame coming out?
    from IPython.display import display, HTML
    display(HTML(ani.to_jshtml()))
    # Consider also:
    # writer = animation.FFMpegWriter( fps=frames_per_sec, bitrate=1800)
    # ani.save("movie.mp4", writer=writer)

Seems like it's really working.  🏓

## Next steps

The Pong game is set up, but the players follow a simple linear control rule.  Think of neural networks that can learn to play the game competitively (also using hints like score and the opponent's paddle position), see how they perform, *persevere* and study the factors that lead to skill.

Consider how to *anticipate* the ball's course (including walls)!, use the paddle's edges to build up speed, poasibly even (if the paddles' locations are availble) fake out the strike and/or adapt to the opponent's weaknesses.

Also think of rendering the image into pixels (how would one go around it? hint: sample the x,y with different activation thresholds provided through `<EdenCustomSetup>`...), and stimulate a visual-motor-cortex circuit like in [Anwar et al. 2022]( https://doi.org/10.1371/journal.pone.0265808 ) ([code]( https://github.com/NathanKlineInstitute/SMARTAgent )).

See also a similar [experiment]( https://nest-simulator.readthedocs.io/en/stable/auto_examples/pong/run_simulations.html ) using the NEST simulator.

Remember to compare performance with a at least better linear controller (say, PID) as well.  Was all this effort and computational intensity to run competent neural networks worth it?  How so?

---

Now as the last thing, we'll fetch a screenshot to use in the documentation's [example gallery]( gallery.rst ). 

In [None]:
fig, ax = plt.subplots()
animate(0.13)
ax.axes.xaxis.set_visible(False) # plt.axis('off')
ax.axes.yaxis.set_visible(False)
import io
from PIL import Image
with io.BytesIO() as buf:
    plt.savefig(buf, bbox_inches='tight', dpi=144); buf.seek(0);
    im = Image.open(buf); im.load() # bytes content would be opened as filename ...
im = im.rotate(-90, expand=True)

In [None]:
# Stage 1: Crop
frame_data = [ im ]
frame_data = [ x.crop((int(.1*x.size[0]), int(.0*x.size[1]), int(.9*x.size[0]), x.size[1])) for x in frame_data ]

# Stage 2: Resize
for x in frame_data: x.thumbnail((240,200))

aabuf = io.BytesIO()
frame_data[0].save(aabuf, format='png')
print(len(aabuf.getvalue()))

# to lossless recompression
import oxipng
oxipng_opts = {'level':6, 'bit_depth_reduction':False}
aa = oxipng.optimize_from_memory(aabuf.getvalue(), **oxipng_opts)
print('Optimized PNG size:',len(aa))

import IPython
display(IPython.display.Image(data=aabuf.getvalue()))

with open('_static/thumb_example_pong.png','wb') as f: f.write(aa)

And after the last thing, move the generated figures to their right place and clean up after this notebook. 

In [None]:
import os, shutil
for name in os.listdir('.'):
    if ('.gen.' in name
        or name.endswith('.xml')
        or name.endswith('.nml')):
        shutil.rmtree(name, ignore_errors=True)
# for name in os.listdir('.'): print(name)