---
date: 2024-05-01
title: "Skryms Signals Summary and Models"
subtitle: "learing language games"
keywords: [game theory, signaling games, partial pooling, evolution, reinforcement learning, signaling systems, evolution of language]
---


In [@Skyrms2010signals] philosopher and mathematician Brian Skyrms discusses how one can extend the concept of a signaling games into a full fledged signaling systems and to some extent a rudimentary language.

I like many other found Signals to be a fascinating little book worth reading at least a couple of times. While Skyrms starts with a basic exposition motivated by Greek philosophers he eventually makes a deep dive into areas like reinforcement learning, replicator dynamics, mean field games and some other deep mathematical fields without much of introduction. In places the monographs seems incomplete and may require hunting the papers in the bibliography and possibly more recent work by the same authors.

I slowly noticed it being cited in more and more papers which I read. This sort of indicated that intellectually more people we on the same path of thinking how to equip their problem solving with a signaling system or better yet to evolve a more sophisticated language.\

I went back several times to review the chapter on Complex signals, which I feel is the most interesting for real-world application. I began to think that the Lewis games are too rudimentary since signaling systems that evolve/learned from them are basically n-k maps of signals to meaning.

What I wanted was a recipe for quickly agent that need to evolve and teach/learn a language for efficient communication.

I wanted to go the relevant papers he covers on this area and then to see of there were newer results he did not cover. This turned out to be a bit of a challenge. In the mean time I also learned some courses on RL and even tried a couple of ideas from this book at work. I think I should summarize at least some of the more interesting results from the book.

Besides a summary I also want to try to implement some of the keystone models in the book to see if I can derive the reductionist simple language learning game.

## 1. Signals

### Big Research Questions

**Q1. How can interacting individuals spontaneously learn to signal?**

**Q2. How can species spontaneously evolve signaling systems?**

## Sender-Receiver

> There are two players, the sender and the receiver.\
> Nature chooses a state at random and the sender observes the state chosen.\
> The sender then sends a signal to the receiver, who cannot observe the state directly but does observe the signal.\
> The receiver then chooses an act, the outcome of which affects them both, with the payoff depending on the state.\
> Both have pure common interest—they get the same payoff—and there is exactly one “correct” act for each state.\
> In the correct act-state combination they both get positive payoff; otherwise payoff is zero.\
> The simplest case is one where there are the same number of states, acts, and signals.

A separating equilibrium is called a signaling system

> If we start with a pair of sender and receiver strategies, and switch the messages around the same way in both, we get the same payoffs. In particular, permutation of messages takes one signaling-system equilibrium into another.

We can understand a signaling system as a encoding look-up table by the sender and a decoding lookup table for the reciever which is the inverse of the first. The product of two permutations is the identity matrix. Each permutation of the identity matrix gives a valid signaling system

**Q3. Is there a most salient signaling system?**

Salience is a concept from Schelling's Game theory that suggest that one solution to a coordination problem might be naturally better then others. (e.g. meeting a relative at the airport). This can be due to an externality to the pure coordination problem. Salience can also arise from non uniformity of the state distribution - by providing less frequent messages longer messages based on binary coding. The salience hierarchy might be grounded in risk - more urgent messages might be shorter and learned before the longer ones.

my thoughts on Salience:

- Salience would arise in nature through the non-uniform distribution of states 
  which is ignored in most papers leading to equally salient signaling system. 
  When the states are not uniformly distributed then the signals will not be 
  uniformly distributed. The more common states should have more common signals. 
  e.g. if snakes are more common than eagles then the signal for snake should be 
  shorter/simpler/learned first than the signal for eagle. In another location
  the distributions could be reversed leading to a different salience hierarchy.
- Another way (of seeing this is that) salience would arise in nature to minimize
  risks for the sender, who could become a target for a predator by sending a signal.
- Two other source of salience are the risk of making mistakes and the cost of 
  sending a signal.
- Finally there is nothing stopping the salience from being a function of all these
  factors through a product of their probabilities. Though this is more easily
  expressed in the language of fitness. Salience will select the language whose
  speakers gain the highest expected progeny (fitness) by avoiding risks, conserving
  energy and avoiding miscommunication for their habitat.
- If the speakers migrate they might benefit from a language that is salient in 
  multiple habitats. This is a form of generalization.
- If there are different cost for encoding and decoding then the salience will be 
  a function of the product of the encoding and decoding costs. This is a form of 
  cost minimization. In this scenario there may be a competition between the sender and
  the receiver to minimize their costs. But the sender has the upper hand since the
  sender chooses the signal. The sender is the causal agent in the signaling system.
  

**Q4. How can two agents with different signaling find a SS that is midway between them (including systems with both shared and unique states)?**

-   Its fairly clear that under the rules of the Lewis game all valid signaling systems are isomorphic and none are more salient.
-   In nature salience might arise and a systems leading to greatest fitness in its users would be the most salient.
-   To find a signaling system that is midway between two signaling systems we could use the Cayley distance between the two permutations. This is the minimum number of transpositions required to transform one permutation into another. The median permutation would be the one that has half the Cayley distance to each signaling systems.
- If the systems have salience we may want to also keep the most salient signals intact and now we have a more complex optimization problem. We could use the KL divergence between the two signaling systems to estimate the distance of the signaling distribution from a separating distribution.

the Cayley distance between two permutations is the minimum number of transpositions required to transform one permutation into another. it is a metric on the symmetric group.

**Information in signals**

**Q5. How can we minimally extend this framework to handle Errors and Deception**

> Signals carry information. The natural way to measure the information in a signal is to measure the extent that the use of that particular signal changes probabilities. Accordingly, there are two kinds of information in the signals in Lewis sender-receiver games: information about what state the sender has observed and information about what act the receiver will take. The ﬁrst kind of infor- mation measures effectiveness of the sender’s use of signals to discriminate states; the second kind measures the effectiveness of the signal in changing the receiver’s probabilities of action.

- [ ] TODO: estimate information content of each signal for sender and receiver for separating and partial pooling cases
- [ ] TODO: use entropy for message level estimates of sender and receiver under separating signal, a synonym, a homonym.
- [ ] TODO: use entropy KL divergence to estimate a the distance of the signaling distribution from a separating distribution.

Actually there are a number of extensions one would like to consider for the Lewis framework:

1.  bottlenecks
    1.  more state than signals - this is the interesting case and where complex signaling systems should arise
    2.  more signals than states - this is the case where synonyms can arrise
2.  basic logical reasoning, conjunctions, disjunctions, negations
3.  multiple senders and or receivers
    1.  rewarding coordination (each state requires different actions from the agents - they are learning different receiver maps )
    2.  rewarding correlated equilibrium (sender lets the receivers pick from correlated states at random allowing the receivers avoid penalty of miscoordination.)
    3. networks of agents per the goyal model in ch 11 and 13

complex signals

1.  conjunction of signals,
2.  ordered signals,
3.  recursive signals, group

### Evolution

We first see two competing Signaling systems being tested in a population

[@hofbauer1998evolutionary] Population dynamics - can be used to identify which dynamic equlibria are stable or unstable given an intial population of strategies

There is a figure showing the field dynamics with basins of attractions arrising from the population dynamics equations

We also see symmetry breaking selecting a signaling system to a system

$$
\frac{dp(A)}{dt}=p(A)[U(A)-U]
$$

where

-   U(A) is the average payoff to strategy A and
-   U is the average payoff in the population.


In [None]:
from pylab import *

alpha, beta = 1, 1
xvalues, yvalues = meshgrid(arange(0, 2.1, 0.1), arange(0, 2.1, 0.1))
xdot = xvalues * alpha - beta
ydot = yvalues * alpha - beta
streamplot(xvalues, yvalues, xdot, ydot)
show()

we have a discussion of how signals might arise.

## Evolution


In [None]:
import itertools
import functools
from mesa import Agent, Model
from mesa.time import StagedActivation, RandomActivation
from mesa.datacollection import DataCollector
import matplotlib.pyplot as plt

# agent_roles
r_nature = 'nature'
r_sender = 'sender'
r_receiver = 'receiver'

## Lewis Signaling Game Model

The Lewis signaling game is a model of communication between two agents, a sender and a receiver.
Nature picks a state, the sender observes the state, chooses a signal, and sends the signal to the receiver who then takes an action based on the signal.
If the action of the receiver is a match with the state obseved by the sender, agents get a reward of 1, otherwise, they get a reward of 0.
state, the sender and receiver get a reward of 1, otherwise, they get a reward of 0.
is a match with the state, the sender and receiver get a reward of 1, otherwise, they get a reward of 0.


In [None]:
class HerrnsteinRL():
    '''
                                    The Urn model
     nature            sender                 reciever     reward
                       
    | (0) | --{0}-->  | (0_a)  | --{a}--> | (a_0) | --{0}-->   1   
    |     |           | (0_b)  | --{b}    | (a_1) | --{1}-->   0
    |     |           +--------+    | +-->+-------+
    |     |                         +-|-+  
    | (1) | --{1}-->  | (1_a)  | --{a}+ +>| (b_0) | --{1}-->   1
    |     |           | (1_b)  | --{b}--->| (b_1) | --{0}-->   0
    +-----+           +--------+          +-------+
    
    
    Herrnstein Urn algorithm
    ------------------------
    
    1. nature picks a state 
    2. sender gets the state, chooses a signal by picking a ball in choose_option() from the stat'es urn
    3. reciver gets the action, chooses an actuion by picking a ball in choose_option()
    4. the balls in the urns are incremented if action == state
    5. repeat
    
    '''
    def __init__(self, options, learning_rate=1.0,verbose=False,name='Herrnstein matching law', balls=None):
        
        # filter options in choose option by input
        self.verbose = verbose
        self.name=name
        self.learning_rate = learning_rate
        self.options = options
        if balls is not None:
          self.balls = balls
        else:
          self.balls = {option: 1.0 for option in self.options}
        if self.verbose:
          print(f'LearningRule.__init__(Options: {options})')
    
    def get_filtered_urn(self, filter):
      ''' filters urn's options by prefix and normalizes the weights
          usege:
          urn=urn.get_filtered_urn(1)
          choice = model.random.choice(list(urn.keys()), p=list(urn.values()))
      '''
      assert type(filter) == int, f"filter must be a int"
      filtered_options = [key for key in self.balls.keys() if key[0] == filter]
      if not filtered_options:
        raise ValueError(f"No options found with filter {filter}")
      if self.verbose:
        print(f"in get_filtered_urn({filter=}) --- filtered_options: {filtered_options=}")
      filtered_balls = {opt: self.balls[opt] for opt in filtered_options}
      if self.verbose:
        print(f"in get_filtered_urn({filter=}) --- filtered_balls: {filtered_balls=}")
      total = functools.reduce(lambda a,b: a+b, filtered_balls.values())
      #total = sum(filtered_balls.values())
      if self.verbose:
        print(f"in get_filtered_urn({filter=}) --- total: {total=}")
      assert total > 0.0, f"total weights is {total=} after {filter=} on {self.balls}"      
      normalized_balls = {option: weight / total for option, weight in filtered_balls.items()}
      if self.verbose:
        print(f"in get_filtered_urn({filter=}) --- returning : {normalized_balls=}")
      return normalized_balls
     
    def choose_option(self,filter,random):
        ''' chooses an option from the urn based on the filter and the random choice
            
            usage:
            urn.choose_option(filter=1,random=model.random)
        '''
       
        urn = self.get_filtered_urn(filter)
        if random:
          options = random.choices(list(urn.keys()), weights=list(urn.values()),k=1)
          option = options[0]
          
          if self.verbose:
            print(f'in HerrnsteinRL.choose_option({filter=}) --- chose {option=} from {urn=}')

          return option
        else:
          raise Exception(f"random must be a random number generator")
        
    def update_weights(self, option, reward):
        old_balls = self.balls[option]
        self.balls[option] += self.learning_rate * reward 
        if self.verbose:
          print(f"Updated weight for option {option}: {old_balls} -> {self.balls[option]}")

In [None]:
class LewisAgent(Agent):
  
    def __init__(self, unique_id, model, game, role, verbose=False):
        super().__init__(unique_id, model)
        self.role = role #( one of nature, sender, receiver)
        self.verbose = verbose
        self.game = game
        self.messages = []
        self.actions = []
        if role == "sender":
          self.urn = HerrnsteinRL(model.states_signals, learning_rate=1.0,verbose=verbose,name='state_signal_weights')
        elif role == "receiver":
          self.urn = HerrnsteinRL(model.signals_actions, learning_rate=1.0,verbose=verbose,name='signal_action_weights')
        else:
          self.urn = None
        
    def step(self):
      # reset agent state before step
      self.messages = []
      self.actions = []

    def gen_state(self)-> None:
        if self.role == r_nature:
          self.current_state = model.random.choice(self.model.states)
          if self.verbose:
                print(f"Nature {self.unique_id} set state {self.current_state}")
                
    @property
    def state(self):
        if self.role == r_nature:
          return self.current_state

    def choose_signal(self, filter):
        # sanity checks for filter
        assert type(filter) == int, f"filter must be a int"
        assert filter in model.states, f"filter must be a valid state"
        
        
        if self.role != r_sender:
          throw(f"Only sender can send signals")
        self.option = self.urn.choose_option(filter=filter,random=self.model.random)
        signal = self.option[1] # the prefix is the urn context we want the suffix
        assert type(signal) == int, f"signal {signal=} must be a int"
        self.signal = signal
        if self.verbose:
              print(f"Sender {self.unique_id} got filter {filter} choose option: {self.option} and signaled: {self.signal}")
        return self.signal
          

    def send_signal(self, filter, receiver):
        ''' 
            # Message sending logic:
            1. sender chooses a signal based on the state
            2. sender sends the signal to the receiver
        '''
        if self.role != r_sender:
          raise Exception(f"Only sender can send signals")
         
        assert type(filter) == int, f"filter must be a int"
        assert filter in model.states, f"filter must be a valid state"
        signal = self.choose_signal(filter=filter)
        assert signal is not None, f"signal must be a valid signal"
        if self.verbose:
          print(f"Sender {self.unique_id} chose signal: {signal}")
        receiver.messages.append(signal)
        if self.verbose:
          print(f"Sender {self.unique_id} sends signal: {signal} to receiver {receiver.unique_id}")

    def fuse_actions(self,actions):
        ''' 
            # Message fusion logic:
            1. single message:  if there is only one signal then the action is the action associated with the signal
            2. ordered messages: if there are multiple signals then the action is the number from the string assocciated with the concatenated signal
               if there are two signals possible per message we concat and covert binary string to number
            3. is the messages are sets we could perform a intersetion and take the action associated with the intersection 
               currently this is not implemented
            4. support for recursive signals is currently under research .
        ''' 
        if self.role != r_receiver:
          raise Exception(f"Only receiver can set actions")
        
        if len(actions) == 1: # single action no need to fuse
          return actions[0]
        else:
          # fuse the actions into a binary number
          action = 0
          # if there are multiple signals
          for i in range(len(actions)):
            action += actions[i]*(2**i)
          if self.verbose:
              print(f"Receiver {self.unique_id} fused actions : {self.actions} into action: {action}")
          return action

    def decode_message(self,signal):
        ''' first we need to get the filtered urn for the signal
            and then choose the option based on the urn'''
        if self.role != r_receiver:
          raise Exception(f"Only receiver can decode messages")
        option = self.urn.choose_option(filter=signal,random=self.model.random)
        action = option[1]
        if self.verbose:
              print(f"in decode_message({signal=}) Receiver {self.unique_id} got option: {option} and decoded action: {action}")
        return action

    def set_action(self):
        ''' first we need to use the urn to decode the signals 
            then need to fuse them to get the action '''
        if self.role != r_receiver:
          raise Exception(f"Only receiver can set the action")
        self.actions = []
        for signal in self.messages:
          self.actions.append(self.decode_message(signal))          
        self.action = self.fuse_actions(self.actions)
        # which option to reinforce 
        self.option = (self.messages[0],self.action)
        if self.verbose:
              print(f"Receiver {self.unique_id} received signals: {self.messages} and action: {self.action}")
              
    def set_reward(self,reward):
        if self.role not in [r_receiver,r_sender]:
          raise Exception(f"Only sender and receiver can set rewards")
        self.reward = reward
        if self.verbose:
            print(f"Receiver {self.unique_id} received reward: {self.reward}")
                
    def calc_reward(self,correct_action):
        if self.role != r_receiver:
          raise Exception(f"Only receiver can calculate rewards")
        self.reward = 1 if self.action == correct_action else 0        

class SignalingGame(Model):
  
    # TODO: add support for 
    # 1. bottle necks
    # 2. rename k to state_count
    # 3. state_per_sender = state_count/sender_count 
    # 2. partitioning states by signals => state/sender_count

    def __init__(self, game_count=2, senders_count=1, recievers_count=1, state_count=3,signal_count=3,verbose=True):
        super().__init__()
        self.verbose = verbose
        self.step_counter = 0
        self.schedule = RandomActivation(self)
        
        
        # Define the states, signals, and actions
        self.states   = [i for i in range(state_count)]
        print(f'{self.states=}')
        self.signals  = [i for i in range(signal_count)]
        print(f'{self.signals=}')
        self.actions  = [i for i in range(state_count)]
        print(f'{self.actions=}')
        
        # e.g., 1 -> 1, 2 -> 2, ...
        self.states_signals =  [(state,signal) for state in self.states for signal in self.signals]
        print(f'{self.states_signals=}')
        self.signals_actions = [(signal,action) for signal in self.signals for action in self.actions] 
        print(f'{self.signals_actions=}')
        
        # Agents

        self.uid=0
        self.senders_count=senders_count
        self.recievers_count=recievers_count

        # Games each game has a nature, senders and receivers
        self.games = []
        # Create games        
        for i in range(game_count):
            game = {
              r_nature: None,
              r_sender: [],
              r_receiver: []
            }
            
            # create nature agent
            game[r_nature] = LewisAgent(self.uid, self, game=i,role = r_nature,verbose=self.verbose)
            self.schedule.add(game[r_nature])
            self.uid += 1
            
            # create sender agents
            for j in range(senders_count):
                sender = LewisAgent(self.uid, self, game=i,role = r_sender,verbose=self.verbose)
                game[r_sender].append(sender)
                self.schedule.add(sender)
                self.uid +=1
                
            # create receiver agents
            for k in range (recievers_count):
                reciever = LewisAgent(self.uid, self, game=i,role = r_receiver,verbose=self.verbose)
                game[r_receiver].append(reciever)
                self.schedule.add(reciever)
                self.uid +=1
                
            self.games.append(game)

            self.total_reward = 0
        

        # Define what data to collect
        self.datacollector = DataCollector(
            model_reporters={"TotalReward": lambda m: m.total_reward},  # A function to call 
            agent_reporters={"Reward": "reward"}  # An agent attribute
        )

    def compute_total_reward(self,model):
        return 
        
    def step(self):
      
        for agent in model.schedule.agents:
            # reset agent state before step
            agent.step()
            
        for game_counter, game in enumerate(self.games):
            if self.verbose:
                print(f"--- Step {model.step_counter} Game {game_counter} ---")
            nature = game[r_nature]
            nature.gen_state()
            state = nature.current_state
            assert type(state) == int, f"state must be a int"
            assert state in model.states, f"state must be a valid state"
            if self.verbose:
                print(f"in model.step() --- game {game_counter} --- Nature {agent.unique_id} set state {state} in game {game_counter}")
            for sender in game[r_sender]:
                for receiver in game[r_receiver]:                    
                    sender.send_signal(filter = state, receiver=receiver)
            for receiver in game[r_receiver]:
                assert receiver.role == r_receiver, f"receiver role must be receiver not {receiver.role}"
                receiver.set_action()
                if self.verbose:
                    print(f"in model.step() --- game {game_counter} --- Receiver {receiver.unique_id} action: {receiver.action}")
                receiver.calc_reward(correct_action=state)
                reward = receiver.reward
                assert type(reward) == int, f"reward must be a int not {type(reward)}"
                assert reward in [0,1], f"reward must be 0 or 1 not {reward}"
                print(f"in model.step() --- game {game_counter} --- Receiver {receiver.unique_id} received reward: {receiver.reward}")
            
            for agent in itertools.chain(game[r_sender],game[r_receiver]):
                agent.set_reward(reward)
                if self.verbose:
                    print(f"in model.step() --- game {game_counter} --- Sender {agent.unique_id} received reward: {reward}")
                agent.urn.update_weights(agent.option, reward)

            #print(f'in model.step() --- game {game_counter}, {self.expected_rewards(game)=}')
                    # Collect data
        
        self.total_reward += sum(agent.reward for agent in self.schedule.agents if agent.role == r_receiver)

        self.datacollector.collect(self)


    def expected_rewards(self,game):
      return 0.25

    def run_model(self, steps):

        """Run the model until the end condition is reached. Overload as
        needed.
        """
        while self.running:
            self.step()
            steps -= 1
            if steps == 0:
                self.running = False

In [None]:
# Running the model
state_count= 3  # Number of states, signals, and actions
signal_count= 3
steps = 1000

model = SignalingGame(senders_count=1,recievers_count=1,state_count=state_count,signal_count=signal_count,verbose=True,game_count=2)
model.run_model(steps)  # Run the model for the desired number of steps

# Get the reward data
reward_data = model.datacollector.get_model_vars_dataframe()

# Plot the data
plt.figure(figsize=(10, 8))
plt.plot(reward_data['TotalReward'])
plt.xlabel('Step')
plt.ylabel('Total Reward')
plt.title('Total Reward over Time')
plt.grid(True)  # Add gridlines
plt.xlim(left=0)  # Start x-axis from 0
plt.ylim(bottom=0,top=1000)  # Start y-axis from 0
plt.show()     

In this simulation the agents are not learning - they are accessing the predefined signals and actions in the model hence rewards are always 1.

Player in Lewis signaling games can reach three type of equilibria

1.  Separating equilibrium in which receiver fully recovers the state from the signal and can take the appropriate action
2.  Partial pooling equilibrium in which *synonyms* or *homophones* frustrate the receiver for always recovering the state.
3.  Full pooling equilibrium in which all signals are the same and the agents are unable to communicate.

A one word synonym for "desired qualities" derived from desire that used in academic literature is "desiderata".

Skryms next considers bottle necks - which are cases where there are more signals than actions and vica versa.

-   In the case of more signals than actions successful learning will result a partial polling equilibrium with some synonyms.
-   In the case of more actions than signals the best an agent can learn is a partial pooling equilibrium with homophones.

Both synonyms and homophones have drawbacks however:

While synonyms increase the cognitive load and the number of signals that need to be learned they do not prevent the recovery of the state being communicated. Homophones require the receiver to select an interpretation at random leading to lower payoffs since the receiver unable to recover the state cannot select the correct action. If the number of signal is the same as the number of actions, the pigeon hole principle guarantees that for every synonym there must be a homophone.

If we consider that for recoverability we need action and signals to be fully correlated it is easy to see that each failure to correlate

action to signals results in a (partial) pooling solution. Thus there are far more partial pooling equilibria than separating equilibria. and it is thus no surprise that natural language is rife with homophones and synonyms.

In lieu of the fact that partial pooling equilibrium far out number the separating ones with and with out bottlenecks, setting up and later learning a separating signaling system with minimal homophones/synonyms is not trivial task. (If we also factor in cost/risk of miscommunication some homophones are clearly worse than others)

-   Evolution for example may not be the best way for this.

-   While researchers have very basic algorithms to do so, in terms of convergence rate and sample efficiency.

Although not considered it is easy to see that there are far more partial pooling

We can conclude proceed to discuss the desiderata for learning algorithms.

Note: Dropout Algorithm Introducing bottlenecks into neural networks tend to improve their ability to generalize by forcing them to avoid memorizing inputs and come up with more resiliant representations. This suggest that partial pooling equilibria may play a more significant role in structured/complex signaling systems.

## Desiderata for learning algorithms of signaling systems

1.  State recovery - we prefer the algorithm to learn a separating equilibrium and if avoid pooling equilibrium with homophones.
2.  Convergence - we want the algorithm to quickly converge to the equilibrium.
3.  Sample efficiency - we want the algorithm to learn after minimal exposure to stimuli.

Some questions

-   How different are the task of creating the signaling system from learning it?

    -   the main difference perhaps is that one party has a mapping and it is up to the second to learn it. they can't find unused symbols and mach them to a new state.

    -   there may be many speakers so making changes will be costly.

-   Can switching roles of sender and receiver give better outcomes in learning ?

    -   this may change for different extensions

-   If there are multiple agent learning can create or learn the signaling system better or faster

    -   what if they have groups with established signal systems

    -   how can they find a new set of mapping with minimal permutation from their original

-   If states used for reward are not random are there better schedules for learning are not random

What if each has knowledge of a working signaling system already help adding more players seem to

# 4 Evolution

The three essential factors in Darwin’s account are

1.  natural variation - mutation, gene flow via migration, genetic drift and recombination in sexual reproduction.
2.  differential reproduction - [@Taylor1978ESS] replicator dynamics
3.  inheritance

### ESS

In [@Smith1973LogicAnimalConflict] the authors introduced a novel solution concept - the ESS or Evolutionary stable strategy, improving on the notion of the Nash equilibrium by replacing agent level play dominance with statistical dominance of strategies.

::: {#ex-ess-hak-dove}
## ESS Motivating Example Hawk Dove Game

|          | Hawk | Dove |
|----------|------|------|
| **Hawk** | 0    | 3    |
| **Dove** | 1    | 2    |

: Hawk Dove Game

This explains why hyper-aggressive Hawks type who can defeat more peaceful Doves type do not wipe them out. Hawks have an advantage if there are mostly doves. Once they are in a majority Hawk-Hawk interaction lead to serious injury and death. ESS is a frequency dependent equilibrium.
:::

## ESS Criteria

In [@Smith1973LogicAnimalConflict] the authors introduce the following criteria in terms of payoffs for a strategy to be an ESS.

A strategy, S, is evolutionary stable if for any other strategy, M, either:

1.  Fitness (S played against S) \> Fitness (M played against S) or:
2.  Fitnesses are equal against S, but Fitness(S against M) \> Fitness(M against M)

Where under the first mutants are expelled quickly and under 2 less so.

## Differential Reproduction - Replicator dynamics

Replicator dynamics is driven by Darwinian ﬁtness—expected number of progeny.

so $fitness \sim \mathbb E(|progeny|)$ where on average you get what you expect. For strategy $S$ the population

$$
x_{t}(S) = \frac{x_{t-1}(S) \times fitness(S)}{mean\_fitness}
$$

and for continuous time[^2]

[^2]: I think that we should consider a lewis hirarcy of games based on lewis games with\
    a. logic\
    b. conjuctive signals

$$
\frac{dx}{dt} = x (fitness(S) - {mean\_fitness})
$$

The main outcomes of this chapter are that for a two state/signal/action Lewis game

1.  Multiple isomorphic signaling systems we could call languages will arise leading to a population of agents split equaly
2.  In a population of agents whose fitness depends on use of the language the stable state is one in which just one of the language is used by the entire population. Other equilibria are unstable which leads to spontaneous breaking of the symmetry and a gradual drift of the population towards one of the stable states.

Notes:

1.  The analysis fails to consider spatial dynamics. It seems that a in a local pockets of language 1, agents with language 2 might have lower fitness.
2.  There is a cost of switching and agents typicaly are not born with a fully formed language ability they need to learn a language and that has costs and requires access to signalers with the said language.
3.  In reality *Pidgeons* and *Creoles* are often formed. This is a language that is a mix of two or more languages. This is a partial pooling equilibrium. The existence of creoles suggest that the population dynamics of language formation is more complex than the simple Lewis game.

## Langauge intergration problem:

### **Problem Definition**

Given a set of signaling systems ${\pi_1,\pi_2,\ldots,\pi_𝑛}$, find a permutation $\pi_m$ such that:

$$
\pi_m =\arg \min_\pi \sum_{𝑖=1}^𝑛 d(\pi,\pi_i)
$$ where d is the Cayley distance between permutations, i.e. the minimum number of transpositions required to transform one permutation into another.

### **Solution Approach**

Finding the exact median permutation is a computationally challenging task because the problem is NP-hard. However, there are heuristic and approximation methods to approach this problem. One common approach is to use a greedy algorithm that iteratively improves a candidate solution based on the distances to all permutations in the set.

Here is a simple heuristic approach to estimate a solution:

1.  **Start with an Initial Guess**: You can start with any permutation, such as 𝜋1π1​ or any permutation randomly chosen from the set.

2.  **Iterative Improvement**:

    -   For each element in the permutation, consider swapping it with every other element.
    -   Calculate the new total distance after each possible swap.
    -   If a swap results in a lower total distance, make the swap permanent.
    -   Repeat this process until no improving swaps are found.

This approach doesn't guarantee an optimal solution but can often produce a good approximation in a reasonable time frame.

Here's a Python function that demonstrates this basic heuristic:


In [None]:
import itertools

def cayley_distance(pi, sigma):
    """Calculate the Cayley distance between two permutations."""
    count = 0
    temp = list(pi)
    for i in range(len(pi)):
        while temp[i] != sigma[i]:
            swap_index = temp.index(sigma[i])
            temp[i], temp[swap_index] = temp[swap_index], temp[i]
            count += 1
    return count

def median_permutation(permutations):
    n = len(permutations[0])  # Assuming all permutations are of the same length
    current = list(permutations[0])  # Start with the first permutation as an initial guess
    improving = True

    while improving:
        improving = False
        best_distance = sum(cayley_distance(current, p) for p in permutations)
        for i, j in itertools.combinations(range(n), 2):
            current[i], current[j] = current[j], current[i]  # Swap elements
            new_distance = sum(cayley_distance(current, p) for p in permutations)
            if new_distance < best_distance:
                best_distance = new_distance
                improving = True
            else:
                current[i], current[j] = current[j], current[i]  # Swap back if no improvement

    return current

# Example usage
permutations = [
    [1, 2, 3, 4],
    [2, 1, 4, 3],
    [1, 3, 4, 2],
    [4, 3, 2, 1]
]
print("Median permutation:", median_permutation(permutations))

# Learning

Two type of learning are considered.

1.  Evolution learning using knowledge hard-coded into the genome of the agents. Learning happens though replicator dynamics incorporating randomization followed by natural selection. Also other biologically inspired ideas like mutation and use of a fitness function can come into play.

    The down side of Evolution is that is takes many generation for many structures to emerge. (Richard Dawkings states that the evolution of different morphology of the eye are quick taking only 80 generation to evolve in a simulation from the most rudimentary light sensitive cell and elsewhere suggest that 8 generations are needed to see changes in this type of framework.

2.  RL refers to the type of learning from experience by an organism during its lifetime.

3.  Noam Chomsky and others Linguistics hypothesize that Language learning faculties are to a large extent passed through evolution and for this reason individuals can learn languages based on a rather minimal amount of stimulus. This has also be a reason why many in their field abandoned their work on solving linguistics and went on to research the mysteries of the human brain. I feel that to a large extent this book demonstrates that scientifically the notion of the brain requiring a specialized mechanism to evolve/learn complex language is an unnecessary assumption. (Of course it is possible that the brain has co-evolved together with language and that such mechanism do exist.)

    1.  in one sense the book starts with very simple systems of communication with just a lexicon.

    2.  The formation of more complex systems with syntax are treated in chapter 12 but these results here seem to satisfy a mathematician or a philosopher etc, without delving into different linguistic niceties that might satisfy a linguist.

    3.  However the Lewis game needs only a small tweak (the receiver getting multiple partial signals) to allow a signaling system with a grammar to emmerge via Roth-Erev RL. We can also make a categorical statement that this type of RL is a general purpose learning mechanism not a language specific one.

In agents we have learning that is based on evolution and requires subsequent generations of agents becoming fitter.

Here are two conceptual ideas to base RL on

Law of effect

:   Of several responses made to the same situation, those which are accompanied or closely followed by satisfaction to the animal will, other things being equal, be more ﬁrmly connected with the situation, so that, when it recurs, they will be more likely to recur. — Edward Thorndike, Animal Intelligence, 1911

Law of practice

:   Learning slows down as reinforcements accrue

## Roth–Erev RL alg:

1.  set starting weight for each option
2.  weights evolve by addition of rewards gotten
3.  probability of choosing an alternative is proportional to its weight.

```python RE-RL
from mesa import Agent, Model
from mesa.time import StagedActivation
import random
import numpy as np

class LearningRule:
    def __init__(self, options, learning_rate=0.1):
        self.weights = {option: 1.0 for option in options}  # Start with equal weights for all options
        self.learning_rate = learning_rate

    def update_weights(self, option, reward):
        # Update the weight of the chosen option by adding the reward scaled by the learning rate
        old_weight = self.weights[option]
        self.weights[option] += self.learning_rate * reward
        print(f"Updated weight for option {option}: {old_weight} -> {self.weights[option]}")

    def choose_option(self):
        # Select an option based on the weighted probabilities
        total = sum(self.weights.values())
        probabilities = [self.weights[opt] / total for opt in self.weights]
        return np.random.choice(list(self.weights.keys()), p=probabilities)

class LewisAgent(Agent):
    def __init__(self, unique_id, model, learning_options):
        super().__init__(unique_id, model)
        self.message = None
        self.action = None
        self.reward = 0
        self.learning_rule = LearningRule(learning_options, learning_rate=0.1)  # Initialize learning with given options

    def set_reward(self):
        print(f"Agent {self.unique_id} received reward: {self.reward}")

class Sender(LewisAgent):
    def send(self):
        state = self.model.get_state()
        self.message = self.learning_rule.choose_option()  # Send a signal based on the learned weights
        print(f"Sender {self.unique_id} sends signal for state {state}: {self.message}")

    def update_learning(self):
        self.learning_rule.update_weights(self.model.current_state, self.reward)  # Update weights based on the state and received reward

class Receiver(LewisAgent):
    def receive(self):
        self.received_signals = [sender.message for sender in self.model.senders]
        if self.received_signals:
            self.action = self.learning_rule.choose_option()  # Choose an action based on received signals and learned weights

    def calc_reward(self):
        correct_action = self.model.states_actions[self.model.current_state]
        self.reward = 1 if self.action == correct_action else 0
        print(f"Receiver {self.unique_id} calculated reward: {self.reward} for action {self.action}")

    def update_learning(self):
        for signal in self.received_signals:
            self.learning_rule.update_weights(signal, self.reward)  # Update weights based on signals and rewards

class SignalingGame(Model):
    def __init__(self, senders_count=1, receivers_count=1, state_count=3):
        super().__init__()
        self.k = k
        self.current_state = None

        # Initialize the states, signals, and actions mapping
        self.states_signals = list(range(k))  # States are simply numbers
        self.signals_actions = list(chr(65 + i) for i in range(k))  # Signals are characters

        self.states_actions = {i: i for i in range(k)}  # Mapping states to correct actions

        self.senders = [Sender(i, self, self.signals_actions) for i in range(senders_count)]
        self.receivers = [Receiver(i + senders_count, self, self.signals_actions) for i in range(receivers_count)]
        
        self.schedule = StagedActivation(self, stage_list=['send', 'receive', 'calc_reward', 'set_reward', 'update_learning'])

    def get_state(self):
        return random.choice(self.states_signals)

    def step(self):
      
        self.current_state = self.get_state()
        print(f"New state of the world: {self.current_state}")
        self.schedule.step()

# Running the model
model = SignalingGame(senders_count=1, receivers_count=1, state_count=3)
for i in range(10):
    print(f"--- Step {i+1} ---")
    model.step()
    
```

## Bush–Mosteller RL

1.  If an act is chosen and a reward is gotten the probability is incremented by adding some fraction of the distance between the original probability and probability one

    $$
    pr_{new}(A)=(1-\alpha)pr_{old}(A) + a(1)
    $$

2.  Alternative action probabilities are decremented so that everything adds to one

## Goldilocks RL

We consider if there is a Goldilocks point in the RL exploration exploitation dilemma which has a good balance of the two.

-   If we stop learning too fast we are **too cold**

-   If we exploring too much we are **too hot**

-   At the limit is the Goldilocks RL point

**Q: is there Goldilocks RL Alg?**

-   Roth—Erev, Thompson sampling & UCB don't get stuck

-   Epsilon greedy is too hot

-   Bush–Mosteller is too cold

## RL variants:

-   BM variants like dynamically adjusting aspiration levels

-   exponential response rule. The basic idea is to make probabilities proportional to the exponential of past reinforcements. [@Blume2002]

-   best response dynamics, aka Cournot dynamics

## Beyond the book:

-   \^\[citation needed \]\^ investigating RL for this task also suggest that Roth-Erev with forgetting leads to more efficient learning.
-   \^\[citation needed\]\^ Another paper suggest that a learning with a certain prior can be better than Roth-Erev learning.

Adding Learning


In [None]:
from mesa import Agent, Model
from mesa.time import StagedActivation
import random

class LewisAgent(Agent):
  
    def __init__(self, unique_id, model):
        super().__init__(unique_id, model)
        self.message = None
        self.action= None

    def send(self):
      pass
    
    def recive(self):
      pass

    def calc_reward(self):
      pass
    
    def set_reward(self):
        self.reward = model.reward
        # Placeholder for learning logic
        print(f"Agent {self.unique_id} received reward: {self.reward}")
 
class Sender(LewisAgent):
  
    def __init__(self, unique_id, model):
        super().__init__(unique_id, model)

    def send(self):
        state = self.model.get_state()
        # Learning to map states to signals
        self.message = self.model.states_signals[state]
        print(f"Sender {self.unique_id} sends signal for state {state}: {self.message}")

class Receiver(LewisAgent):
  
    def __init__(self, unique_id, model):
        super().__init__(unique_id, model)

    def recive(self):
      self.received_signals=[]
      for sender in self.model.senders:
        self.received_signals.append(sender.message)
            # Learning to map signals to actions
      if len(self.received_signals)==1:
        self.action = self.model.signals_actions[self.received_signals[0]]
      else:
        self.action = self.model.signals_actions[self.received_signals[0]]
      

    def calc_reward(self):
      action = self.model.signals_actions[self.received_signals[0]]
      correct_action = self.model.states_actions[self.model.current_state]
      reward = 1 if action == correct_action else 0
      model.reward = reward


class SignalingGame(Model):
    def __init__(self, senders_count=1, recievers_count=1, state_count=3):
        
        super().__init__()
        self.senders_count=senders_count
        self.recievers_count=recievers_count
        self.num_agents = self.recievers_count+self.senders_count

        # e.g., 0 -> A, 1 -> B, ...
        self.states_signals = {i: chr(65 + i) for i in range(k)} 

        # e.g., A -> 0, B -> 1, ...
        self.signals_actions = {chr(65 + i): i for i in range(k)}
        
        # state 0 needs action 0, state 1 needs action 1, ...
        self.states_actions = {i: i for i in range(k)}  
        
        self.current_state = None

        # Create agents
        self.senders = []
        self.receivers=[]
        self.my_agents=[]
        self.uid=0
        for i in range(self.senders_count):
            sender = Sender(self.uid, self)
            self.senders.append(sender)
            self.my_agents.append(sender)
            self.uid +=1
        for j in range (self.recievers_count):
            reciever = Receiver(self.uid, self)
            self.receivers.append(reciever)
            self.my_agents.append(reciever)
            self.uid +=1

        self.schedule = StagedActivation(
          model=self,
          agents=self.my_agents, 
          stage_list = ['send','recive','calc_reward','set_reward']
        )
    
    def get_state(self):
        return self.current_state

    def step(self):
        self.current_state = random.choice(list(self.states_signals.keys()))
        print(f"New state of the world: {self.current_state}")
        self.schedule.step()

# Running the model
k = 3  # Number of states, signals, and actions
steps = 10
model = SignalingGame(senders_count=2,recievers_count=1,state_count=k)
for i in range(steps):
    print(f"--- Step {i+1} ---")
    model.step()

# 11. Networks I: Logic and Information Processing

## Logic

## Information processing

### Inventing the code Game

The world has say four states {S1...S4}. In this extended Lewis game where an agent is a receiver of two messages, each with a partial specification the first is {s1\|\|s2} or {s3\|\|s4} and the second {s1\|\|s3} or {s2\|\|s4}. The agent needs to process the two messages it to get the full state specification and take the appropriate action in response for getting a reward !

The added problem here is that the messages one of two flags, and one of two other flags do not have an established system for the message so learning the content of the signals needs to evolve together with the inference.

The sender can be two agents or one agent with a complex message.

Jeffrey Barrett in Barrett 2007a, 2007b. showed that this can be learned with Roth Erev RL

this is more interesting if there are errors:

-   is a 10% chance of senders making mistakes with only 3% errors by the receiver?! Skyrms explains this due to the inference being like a taking a vote in a Condorcet signaling system.

-   receiver errors are considered in [@Nowak1999] where the authors claim they lead to syntax formation.

# 12. Complex Signals and Compositionality

CCSS

:   complex composeable signaling systems

:   

-   The use of complex signals is not unique to humans.

-   In [@Nowak1999] the authors make a case that complex signals can increase the ﬁdelity of information transmission, by preventing simple signals getting crowded together as the space of potential signals gets ﬁlled up. Also some complex signalsing systems should be simpler to learn. (*can we specify a maximaly learnable family?*) and process inforamtion

-   considered CCSS as conffering greater Darwinian fitness in contexts where *rich information processing is important.*

    -   Q: **Is there a metric for measuring the advantage and or the importance of such information processing needs?**

-   In [@batali1998] the author investigates the emergence of complex signals in populations of neural nets.

-   in [@Kirby2000] the author, extends the model in a small population of interacting artiﬁcial agents.

-   These two papers assume Structured meanings like \<John, loves, Mary\>. But I am more interested in the ability of evolving arbitrary structures like a sketch map of resources, a distribution of prices, a small bitmap etc.

-   Skryms takes a similar reductionist POV: finding how to evolve a complex signaling system with minimal departure from the Lewis signaling game and other models already covered....

-   It is suggested that the "Inventing the code Game" is a sufficient framework creating basic composeable messages. If the receiver considers a sequence of two partial signals as conjunction the and can integrated into one full message!

    -   Red \> Top

    -   Green\> Bottom

    -   Yellow\> Left

    -   Blue \> Right

    to signal the state of \<bottom, left\> a sender can send \<green,yellow\> or \<yellow,green\> and the receiver can compose them.

-   But if it is also possible to evolve and learn order for signals a richer form of composeability become possible. Subject–predicate or operator–sentence.

-   Sensitivity to temporal order is something many organisms have already developed in responding to perceptual signals.

-   More generally, we can say that temporal pattern recognition is a fundamental mechanism for anticipating the future.

Skryms points out that temporal order is another mechanism that evolves and that they come together.

Unfortunately Skryms seems to get sidetracked once he point out about order and does not explain how order sensitivity eveloves in "Making the code game".

```python
from mesa import Agent, Model
from mesa.time import StagedActivation
import random

class LewisAgent(Agent):
  
    def __init__(self, unique_id, model):
        super().__init__(unique_id, model)
        self.message = None
        self.action= None

    def send(self):
      pass
    
    def recive(self):
      pass

    def calc_reward(self):
      pass
    
    def set_reward(self):
        self.reward = model.reward
        # Placeholder for learning logic
        print(f"Agent {self.unique_id} received reward: {self.reward}")
 
class Sender(LewisAgent):
  
    def __init__(self, unique_id, model):
        super().__init__(unique_id, model)

    def send(self):
        state = self.model.get_state()
        
        # Learning to map states to signals
        if type(state) is str:
          self.message = self.model.states_signals[state]
        else:
          self.message = set()
          while len(self.message)>0:
            message = {model.states_signals[self.message.pop()]}
            self.message = self.message.union(message)
        print(f"Sender {self.unique_id} sends signal for state {state}: {self.message}")

class Receiver(LewisAgent):
  
    def __init__(self, unique_id, model):
        super().__init__(unique_id, model)

    def recive(self):
      self.received_signals=[]
      self.action = set()
      
      for sender in self.model.senders:
        self.received_signals.append(sender.message)
        # Learning to map signals to actions
        print(f'{self.received_signals=}')
        print(f'{type(self.received_signals)=}')
        
        for signal_set in self.received_signals:
          actions = set()
          while len(signal_set)>0:
            action  = {model.signals_actions[self.message.pop()]}
            actions  = actions.union(action)
          self.action =  self.action.intersection(actions)
      print(f"Reciever {self.unique_id} action : {self.action}")

    def calc_reward(self):
      action = self.action
      correct_action = self.model.states_actions[self.model.current_state]
      reward = 1 if action == correct_action else 0
      model.reward = reward
      

class SignalingGame(Model):
    def __init__(self, senders_count=1, recievers_count=1, state_count=3):
        
        super().__init__()
        self.senders_count=senders_count
        self.recievers_count=recievers_count
        #self.num_agents = self.recievers_count+self.senders_count

        self.states   = [f'{i}' for i in range(state_count)]
        self.signals = [chr(65 + i) for i in range(state_count)]
        self.actions = [f'{i}' for i in range(state_count)]

        self.current_state = None

        # Create agents
        self.senders = []
        self.receivers=[]
        self.my_agents=[]
        self.uid=0
        
        for i in range(self.senders_count):
            sender = Sender(self.uid, self)
            self.senders.append(sender)
            self.my_agents.append(sender)
            self.uid +=1
        for j in range (self.recievers_count):
            reciever = Receiver(self.uid, self)
            self.receivers.append(reciever)
            self.my_agents.append(reciever)
            self.uid +=1

        self.schedule = StagedActivation(
          model=self,
          agents=self.my_agents, 
          stage_list = ['send','recive','calc_reward','set_reward']
        )
    
        

    def step(self):
        self.current_state = random.choice(list(self.states_signals.keys()))
        self.current_state_set = {random.choice(list(self.states_signals.keys()))}
        print(f"New state of the world: {self.current_state}")
        self.schedule.step()

    def get_state(self):
        if self.senders_count ==1:
          return self.current_state
        else: 
          return {self.current_state, random.choice(list(self.states_signals.keys()))}


# Running the model
state_count = 3  # Number of states, signals, and actions
steps = 10
model = SignalingGame(senders_count=2,recievers_count=1,state_count=state_count)
for i in range(steps):
    print(f"--- Step {i+1} ---")
    model.step()

```

## Some thoughts

1.  learning in the original Lewis language games is exponential in the size of the lexicon. It would seem that some complex signals systems should have orders of magnitude advantage in learning rates compared to the original variants. Lets consider a Lewis signaling system with 27 signals.\
    The learning is $O(e^{27})\propto5\times10^{12}$

2.  Under a conjunctive structure with three messages a lexicon of 9 messages would be required.\
    The learning is $O(e^{9})\propto 8.1\times10^{3}$

3.  Under Say we have a VSO complex signal with 3 signals per a positional POS category. This leads to 27 signal lexicon under the original lexicon. Using the complex system only 3 three signals need to be learned.\
    So that learning is $O(e^{3}) \propto 20$

    If we factor learning time as part of the costs of signaling we should expect complex signaling systems to emerge quickly. Also if we consider learning as part of In this case partial pooling states are acceptable and even desirable each signal now has three meaning depending on its position.

4.  In NLP we never see such a perfect utilization of a SS where all synthetically messages are semantically meaningful. On the other hand NLP allow nesting so that sequence like V(VSO)(VSO) corresponding to 3^8^ messages and adding a sub-category modifier prefix (MVMSMO) leads to (3\^6) signals 729 signals without

5.  For a simulation - some predators can be introduced into the environment nearby agents will signal it presence. Receivers who take that appropriate action will survive. Those that do not may die. Agent have longevity and must learn the language. When agents die they are replaced by infants without a uniform signaling weights.

6.  Another point is that seems obvious is that if we learn/evolve the lexicon with just one one new word at a time the task becomes trivial. We just need to learn one new state to signal and one new signal to action mapping. But learning just one is a one to one matching. If we have some sense of the salience of the signals we can just order them in that order and we keep increasing fitness.... till we reach some marginal rate of fitness where new signals do almost nothing for our survival.

7.  If we can evolve a complex signaling system we can move to next steps like optimizing our lexicon and grammar for:
    1.  minimizing communications errors, (error detection and correction)
    2.  maximizing information transmission. (compression)
    3.  minimizing cost of acquisition. (acquisition)
    4.  the trade off between grammatic generalization and easily learnability v.s. making the system harder to learn but more efficient for communication.
    5.  how do we handle inference (for logic)
    6.  how do we take advantage of predictability for partial messages
    7.  what about a convention for grammar - useful for agents that need to exchange data in different formats efficently.
    8.  Costs of morphotactics - can we do all this in practive with human sound systems. Can we figure our metrics for human languages.
    9.  Given a (human) language tree can we posit a most pasimonius path for its evolution.


In [@Skyrms2010signals] the author discusses how a Lewis signaling games can be viewed as a mechanism in which a rudimentary signaling system can give rise to a simple language.

The languages arising from Lewis signaling games are limited.


---
tldr:

The book is aptly nameed signals. It is primarily looking at signaling systems not at anything resembling a fully fledged languages.

- What is missing -
  - real distribution of states - very large set of messages
  - real distribution of messages - a subset of phonology
  - a communication protocol over a lossy amd risky channel which can lead to 
    - sender may err (most of the RL models in the book - unkike mine lead to making mistakes at a certain frequency)
      - mistakes can happen at either end
      - lewis games have shared reward but mistakes will impact agents assymetricaly
        - recivers taking the wrong action will face greater risks
        - senders making maistakes can 
    - sender
    - transmission errors
    - risk related to message length
    - risk associaded with mistakes in the protocol.    

    - posibility of partial messages and pressure on shorter messages 
      - consideation of "learning by correcting mistakes"
  - a grammar 
  - group dynamics - there are many languages. Languages may have emereged over and over in different locations and grooups seems to
    have played a major role here. I beleive that a basic model of many agents should lead to rapid 
  - some constraints
    - distibutional contraints already menthioned
    - basic languages learning must be rapid even if learnig the full system takes years.
      - morphology speed up learning of the lexicon 
        - learning k lexemes + a morphology of m productions => $k \times m$ lexicon with a coordiantion cost of $k+m$ 
      - syntax allows makes more efficent use of the 

The main formalism is the Lewis Signing game. It covers in essense nothing more than coordinating a shared lexcion of signals. 
There is a lot of effort made to consider the seperating equilibria. 
But there are many more seperating equilibria and some of these are not only better suited for communications but also
emergence of categories a type of messages that can be sepcilized/disambiguated using another signal. However this requires
setting up a communication protocol. Lewis signaling can help with the coordination task for that by picking a subset of 
protocol from many possible options. But there is a second, more salient part - what is the protocol?