# Agent-based models and path dependency

## Agent-based models

Another kind of generative or simulation-based model which can offer insights into the dynamics of complexity is the agent-based model.

* An agent-based model is just a simulation of a number of agents (a bit like imaginary characters) who act according to some rule in an environment that also features some rule.

* Simple agent-based models are sometimes associated with automata because they are often implemented in simplified "world" akin to those of classic cellular automata like Conway's Game of Life.
    * But there are a variety of more complex approaches to agent-based models ... they need not be simple, deterministic, nor discrete and they can have sophisticated rules.

### Using agent-based models

Although it might be tempting to jump into complex agent-based models, there are good reasons to work with a minimal model, such as explainability and calibration.

We can borrow the automaton idea -- and substitute rules that we would like to investigate -- to see if our rules lead to favorable outcomes. With an experimental platform like this, we can then adjust our rules in the hope of creating different outcomes.

### Schelling Segregation Model

To get a feel for how this approach might be informative, we'll look at one of the earliest and most famous agent-based models: the __Schelling Segregation Model__

Allen Downey summarizes this model/thought experiment brilliantly in his book *Think Complexity*:

> The Schelling model of the world is a grid where each cell represents a house. The houses are occupied by two kinds of agents, labeled red and blue, in roughly equal numbers. About 10% of the houses are empty.
>
> At any point in time, an agent might be happy or unhappy, depending on the other agents in the neighborhood, where the “neighborhood" of each house is the set of eight adjacent cells. In one version of the model, agents are happy if they have at least two neighbors like themselves, and unhappy if they have one or zero.
>
> The simulation proceeds by choosing an agent at random and checking to see whether they are happy. If so, nothing happens; if not, the agent chooses one of the unoccupied cells at random and moves.
>
> You will not be surprised to hear that this model leads to some segregation, but you might be surprised by the degree. From a random starting point, clusters of similar agents form almost immediately. The clusters grow and coalesce over time until there are a small number of large clusters and most agents live in homogeneous neighborhoods.
>
> If you did not know the process and only saw the result, you might assume that the agents were racist, but in fact all of them would be perfectly happy in a mixed neighborhood.

Let's implement this model to see
* what a simple agent-based model implementation looks like
* how the "homophily index," or fraction of similar neighbors required for happiness, affects the overall segregation of the grid

In [None]:
size = 15
homophily_index = 0.3

In [None]:
def make_grid(size):
    grid = np.random.uniform(0, 1, (size,size))
    grid[grid>0.55] = 1
    grid[grid<0.45] = 2
    grid[grid<1]=0
    return grid

grid = make_grid(size)
grid

In [None]:
plt.magma()
plt.imshow(grid)
plt.colorbar()

In [None]:
np.argwhere(grid==0)

In [None]:
import random

def pick_random_agent(grid):
    agent_locations = np.argwhere(grid != 0)
    loc_index = random.randint(0, agent_locations.shape[0]-1)
    return (agent_locations[loc_index][0], agent_locations[loc_index][1])

def pick_empty_loc(grid):
    empty_locations = np.argwhere(grid == 0)
    loc_index = random.randint(0, empty_locations.shape[0]-1)
    return (empty_locations[loc_index][0], empty_locations[loc_index][1])

In [None]:
agent = pick_random_agent(grid)
agent

In [None]:
agent_group = grid[agent[0], agent[1]]
agent_group

In [None]:
neighborhood = grid[agent[0]-1:agent[0]+2, agent[1]-1:agent[1]+2]
neighborhood

In [None]:
similar_neighbors_locs = (neighborhood == agent_group)
similar_neighbors_locs

In [None]:
similar_neighbors = similar_neighbors_locs.sum() - 1
similar_neighbors

In [None]:
def do_update(grid):
    agent = pick_random_agent(grid)
    agent_group = grid[agent[0], agent[1]]
    neighborhood = grid[agent[0]-1:agent[0]+2, agent[1]-1:agent[1]+2]
    similar_neighbors = (neighborhood == agent_group).sum() - 1
    is_happy = (similar_neighbors / 8) > homophily_index
    if not is_happy:
        new_loc = pick_empty_loc(grid)
        grid[agent[0], agent[1]] = 0
        grid[new_loc[0], new_loc[1]] = agent_group

In [None]:
plt.imshow(grid)

In [None]:
for i in range(10 * size**2):
    do_update(grid)
    
plt.imshow(grid)

In [None]:
size = 100

grid = make_grid(size)
plt.imshow(grid)

In [None]:
for i in range(2 * size**2):
    do_update(grid)
    
plt.imshow(grid)

In [None]:
for i in range(2 * size**2):
    do_update(grid)
    
plt.imshow(grid)

In [None]:
for i in range(2 * size**2):
    do_update(grid)
    
plt.imshow(grid)

In [None]:
homophily_index = 0.4

size = 100

grid = make_grid(size)
plt.imshow(grid)

In [None]:
for i in range(4 * size**2):
    do_update(grid)
    
plt.imshow(grid)

It would be interesting to plot the homophily index vs. the number of iterations before a particular segregation level is met, but that is a bit beyond what we have time for today.

### Takeaway

What are some takeaways from this experiment?

We can test hypotheses which may be critical to real-world phenomena within a highly artificial "small world" and still learn critical insights.

* For example, we might want to test the following hypothesis:
    * "Modest homophily values like 30% are insufficient to generate segregation -- something else is necessary."
    * *We can see that the hypothesis is clearly false.*

Of course, the model cannot tell you how to manage your society, business, or project. But it can provide indicators you can use when designing for target outcomes.

### More recent research

__Joshua Epstein and Agent Zero__

Since then, numerous researchers have used ABMs to explore a wide range of phenomena.

In *Agent_Zero: Toward Neurocognitive Foundations for Generative Social Science*, Joshua Epstein introduces a new theoretical agent called Agent Zero, which is an attempt to ground social science in neurocognitive processes. Epstein's work focuses on creating computational models that can generate a wide array of social phenomena. Some of the phenomena that Epstein generates and discusses in the book include how
* a jury can unanimously vote to convict when only a minority of participants believe the defendant is guilty
* diversity in "trigger points" can make a mob more likely to turn violent
* soldiers can become susceptible to committing mass killings and other atrocities

Book: https://press.princeton.edu/books/hardcover/9780691158884/agentzero

Online models (will likely not make sense until after reading the book): http://modelingcommons.org/browse/one_model/5982#model_tabs_browse_info

__Stefanie Crabtree__



https://stefanicrabtree.com/agent-based-modeling-work/

#### How does this connect to the distributions we talked about earlier?

Although it is convenient to formulate and render cellular automata like these in a "grid world," they can actually be interpreted as graphical models, so they are not as far away from networks as they might appear at first. And, as areas of the "grid" are assimilated to one factor or another, there are multiplicative effects: for example, in this model, the colored areas (~ 2-dimensional) become overwhelmingly larger than the frontiers (~ 1-dimensional).

The homophily model we've looked at here might just as easily describe users of one or another mobile phone or communications platform (in non-geographical cases, the dimensions may represent other aspects of a social or product space) -- so there are definitely applications in business.

## Exploring path dependence

Simple models may evaluate a distribution of outcomes for an individual, team, firm, or other group over a series of choices.

For example, when choosing what product to prioritize for the next quarter, projections might assign probabilities and expected profits to different market scenarios and product choices. A business unit might choose to focus on the product with highest expected profit across the projected business scenarios.

However -- as anyone familiar with the consequences of technical debt can tell you -- your next choice is rarely made with a blank-slate starting point. We all have to live with the consequences of our previous choices, and that can change the expected outcome dramatically. 

* In other words, our outcomes are not purely dependent on a current decision. They are dependent on the path of prior steps in the outcome space.

### Simple investment model

We'll take a look at a simple investment (or gambling) model which produces reliable positive returns when viewed from the an average (or expectation) perspective, but yields ruinous losses when viewed from the path-dependent perspective of any actual investor (or gambler)

__The business proposition__

* 50/50 risk of success or failure
* Success returns 50 cents on the dollar (i.e., \\$1 invested returns \\$1.50)
* Failure produces a loss of 40 cents (i.e., in the failure scenario, one recoups \\$0.60 from each \\$1 invested)

Traditional expectation:

In [None]:
0.5 * 1.5 + 0.5 * 0.60

We can simulate that to get a better idea of the deviation from the ideal average

In [None]:
sample_size = range(100, 10000, 100)

outcomes = []

for i in sample_size:
    draws = np.random.uniform(0, 1, (i))
    draws[draws > 0.5] = 1.50
    draws[draws < 1] = 0.6
    outcomes.append(draws.mean())
    
plt.plot(sample_size, outcomes)

So it looks like, even for small samples or "bad luck" we should do pretty well with this sort of investment.

__Ensemble average vs. time average__

But this form of average assumes that we start in the same position prior to each investment or bet.

* It's a bit like looking at hundreds or thousands of individuals or firms each making one bet. On average, they will (collectively) do well!

But let's change our perspective for a moment and look at one individual or firm making a sequence of small bets/investments.

* If they make $2n$ investments, we would expect about $n$ to yield the \\$1.50 and the other $n$ to yield the \\$0.60
* So the end result would be $(1.5)^n*(0.6)^n = [(1.5)(0.6)]^n = 0.9^n$

Wait ... $0.9^n$ doesn't look very good. In fact, it will go very quickly to zero for any significant $n$

Just to be sure, let's simulate this as well:

In [None]:
steps=200
simulations=10000
draws = np.random.uniform(0, 1, (simulations, steps))
draws[draws > 0.5] = 1.50
draws[draws < 1] = 0.6
outcomes = draws.prod(axis=1)
plt.hist(outcomes, bins=100)

Just for comparison, our expected value after `steps` investments

In [None]:
expected = 1.05 ** steps
expected

In [None]:
outcomes[outcomes < 0.1].size / simulations

In [None]:
outcomes[outcomes < 1].size / simulations

In [None]:
outcomes[outcomes > 2].size / simulations

In [None]:
outcomes[outcomes >= expected].size / simulations

__A dramatic view of the "lifelines" of a number of agents facing a similar set of options__

<img src='images/ergo.webp' width=700>

From: https://www.nature.com/articles/s41567-019-0732-0

#### Takeaway

When does this occur in real life?

Although our specific numbers in the present example are contrived, path dependence is a critical factor in many real-world systems:
* economic actors
* health outcomes
* hiring and promotion
* education
* criminal justice
* participation in risk-taking and investment activities

__How does this connect to the distributions and patterns we've been talking about?__

Notice that, in the path-dependent case,
* we have a *series of multiplied values which are not independent*
    * (since each multiplication is  dependent on prior state) 
* where, in the ensemble expectation, we *assumed* that all of the events (values being multiplied) are independent
    * (they only depend on the "rules of the game" -- every trial starts with 1 dollar)
    
Once again, we see a compounding effect leading to drastically large (or small) numbers. 

A concrete example is insurance pools. A sufficiently large and diverse business can "self insure" anything from employee health costs to its own fleet of vehicles. Such self insurance can work, provided the losses are independent enough that the ensemble average holds.

If a company's employees were all concentrated in an area with common health hazards (say, contaminated air or ground water) then the sequence of repeated of heath-cost losses would not be independent -- risk would be magnified as health losses compound over time.

__How do we use this knowledge?__

Any time we are looking to achieve an "average" result over time, we can ask whether the steps are truly independent. As a technology example, we may have a device that we deploy in the field which features high uptime (time between failures). 
* To achieve long-term reliability, we want to ensure that the device is as stateless as possible when it recovers
* If a device retains state (e.g., internal storage or config) which affect its future success (after recovering from a failure) then the sequence of failures becomes path dependent