# Batch Analysis for hawk/dove simulation with multiple risk attitudes, no adjustment

- What % of time do agents play Hawk, by risk attitude? risk-inclined (R=1) and risk-avoidant (R8) agents play Hawk?
- Cumulative wealth analysis by risk attitude

To track how often agents play Hawk, we need to collect data for every round.

This data was generated by running:
```console
simulatingrisk/hawkdovemulti/batch_run.py --params no_adjustment --agent-data --collect-data every_round
```

Each row in the data file represents one round for each agent.

In [2]:
import polars as pl


# df = pd.read_csv("../../data/hawkdovemulti/2025-07-22T170057_747737_agent.csv")
# df = pd.read_csv("../../data/hawkdovemulti/2025-07-23T135510_859267_agent.csv")
# which batch run data to use; use variable to ensure we use matching agent and model data
batch_run_date = "2025-07-24T120337_924060"

# load agent data, drop unneeded columns, and add numeric value 1 for played hawk, 0 for played dove

df = (
    pl.read_csv(f"../../data/hawkdovemulti/{batch_run_date}_agent.csv")
        .drop("risk_level_changed")  # drop risk_level_changed; not relevant here (no adjustment = no changes)
        .rename({'risk_level': 'risk_attitude'})  # code still uses risk_level internally; relabel as risk attitude
        .with_columns(
            # add a numeric field to turn choice of play to 1/0 hawk, for aggregation            
            played_hawk=pl.when(pl.col("choice").eq("hawk")).then(1).otherwise(0)
        )
)
df.head()

RunId,iteration,Step,AgentID,risk_attitude,choice,points,played_hawk
i64,i64,i64,i64,i64,str,i64,i32
1,1,1,0,3,"""dove""",12,0
1,1,1,1,8,"""dove""",12,0
1,1,1,2,8,"""hawk""",12,1
1,1,1,3,4,"""hawk""",18,1
1,1,1,4,8,"""hawk""",15,1


## Percent of the time agents play Hawk, by risk attitude

What % of time do risk-inclined (R=1) and risk-avoidant (R8) agents play Hawk?

- Guess from observation is is >90% for R=1, <10% for R8, but we want to have statistics for this: for X trials, how many of them does R1 play Hawk more than 90% of the time?
- Also useful to have statistics e.g. R=2 played Hawk between 80-90% of the time, or whatever the result is.


In [3]:
# each row in the data frame is a play by an agent on the grid
# group by risk level, then:
# - count the number of rows (= total number of plays)
# - sum the played_hawk field (= number of times played hawk)
# - calculate percent of turns played hawk

hawk_by_risk_attitude = (
    df.group_by("risk_attitude")
        .agg(n_plays=pl.col("played_hawk").count(), n_plays_hawk=pl.col("played_hawk").sum())
    .with_columns(
        pct_plays_hawk=pl.col("n_plays_hawk").truediv(pl.col("n_plays")).mul(100)
    )
)

hawk_by_risk_attitude

risk_attitude,n_plays,n_plays_hawk,pct_plays_hawk
i64,u32,i32,f64
9,13146745,178159,1.355157
3,15198831,10693764,70.359122
0,13280811,13101490,98.649774
6,15209556,4506272,29.6279
7,13058019,1725094,13.210993
1,12726041,11838227,93.023643
4,17567250,10583899,60.247899
8,12692401,882273,6.951191
2,13098799,11371104,86.810279
5,17605297,6979686,39.645375


In [4]:
# output core fields as nicely styled table

(hawk_by_risk_attitude
    .select("risk_attitude", "pct_plays_hawk")
    .sort("risk_attitude")
    .rename({"risk_attitude": "Risk Attitude", "pct_plays_hawk": "% plays Hawk"})
    .style.tab_header(title="% of time agents play Hawk by Risk Attitude")
    .fmt_number("% plays Hawk", decimals=1)
)

% of time agents play Hawk by Risk Attitude,% of time agents play Hawk by Risk Attitude
Risk Attitude,% plays Hawk
0,98.6
1,93.0
2,86.8
3,70.4
4,60.2
5,39.6
6,29.6
7,13.2
8,7.0
9,1.4


In [5]:
import altair as alt

alt.Chart(hawk_by_risk_attitude).mark_bar(width=10).encode(
    x=alt.X("risk_attitude", title="Risk Attitude").scale(domain=[0, 9]),                                                          
    y=alt.Y("pct_plays_hawk", title="% time plays Hawk")
).properties(title="% of time agents play Hawk")

Context: how many runs are these numbers drawn from?



In [6]:
total_unique_runs = len(df["RunId"].unique())
total_iterations = len(df["iteration"].unique())
n_combinations = int(total_unique_runs / total_iterations)

# Step is the round count indicator
longest_run = df["Step"].max()    # highest across all runs
# average of max value for each run. Group by run, get max Step, average —> returns a dataframe, get first Step value
average_run = df.group_by("RunId").agg(pl.col("Step").max()).mean()["Step"].first()

print(f"""{total_unique_runs:,} total unique runs; {total_iterations} iterations of {n_combinations} different parameter combinations.

Longest run: {longest_run} steps
Average run: {average_run:.1f} steps
""")


13,500 total unique runs; 100 iterations of 135 different parameter combinations.

Longest run: 109 steps
Average run: 44.9 steps



### Analysis filtered by simulation parameters

In [7]:
# identify the last round of each run
# for both wealth analysis and model parameters, we want to look at the last round (Step) of each run

last_round_df = df.group_by("RunId").agg(pl.col("Step").max())
last_round_df.head()

RunId,Step
i64,i64
8789,57
8146,31
12996,31
9983,31
11421,31


In [8]:
# load model data and filter to last round for each run
full_model_df = pl.read_csv(f"../../data/hawkdovemulti/{batch_run_date}_model.csv")
model_df = last_round_df.join(full_model_df, on=['RunId', 'Step'], how="left")
# limit to only those fields that are needed for our analysis
model_df = model_df.select("RunId", "iteration", "risk_distribution", "play_neighborhood", "observed_neighborhood", "grid_size")
model_df.head()

RunId,iteration,risk_distribution,play_neighborhood,observed_neighborhood,grid_size
i64,i64,str,i64,i64,i64
8789,89,"""skewed right""",8,4,5
8146,46,"""skewed right""",8,8,5
12996,96,"""bimodal""",4,24,5
9983,83,"""skewed right""",4,8,5
11421,21,"""bimodal""",8,4,5


In [9]:
print(f"""Simulation parameters:

Grid size: {', '.join(str(n) for n in sorted(model_df["grid_size"].unique()))}
Iniital risk distribution: {', '.join(str(n) for n in sorted(model_df["risk_distribution"].unique()))}
Play neighborhood size: {', '.join(str(n) for n in sorted(model_df["play_neighborhood"].unique()))}
Observed neighborhood sized: {', '.join(str(n) for n in sorted(model_df["observed_neighborhood"].unique()))}

""")

Simulation parameters:

Grid size: 5, 10, 25
Iniital risk distribution: bimodal, normal, skewed left, skewed right, uniform
Play neighborhood size: 4, 8, 24
Observed neighborhood sized: 4, 8, 24




In [10]:
# join agent data with model data so we can filter by starting parameters
agent_df_params = df.join(model_df, on=["RunId", "iteration"], how="left")

#### Grid size


In [11]:
hawk_by_gridsize_riskattitude = (
    agent_df_params.group_by("grid_size", "risk_attitude")
         .agg(n_plays=pl.col("played_hawk").count(), n_plays_hawk=pl.col("played_hawk").sum())
    .with_columns(
        pct_plays_hawk=pl.col("n_plays_hawk").truediv(pl.col("n_plays")).mul(100)
    )
)

alt.Chart(hawk_by_gridsize_riskattitude).mark_bar(width=10).encode(
    x=alt.X("risk_attitude", title="Risk Attitude").scale(domain=[0, 9]),                                                          
    y=alt.Y("pct_plays_hawk", title="% time plays Hawk")
).facet(column=alt.Column("grid_size", title="Grid Size")).properties(title="% of time agents play Hawk")


#### Play neighborhood


In [12]:
hawk_by_playnhood_riskattitude = (
    agent_df_params.group_by("play_neighborhood", "risk_attitude")
         .agg(n_plays=pl.col("played_hawk").count(), n_plays_hawk=pl.col("played_hawk").sum())
    .with_columns(
        pct_plays_hawk=pl.col("n_plays_hawk").truediv(pl.col("n_plays")).mul(100)
    )
)

alt.Chart(hawk_by_playnhood_riskattitude).mark_bar(width=10).encode(
    x=alt.X("risk_attitude", title="Risk Attitude").scale(domain=[0, 9]),                                                          
    y=alt.Y("pct_plays_hawk", title="% time plays Hawk")
).facet(column=alt.Column("play_neighborhood", title="Play Neighborhood")).properties(title="% of time agents play Hawk")


#### Observed neighborhood


In [13]:
hawk_by_obsnhood_riskattitude = (
    agent_df_params.group_by("observed_neighborhood", "risk_attitude")
         .agg(n_plays=pl.col("played_hawk").count(), n_plays_hawk=pl.col("played_hawk").sum())
    .with_columns(
        pct_plays_hawk=pl.col("n_plays_hawk").truediv(pl.col("n_plays")).mul(100)
    )
)

alt.Chart(hawk_by_obsnhood_riskattitude).mark_bar(width=10).encode(
    x=alt.X("risk_attitude", title="Risk Attitude").scale(domain=[0, 9]),                                                          
    y=alt.Y("pct_plays_hawk", title="% time plays Hawk")
).facet(column=alt.Column("observed_neighborhood", title="Observed Neighborhood")).properties(title="% of time agents play Hawk")


#### Initial risk distribution


In [14]:
hawk_by_initialdist_riskattitude = (
    agent_df_params.group_by("risk_distribution", "risk_attitude")
         .agg(n_plays=pl.col("played_hawk").count(), n_plays_hawk=pl.col("played_hawk").sum())
    .with_columns(
        pct_plays_hawk=pl.col("n_plays_hawk").truediv(pl.col("n_plays")).mul(100)
    )
)

alt.Chart(hawk_by_initialdist_riskattitude).mark_bar(width=10).encode(
    x=alt.X("risk_attitude", title="Risk Attitude").scale(domain=[0, 9]),                                                          
    y=alt.Y("pct_plays_hawk", title="% time plays Hawk")
).facet(facet=alt.Facet("risk_distribution", title="Initial Risk Distribution"), columns=3
).properties(title="% of time agents play Hawk")


## Cumulative Wealth analysis


- mean and quartiles for wealth by R
- does mean vary between Rs or is it roughly the same?
  - quartiles look different, but need a statistic; esp. compare lower-R quartile to higher-R quartile
     - expect/hope that lower quartile is higher for R1 than R8, higher quartile is higher for R8 than R1
     - what's going on in the middle?


-----

Because our simulations run for different lengths before stopping, and because our simulations include a range of different play neighborhoods (which affects total payoff), we scale points (wealth) by play neighborhood and simulation run length for comparison across simulations.


In [15]:
# combine the last round dataframe and full agent dataframe to get just the last round
agents_last_round_df = (
    last_round_df.join(df, on=['RunId', 'Step'], how="left")
        # join on model parameters, for filtering and scaling by play neighborhood
        .join(model_df, on=["RunId", "iteration"])
        .with_columns(
            # calculate a scaled points value so we can compare across runs with different length and play neighborhood
            scaled_points=pl.col("points").truediv(pl.col("play_neighborhood")).truediv(pl.col("Step")).mul(100)
        )
    
)

# hard to compare points across runs, since it depends on how long the simulation ran; scale points by number of runs
# agents_last_round_df['scaled_points'] = agents_last_round_df.apply(lambda x: (x['points'] / x['Step'])*10, axis=1)

# merge with model parameters, for filtering
# agents_last_round_df = pd.merge(agents_last_round_df, model_df, on=["RunId", "iteration"])

agents_last_round_df.head(10)

RunId,Step,iteration,AgentID,risk_attitude,choice,points,played_hawk,risk_distribution,play_neighborhood,observed_neighborhood,grid_size,scaled_points
i64,i64,i64,i64,i64,str,i64,i32,str,i64,i64,i64,f64
8789,57,89,0,2,"""hawk""",687,1,"""skewed right""",8,4,5,150.657895
8789,57,89,1,7,"""dove""",689,0,"""skewed right""",8,4,5,151.096491
8789,57,89,2,2,"""hawk""",771,1,"""skewed right""",8,4,5,169.078947
8789,57,89,3,7,"""dove""",608,0,"""skewed right""",8,4,5,133.333333
8789,57,89,4,9,"""dove""",687,0,"""skewed right""",8,4,5,150.657895
8789,57,89,5,8,"""dove""",714,0,"""skewed right""",8,4,5,156.578947
8789,57,89,6,5,"""hawk""",628,1,"""skewed right""",8,4,5,137.719298
8789,57,89,7,7,"""dove""",686,0,"""skewed right""",8,4,5,150.438596
8789,57,89,8,3,"""dove""",603,0,"""skewed right""",8,4,5,132.236842
8789,57,89,9,6,"""dove""",693,0,"""skewed right""",8,4,5,151.973684


First plot unscaled wealth distribution by risk attitude.

In [16]:
# our data has a lot of rows; enable vegafusion so altair can calculate quartiles for us
alt.data_transformers.enable("vegafusion")

alt.Chart(agents_last_round_df).mark_boxplot(extent="min-max").encode(
   x=alt.X("risk_attitude", title="Risk Attitude").scale(domain=[0, 9]),
    y=alt.X("points", title="Wealth "),
).properties(title="Cumulative Wealth (unscaled) by Risk Attitude")

Now plot scaled wealth distribution by risk attitude.

In [17]:
wealthchart_title = alt.TitleParams(
    "Cumulative Wealth by Risk Attitude",
    subtitle=["Wealth scaled by simulation length and play neighborhood"]
)

alt.Chart(agents_last_round_df, title=wealthchart_title).mark_boxplot(extent="min-max").encode(
    x=alt.X("risk_attitude", title="Risk Attitude").scale(domain=[0, 9]),
    y=alt.X("scaled_points", title="Wealth "),
) 

In [18]:
# Wanted to try a violin plot for comparison, but Altair/Vega can't handle the scale of our data,
# and I couldn't find an example of how to precompute the values to feed into an Altair mark area chart.


For comparison's sake, plot wealth at round 31 across all simulations, scaled only by play neighborhood.

This distribution looks similar to the scaled wealth plot.    

In [19]:
# what if we look at wealth at round 31 across simulations?

agents_round31_df = df.filter(pl.col("Step").eq(31)).join(model_df, on=["RunId", "iteration"]).with_columns(
    scaled_points=pl.col("points").truediv(pl.col("play_neighborhood")).mul(8)
)

alt.Chart(agents_round31_df).mark_boxplot(extent="min-max").encode(
   x=alt.X("risk_attitude", title="Risk Attitude").scale(domain=[0, 9]),
    y=alt.X("scaled_points", title="Wealth"),
).properties(title="Cumulative wealth at round 31 across runs")

Calculate quartiles and other statistics, and output as a table for reference.

Calculate quartiles and other statistics and output as a table, for reference.

In [20]:
# Altair boxplot is calculating quartiles for us, but we can calculate them directly as well

wealth_by_risk_attitude = agents_last_round_df.group_by("risk_attitude").agg(
    min=pl.col("scaled_points").min(), 
    max=pl.col("scaled_points").max(),
    mean=pl.col("scaled_points").mean(), 
    Q1=pl.col("scaled_points").quantile(0.25),
    Q2=pl.col("scaled_points").quantile(0.5),
    Q3=pl.col("scaled_points").quantile(0.75),
).sort("risk_attitude")

In [21]:
# output wealth by risk attitude as nicely styled table

(wealth_by_risk_attitude
    .rename({"risk_attitude": "Risk Attitude"})
    .style.tab_header(title="Cumulative Wealth by Risk Attitude", subtitle="Wealth scaled by simulation length and play neighborhood")
    .fmt_number(decimals=1)
    .fmt_number("Risk Attitude", decimals=0)
)

Cumulative Wealth by Risk Attitude,Cumulative Wealth by Risk Attitude,Cumulative Wealth by Risk Attitude,Cumulative Wealth by Risk Attitude,Cumulative Wealth by Risk Attitude,Cumulative Wealth by Risk Attitude,Cumulative Wealth by Risk Attitude
Wealth scaled by simulation length and play neighborhood,Wealth scaled by simulation length and play neighborhood,Wealth scaled by simulation length and play neighborhood,Wealth scaled by simulation length and play neighborhood,Wealth scaled by simulation length and play neighborhood,Wealth scaled by simulation length and play neighborhood,Wealth scaled by simulation length and play neighborhood
Risk Attitude,min,max,mean,Q1,Q2,Q3
0,0.0,300.0,148.7,114.9,149.7,174.6
1,0.0,300.0,144.0,113.2,147.6,167.8
2,0.0,300.0,145.4,112.5,147.6,170.6
3,0.0,300.0,143.4,110.8,137.5,166.0
4,0.0,300.0,141.7,109.9,134.7,157.3
5,1.7,300.0,139.4,114.3,137.5,150.8
6,43.2,300.0,143.3,125.4,143.0,154.3
7,64.7,300.0,147.0,137.0,149.3,158.1
8,87.1,300.0,151.0,140.3,150.4,161.7
9,96.8,203.2,150.3,141.4,150.0,160.9


### Analysis filtered by simulation parameters

How does the wealth distribution vary based on other starting parameters?

#### Grid size

In [22]:
alt.Chart(agents_last_round_df).mark_boxplot(extent="min-max").encode(
    x=alt.X("risk_attitude", title="Risk Attitude").scale(domain=[0, 9]),
    y=alt.X("scaled_points", title="Wealth"),
).facet(column=alt.Column("grid_size", title="Grid Size")).properties(title=wealthchart_title)

In [83]:
wealthchart_title

TitleParams({
  subtitle: ['Wealth scaled by simulation length and play neighborhood'],
  text: 'Cumulative Wealth by Risk Attitude'
})

In [159]:
# calculate mean & quartiles for risk attitude by grid size
wealth_by_risk_grid = agents_last_round_df.group_by("risk_attitude", "grid_size").agg(
    min=pl.col("scaled_points").min(), 
    max=pl.col("scaled_points").max(),
    mean=pl.col("scaled_points").mean(), 
    Q1=pl.col("scaled_points").quantile(0.25),
    Q2=pl.col("scaled_points").quantile(0.5),
    Q3=pl.col("scaled_points").quantile(0.75),
).sort("grid_size", "risk_attitude")


selection = alt.selection_point(fields=['grid_size'], bind='legend')

wealth_mean_grid = alt.Chart(wealth_by_risk_grid).mark_line(interpolate="monotone").encode(
    x=alt.X("risk_attitude", title="Risk Attitude").scale(domain=[0, 9]),
    y=alt.Y("mean", title="Wealth (mean)").scale(zero=False),
    color=alt.Color("grid_size:N", title="Grid Size"),
    opacity=alt.when(selection).then(alt.value(1.0)).otherwise(alt.value(0.4))
).add_params(
    selection
)

wealth_spread_grid = alt.Chart(wealth_by_risk_grid).mark_area(interpolate="monotone").encode(
    x=alt.X("risk_attitude", title="Risk Attitude").scale(domain=[0, 9]),
    y=alt.Y("Q3").scale(zero=False),
    y2="Q1",
    color=alt.Color("grid_size:N", title="Grid Size"),
    opacity=alt.when(selection).then(alt.value(0.3)).otherwise(alt.value(0.1))
).add_params(
    selection
)

# combine the charts for multiple ways to view
grid_wealth_title = wealthchart_title.copy()
grid_wealth_title['text'] += " and Grid Size — Mean and Quartiles"

(wealth_mean_grid | wealth_spread_grid | (wealth_mean_grid + wealth_spread_grid)
).resolve_legend(color="shared").properties(title=grid_wealth_title)

Click on items in the Grid Size legend to change the chart opacity to focus on a particular group.

In [160]:
# turn  grid size into label for context in the table
(wealth_by_risk_grid.with_columns(grid_size=pl.lit("Grid Size: ").add(pl.col("grid_size").cast(pl.datatypes.String)))
    .rename({"risk_attitude": "Risk Attitude"})
    .style.tab_header(title="Cumulative Wealth by Risk Attitude and Grid Size", subtitle="Wealth scaled by simulation length and play neighborhood")
     .tab_stub(rowname_col="Risk Attitude", groupname_col="grid_size")
    .fmt_number(decimals=1)
    .fmt_number("Risk Attitude", decimals=0)
)

Cumulative Wealth by Risk Attitude and Grid Size,Cumulative Wealth by Risk Attitude and Grid Size,Cumulative Wealth by Risk Attitude and Grid Size,Cumulative Wealth by Risk Attitude and Grid Size,Cumulative Wealth by Risk Attitude and Grid Size,Cumulative Wealth by Risk Attitude and Grid Size,Cumulative Wealth by Risk Attitude and Grid Size
Wealth scaled by simulation length and play neighborhood,Wealth scaled by simulation length and play neighborhood,Wealth scaled by simulation length and play neighborhood,Wealth scaled by simulation length and play neighborhood,Wealth scaled by simulation length and play neighborhood,Wealth scaled by simulation length and play neighborhood,Wealth scaled by simulation length and play neighborhood
Unnamed: 0_level_2,min,max,mean,Q1,Q2,Q3
Grid Size: 5,Grid Size: 5,Grid Size: 5,Grid Size: 5,Grid Size: 5,Grid Size: 5,Grid Size: 5
0,0.0,300.0,148.2,125.0,150.0,162.0
1,0.0,300.0,142.6,113.2,148.6,161.6
2,0.0,300.0,143.5,109.4,147.6,167.9
3,1.4,300.0,141.0,104.4,137.0,162.1
4,2.4,300.0,139.7,103.2,133.0,156.9
5,4.2,300.0,137.4,104.8,137.2,150.4
6,64.0,300.0,140.4,119.8,141.4,152.4
7,79.9,300.0,144.9,135.0,148.3,156.0
8,88.8,300.0,149.3,139.1,150.0,160.0


#### Play neighborhood

In [115]:
alt.Chart(agents_last_round_df).mark_boxplot(extent="min-max").encode(
    x=alt.X("risk_attitude", title="Risk Attitude").scale(domain=[0, 9]),
    y=alt.X("scaled_points", title="Wealth"),
).facet(column=alt.Column("play_neighborhood", title="Play Neighborhood")
).properties(title=wealthchart_title)

In [141]:
# calculate mean & quartiles for risk attitude by play neighborhood
wealth_by_risk_playnhood = agents_last_round_df.group_by("risk_attitude", "play_neighborhood").agg(
    min=pl.col("scaled_points").min(), 
    max=pl.col("scaled_points").max(),
    mean=pl.col("scaled_points").mean(), 
    Q1=pl.col("scaled_points").quantile(0.25),
    Q2=pl.col("scaled_points").quantile(0.5),
    Q3=pl.col("scaled_points").quantile(0.75),
).sort("play_neighborhood", "risk_attitude")


playn_selection = alt.selection_point(fields=['play_neighborhood'], bind='legend')


wealth_mean_playnhood = alt.Chart(wealth_by_risk_playnhood).mark_line(interpolate="monotone").encode(
    x=alt.X("risk_attitude", title="Risk Attitude").scale(domain=[0, 9]),
    y=alt.Y("mean", title="Wealth (mean)").scale(zero=False),
    color=alt.Color("play_neighborhood:N", title="Play Neighborhood"),
    opacity=alt.when(playn_selection).then(alt.value(1.0)).otherwise(alt.value(0.4))
).add_params(
    playn_selection
)

wealth_spread_playnhood = alt.Chart(wealth_by_risk_playnhood).mark_area(interpolate="monotone").encode(
    x=alt.X("risk_attitude", title="Risk Attitude").scale(domain=[0, 9]),
    y=alt.Y("Q3").scale(zero=False),
    y2="Q1",
    color=alt.Color("play_neighborhood:N", title="Play Neighborhood"),
    opacity=alt.when(playn_selection).then(alt.value(0.3)).otherwise(alt.value(0.1))
).add_params(
    playn_selection
)

playn_wealth_title = wealthchart_title.copy()
playn_wealth_title['text'] += " and Play Neighborhood — Mean and Quartiles"

# combine the charts for multiple ways to view
(wealth_mean_playnhood | wealth_spread_playnhood | (wealth_mean_playnhood + wealth_spread_playnhood)
).resolve_legend(color="shared").properties(title=playn_wealth_title)

Click on items in the Play Neighborhood legend to change the chart opacity to focus on a particular group.

In [161]:
(wealth_by_risk_playnhood.with_columns(play_neighborhood=pl.lit("Play Neighborhood: ").add(pl.col("play_neighborhood").cast(pl.datatypes.String)))
    .rename({"risk_attitude": "Risk Attitude"})
    .style.tab_header(title="Cumulative Wealth by Risk Attitude and Play Neighborhood", subtitle="Wealth scaled by simulation length and play neighborhood")
     .tab_stub(rowname_col="Risk Attitude", groupname_col="play_neighborhood")
    .fmt_number(decimals=1)
    .fmt_number("Risk Attitude", decimals=0)
)

Cumulative Wealth by Risk Attitude and Play Neighborhood,Cumulative Wealth by Risk Attitude and Play Neighborhood,Cumulative Wealth by Risk Attitude and Play Neighborhood,Cumulative Wealth by Risk Attitude and Play Neighborhood,Cumulative Wealth by Risk Attitude and Play Neighborhood,Cumulative Wealth by Risk Attitude and Play Neighborhood,Cumulative Wealth by Risk Attitude and Play Neighborhood
Wealth scaled by simulation length and play neighborhood,Wealth scaled by simulation length and play neighborhood,Wealth scaled by simulation length and play neighborhood,Wealth scaled by simulation length and play neighborhood,Wealth scaled by simulation length and play neighborhood,Wealth scaled by simulation length and play neighborhood,Wealth scaled by simulation length and play neighborhood
Unnamed: 0_level_2,min,max,mean,Q1,Q2,Q3
Play Neighborhood: 4,Play Neighborhood: 4,Play Neighborhood: 4,Play Neighborhood: 4,Play Neighborhood: 4,Play Neighborhood: 4,Play Neighborhood: 4
0,0.0,300.0,151.7,79.8,150.0,220.2
1,0.0,300.0,148.6,88.7,149.5,191.1
2,0.0,300.0,150.6,99.2,149.2,212.1
3,0.0,300.0,147.7,101.8,129.5,186.6
4,0.0,300.0,145.4,102.1,125.5,175.0
5,1.7,300.0,140.1,104.7,126.6,151.6
6,43.2,300.0,144.4,113.7,137.3,161.3
7,64.7,300.0,146.4,125.8,148.7,162.3
8,87.7,300.0,150.6,128.2,150.0,169.4


#### Observed neighborhood

In [118]:
alt.Chart(agents_last_round_df).mark_boxplot(extent="min-max").encode(
    x=alt.X("risk_attitude", title="Risk Attitude").scale(domain=[0, 9]),
    y=alt.X("scaled_points", title="Wealth"),
).facet(column=alt.Column("observed_neighborhood", title="Observed Neighborhood")
).properties(title=wealthchart_title)

In [143]:
# calculate mean & quartiles for risk attitude by play neighborhood
wealth_by_risk_obsvnhood = agents_last_round_df.group_by("risk_attitude", "observed_neighborhood").agg(
    min=pl.col("scaled_points").min(), 
    max=pl.col("scaled_points").max(),
    mean=pl.col("scaled_points").mean(), 
    Q1=pl.col("scaled_points").quantile(0.25),
    Q2=pl.col("scaled_points").quantile(0.5),
    Q3=pl.col("scaled_points").quantile(0.75),
).sort("observed_neighborhood", "risk_attitude")


obsvn_selection = alt.selection_point(fields=['observed_neighborhood'], bind='legend')


wealth_mean_obsvnhood = alt.Chart(wealth_by_risk_obsvnhood).mark_line(interpolate="monotone").encode(
    x=alt.X("risk_attitude", title="Risk Attitude").scale(domain=[0, 9]),
    y=alt.Y("mean", title="Wealth (mean)").scale(zero=False),
    color=alt.Color("observed_neighborhood:N", title="Observed Neighborhood"),
    opacity=alt.when(obsvn_selection).then(alt.value(1.0)).otherwise(alt.value(0.4))
).add_params(
    obsvn_selection
)

wealth_spread_obsvnhood = alt.Chart(wealth_by_risk_obsvnhood).mark_area(interpolate="monotone").encode(
    x=alt.X("risk_attitude", title="Risk Attitude").scale(domain=[0, 9]),
    y=alt.Y("Q3").scale(zero=False),
    y2="Q1",
    color=alt.Color("observed_neighborhood:N", title="Observed Neighborhood"),
    opacity=alt.when(obsvn_selection).then(alt.value(0.3)).otherwise(alt.value(0.1))
).add_params(
    obsvn_selection
)

obsvn_wealth_title = wealthchart_title.copy()
obsvn_wealth_title['text'] += " and Observed Neighborhood — Mean and Quartiles"

# combine the charts for multiple ways to view
(wealth_mean_obsvnhood | wealth_spread_obsvnhood | (wealth_mean_obsvnhood + wealth_spread_obsvnhood)
).resolve_legend(color="shared").properties(title=obsvn_wealth_title)

Click on items in the Observed Neighborhood legend to change the chart opacity to focus on a particular group.

In [162]:
(wealth_by_risk_obsvnhood.with_columns(observed_neighborhood=pl.lit("Observed Neighborhood: ").add(pl.col("observed_neighborhood").cast(pl.datatypes.String)))
    .rename({"risk_attitude": "Risk Attitude"})
    .style.tab_header(title="Cumulative Wealth by Risk Attitude and Observed Neighborhood", subtitle="Wealth scaled by simulation length and play neighborhood")
     .tab_stub(rowname_col="Risk Attitude", groupname_col="observed_neighborhood")
    .fmt_number(decimals=1)
    .fmt_number("Risk Attitude", decimals=0)
)

Cumulative Wealth by Risk Attitude and Observed Neighborhood,Cumulative Wealth by Risk Attitude and Observed Neighborhood,Cumulative Wealth by Risk Attitude and Observed Neighborhood,Cumulative Wealth by Risk Attitude and Observed Neighborhood,Cumulative Wealth by Risk Attitude and Observed Neighborhood,Cumulative Wealth by Risk Attitude and Observed Neighborhood,Cumulative Wealth by Risk Attitude and Observed Neighborhood
Wealth scaled by simulation length and play neighborhood,Wealth scaled by simulation length and play neighborhood,Wealth scaled by simulation length and play neighborhood,Wealth scaled by simulation length and play neighborhood,Wealth scaled by simulation length and play neighborhood,Wealth scaled by simulation length and play neighborhood,Wealth scaled by simulation length and play neighborhood
Unnamed: 0_level_2,min,max,mean,Q1,Q2,Q3
Observed Neighborhood: 4,Observed Neighborhood: 4,Observed Neighborhood: 4,Observed Neighborhood: 4,Observed Neighborhood: 4,Observed Neighborhood: 4,Observed Neighborhood: 4
0,0.0,300.0,153.1,125.5,150.0,183.9
1,38.7,300.0,151.0,118.2,148.6,174.3
2,38.7,300.0,154.9,124.6,149.6,183.1
3,67.7,300.0,152.8,124.2,145.3,168.5
4,64.4,300.0,154.6,124.9,147.7,169.8
5,71.8,300.0,148.2,127.3,144.0,154.4
6,69.4,300.0,150.5,129.7,145.8,157.7
7,82.8,300.0,148.8,137.7,149.4,156.5
8,87.1,300.0,150.8,138.3,150.0,158.6


#### Initial risk distribution

In [117]:
alt.Chart(agents_last_round_df).mark_boxplot(extent="min-max").encode(
    x=alt.X("risk_attitude", title="Risk Attitude").scale(domain=[0, 9]),
    y=alt.X("scaled_points", title="Wealth"),
).facet(facet=alt.Facet("risk_distribution", title="Initial Risk Distribution"), columns=3
).properties(title=wealthchart_title)

In [130]:
# calculate mean & quartiles for risk attitude by play neighborhood
wealth_by_risk_dist = agents_last_round_df.group_by("risk_attitude", "risk_distribution").agg(
    min=pl.col("scaled_points").min(), 
    max=pl.col("scaled_points").max(),
    mean=pl.col("scaled_points").mean(), 
    Q1=pl.col("scaled_points").quantile(0.25),
    Q2=pl.col("scaled_points").quantile(0.5),
    Q3=pl.col("scaled_points").quantile(0.75),
).sort("risk_distribution", "risk_attitude")

riskdist_selection = alt.selection_point(fields=['risk_distribution'], bind='legend')

wealth_mean_riskdist = alt.Chart(wealth_by_risk_dist).mark_line(interpolate="monotone").encode(
    x=alt.X("risk_attitude", title="Risk Attitude").scale(domain=[0, 9]),
    y=alt.Y("mean", title="Wealth (mean)").scale(zero=False),
    color=alt.Color("risk_distribution:N", title="Distribution"),
    opacity=alt.when(riskdist_selection).then(alt.value(1.0)).otherwise(alt.value(0.4))
).add_params(
    riskdist_selection
)

wealth_spread_riskdist = alt.Chart(wealth_by_risk_dist).mark_area(interpolate="monotone").encode(
    x=alt.X("risk_attitude", title="Risk Attitude").scale(domain=[0, 9]),
    y=alt.Y("Q3").scale(zero=False),
    y2="Q1",
    color=alt.Color("risk_distribution:N", title="Distribution"),
    opacity=alt.when(riskdist_selection).then(alt.value(0.3)).otherwise(alt.value(0.1))
).add_params(
    riskdist_selection
)

riskdist_wealth_title = wealthchart_title.copy()
riskdist_wealth_title['text'] += " and Initial Risk Distribution — Mean and Quartiles"

# combine the charts for multiple ways to view
(wealth_mean_riskdist | wealth_spread_riskdist | (wealth_mean_riskdist + wealth_spread_riskdist)
).resolve_legend(color="shared").properties(title=riskdist_wealth_title)

Click on items in the Distribution legend to change the chart opacity to focus on a particular group.

In [163]:
# output values as a nice table so we can reference them if needed
(wealth_by_risk_dist.with_columns(risk_distribution=pl.lit("Initial Risk Distribution: ").add(pl.col("risk_distribution")))
    .rename({"risk_attitude": "Risk Attitude"})
    .style.tab_header(title="Cumulative Wealth by Risk Attitude and Initial Risk Distribution", subtitle="Wealth scaled by simulation length and play neighborhood")
     .tab_stub(rowname_col="Risk Attitude", groupname_col="risk_distribution")
    .fmt_number(decimals=1)
    .fmt_number("Risk Attitude", decimals=0)
)

Cumulative Wealth by Risk Attitude and Initial Risk Distribution,Cumulative Wealth by Risk Attitude and Initial Risk Distribution,Cumulative Wealth by Risk Attitude and Initial Risk Distribution,Cumulative Wealth by Risk Attitude and Initial Risk Distribution,Cumulative Wealth by Risk Attitude and Initial Risk Distribution,Cumulative Wealth by Risk Attitude and Initial Risk Distribution,Cumulative Wealth by Risk Attitude and Initial Risk Distribution
Wealth scaled by simulation length and play neighborhood,Wealth scaled by simulation length and play neighborhood,Wealth scaled by simulation length and play neighborhood,Wealth scaled by simulation length and play neighborhood,Wealth scaled by simulation length and play neighborhood,Wealth scaled by simulation length and play neighborhood,Wealth scaled by simulation length and play neighborhood
Unnamed: 0_level_2,min,max,mean,Q1,Q2,Q3
Initial Risk Distribution: bimodal,Initial Risk Distribution: bimodal,Initial Risk Distribution: bimodal,Initial Risk Distribution: bimodal,Initial Risk Distribution: bimodal,Initial Risk Distribution: bimodal,Initial Risk Distribution: bimodal
0,0.0,300.0,152.1,124.8,150.0,185.5
1,0.0,300.0,153.1,125.2,150.0,185.5
2,0.0,300.0,153.3,125.4,150.0,185.5
3,0.0,300.0,156.0,126.2,150.0,185.1
4,1.4,300.0,158.7,137.1,150.0,184.7
5,8.9,300.0,157.1,137.9,150.0,167.7
6,74.2,300.0,153.9,138.0,150.0,161.7
7,96.8,300.0,150.4,138.3,150.0,158.9
8,96.8,300.0,150.1,138.3,150.0,158.9
