In [None]:
import networkx as nx
import itertools
import numbers
import numpy as np
import matplotlib.pyplot as plt

# Strategic Network Formation

#### Economic Game Theoretic Models of Network Formation

- Costs and benefits for each agent associated with each network
- Agents choose links
- Contrast incentives and social efficiency

#### Modeling Choices

How should we model incentives to form and sever links?
- is consensus needed (undirected/directed)?
- can they coordinate changes in the network?
- is the process dynamic or static?
- how sophisticated are agents?
- what do they know when making a decision?
- do they make errors?
- what happens on the network?
- can they compensate each other for relationship?
- are links ajustable in intensity?

#### Some Questions

- Which networks are likely to form?
- Are some more stable than others to various perturbations?
- Are the networks that form efficient?
- How inefficient are they if they are not efficient?
- Can intervention help improve efficiency?
- Can such models provide insight into observed characteristics of networks?

## An Economic Analysis: Jackson Wolinsky

- $u_i(g)$ ‐ payoff to $i$ if the network is $g$
- undirected network formation
- $0 \leq \delta \leq 1$ a benefit parameter for i from connection between $i$ and $j$
- $0 \leq c_{ij}$ cost to $i$ of link to $j$
- $l(i,j)$ shortest path length between $i$,$j$
- $u_i(g)= \sum_j \delta^{l(i,j)} ‐ \sum_{j \in N_i(g)} c_{ij}$

#### Questions

- Which networks are best for society?
- Which networks are formed by the agents?

### Pairwise Stability and Efficiency

- What if model as a game where each agent announces who they wish to link to and a link forms if and only if both agents name each other?
- Nash equilibrium: no agent can gain from changing his/her action


### Pairwise Stability
- no agent gains from severing a link – relationships must be beneficial to be maintained
$$u_i(g) \geq u_i(g-ij) \text{ for } i \text{ and }ij \in g$$
- no two agents both gain from adding a link (at least one strictly) – beneficial relationships are pursued when available
$$u_i(g+ij) > u_i(g) \text{ implies } u_j(g+ij) < u_j(g) \text{ for } ij \notin g$$
- a weak concept, but often narrows things down

### Efficiency

- Pareto efficient $g$: there does not exist $g'$ s.t.
$$u_i(g') \geq u_i(g) \text{ for all }i\text{, strict for some}$$
- Efficient $g$ (Pareto if transfers):
$$ g = \argmax_{g'} \sum u_i(g')$$

### Symmetric Connection Model:

- benefit from a friend is $\delta<1$
- benefit from a friend of a friend is $\delta^2$,...
- cost of a link is $c>0$

#### Efficient Networks

- low cost: $c < \delta - \delta^2$ 
    - complete network is uniquely efficient
- medium cost: $\delta - \delta^2 < c < \delta + (n - 2) \delta^2/2$
    - star networks with all agents are uniquely efficient
- high cost: $\delta + (n - 2) \delta^2/2 < c$
    - empty network is uniquely efficient

#### Pairwise Stability

- low cost: $c< \delta - \delta^2$
    - complete network is pairwise stable
- medium/low cost: $\delta - \delta^2 < c < \delta$
    - star network is pairwise stable
    - others are also pairwise stable
- medium/high cost: $\delta < c < \delta +(n‐2)\delta^2/2$
    - star network is not pairwise stable (no loose ends)
    - nonempty pairwise stable networks are over‐connected and may include too few agents
- high cost: $\delta +(n‐2)\delta^2/2 < c$
    - empty network is pairwise stable

#### Time for Demo

In [None]:
import net_formation_sym
from jupyter_dash import JupyterDash

app = JupyterDash(__name__, meta_tags=[{"name": "viewport", "content": "width=device-width, initial-scale=1"}])
net_formation_sym.create_net_formation_app(app)
app.run_server(mode="inline")

### Externalities

- Positive:
$$u_k(g+ij) \geq u_k(g) \text{ if }ij \text{ not in }g\text{ for every }k\neq i,j$$
- Negative:
$$u_k(g+ij) \leq u_k(g) \text{ if }ij \text{ not in }g\text{ for every }k\neq i,j$$

#### Coauthor model (example for negative externalities)

Agents get value from research collaboration
- value for each relationship depends on time each puts into it
- plus an interaction term, which is product of the times spent
$$u_i(g) = \sum_{j: ij \in g} \left(\frac{1}{d_i} +\frac{1}{d_j} +\frac{1}{d_i d_j}\right)
= 1+ \sum_{j: ij \in g} \left(\frac{1}{d_j} +\frac{1}{d_i d_j}\right)$$

#### Time for Demo

#### Key take aways

- Positive externalities – under‐connected
- Negative externalities – over‐connected

### Network Formation and Transfers
- Stable and Efficient only coincide in special cases
- Can transfers help in other cases?
- What can we say about when conflict exists?
- What can we say about when transfers improve efficiency?
- Are transfers in players’ interests?

#### What are Transfers ?
- Outside intervention, taxing or subsidizing relationships – e.g., goverment support of R&D relationship
- Bargaining among the individuals involved in the relationships
- Favors exchanged among friends….

#### Modeling Transfers
- Change utilities from $u_i(g)$ to $u_i(g)+t_i(g)$
- E.g., peripheral players pay center of star in connections
model to maintain connections

#### Egalitarian Transfers

- Set $t_i(g) = \frac{1}{n}\sum_j u_j(g) - u_i(g)$
- Then $u_i(g) + t_i(g) = \frac{1}{n}\sum_j u_j(g)$
- Now every agent has societal incentives

#### Transfers can Fail
Put in some basic requirements on transfers:
- completely isolated nodes that generate no value
get 0
- nodes that are completely interchangeable get
same transfers

Transfers cannot always help

### Summary So Far
- Efficient networks take some simple forms in a variety of models
- Efficient networks and pairwise stable need not coincide
- Transfers may help, but not always without violating some basic conditions

### Heterogeneity in Strategic Models

- Costs depend on geography and characteristics of nodes
    - easier to be friends with neighbors
    - easier to relate to people with similar background
- Benefits depend on characteristics of nodes
    - synergies from working together, trading, sharing risk, exchanging favors..
    - complementarities: benefits from diversity...

#### Can economic models match observables?

Small worlds derived from costs/benefits
- low costs to local links – high clustering
- high value to distant connections – low diameter
- high cost of distant connections – few distant links

#### Islands connections model

- $J$ players live on an island, K islands
- cost $c$ of link to player on the island
- cost $C>c$ of link to player on another island

Results:
- High clustering within islands, few links across
- small distances

### SWOT of an economic approach

#### Strength
- Payoffs allow for a welfare analysis
    - Identify tradeoffs – incentives versus efficiency
- Tie the nature of externalities to network formation…
- Put network structures in context
- Account for and explain some observables

#### Challenges
- Stark (overly regular) network structures emerge
    - need some heterogeneity
    - simulations help in fitting
- over‐emphasize choice versus chance for some (especially large) applications??
- How to identify payoff structure in applications?
    - relating network structure and outcomes, payoffs

#### Models that marry strategic with random are needed
- Weaknesses of Random are Strengths of
Economic approach, and vice versa.
- Mixed models
    - allow for welfare/efficiency analysis
    - take model to data and fit observed networks
    - do so across applications

## Refining pairwise stability

- Beyond Pairwise Stability ‐ Allowing other deviations
    - multiple links by individuals
    - coordinated deviations
- Existence questions
- Dynamics
- Stochastic Stability
- Forward looking behavior
- Directed Networks

### Nash equilibrium

- Players simultaneously announce
their preferred set of neighbors $S_i$
- $g(S)=\{ ij : j \in S_i \text{ and } i \in S_j \}$
- Nash stable, $u_i(g(S)) \geq u_i(g(S'_i,S_{‐i})) \text{ for all }S'_i$
- So, $g$ is Nash stable if and only if no player wants to delete some set of his or her links

## Dynamic Strategic Network Formation

### Dynamic Strategic Models
- Explicitly model dynamics and incentives
    - Realism(?)
    - Refine static stable models
    - Incorporate forward looking nature
- Very different approaches:
    - Myopic and error prone
    - Fully forward looking and calculating

### A Dynamic Process
- Natural dynamics: link is picked at random
    -  added if it benefits both players (at least one
strictly)
    - deleted if it benefits either to delete it
- Will find pairwise stable networks (if they exist)
- Even if efficient networks are pairwise stable, may
have low chance of reaching them....

### Drawback
Consider connections model where
$\delta-\delta^2 <c< \delta$, so that a star is efficient and
pairwise stable. As $n$ grows, the probability that
the above process stops at a star goes to 0.

## Evolution and Stochastics

### Improving path:
- Sequence of adjacent networks:
    - Link is added if it benefits both agents, at least one strictly
    - Link is deleted if either agent benefits from its deletion

### Stochastic Stability
- Add trembles/errors to improving paths:
-  Start at some network and with equal probability on all links
choose a link:
    - Add that link if it is not present and both agents prefer to add it (at least one strictly)
    - delete that link if it is present and one of the two agents
prefers to delete it.
    - Reverse the above decision with probability ε>0
- Finite state, irreducible, aperiodic
Markov chain

### Benefit
More errors to leave basin of attraction of the
complete network than to leave the basin of
attraction of empty network

## Directed Networks
- Formation game easy:
- Players simultaneously announce their preferred set of neighbors $S_i$
- $g(S)= \{ ij : j \in S_i\}$ keeping track of ordered pairs
- Nash equilibrium

### Flow of Payoffs?
- One way flow – get information but not vice versa
- Two way flow – one player bears the cost, but both benefit from the connection (link on internet, phone
call??)

### Two Way Flow
#### Efficiency 

As in the undirected connections model, except $c/2$ and link in either direction (but not both)
- low cost: $c/2 < \delta-\delta^2$
    - complete networks
- medium cost: $\delta-\delta^2 < c/2 < \delta+(n-2)\delta^2/2$
    - star networks
- high cost: $\delta+(n-2)\delta^2/2 < c/2$
    - empty network

#### Nash Stable:

- low cost: $c< \delta-\delta^2$
    - two‐way complete networks are Nash stable
- medium/low cost: $\delta-\delta^2 < c < \delta$
    - all star networks are Nash stable, plus others
- medium/high cost: $\delta < c < \delta+(n-2)\delta^2/2$
    - peripherally sponsored star networks are Nash stable (no other stars, but sometimes other networks)
- efficient and stable can be empty:
    - $\delta - \delta^2 < c < 2(\delta - \delta^2)$ complete is efficient, not
equilibrium

### One Way Flow

- Keep track of directed flows, and
in links are not (always) useful
- An Example
$$u_i(g) = R_i(g) – d_i^{out}(g)c$$
 where $R_i(g)$ is the number of players reached by directed paths from $i$

#### Efficient Networks
- n‐player ``wheels’’ if $c< n-1$, empty otherwise

#### Stable Networks
- If $c<1$ then n‐player wheels are the only strictly Nash stable network
- If $1<c<n-1$ n‐player wheels and empty networks are the only strictly Nash stable networks

# Diffusion in Social and Economic Networks 

## Bass Model
- A benchmark model with no explicit social
structure
- Two actions/states/behaviors 0 and 1
- $F(t)$ fraction of the population who have adopted action 1 at time $t$
- $p$ rate of spontaneous innovation/adoption
- $q$ rate of imitation of adoption
- dynamic defined by
$$\frac{dF}{dt}(t) = (p + q F(t))(1-F(t))$$
- solution is given by
$$F(t) = \frac{1-e^{-(p+q)t}}{1+qe^{-(p+q)t}/p}$$
- Gives S‐shape (if $q>p$) and tends to 1 in the limit
- Initially only $p$ matters, then $q$ takes over
- Eventually change slows as F(t) approaches 1

In [None]:
def s_shape_curve(p, q, t):
    e = np.exp(-(p + q) * t)
    return (1 - e) / ( 1 + q / p * e )

In [None]:
t = np.linspace(0, 30, num=1000)
f = s_shape_curve(0.01, 0.38, t)
plt.plot(t, f, linewidth=2.0)

## Random Networks and Diffusion

- Idea, disease, computer virus spreads via connections in the network
- Nodes are linked if one would "infect" the other
- Will an infection take hold?
- How many nodes/people will it reach?

### Questions:
- When do we get diffusion?
- What is the extent of diffusion?
- How does it depend on the particulars of the process as well as the network?
- Who is likely to be infected earliest?

### Component Structure
- Reach of contagion is determined by the component structure
- Some players or nodes are immune, Some links fail to transmit…
- What do components look like of those who are susceptible and given links that work

### Extent of Diffusion
- Get nontrivial diffusion if someone in the giant component is infected/adopts
- Size of the giant component determines likelihood of diffusion and its extent
- Random network models allow for giant component calculations
- How big is the giant component when there is one?
    - Size of the giant component when $1/n< p < \log(n)/n$

### Calculating the Size of the Giant Component

- $q$ is fraction of nodes in largest component
- look at any node: chance it is in the giant component is $q$
- chance that this node is outside of the giant component is the chance that all of its neighbors are outside of the giant component
- Probability that a node is outside of the giant component = $1 - q$
- probability that all of its neighbors are outside = $(1-q)^d$ where $d$ is the node’s degree
- So, probability $1-q$ that a node is outside of the giant component is
$$1-q = \sum_{d} (1-q)^d P(d)$$
- Where $P(d)$ is the chance that the node has $d$ neighbors
$$P(d) = [ (n‐1)^d / d! ] p^d e^{-(n-1)p}$$
- So
$$1-q = e^{‐(n‐1)p}\sum_{d}[(1-q) (n-1)p]^d / d!
= e^{‐(n‐1)p} e^{(n‐1)p(1‐q)}
= e^{‐q(n‐1)p}$$
- Or
$$ -\log(1-q) / q = (n-1) p = E[d]$$

### Who is infected?
- Probability of being in the giant component:
- $1‐(1‐q)^d$ increasing in $d$
- More connected, more likely to be infected (more likely to be infected at any point in time...)

### Extensions
- Immunity: delete a fraction of nodes and study the giant component on remaining nodes
- Probabilistic infection
    - Random infection: have some links fail, just lower p


### Contagion with Immunity and Link Failure
- Some node is initially exposed to infection
- $\pi$ of the nodes are immune naturally
- only some links result in contagion – fraction $f$
- What is the extent of the infection?

### Homework
Write a python code which:
- Consider a random network on $n$ nodes
- Delete fraction $\pi$ of the nodes
- Delete fraction $1-f$ of the links
- If starts at a node in giant component of the remaining network, then the giant component of that network is the extent
of the infection; otherwise negligible
- Let q be the fraction of nodes of the remaining network in its giant component. Bootstrap its estimate on 100 simulations
- Compare $–\log(1-q)/q$ vs $(n-1)p(1-\pi)f$

## SIS Model
- An extensively studied model in epidemiology
- Allows nodes to change behaviors back and forth over time
- Model of catching some recurring diseases, who to vote for, etc.
- Nodes are infected or susceptible
-  Probability that get infected is proportional to number of infected neighbors with rate $v>0$, plus spontaneous $\varepsilon$
- get well randomly in any period at rate $\delta>0$
- Let $\rho$ be the percent infected
- Start with benchmark where all players mix with even probabilities
- Randomly meet an individual each period
- Large Markov chain
- Steady state mean‐field: $\frac{d\rho}{dt} = 0$

### Mean‐Field calculation
- dynamics
$$\frac{d\rho}{dt} = (1‐\rho)(v\rho+\varepsilon) – \rho\delta = 0$$
- steady state solution
$$ρ = [ (v-\delta-\varepsilon)+ ((v-\delta-\varepsilon)^2 +4 \varepsilon v)^\frac{1}{2} ] / 2v$$

### Mean‐Field drop $\varepsilon$

- dynamics
$$\frac{d\rho}{dt} = (1-\rho)v\rho – \rho\delta=0$$
- two solutions
$$\rho = 1 – \delta/v \text{ (if >0)}$$
$$ρ = 0$$
- If $\delta > v$ then recover faster than get sick, no infection stays
- Otherwise, infection stays at some level, for low recovery rates can lead to large infections

### Where’s the network?
- so far uniformly random interaction
- missing heterogeneity in degree
- missing local patterns
- we can at least address the first concern...

### Explore Degree Distribution Influence

- random matching with $d_i$ matches for node $i$
- $\rho(d)$ fraction of nodes of degree $d$ infected
- $\theta$ fraction of randomly chosen neighbors infected

### Chance that meet an infected node
- $P(d)$ fraction of nodes that have $d$ meetings
- More likely to meet someone who has high $d$
- likelihood of meeting node of degree $d$ is
$$P(d) d /E[d]$$
- So likelihood of meeting infected node is:
$$\theta = \sum_d \rho(d) P(d) d / E[d]$$
- Steady state: for each $d$
$$0 = \frac{d\rho}{dt}(d) = ( 1- \rho(d) )v\theta d - \rho(d) \delta$$
$$\rho(d) = \lambda\theta d / (\lambda\theta d + 1) \text{ where }\lambda= v/\delta$$
- Steady state infection rate of people you meet is the solution to
$$\theta = \sum_d \rho(d) P(d) d / E[d]
= \sum_d P(d) \lambda \theta d^2 /[ (\lambda \theta d + 1) E[d]]$$
- What can we say about how this depends on the "network structure"? How does infection rate of neighbors θ depend on
$P(d)$, $E(d)$?

### Properties of H
$$H(\theta) = \sum_d P(d) \lambda \theta d^2 /[ (\lambda \theta d + 1) E[d]]$$

- H is increasing
$$H'(\theta) = \sum_{d} P(d) \lambda d^2 /[ (\lambda \theta d + 1)^2 E[d]] > 0$$
- H is strictly Concave
$$H''(\theta) = ‐ 2 \sum_{d} P(d) \lambda^2 d^3/[ (\lambda\theta d + 1)^3 E[d]] < 0$$
- Steady state exist only iff $H'(0)>1$, i.e
$$H'(0) = \sum_{d} P(d) \lambda d^2 / E[d] = \lambda E[d^2]/E[d] > 1$$
$$\lambda > E[d]/E[d^2]$$
- So need infection/recovery rate to be high enough relative to average degree divided by second moment (roughly variance)

### Conditions for Steady State
- Iff $\lambda > E[d]/E[d^2]$ have a nonzero steady state
- In a regular network, need $λ > 1/E[d]$
- In a E‐R network, need $λ > 1/(1+E[d])$
- In a power‐law network, $E[d^2]$ diverges –
always have a nonzero steady state

### Ideas:
- High degree nodes are more prone to infection
- Serve as conduits
- Higher variance, more such nodes to enable infection

## Fitting a Diffusion Model to Data
- Map network structure via surveys, observe behavior
- Model diffusion and fit the model from observed networks and behaviors
- Know the set of initially informed nodes
- Informed nodes (repeatedly) pass information randomly to their neighbors over discrete times
- Once informed (just once), nodes choose to participate depending on their characteristics and their neighbors’ choices

### Questions
- What determines behavior:
    - Pure access to information (no strategic effects)?
    - Complementarities (strategic affects)?
- Are non‐participants important in diffusion?
    - Model information passing by participants (usual contagion)
    - Information passing by non‐participants too
- Estimate structural models of diffusion and behavior