*Note: This script is an effort to replicate the results from the paper "Talk of the Network: A Complex Systems Look at the Underlying Process of Word-of-Mouth", Goldenberg, Libai and Muller (2001). This is a self-didactic attempt.*

In [7]:
using Distributions
using LightGraphs

In [8]:
srand(20130810)

MersenneTwister(UInt32[0x01332bfa], Base.dSFMT.DSFMT_state(Int32[-1772545288, 1073534108, 1077066014, 1072915095, -2146195133, 1072843413, 301764553, 1073404181, 750472136, 1073628106  …  -1491411563, 1073194977, 716119449, 1072893711, 1632331784, 758890923, 1433693833, -13012230, 382, 0]), [1.48191, 1.49338, 1.59916, 1.51374, 1.88858, 1.44925, 1.64431, 1.09988, 1.93543, 1.448  …  1.95298, 1.7008, 1.49435, 1.89406, 1.0931, 1.97609, 1.78404, 1.03578, 1.37303, 1.37408], 382)

# Introduction 

This paper explores the pattern of personal communication betwee an individual's core friends group (strong ties) and a wider set of acquaintances (weak ties). This remarkable study is one of the first ones in marketing that explored the influence of social networks on the diffusion of marketing messages. The key questions investogated in the context of information dissemination are:

- What matters more - strong ties or weak ties?
- What effect does the size of an average individuals network have?
- How does advertising interact with the diffusion through weak ties and that through strong ties

# Model

## Assumptions

Each individual in the substrate network (referred to as nodes) are connected to the same number of strong ties (varied from 5 - 29) and weak ties (varied from 5 - 29). The probability of activation of a node, i.e., an uninformed individual turning to informed can happen in three ways: through a strong tie with probability $\beta_s$, through a weak tie with probability $\beta_w$ or through external marketing efforts with probability $\alpha$. In line with conventional wisdom, we assume $\alpha < \beta_w < \beta_s$. 

At timestep $t$, if an individual is connected to $m$ strong ties and $j$ weak ties, the probability of the individual being informed in this time step is:

$$
p(t) = 1 - (1- \alpha)(1 - \beta_w)^j(1 - \beta_s)^m
$$

We are interested in two outcome variables:
1. The number of time steps elapsed till 15% of the network engages 
2. The number of time steps elapsed till 95% of the network engages

## Parameter ranges

In [9]:
println("Number of strong ties per node: ", floor.(Int, linspace(5, 29, 7)))
println("Number of weak ties per node: ", floor.(Int, linspace(5, 29, 7)))
println("Effect of advertising (α): ", collect(linspace(0.0005, 0.01, 7)))
println("Effect of weak ties (β_w): ", collect(linspace(0.005, 0.015, 7)))
println("Effect of strong ties (β_s): ", collect(linspace(0.01, 0.07, 7)))

Number of strong ties per node: [5, 9, 13, 17, 21, 25, 29]
Number of weak ties per node: [5, 9, 13, 17, 21, 25, 29]
Effect of advertising (α): [0.0005, 0.00208333, 0.00366667, 0.00525, 0.00683333, 0.00841667, 0.01]
Effect of weak ties (β_w): [0.005, 0.00666667, 0.00833333, 0.01, 0.0116667, 0.0133333, 0.015]
Effect of strong ties (β_s): [0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.07]


## Execution

- At $t = 0$, the status of all nodes is set to `false`

- For each node, the probability of being informed is calculated as per the above equation. A random draw $U$ is made from a standard uniform distribution and compared with the probability. If $U < p(t)$ the status of the node is changed to `true`

- In each successive time step the previous step is repeated till 95% of the total network (of size 3000) engages

## Code walkthrough

### Reset node status

At the beginning of each simulation, we call the following function to set the status of all the nodes to `false`

In [10]:
function reset_node_status(G::LightGraphs.SimpleGraphs.SimpleGraph)
    return falses(nv(G))
end

reset_node_status (generic function with 1 method)

### Count activated strong and weak ties

The following functions are used to count the number of activated strong and weak ties in the neighborhood of a node 

In [12]:
function n_active_strong_ties(G::LightGraphs.SimpleGraphs.SimpleGraph, node_status::BitVector)
    
    n_active_strong_ties  = 0
    
    m = Vector{Int}()
    
    for node in vertices(G)
        if node_status[node] == false
            for nbr in neighbors(G, node)
                if node_status[nbr] == true
                    n_active_strong_ties += 1
                end
            end
        
        m[node] = n_active_strong_ties
            
        end
    end
    
    return m
end

n_active_strong_ties (generic function with 1 method)

In [None]:
function n_active_weak_ties(G::LightGraphs.SimpleGraphs.SimpleGraph, node_status::BitVector)
    
    n_active_weak_ties  = 0
    
    j = Vector{Int}()
    
    for node in vertices(G)
        if node_status[node] == false
            for nbr in neighbors(G, node)
                if node_status[nbr] == true
                    n_active_weak_ties += 1
                end
            end
        
        j[node] = n_active_weak_ties
            
        end
    end
    
    return j
end

### Activation probability

At each time step, the vector holding the probabilty of activation for each node is calculated using the following function. Since the network is small (3000 nodes) we use an array comprehension. For larger networks, preallocated arrays and explicit looping might be better.

In [11]:
function calc_activation_prob(G::LightGraphs.SimpleGraphs.SimpleGraph,
                              j::Vector{Int}, m::Vector{Int},
                              alpha::Float64, beta_w::Float64, beta_s::Float64)
    
    probs = [1 - ((1 - alpha) * (1 - beta_w)^j[node] * (1 - beta_s)^m[node]) for node in vertices(g)]
    
    return probs
end

calc_activation_prob (generic function with 2 methods)