# Applying Complex Systems Modeling

Let's consider a practical scenario that can exhibit complex system dynamic and which we often need to address in business contexts (as well as social, political, and other spheres): __product introduction and viral adoption__.

Naturally, there are many elements that can influence the viral success (or lack thereof) for a product and viral success is notoriously hard to craft or predict, notwithstanding all of those influencer courses that make promises.

## General approach: network analysis

We can imagine the various users, influencers, and potential users of our product as nodes in a graph or network which captures their relationships.

If we understand the characteristics of the network, we can learn about how our product might spread.

For example, if the network space has attractors states representing little connection vs. massive connection, we would want to understand
* if our product is likely to enter the system in the "massive connection" basin of attraction (which we'd like)
* whether the system exhibits tipping point behavior at the "edge" between the low and high connection attractor regions
    * what parameters the system is more or less sensitive to and which might allow us to manipulate or at least plan for better odds

<img src='images/Energy_landscape.png' width=700 />

In this notebook, we'll focus first on investigating a flavor of network that is closer to real-world social connections than the E-R graphs we looked at earlier.

We'll aim to learn the answers to the above questions through experiments.

## Small world graphs

Small world graphs, or small world networks, are a type of graph in which most nodes can be reached from every other node by a small number of steps. This type of network was first described in the 1960s by social psychologist __Stanley Milgram__ in his "small world experiment". The experiment highlighted the concept of "six degrees of separation," suggesting that any two people on Earth could be connected through a chain of six acquaintances or less.

In Milgram's experiment, he sent packages to 160 random people living in Boston, asking them to forward the package to a friend or acquaintance who they thought would bring the package closer to a designated final individual, a stockbroker also living in Boston. Surprisingly, the packages reached the stockbroker in an average of six steps, hence the term "six degrees of separation".

https://en.wikipedia.org/wiki/Small-world_experiment

This discovery has had far-reaching implications, influencing several fields from sociology to computer science. 

In the late 1990s, mathematicians __Duncan Watts__ and __Steven Strogatz__ formalized the concept of small world networks in a mathematical context. 

They proposed a simple model for generating such networks, starting with a regular lattice and then rewiring some of its edges at random. This model revealed that even a small amount of rewiring could give rise to a network with both high clustering (like a regular lattice) and short average path lengths (like a random graph), a hallmark of small-world networks.

In [None]:
import networkx as nx
import matplotlib.pyplot as plt

# Create a Watts-Strogatz small world graph
# n = number of nodes
# k = each node is connected to k nearest neighbors in ring topology
# p = the probability of rewiring each edge
n, k, p = 20, 4, 0.5
G = nx.watts_strogatz_graph(n, k, p)

nx.draw(G, with_labels=True)
plt.show()

Read more about the Watts-Strogatz model at https://en.wikipedia.org/wiki/Watts%E2%80%93Strogatz_model

> Note that although this model has some statistics and topological characteristics that are similar to organic social networks, it is also different in significant ways. Simple generative processes for organic-similar networks are an ongoing area of research and we've chosen to use the simples model from this family for this introductory topic.

In [None]:
nx.is_connected(G)

It doesn't seem surprising that the network is connected, given the process tha generated it.

Let's try another, bigger graph with different parameters.

In [None]:
n, k, p = 100, 3, 0.01
G = nx.watts_strogatz_graph(n, k, p)

nx.draw(G, with_labels=True)
plt.show()

In [None]:
nx.is_connected(G)

Maybe all of these graphs -- or nearly all -- are connected...

Let's try one with a larger "population"

In [None]:
n, k, p = 10000, 3, 0.01
G = nx.watts_strogatz_graph(n, k, p)
nx.is_connected(G)

We could experiment for a few minutes with different combinations of parameters but it's not obvious what's going on. 

We can be more systematic by running a large number of simulations and counting the outcomes.

with `n, k, p = 10000, 3, 0.01` run 100 simulations and look at the proportion which are connected

In [None]:
sum(nx.is_connected(nx.watts_strogatz_graph(n,k,p)) for _ in range(100))

Now try `n, k, p = 10000, 3, 0.5` and repeat the experiment

In [None]:
n, k, p = 10000, 3, 0.5
sum(nx.is_connected(nx.watts_strogatz_graph(n,k,p)) for _ in range(100))

This is better than one-off sampling, but it's not very systematic.

Let's fix the graph size at 10,000 and experiment with `k` and `p`

To keep it simple, we'll experiment with `k` first. Leave `p` at 0.5 and calculate the connectedness proportion for values of `k` from 2 up through 6.

Plot the results

In [None]:
x = range(2,7)

In [None]:
conn = [0.01 * sum(nx.is_connected(nx.watts_strogatz_graph(n,k,p)) for _ in range(100)) for k in x]

In [None]:
plt.plot(x, conn)

What do you notice about the results?

Since we have 2 parameters we're interested in ($k$ and $p$), if we had more time it would make sense to plot a 3-D graph (connectedness probability as a function of $k$ and $p$). 

But we can take a shortcut that will save some time (both coding and running).

If there is an interesting boundary in your previous plot, pick the integer value on either side (since the parameter $k$ represents a whole number of neighbor nodes)

For each of those two values, calculate the connected proportion when varying the $p$ parameter (probability of rewiring) across this set of possible values:

In [None]:
probs = [0.05, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6]

Plot the results.

In [None]:
conn_k3 = [0.01 * sum(nx.is_connected(nx.watts_strogatz_graph(n,3,p)) for _ in range(100)) for p in probs]
conn_k4 = [0.01 * sum(nx.is_connected(nx.watts_strogatz_graph(n,4,p)) for _ in range(100)) for p in probs]

In [None]:
plt.plot(probs, conn_k3, label='k=3')
plt.plot(probs, conn_k4, label='k=4')
plt.legend()

* What does this tell us about the sensitivity of this graph family to the two parameters?

* Can you think of realistic scenarios where the $k$ (neighbor connection) might take on values between 2 and 6?

* If this graph were sufficiently similar to our customer and influencer graph, would this be "good news" or "bad news"?

* What could we do to increase our chances of success?

### Going further

Next steps could include simulating
* the spread of the product through the network measuring spread as a function of time
* the entry of a competing product, spreading elsewhere in the network, to see
    * how the relative timing of product launch affects final market share in a "first-in wins" scenario
    * long-term dynamics of a multiproduct market with low or moderate switching costs
* different types of people (nodes) and relationships (edges) with different and probabilistic behaviors

> In some ways, modeling this product spread may remind you of SEIR models used in epidemiology and other population-spread problems. Those are great tools too -- what are the pros and cons of the SEIR approach vs. a network simulation approach?

And of course we could try other graph-building algorithms that might have better similarity to our target population.
