# Marketing Network Project

Data: 'friendships.gml' | University of Michigan School of Information

This project takes data from a network of friends on a social network media site that has been given to a business to analyze to best tailor its marketing practices to. The company wants to offer discount vouchers to select customers who can share vouchers in their network.  The following assumptions apply:
- The voucher can only be shared by the original node (not by any of the off shoots)
- Want the voucher to reach as many nodes as possible

In [1]:
import networkx as nx

G1 = nx.read_gml('friendships.gml')
n = len(G1.nodes())
print('Total Network has {} people'.format(n))

Total Network has 1133 people


Find the node that is most central and connects to as many nodes as possible

In [2]:
def most_edges():
    degree = nx.degree_centrality(G1)
    return max(degree.keys(), key=lambda x:degree[x])
n = most_edges()
print('Node number {} is the best candidate'.format(n))

Node number 105 is the best candidate


If we change assumption #1 and vouchers can be shared by all nodes and not just the original node, what node is our best candidate? Because the network is connected, regardless of who we pick, every node in the network will eventually receive the voucher. However, we now want to ensure that the voucher reaches the nodes in the lowest average number of hops since the probability of sharing the voucher declines with each node the voucher is shared to.

In [3]:
def most_edges2():
    closeness = nx.closeness_centrality(G1)
    return max(closeness.keys(), key=lambda x:closeness[x])
N = most_edges2()
print('Node number {} is the best candidate'.format(N))

Node number 23 is the best candidate


Assume the restriction on the voucher’s travel distance is still removed, but now a competitor has developed a strategy to remove a person from the network in order to disrupt the distribution of the company’s voucher. The competitor is specifically targeting people who are often bridges of information flow between other pairs of people. 

The next step is to find single riskiest person to our companies voucher distribution model.

In [4]:
def risk():
    btw = nx.betweenness_centrality(G1)
    return max(btw.keys(), key=lambda x:btw[x]) # riskiest person identifed as the node who has the most information flowed through it
r = risk()
print('Node number {} is the most disruptive person in the model if removed'.format(r))

Node number 333 is the most disruptive person in the model if removed
