A primer on network analysis for business
A network is a dataset made of entities and their relations
Scientists use the term "graph" to discuss networks.
a. Social networks
As users, we are very familiar with one type of networks - social networks:
b. Other networks
It is important to realize that networks cover more than relations between humans. For example, it is possible to imagine a network made out of cooking recipes. 2 ingredients are connected if they appear frequently in the same recipes.
Scanning all recipes and their ingredients from a website of cooking recipes, this gives:
Semantic networks are another broad category of networks. The method is the same: we need to find a way to "relate" words in a text, then we get a network. The general idea is the same as in cooking recipes: 2 terms of a text will be connected in the network if they frequently appeared in same paragraphs.
c. How big can networks be?
For example, “The Anatomy of the Facebook Social Graph” (2011)
→ study of 721 million active Facebook users and the 69 billion (!) friendship links connecting them.
A limit is quickly reached in terms of visualization: it is hard to fit millions of nodes on a screen. In the next visualization, we can see a network of 90,000 Swedish speakers and their relations on Twitter. The view is very cluttered.
(open the source for an interactive version)
2. Networks: what use for business?
If a network is made of entities and their relations, then a segment is a subgroup of entities in the network, which has some cohesion or something in common.
This subgroup of nodes in the network is often called a "community".
Detecting communities in a network, also called "clustering", consists in finding nodes that have many connections in common.
This is a mathematical and algorithmic procedure, but it is very simple to understand visually:
c. Understanding how information spreads
A data science company created "Where does my tweet go", which traces how a given tweet spreads through retweets. The service is now discontinued (Twitter datan was too expensive to buy) but the mechanism can be explained:
d. Identifying patterns - for fraud detection, control or intelligence.
In the following video, we see participants in the money market (short term loans between banks) in Europe. 2 banks are connected if one lends to the other. The pattern of exchanges shifts through years - banks withdraw from the market.
3. To go further
You can also visit my tutorials on Gephi, the leading software to visualize large graphs: