# What is Social Network Analysis? 

- A short history, motivating examples, and terminology
- Nodes
- Edges
- Types of Graphs
- Attributes and Weights

### Let's start with some definitions

__Network__: a pattern of interconnections among a set of things [[Source](http://www.cs.cornell.edu/home/kleinber/networks-book/networks-book-ch01.pdf)]

__Social Network__: a network where the *things* are people and the *interconnections* are social interactions

__Social Network Analysis__ (SNA): the application of _graph and network theory_ to investigate social structures.

__Graph Theory:__ the study of graphs, which are mathematical structures used to model pairwise relations between objects.

__Network Theory:__ a part of graph theory: a network can be defined as a graph in which nodes and/or edges have attributes (e.g. names). 

---

### Parts of Graphs

__Node / Vertex__: The entity of analysis which has a relationship. Node is used in the network context, vertex is used in the graph theory context, but words are often used interchangeably.

__Link / Edge / Relationship__: The connections between the nodes. Link is used in the network context, edge is used in the graph theory context, and all words are used interchangably with *relationship*.

__Attributes__: Both nodes and edges can store attributes, which contain additional data about that object.

__Weight__: A common *attribute* of edges, used to indicate *strength* or *value* of a relationship.

### Types of Graphs

Graphs are typically classified based on the presence of weights and direction attached to the edges in a graph. The table below covers what we call each type of graph:

|                | Absent     | Present  |
|----------------|------------|----------|
| __Weights__ | Unweighted | Weighted |
| __Directionality__ | Undirected | Directed |

<center>__Additional flavors__: parallel edges, self loops, bi- and tri-graphs</center>

In context:
> We are talking about a(n) __\[unweighted/weighted\]__ __\[undirected/directed\]__ graph (with __\[parallel edges | self loops\]__).

---

## Examples
### Example: Zachary's Karate Club Network
> The *Iris* dataset of social network analysis

![karate club](https://upload.wikimedia.org/wikipedia/commons/2/2b/Karate_Cuneyt_Akcora.png)

A social network of a karate club was studied by Wayne W. Zachary for a period of three years from 1970 to 1972. The network captures 34 members of a karate club, documenting 78 pairwise links between members who interacted **outside** the club. During the study a conflict arose which led to the split of the club into two. Based on collected data Zachary assigned correctly all but one member of the club to the groups they actually joined after the split.

There is even a [Zachary's Karate Club CLUB](http://networkkarate.tumblr.com/), which awards a trophy to the first person at a netowrk conference to use Zachary's Karate Club Network as an example

### Example:  _15th Century Florentine Marriages_

![15th century florentine marriages](https://upload.wikimedia.org/wikipedia/commons/thumb/2/20/15th_Century_Florentine_Marriges_Data_from_Padgett_and_Ansell.pdf/page1-577px-15th_Century_Florentine_Marriges_Data_from_Padgett_and_Ansell.pdf.jpg)

[[Padgett and Ansell, 1993](http://home.uchicago.edu/~jpadgett/papers/published/robust.pdf)]

The graph above is a marriage network of 16 influential Florentian families in the 1430s.  At this time in Renaissance Italy, the major families were essentially an oligarchy, controlling politics and money in the region.

Based on this network, can you surmise which family ascended to power in the proceeding decades?

By examining the right networks, we can understand which actors are the most central.  In this case, the network forecasts the Rise of the Medici's, even though they were not the most wealthy or most politically connected family at the time.

### The human disease network

![the human disease network](http://www.pnas.org/content/104/21/8685/F2.large.jpg)

> In the “human disease network” (HDN) nodes represent disorders, and two disorders are connected to each other if they share at least one gene in which mutations are associated with both disorders (Figs. 1 and 2 a). In the “disease gene network” (DGN) nodes represent disease genes, and two genes are connected if they are associated with the same disorder (Figs. 1 and 2 b).

[Goh, Kwang-Il, Michael E. Cusick, David Valle, Barton Childs, Marc Vidal, and Albert-László Barabási. 2007. “The Human Disease Network.” Proceedings of the National Academy of Sciences of the United States of America 104 (21): 8685–90.](http://www.pnas.org/content/104/21/8685.full)