# Graphs

We've seen been both Linked Lists and Binary Search Trees which are linked data structures.

Graphs are our most general linked data Structure. They are also perhaps our most versatile data structure.

A **graph** is a set of vertices connected by edges.




<img src = "figures/graphs.jpeg" width = "60%">

In a **directed graph**, the edges have direction. $A$ can get directly to $B$, but $B$ can't go back to $A$ through that edge.

In an **undirected graph**, edges can be traversed in both ways.

## Applications

Graphs are one of the most applicable data structures. This comes from their generality. At its core, graphs represent objects and the relationships between them. This isn't a philosophy course, but all of knowledge falls under that category.

This is not to say that we should apply graphs to every problem in existence, but that it is a valuable skill to know how you can represent problems in terms of graphs and to know the types of questions you can ask and answer using graphs. Familarity with graphs and graph algorithms we pays dividends in problem solving aptitude.

**Some Applications of Graphs:**

- Networking

- Social Networks
- Cellular Processes
- Biomolecular Interactions
- Advertising
- Disease Spread
- Logistics
- Lexical Semantics
- Scheduling
- Circuit Design
- etc...


# Questions we can ask about Graphs

Since we are concerned with the relationships between objects, it is to get the direct connections between vertices.



- We often want to find **neighbors** of any given vertex $v$. These are vertices that are directly connected to $v$ by an edge.

    For example, which proteins directly interact with each other? 
    
    Or, which products are directly related to related to each other via shared interests?



- A second related common question is: what is the **degree** of a vertex? That is, how many neighbors does it have?

    How many cities can someone get a direct flight to from a given airport?



- We are also often interested in **paths** through a graph. A **path** is a sequence of vertices where each is connected by an edge.
    
    What route should one take to drive from one city to the next?



- Related to paths is the question of **reachability**. Is vertex $a$ reachable from $b$? That is, is there a path between them?

    Given a graph of biomolecule interactions, can an antibody indirectly affect an unintended biological pathway? That is, can an antibody cause an unintended side effect?



- Generalizing the idea of reachability is **connectedness**. A graph is **connected** is there is a path between every pair of vertices.

    - Not all graphs are connected. A **connected component** is a maximal subset of vertices where every pair in that subset is connected.

    Social distancing was a practice in partitioning the massively connected graph of human interactions into many very small connected components.







<img src = "figures/connected-components.jpeg" width = "60%">

With your new found excitement to use graphs to solve your problems, let's dive in to how to use them!

Our plan is to discuss how we can represent graphs computationally and then to see a few graph algorithms. We'll focus graph exploration which is useful for identifying connected components. Then we'll touch on a few other graph algorithms. We can't possibly cover everything, but it is a huge advantage to know what is possible.

# Representing Graphs


There are multiple ways to represent a graph. We will consider one.

When deciding how to implement a graph (or any data structure), we should ask what operations we want to perform efficiently.

In other data structures, the key operations are:

- insert
- remove
- contains

As you can tell from our discussion about, with a graph, we are usually concerned with exploring relationships between the elements by analyzing paths, partitions, or substructures within the graph.

Rather than being concerned with efficient modification (e.g, efficient insertion and removal), graphs are normally built upfront, and we want efficient access into them.

Moreover, if we want to analyze larger structures, the fundamental operation is to be able to access the neighbors of individual vertices.


## Map of Neighbors

A dictionary is a perfect structure for our needs. Dictionaries provide $O(1)$ access based on a key. Given a vertex, we can efficiently retrieve a list of its neighbors.

Here are our original graph examples again:

<img src = "figures/graphs.jpeg" width = "60%">

We can build dictionaries to represent them:

In [2]:
undirected_graph = {
    'A': ['B', 'C'],
    'B': ['A', 'C', 'D'],
    'C': ['A', 'B', 'D'],
    'D': ['B', 'C']
}

directed_graph = {
    'A': ['B'],
    'B': ['D'],
    'C': ['A', 'B'],
    'D': ['C']
}

Getting a list of the neighbors for any vertex is trivially easy. 

In [3]:
print("Undirected Graph 'B' has neighbors {}".format(undirected_graph['B']))

print("Directed Graph 'C' has neighbors {}".format(directed_graph['C']))

Undirected Graph 'B' has neighbors ['A', 'C', 'D']
Directed Graph 'C' has neighbors ['A', 'B']


# Coming Up

In this notebook, we introduced graphs, potential applications, key questions and idea with respect to them, and how to represent them in a program. 

In our next notebook, we will explore a couple of related graph algorithms, namely, how to explore graphs!