# Introduction to Graphs and Python
TODO - Do I need to do a quick introduction of Jupyter?

Graphs are very useful because many problems can be transformed into a graphical representation such that a known algorithm can give you a solution. One common problem, given a graph, is finding the best path between two nodes. There are a variety of algorithms for this problem depending on your definition of 'best' path and the sort of graph you have. Graphs come in a variety of flavors. Here we'll deal with _directed_ graphs (A->B does not imply B->A) with _unweighted_ edges (the length of a path is just the number of edges in it; equivalently, these could be weighted graphs where all edges have the same edge weight).

Breadth-first search (BFS) prioritizes search near the source, spreading out slowly but searching all paths as it goes. It is _breadth_-first in the sense that it searches all nodes at a given depth from the source (e.g. two edges away) first before moving deeper (e.g. to nodes three edges away). Depth-first search (DFS) differs in that it searches deeper nodes first until there are no deeper nodes, then it moves up as little as possible to find a new path to move deeper on.

Watch a visualization of BFS/DFS [here][visual]. Just skip through the intro explanation of the tool unless you're interested. Then you can draw any graph you like by clicking "Draw Graph" in the orange menu on the left side (or have it generate a random graph for you). Once you have a nice graph, click either BFS or DFS to watch the algorithms go (you just have to enter the start node; it doesn't take a goal node as an input but rather keeps searching until it has searched the entire graph to find the shortest paths from the source to all nodes. You can see how long it would have taken to find a given node, though, as you watch). If it moves too fast, you can pause the visualization (the controls are at the bottom of the screen) and use the forward/backward arrow keys to advance. You can also use +/- to change the speed.

BFS is guaranteed to find the shortest path between the source and any other node (because it prioritizes shorter paths). DFS does not have this guarantee; as it searches longer paths first, it may find a sub-optimal path. However, both are guaranteed to find some path if a path exists. You might expect that DFS would be quicker then, but the runtime for both is $O(V + E)$ where $V$ is the number of vertices in the graph and $E$ is the number of edges (DFS _is_ useful for some things, though; it is used, for example, when doing topological sorts).

---------------------

I've written some code for a Node class below; you can use it if you want, adapt it, or write your own. There is also a Graph class (though you could just use a list) and a function that should help in specifying graphs. If you use the Node class below, all you need is the starting node (and to have the goal specified) in order to do your search. I wrote a little code that shows how to make a graph, set the goal node, and find a path using this code.

---------------------

Other useful things about graphs:
* One generally stores a graph as either an adjacency matrix or an edge list.
    * An adjacency matrix $A$ is a square $n x n$ matrix, where $n$ is the number of nodes, which has $a_{ij}=1$ if there is an edge between nodes $i$ and $j$, otherwise the entries are 0. One can do a large number of useful things with adjacency matrices. For example, from the definition you already know that $A^1$ gives the number of paths of length 1 or less between pairs of nodes; this property actually holds for any power as well. That is, the entries of $A^k$ are the number of paths of length $k$ or less between the given pairs of nodes. 
    * An edge list has elements that are tuples of nodes. If there is an edge between nodes $i$ and $j$ then there will be an element $i,j$ in the edge list.
* There are a number of different _centralities_ that one can compute on graphs. These tell you which nodes are most important, for varying definitions of importance. The simplest is degree centrality: each node's degree centrality is just the same as its degree (the degree of a node is the number of edges it has; for directed graphs, one also delineates in-degree and out-degree for the number of in- and out-edges of a node).

[visual]: https://visualgo.net/dfsbfs

## Graphs

Edges in a graph can also have **weights**. If the edges are roads, the weights might be the length of the road. It might be better to drive along several short rodes instead of just one long road, if it still gets you there faster!

Directed, undirected

How to represent grids as graphs

### Applications
| Nodes | Edges | Problem |
|---|
| Cities | Roads | Find the shortest path between any two cities |

## In Python

A program is just a list of directions to your computer. There are many different languages that computers 'speak'; these each have their own strengths and weaknesses. Python is the name of the language we'll be using. I find it rather beginner-friendly, and a lot of great applications have been written in and for Python.

There are a few key parts of the language that we'll need to learn in order to tell our computers to do what we want, like solve problems involving graphs. We're going to go over this quickly now. If it feels a little overwhelming, don't worry; you'll get more practice later in the class.

### Numbers
Numbers in Python work just like you would expect! So do most arithmetic operators: `+`, `-`, `*`, and `/`. Just don't use commas in your numbers (e.g. write `10000` *not* `10,000`).

In [1]:
5 + 3

8

In [2]:
2.5 * 4

10.0

In [3]:
9 / 3

3.0

In [4]:
(1 + 2) * 3 + 4 / 2

11.0

### Comments
It's often useful to say something about what your code is doing, especially if there's a tricky part. You can do this using comments. Just put a `#` in front, and then you can write anything you want after. Python knows it's a comment, so it doesn't try to read it like code.

In [16]:
# I'm a comment, so nothing happens here!

In [17]:
5 + 3 # this is just some addition; everything before the comment is still code!

8

### Strings
A string is just text. It can contain any characters (letters, numbers, punctuation, etc.) and be of any length. You tell Python that something is a string by surrounding it with quotation marks (single or double quotes).

In [5]:
"I'm a string!"

"I'm a string!"

In [6]:
"String part 1" + ' part 2' # adding strings makes them into one

'String part 1 part 2'

In [7]:
"Single quotes in me are fine, but double quotes aren't!"

"Single quotes in me are fine, but double quotes aren't!"

In [8]:
'Double quotes in me "are" fine, but single quotes are not'

'Double quotes in me "are" fine, but single quotes are not'

In [15]:
print('If you must use single quotes in me, "escape" them like this:  \' ')
# use print to output strings

If you must use single quotes in me, "escape" them like this:  ' 


### Variables

### Functions

### Lists
A list can contain any amount of elements, like numbers or strings. Use square brackets to let Python know it's a list, and use commas between each item in the list.

In [18]:
[1, 2, 3]

[1, 2, 3]

In [19]:
[5, 'apples', 4, 'peaches'] # numbers and strings in the same list!

[5, 'apples', 4, 'peaches']

In [20]:
# a list can even have other lists inside of it
[[1, 2, 3], [4, 5, 6]]

[[1, 2, 3], [4, 5, 6]]

In [21]:
# we can pull things out of a list by telling Python which number of thing we want
# Python starts counting at 0, not at 1!
fruits = ['apples', 'pears', 'grapes']
print('The first fruit is', fruits[0]) # use a comma between multiple things to print them all at once
print('The last fruit is', fruits[2])

The first fruit is apples
The last fruit is grapes


In [22]:
# the len function tells us how long a list is (how many things are inside of it)
print("There are", len(fruits), "fruits.")

There are 3 fruits.


In [23]:
# we can even change what's inside of a list!
print('The second fruit used to be', fruits[1])
fruits[1] = 'mango'
print('Now the second fruit is', fruits[1])

The second fruit used to be pears
Now the second fruit is mango


In [None]:
# the append function 