In [1]:
# setup
from IPython.display import display,HTML
display(HTML('<style>.prompt{width: 0px; min-width: 0px; visibility: collapse}</style>'))
display(HTML(open('../rise.css').read()))

# imports
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
sns.set(style="whitegrid", font_scale=1.5, rc={'figure.figsize':(12, 6)})

<h1>A Star Algorithm</h1>

We continue to develop our maze program by investigating how some search methods can be used to solve the maze. The most important algorithm that we will see is the [A-star algorithm](https://cs.stanford.edu/people/eroberts/courses/soco/projects/2003-04/intelligent-search/astar.html#:~:text=A*'s%20optimality%20is,an%20optimal%20path%20to%20g.), which is a minor modification of Dijkstra's algorithm.

In our Labyrinth program, suppose that we have already implemented generating the maze by creating a planar graph and selecting a set of edges to be 'opened.' For the final, the students will implement depth-first search and the A-star algorithm. We have already seen depth-first search in detail, so let us turn to the A-star algorithm.

<h3>Dijkstra's Algorithm Recap</h3>

Dijkstra's algorithm can be viewed as both a greedy and dynamic program, where we build up a tree of the shortest distances from a given start vertex by greedily selecting the vertex on the frontier that minimizes the distance to this start vertex $s$. At any step in the program when the explored tree is $t$, let $g_t(n)$ be the shortest distance from the start node $s$ to the node $n$. 

<h3>A-Star algorithm.</h3>

The A-star algorithm is very similar. If we know the location of the final node $r$, then we can estimate the shortest path from a node $n$ to $r$ using the Euclidean distance, which we denote $h(n)$. Instead of greedily choosing the node that minimizes $g_t(n)$, we greedily choose the node that minimizes $f_t(n) = g_t(n)+h(n)$. This defines the A-star algorithm.

More generally, we can use any function $h(n)$ in place of the Euclidean distance. For example, in solving the 2x2x2 Pocket cube, we set $h(n)$ to be the minimum of the number of moves needed to correctly position all of the cubies and the number of moves needed to correctly orient all of the cubies.

We will show that the A star algorithm finds the shortest path from $s$ to $r$ as long as $h(n)$ is always an underestimate of the length of the shortest path. When the graph is embedded in $\mathbb{R}^2$ and $h(n)$ is the Euclidean distance, and the edges are measured according to their Euclidean distances, it is true that $h(n)$ is always an underestimate, because it measures the length of the most direct route. When the graph has the states of the pocket cube as its vertices and pocket cube moves as its edges and $h(n)$ is the heuristic function for the pocket cube, $h(n)$ is still an underestimate because, in order to solve the cube, we at least need to put all of the cubies in the right places and correctly orient them.

<h3>Why does A-star find the shortest path?</h3>

We have claimed that the A-star algorithm always returns the shortest path from $s$ to $r$ when $h(n)$ is an underestimate of the length of the path. Let's see how to prove this, following [this resource.](https://cs.stanford.edu/people/eroberts/courses/soco/projects/2003-04/intelligent-search/astar.html#:~:text=A*'s%20optimality%20is,an%20optimal%20path%20to%20g.)

Suppose that the algorithm first finds $r$ via an indirect route, a tree $t$, colored green, when there is a shorter route in a tree $t^\prime$ colored pink, available. Let $n$ be the first node along this shorter route from $s$ to $r$ that is not in the tree $t$.


<img src="figures/astar.jpg" width="30%">

#TODO: Redraw: Replace t with r in the figure above.

Since $r$ is visited before $n$, we must have $f_t(r)\leq f_t(n)$. Let $t^\prime$ be a tree that contains the shortest route from $s$ to $r$. Then $f_{t^\prime}(r) \geq f_{t^\prime}(n)$, because $h(n)$ is an underestimate. Putting this together we find

\begin{align*}
f_{t^\prime}(r) \geq f_{t^\prime}(n) = f_{t}(n) \geq f_t(r).
\end{align*}

The equality in the middle holds because of the choice of $n$.

Note that $h_t(r)=h_{t^\prime}(r)$, since both are already the final node.


This contradicts that $t^\prime$ contains a shorter route that $t$ from $s$ to $r$.