# A* algorithm - step by step
In the following, we develop the necessary code to implement the A* algorithm in C++.

## Store a grid in your program
In order to write the A* search algorithm, I'll need a grid to search through. 

Let's start with a hard-coded grid.

In [1]:
#include <iostream>
#include <vector>
using std::cout;
using std::vector;

vector<vector<int>> board = {
    {0, 1, 0, 0, 0, 0},
    {0, 1, 0, 0, 0, 0},
    {0, 1, 0, 0, 0, 0},
    {0, 1, 0, 0, 0, 0},
    {0, 0, 0, 0, 1, 0}
};

for (vector row : board) {
    for (int value : row) {
        cout << value << " ";
    }
    cout << "\n";
}

0 1 0 0 0 0 
0 1 0 0 0 0 
0 1 0 0 0 0 
0 1 0 0 0 0 
0 0 0 0 1 0 


## Create a function to print the board: PrintBoard
Let's put the printing of the board into a function.

In [2]:
#include <iostream>
#include <vector>
using std::cout;
using std::vector;

In [3]:
// When I try using vector instead of std::vector, the
// compiler complains about that.
// TODO: Look into this in more detail and potentially raise an issue
// at: https://github.com/jupyter-xeus/xeus-cling/issues
void PrintBoard(const std::vector<std::vector<int>> board) {
    for (std::vector row : board) {
        for (int value : row) {
            cout << value << " ";
        }
        cout << "\n";
    }
}

In [4]:
PrintBoard(board)

0 1 0 0 0 0 
0 1 0 0 0 0 
0 1 0 0 0 0 
0 1 0 0 0 0 
0 0 0 0 1 0 


## Create a function to read the board from a file: ReadBoardFile

In [5]:
#include <iostream>
#include <fstream>

In [6]:
void ReadBoardFile(const std::string &path) {
    std::ifstream board_file(path);
    if (board_file) {
        std::string line;
        while (getline(board_file, line)) {
            std::cout << line << "\n";
        }
    }
}

In [7]:
ReadBoardFile("files/1.board")

0,1,0,0,0,0,
0,1,0,0,0,0,
0,1,0,0,0,0,
0,1,0,0,0,0,
0,0,0,0,1,0,


## Create a function to parse lines into vectors: ParseLine

In [8]:
#include <sstream>

In [9]:
std::vector<int> ParseLine(const std::string &line) {
    // Assumption: Each line of the baord looks like this: 1, 0, 0, 0,
    std::vector<int> v;
    std::istringstream str_stream(line);
    
    char comma;
    int number;
    
    while (str_stream >> number >> comma && comma == ',') {
        v.push_back(number);
    }
    return v;
}

Let's update `ReadBoardFile` to use the `ParseLine` function.

In [10]:
using std::string;
using std::vector;
using std::ifstream;

In the following we need to use `auto` instead of `<vector<vector<int>>` because xeus-cling throws an error for the latter case. ([source](https://github.com/jupyter-xeus/xeus-cling/issues/40))

In [11]:
auto ReadBoardFile(const string &path) {
    std::ifstream board_file(path);
    std::vector<std::vector<int>> board;
    if (board_file) {
        std::string line;
        while (getline(board_file, line)) {
            board.push_back(ParseLine(line));
        }
    }
    return board;
}

In [12]:
std::vector<std::vector<int>> board = ReadBoardFile("files/1.board");
PrintBoard(board);

0 1 0 0 0 0 
0 1 0 0 0 0 
0 1 0 0 0 0 
0 1 0 0 0 0 
0 0 0 0 1 0 


## Formatting the Printed Board
The board will eventually have more than two cell states as the program becomes more complicated. Let's add formatting to improve the readability. We'll use formatting moving forward.

Let's print the board this way:
```
0   ⛰️   0   0   0   0
0   ⛰️   0   0   0   0
0   ⛰️   0   0   0   0
0   ⛰️   0   0   0   0
0   0    0   0  ⛰️   0
```

In [13]:
enum class State {kEmpty, kObstacle};

In [14]:
string CellString(const State &state) {
    switch (state) {
        case State::kEmpty :
            return "0 ";
            break;
        case State::kObstacle:
            return "⛰️ ";
            break;
        default:
            throw std::runtime_error("Unknown State");
    }
}

In [15]:
void PrintBoard(const std::vector<std::vector<int>> board) {
    for (auto row : board) {
        for (int value : row) {
            cout << CellString(State(value));
        }
        cout << "\n";
    }
}

In [16]:
std::vector<std::vector<int>> board = ReadBoardFile("files/1.board");
PrintBoard(board);

0 ⛰️ 0 0 0 0 
0 ⛰️ 0 0 0 0 
0 ⛰️ 0 0 0 0 
0 ⛰️ 0 0 0 0 
0 0 0 0 ⛰️ 0 


# Putting it all together
Next, we're going to refactor all the functions to use `State` exclusively. Here is an overview over the API:

```Cpp
enum class State {kEmpty, kObstacle};

// Transform a state into a string
std::string CellString(const State &state) {...}

// Print the board to the standard output stream
void PrintBoard(const std::vector<std::vector<State>> board) {...}

// Transforms a line into a State vector
std::vector<State> ParseLine(const std::string &line) {...}

// Open the file containing the board, read the file line by line,
// transform 0 & 1 values to State values, return a 2D vector containing
// the board.
std::vector<std::vector<State>> ReadBoardFile(const std::string &path) {...}
```

In [17]:
#include <fstream>
#include <iostream>
#include <sstream>
#include <string>
#include <vector>
using std::cout;
using std::ifstream;
using std::istringstream;
using std::string;
using std::vector;

// Defined above, can't define it again else it throws an error
// enum class State {kEmpty, kObstacle};

In [18]:
std::string CellString(const State &state) {
    switch (state) {
        case State::kEmpty :
            return "0 ";
            break;
        case State::kObstacle :
            return "⛰️";
            break;
        default:
            throw std::runtime_error("Unknown State");
    }
}

In [19]:
void PrintBoard(const std::vector<std::vector<State>> board) {
    for (std::vector<State> row : board) {
        for (State state : row) {
            std::cout << CellString(state) << " ";
        }
        std::cout << "\n";
    }
}

In [20]:
std::vector<State> ParseLine(const std::string &line) {
    std::vector<State> v;
    std::istringstream str_stream(line);
    
    int num;
    char comma;
    while(str_stream >> num >> comma && comma == ',') {
        v.push_back(State(num));
    }
    return v;
}

In [21]:
// auto refers to std::vector<std::vector<State>>
// have to use auto, else an error is thrown due to xeus-cling
auto ReadBoardFile(const std::string &path) {
    std::ifstream board_file(path);
    std::vector<std::vector<State>> board;
    if (board_file) {
        std::string line;
        while (getline(board_file, line)) {
            board.push_back(ParseLine(line));
        }
        return board;
    } else {
        throw std::runtime_error("File stream failed.");
    }
}

In [22]:
PrintBoard(ReadBoardFile("files/1.board"));

0  ⛰️ 0  0  0  0  
0  ⛰️ 0  0  0  0  
0  ⛰️ 0  0  0  0  
0  ⛰️ 0  0  0  0  
0  0  0  0  ⛰️ 0  


## Summary
Content:
* Sending output to the terminal
* Variables & containers
  * Variable types
  * Vectors
  * auto
  * Define own types with enums
* Functions and control structures
  * Conditionals
  * Loops
  * Functions
* Data Input
  * Read data from a file
  * Parse data and process strings

Wrote a program that:
* Reads an input text file (board data)
* Parses, formats, and stores the data locally
  * Defined enum type to add ascii charachter formatting to the output
* Prints the output

# A* algorithms

Finds efficiently the path between two points in a grid.

The A* algorithms is an algorithm that's frequently used for path finding when working with graphs.

So far, we've developed code that reads board data from an input text file, parses, formats, and stores the data locally, and prints the board to the output.

In the following, we'll add a step between storing the data and printing the board:
* Reads an input text file (board data)
* Parses, formats, and stores the data locally
  * Defined enum type to add ascii charachter formatting to the output
* Find a path using A* search
* Prints the output

The A* algorithms is a discrete method for planning.

### General Discrete Path Search Algorithm
**Given:**
* Map
* Staring location
* Goal location
* Cost function

**Goal:**
* Find minimum cost path

**Method:**
* Keep a list of nodes to investigate further (expand)
* Keep track of which nodes were visited
* Keep track of the g-value (the number of steps taken so far)
* We always expand the node with the smallest g-value

### A* Algorithm
The difference between the A* algorithm and the general discrete path search algorithm is that A* uses a heuristic function that returns the number of steps it takes to get to the goal if there was no obstacle.

The heuristic function is therefore an optimistic guess of how far we are from the goal (because it assumes that there are no obstacles):

$$h(x,y) \leq \text{distance to goal from } x, y$$

The heuristic function is an underestimate or at best equal to the true distance from start to goal. 

There are many valid functions for the heuristic functions, including setting everything to $0$, but that's not really useful here. The _Euclidean distance_ can be used to calculate $h(x, y)$.

#### Modifying the general discrete path search to the A* search
**Given:**
* Map
* Staring location
* Goal location
* Cost function
* Heuristic function $h(x, y)$

**Goal:**
* Find minimum cost path

**Method:**
* Keep a list of nodes to investigate further (expand)
* Keep track of which nodes were visited
* Keep track of the g-value (the number of steps taken so far)
* Keep track of the f-value: $f = g + h(x, y)$
* We always expand the node with the smallest f-value

#### A* Pseudocode
```
Search(grid, start_point, goal_point):
  1. Initialize an empty list of open nodes
  2. Initialize a starting node with the following:
    * x, y values given by start_point
    * g = 0 (cost for each move)
    * h (heuristic function, a function of the current coordinates and the goal)
  3. Add the new node to the list of open nodes.
  4. WHILE the list of open nodes is nonempty:
    1. Sort the open list by f-value
    2. Pop the optimal cell (called the current cell)
    3. Mark the cell's coordinates in the grid as part of the path
    4. IF the current cell is the goal cell:
      * return the grid
    5. ELSE expand the search to the currend node's neighbors:
      * Check each neighbor cell in the grid to ensure that the cell is empty: It has not been closed and is not an obstacle.
      * If the cell is empty, compute the cost (g value) and the heuristic, and add to the list of open nodes.
      * Mark the cell as closed.
  5. If the while loop exits because the list of open nodes is empty, then there are no new nodes to explore and there doesn't exist a path.
```

#### Summary
The A* algorithm finds a path from start node to end node by: 
* checking for open neighbors of the current node
* computing a heuristic for each of the neighbors
* adding those neighbors to the list of open nodes to explore next
* choosing the next node to explore with `min(