### Learning Objectives
* To use `<random>` to generate random numbers 
* To use `<chrono>` to measure running times
* To implement sequential and binary search of both arrays

### Instructions
Read and study the following sections, run their code examples and solve their challenges. This worksheet has the following challenges:
* [CHALLENGE 01](#ch01)
* [CHALLENGE 02](#ch02)
* [CHALLENGE 03](#ch03)

Run your coding challenges and fix any errors they might have before downloading and submitting your completed worksheet for grading. When done, open the menu **File >> Download as >> HTML (.html)** to download your worksheet in HTML format. **Submit the downloaded *.html* file via Canvas**.

# Importance C++ utilities: random numbers and measuring running times
## Generating random numbers:
Before C++11, we relied on the C function `rand()` from `<cstdlib>` to generate random numbers. C++11 changes that by providing a dedicated `<random>` header file which hugely improves our ability to generate all sorts of random numbers. Using the `<random>` header file involves working with two fundamental notions:

- **Engines**: An engine is an object generating a uniformly distributed sequence of integers.
- **Distributions**: A distribution is a function object that takes the sequence of integers produced by the engine and generate a sequence of values according to the mathematical formula of that distribution.

Say for instance we want to generate 30 random numbers between 75 and 100. We start by including the `<random>` header file. Then we define an engine and a distribution objects:

In [None]:
#include <random>

std::default_random_engine en;
std::uniform_int_distribution<> dist{75,100};

The `dist` (the random distribution) is a function object. That means it overloads the `()` operator which allows it to be called like a function. Using `en` and `dist`, we can now generate the required random numbers.

In [None]:
#include <iostream>

for(int i = 0; i < 30; i++){
  std::cout << dist(en);
  if(i < 29) std::cout << ",";
}

Notice how `en` is passed as an argument to the function object `dist`. We can even put this into a function like this:

In [None]:
int random_int(int min, int max){
  static std::default_random_engine en; 
  static std::uniform_int_distribution<int> dist{min, max}; 
    
  return dist(en);
}

Notice the use of `static` in front of the local variables `en` (the random engine) and `dist` (the random distribution). This is so that these objects are only created once (during the first call of this function) and retained to be used in subsequent calls. Here is an example using this function to simulate rolling a die 100 times. The results are displayed as a histogram.

In [None]:
#include <string>
int results[] = {0, 0, 0, 0, 0, 0};
for(int i = 0; i < 100; i++){
  results[random_int(0, 5)]++;
}

// Display the results histogram
for(int i = 0; i < 6; i++){
  std::cout << i + 1 << ": "
            << std::string(results[i], '*') //repeats char '*' resuts[i] times
            << std::endl;   
}

The only issue with the above program is that when it's run over and over you get the same results. This is how random engines and distributions work by default. To make sure new random numbers are generated every time you run the program, provide a different seed value (an integer) every run and pass it to the random engine. Alternatively, we can use the system time to get a seed value. Doing that involves using the `<chrono>` header file.

In [None]:
#include <chrono>
int seeded_random_int(int min, int max){
  static std::default_random_engine en;
  en.seed(std::chrono::system_clock::now().time_since_epoch().count());
  static std::uniform_int_distribution<int> dist{min, max}; 
    
  return dist(en);
}

The `std::chrono::system_clock::now().time_since_epoch().count()` returns a number representing the time passed since the clock epoch: Thu Jan 1, 1970 00:00:00,

## Measuring running time using `<chrono>`
We can use the facilities of the `<chrono>` header file to measure the time it takes to run one or more operations. The idea is simple, we take a snapshot of the system time in high resolution before the operations start and and we take another snapshot of the time after they finish. Then we get the duration between the start and end time snapshots and display that in the appropriate time units (nanoseconds, microseconds, etc.).

Let's repeat the above experiment of throwing a die 100 times but this time we'll record the time it takes to do the experiment from generating the random numbers to displaying the output histogram. 


In [None]:
#include <iostream>
#include <string>
#include <chrono>
using namespace std;

auto start = chrono::high_resolution_clock::now(); // Take this first time snapshot

int data[] = {0, 0, 0, 0, 0, 0};
for(int i = 0; i < 100; i++){
  data[seeded_random_int(0, 5)]++;
}

// Display the results histogram
for(int i = 0; i < 6; i++){
  cout << i + 1 << ": "
       << string(data[i], '*') //repeats char '*' data[i] times
       << endl;   
}

auto end = chrono::high_resolution_clock::now(); // Take the second time snapshot

chrono::duration<double, nano> running_time = end - start;

cout << "Experiment took " << running_time.count() << " nanoseconds to run." << endl;


Here, we are using the `high_resolution_clock` and `duration<>` facilities defined by the `<chrono>` header file. These facilities are defined within a name space called `chrono` nested within the `std` name space. This is why we put `chrono::` in front of these facilities.

At the core of the `chrono` library are the fundamental notions of:
- **clocks**: A clock has of a start point (an epoch) and a tick rate. For example, a clock may start on January 1, 1970 and tick every second. The above `chrono::high_resolution_clock` is a clock with the shortest tick period possible.
- **Time points**: A time point is the duration since the clock epoch. The function `now()` returns a time point, which makes `start` and `end` time points.
- **Durations**: A duration represents a span of time, such as an hour or 30 minutes. A duration can also represent the difference between two time points. The above `running_time` is a duration. Durations can be represented in different time units such as `std::nano` (for nanoseconds), `std::micro` (for microseconds), `std::milli` (for milliseconds), ect.

### <a id="ch01">CHALLENGE 01</a>
In the code cell below, write a program that simulates the experiment of flipping a coin 1000 times and counting how many heads and tails are encountered. Display the results of your experiment and the time it took in microseconds to run it.

In [None]:
//TODO

# Searching
Before looking into searching, we need to talk about **iterators** which are essential to the C++ STL library.

## Iterators
An iterator is a generalization of the pointer concept. It points to an item within a container and can be used to get to the next item. A pointer is the simplest form of iterators.

To implement an iterator, we define a class and overloaded the `*` (dereference) and `++` (increment), `==` (equals), and `!=` (not equals) operators to make it look and behave like pointers. Here is a class named `Array` defining a container with a fixed size.

```c++
#include <initializer_list>

template<typename T>
class Array {
private: 
    T *items; 
    int sz = 0;
    
public: 
    Array(int s): sz(c), items(new T[s]){} 
    
    T operator[](int i) { return items[i]; }
    T& operator[](int i)const { return items[i]; }
    int size(){ return sz; }
    
    ~Array(){
        delete[] items;
    }
}; 
```

We can define an iterator class named `ArrayIterator` and use it to augment the `Array` class with a `search` function that returns an iterator pointing to the found item if any. We can also use it to support the range-based for loop (or for-each loop) by implementing the `begin()` and `end()` functions, both of which return `ArrayIterator` objects. Implementing these two functions allows us to write something like this:

```c++
Array<double> arr(3);
arr[0] = 2.5;
arr[1] = 5.0;
arr[2] = 7.5;
for(double d : arr) {
    cout << d << endl;
}
```

Here is the `ArrayIterator` class.

In [None]:
template<typename T>
class ArrayIterator{
private:
    T* item = nullptr;
public:
  ArrayIterator(T* ptr): item(ptr){}
  ArrayIterator<T>& operator++(){ // an operator to move to the next item
    item = ++item;
    return *this;
  }
    
  T operator*(){ return *item; } // dereference operator
  bool operator==(const ArrayIterator<T>& rhs) const { return item == rhs.item; }
  bool operator!=(const ArrayIterator<T>& rhs) const { return item != rhs.item; }
};

We now can refactor the above `Array` class so as to use iterators.

In [None]:
#include <initializer_list>

template<typename T>
class Array {
private: 
    T *items; 
    int sz = 0; 
public: 
    Array(int s): sz(s), items(new T[s]){} 
    
    T operator[](int i) const { return items[i]; }
    T& operator[](int i) { return items[i]; }
    int size(){ return sz; }
    
    ArrayIterator<T> search(T e) const {
        for(int i = 0; i < sz; i++){
            if(items[i] == e) return ArrayIterator<T>(items + i);
        }
        
        return ArrayIterator<T>(items + sz);
    }
    
    ArrayIterator<T> binarySearch(T e ) const{
        int first = 0;
        int last = sz - 1;
        int mid;

        while (first <= last) {
            mid = (first + last) / 2;
            if(items[mid] == e) return ArrayIterator<T>(items + mid);
            else if (items[mid] > e) last = mid - 1;
            else first = mid + 1;
        }

        return ArrayIterator<T>(items + sz);
    }
    
    ArrayIterator<T> begin() { return ArrayIterator<T>(items); }
    ArrayIterator<T> end() { return ArrayIterator<T>(items + sz); }
    
    ~Array(){
        delete[] items;
    }
}; 

Notice how the `end()` function does not point to the last item in the array; rather it points to the one next to it. 

Having done that, we can now test this class and its iterators.

In [None]:
Array<double> arr(3);
arr[0] = 2.5;
arr[1] = 5.0;
arr[2] = 7.5;
for(double d : arr) {
    cout << d << endl;
}

auto it = arr.search(5.0);
if(it != arr.end()) {
    cout << "\nSearching " << *it << " was found." << endl;
} else {
    cout << "Not found." << endl;
}

## Sequential search
Searching an array (fixed-sized or dynamic) or a linked list for a given value involves comparing the values of some or all items in the array to that value. If found, the search stops and the result is return.

This search can be done either sequentially or using binary search. In sequential search, we start by comparing the value of the first item in the array to the searched for value and, if it is not the same, moving to the next item. We continue to do this until we reach the last item of the array. For an array or linked list with $n$ items, sequential search takes $O(n)$ time.

Here is the sequential search algorithm form the above `Array` class.

```c++
ArrayIterator<T> search(T e) const {
    for(int i = 0; i < sz; i++){
        if(items[i] == e) return ArrayIterator<T>(items + i);
    }

    return ArrayIterator<T>(items + sz);
}
```
Searching a built-in C++ array is similar.

For a linked List, the sequential search looks like the following:
```c++
bool find(T info) {
    auto current = front;
    while(current){
        if(current->info == info){
            return true;
        }

        current = current->next;
    }

    return false;      
}
```
It also takes $O(n)$ time.

The best-case scenario happens when the searched for value matches the first item. The worst-case happens when the searched for value matches the last item or is not found at all.

Analyzing the average-case requires:
- Considering all possible cases.
- Finding the number of comparisons for each case.
- Adding the number of comparisons together
- Dividing total number of comparisons by the number of cases

That is:

$Avg. Time = \frac{1 + 2 + 3 + ... + n}{n} = \frac{1}{n} \frac{n(n+1)}{2} = \frac{n+1}{2}$


### <a id="ch02">CHALLENGE 02</a>
**Q1**. What is the running time, in Big-O notation, of sequentially searching a linked list in reverse order (back to front)?

**Q2**. What is the running time in Big-O notation of the average-case sequential search?

## Binary search
For large arrays or linked lists, sequential search is expensive; we need a better algorithm. Binary search is a much better algorithm for searching arrays. The array, however, must be sorted first. Here is the binary search function from the above `Array` class.

```c++
ArrayIterator<T> binarySearch(T e ) const{
    int first = 0;
    int last = sz - 1;
    int mid;

    while (first <= last) {
        mid = (first + last) / 2;
        if(items[mid] == e) return ArrayIterator<T>(items + mid);
        else if (items[mid] > e) last = mid - 1;
        else first = mid + 1;
    }

    return ArrayIterator<T>(items + sz);
}
```

This algorithm searches a sorted array by first dividing the list in half. If the searched for value is less than the item in the middle, narrow the search to the lower half. Otherwise narrow it to the upper half. Repeatedly keep check doing this until the value is found or the interval is empty.

How much time does binary search take?

Assume we have an array of 128 items. Let's consider the worst case scenario of looking for an item that is not in the list and is greater than any item in it. This means searching will always narrow to the right half. The running time depents on how many iterations the `while` loop takes. As shown below, there are 7 ($lg(128) = 7$) iterations. In the first iteration, the list `[0-127]` is divided into two halves: `[0-63]` and `[64-127]`. In the second iteration, the `[64-127]` is divided into two halves: `[64-95]` and `[96-127]`.

```bash
ARR: [0----------------------------------------------------127]
 01: [0-63][64---------------------------------------------127]
 02:       [64-95][96--------------------------------------127]
 03:              [96-111][112-----------------------------127]
 04:                      [112-119][120--------------------127]
 05:                               [120-123][124-----------127]
 06:                                        [124-125][126--127]
 07:                                                 [126][127]
```

The running time of this algorithm equals the number of iterations. For an array with $n$ items, the binary search algorithm takes $O(lg(n))$ to run which is much faster than the $O(n)$ time we saw earlier with sequential search. 

Let's conclude this section by comparing the running times of the sequential and binary search algorithms side by side.

In [None]:
#include<iostream>
#include<chrono>
#include<random>
#include<iomanip>

using namespace std;

In [None]:
template<typename T>
void print_running_time(Array<T>& arr){
    auto ss_start = chrono::high_resolution_clock::now();
    arr.search(random_int(1, arr.size()));
    auto ss_end = chrono::high_resolution_clock::now(); 
    chrono::duration<double, nano> ss_rt = ss_end - ss_start;

    auto bs_start = chrono::high_resolution_clock::now();
    arr.binarySearch(random_int(1, arr.size()));
    auto bs_end = chrono::high_resolution_clock::now(); 
    chrono::duration<double, nano> bs_rt = bs_end - bs_start;

    cout << setw(20) << ss_rt.count() << setw(20) << bs_rt.count()  << endl;
}

In [None]:
cout << setw(20) << "Sequential" << setw(20) << "Binary"  << endl;

int size = 1024;
while(size > 127){
    Array<int> arr(size);
    for(int i = 0; i < size; i++){ arr[i] = i; }
    print_running_time(arr);
    size /= 2;
}


### <a id="ch03">CHALLENGE 03</a>
**Q1**. The binary search algorithm does not work on sorted linked lists. Why is that?