# The Gradient Descent algorithm

We will use our knowledge of python programming to build a set of _metaheuristic_ solvers. We will start with very simple examples and work our way up to the Gradient Descent algorithm. This algorithm is very widely used in data science, including in the field of deep learning.

A very gentle and readable introduction to this and other interesting algorithms can be found in the book "Essentials of Metaheuristics," available for free at: https://cs.gmu.edu/~sean/book/metaheuristics/

## Random Guessing

In an earlier lecture, we wrote some code to solve the equation `250 = 5 * x`. Although it wasn't made explicit in that example, the variable `x` can only take on integer values. We found that setting `x` equal to 50 gets us the answer we need. In that scenario, we knew that there _was_ an exact answer.

Let's change the problem slightly: `251 = 5 * x`. Now there is no exact integer answer. We have to estimate an answer with as low an error as possible. The basic idea is this: we will try random integers (within some range), we will remember the values which get us the best results and ignore the ones which get us results worse than what we already have.

_Side note_ :  If you are wondering why we are doing a random search, instead of starting with a low value and incrementing up until we hit the correct value, you are correct that other than high school algebra, that method will be much better than random guessing. However, this problem is supposed to simulate a scenario where we can't assume such simple solutions. Besides, we need to practice certain aspects of this algorithm, before we can progress other more complex algos.

Very informal pseudo code:
```ruby
try a guess for x in equation 250 = 5 * x
if guess is better than the preivously recorded guess, store this guess
keep going until the error is small enough or the guessing game has run long enough
```

**Exercise** Just looking at the equation, what is the correct value for `x`? What is the best possible error - difference between what we want (251) and what we can expect?

In [1]:
import random

In [9]:
num_of_guesses = 1000
min_guess = 0
max_guess = 1000

best_guess = None
best_error = None

for guess in [random.randint(min_guess, max_guess) for _ in range(num_of_guesses)]:
  result = 5 * guess
  error = abs(251 - result)

  if best_guess is None or error < best_error:
    best_guess = guess
    best_error = error

best_guess, best_error

(50, 1)

**Exercise** Do you understand what the expression `[random.randint(min_guess, max_guess) for _ in range(num_of_guesses)]` is doing? If not, please ask. How does the code change if we had to use loops instead of this list comprehension?

**Exercise** Do you understand what the expression `best_guess is None or error < best_error:` is doing? If `best_guess` is None, is the remaining expression `error < best_error` still evaluated?

**Exercise** Turn this into a function `random_search(num_of_guesses, min_guess, max_guess)`. This function should return a single number, our best guess for the expression `251 = 5 * x`.