  Now that we understand the steps for gradient descent, let us do an example:
  A new amusement park is about to open in Jeddah. The engineers would like to build a fun wiggly ride that shows you all the cool places in the park. The ride they came up with had the following route:

  $$g(x,t)=t \sin{x}+3t \cos (2x).$$
  They also know that the cool places are located at the points $$(x,y)=\{(1,-3),(3,6),(5,-7),(7,2),(9,5),(11,-8)\}.$$
  Unfortunately, they do not know what is the best $t$ that would pass along most of the cool places. Can you help them?
  play with this graph to better understand the problem:
  https://www.desmos.com/calculator/jyp7hitvh2

In [None]:
import math

We wanted to find the best $t$ for the engineers. Can you think of a loss function that depends only on $t$ that we would like to minimize?

In these cases, we usually use the *squared loss function*, which comes in the following form:
$$loss(t)=(g(x_1,t)-y_1)^2+(g(x_2,t)-y_2)^2+(g(x_3,t)-y_3)^2+(g(x_4,t)-y_4)^2+(g(x_5,t)-y_5)^2+(g(x_6,t)-y_6)^2$$
where $(x_1,y_1)$ stands for the first location $(1,-3)$ and $(x_2,y_2)=(3,6)$ and so on and so forth.

To minimize this function we need to calculate its derivative. Can you do that?

In [None]:
points=[(1,-3),(3,6),(5,-7),(7,2),(9,5),(11,-8)]
def Loss(t,points):

  loss=0

  for point in points:
    loss+=(t*math.sin(point[0])+3*t*math.cos(2*point[0])-point[1])**2

  return loss

def Loss_derivative(t,points):

  loss_derivative=0
  for point in points:
    loss_derivative+=2*(t*math.sin(point[0])+3*t*math.cos(2*point[0])-point[1])*(math.sin(point[0])+3*math.cos(2*point[0]))
  return loss_derivative

The code above defines the loss function and the derivative of the loss function. What are the next steps of gradient descent?


In [None]:
learning_rate=0.001
t=7
for i in range(150):
  t=t-learning_rate*Loss_derivative(t,points)
  print('t=',t,'Loss_derivative=',Loss_derivative(t,points))

t= 6.560147518338105 Loss_derivative= 400.9307363815552
t= 6.159216781956549 Loss_derivative= 365.4531054777983
t= 5.793763676478751 Loss_derivative= 333.11482554000315
t= 5.460648850938748 Loss_derivative= 303.63810111688343
t= 5.157010749821865 Loss_derivative= 276.7697183708656
t= 4.880241031450999 Loss_derivative= 252.27886989584692
t= 4.627962161555152 Loss_derivative= 229.9551720128688
t= 4.398006989542282 Loss_derivative= 209.60685751168648
t= 4.188400132030596 Loss_derivative= 191.0591283133469
t= 3.997341003717249 Loss_derivative= 174.1526539026554
t= 3.8231883498145938 Loss_derivative= 158.74220263162044
t= 3.6644461471829732 Loss_derivative= 144.69539413636355
t= 3.5197507530466097 Loss_derivative= 131.8915621503864
t= 3.3878591908962234 Loss_derivative= 120.22071794542066
t= 3.267638472950803 Loss_derivative= 109.58260549551044
t= 3.1580558674552925 Loss_derivative= 99.8858402479045
t= 3.058170027207388 Loss_derivative= 91.04712410254434
t= 2.9671229031048436 Loss_derivativ

Now, what if the engineers had a different route function that looked like this:
$$g(x,t_1,t_2)=t_1 \sin x +3t_2 \cos (2x)$$

1. Can you write the loss function? \\
2. Can you think of a way to write its derivative?