<a href="https://colab.research.google.com/github/CoolWolfy96/MAT421/blob/main/Module_E_Section_3_2%2C_3_3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Continuity and Differentiation:** A **limit** of a function describes the behavior of the function as it tends towards a value, x. A function can be called **continuous** when both the limit, or behaviour around x, and its actual value match. A function composed of addition, subtraction, multiplication, and division (but not when denominator is zero) of continuous functions is continuous.

Extreme Value Theorem: A **bounded**, continuous function will **always have a maximum and minimum value** that it achieves.

In [85]:
from sympy import *

# sympy is a quite useful library in python if working with symbolic/analytical
# functions and you need to do operations in calculus

x = symbols('x')

f = sin(x)/x;
limit_f = limit(f, x, 0) # limit of f as x->0

print(f)
print(limit_f)

sin(x)/x
1


A **derivative** is often times defined as the instantaneous rate of change of a function at a value, x. Common derivatives and properties of derivatives (Product and Quotient Rule) can be found, but the limit definition of a derivative can also be used. Derivatives can be expanded to multiple independent variables with **partial derivatives** with respect to each independent variable. The partial derivatives can be grouped into a vector called the **gradient** vector which shows the direction of greatest ascent at a point. On the other hand, **directional derivative** is the ascent going at a certain direction which can be calculated as the dot product of gradient vector and unit direction vector.

In [86]:
# using sympy to calculate derivative
y = symbols('y')

a = 1.5

f = sin(x)
diff_f = diff(f,x) # derivative of f with respect to x

print("f: sin(x)")
print("Derivative of f: " + str(diff_f))
print("At a: " + str(diff_f.subs(x, a)) + "\n")

f = sin(x)*sin(y)
diff_fx = diff(f,x) # partial derivative of f with respect to x
diff_fy = diff(f,y) # partial derivative of f with respect to y

print("f: sin(x)*sin(y)")
print("Partial Derivative of f with respect to x: " + str(diff_fx))
print("At (a,a): " + str(diff_fx.subs([(x, a),(y, a)])) + "\n")
print("Gradient Vector of f: " + "<" + str(diff_fx) + ", " + str(diff_fy) + ">" + "\n")


f: sin(x)
Derivative of f: cos(x)
At a: 0.0707372016677029

f: sin(x)*sin(y)
Partial Derivative of f with respect to x: sin(y)*cos(x)
At (a,a): 0.0705600040299336

Gradient Vector of f: <sin(y)*cos(x), sin(x)*cos(y)>



**Taylor's Theorem:** which is more well known in its usage to create the **Taylor's Series** is the polynomial approximation of a function which is differentiable. The important thing to note is that this applies to functions of one variable and functions of multiple variables. The **Lagrange Error Bound** will give an the upper bound on the error of a Taylor's series approximated to a certain degree and point.

In [101]:
def f(x): # taylor series of cos(x) to third degree
  return 1 - x**2/2 + x**4/24

print("Using sympy: \n" + str(cos(0.1)))
print("Using third degree taylor series: \n" + str(f(0.1)))

Using sympy: 
0.995004165278026
Using third degree taylor series: 
0.9950041666666667


**Optimization with Gradient Descent/Ascent:** We know that the gradient vector shows the direction of greatest ascent and the direction opposite of the gradient is the direction of greatest descent. From single variable calculus, when the derivative is zero, it is either a minimum, maximum, or inflection point. With more than one variable, it is when the magnitude of the gradient vector is zero. In this case we can create algorithm that gives a local minimum or maximum of a function where we are given or can calculate a gradient (perhaps even numerically).

This algorithm (implemented below) essentially picks a point on the domain of the function and calculates the gradient vector. The sum/difference of the point and gradient vector becomes the new point. This is repeated until the gradient is within tolerance/really close to zero. Variations of this algorithm multiply the gradient by a scalar called the learning rate or stop the algorithm after a certain amount of iterations.

In [131]:
def gradMagnitude(f, px, py): # returns magnitude of gradient of f
  diff_fx = diff(f,x)
  diff_fy = diff(f,y)
  return sqrt(diff_fx.subs([(x, px),(y, py)])**2 + diff_fy.subs([(x, px),(y, py)])**2)

def findMax(f, tol): # find local max of f within tolerance
  diff_fx = diff(f,x)
  diff_fy = diff(f,y)
  px = 0.1
  py = 0.1
  while gradMagnitude(f, px, py) > tol:
    px += diff_fx.subs([(x, px),(y, py)])
    py += diff_fy.subs([(x, px),(y, py)])
  return (px, py)

f = sin(x)*sin(y) # function to be maximized
(px, py) = findMax(f, 0.000001)

print(px, py) # point where local maximum was achieved
print(str(f.subs([(x, px),(y, py)]))) # value of local maximum
# note this trig function has an infinite number of local maximum

1.57079632679490 1.57079632679490
1.00000000000000
