# Reverse Mode Automatic Differentiation (AD)

Dynamic Reverse mode AD can be implemented by declaring a class `Var` to represent a value and the child expressions that the value depends on. We've provided the implementation that was shown in the lecture slides. 

Tasks:

1. Complete Addition (`__add__`) method below. 
2. Complete division (`__truediv__`), subtraction (`__sub__`) and power (`__pow__`)?

In [44]:
import math

class Var:
    def __init__(self, value):
        self.value = value
        self.children = []
        self.grad_value = None #Initialize to None, which means it's not yet evaluated

    def grad(self):
        #recurse only if the value is not yet cached
        if self.grad_value is None:
            #calculate derivative using chain rule
            self.grad_value = sum(weight * var.grad() for weight, var in self.children)
        return self.grad_value
    
    def __str__(self):
        return str(self.value)

    def __mul__(self, other): #(x=self, y=other) if z=x*y then dz/dx=y and dz/dy=x

        z = Var(self.value * other.value) #z=x*y
      
        #weight = dz/dself=other.value
        self.children.append((other.value, z)) #append [dz/dx=y, z] as children of x
      
        #weight = dz/dother=self.value
        other.children.append((self.value, z)) #append [dz/dy=x, z] as children of y
        return z

# For a=x*y a is a new Var that is a child of both x and y

    def __add__(self, other): #z=x+y, dz/dx = 1, dz/dy = 1
        z = Var(self.value + other.value)
        
        self.children.append((1.0, z))
        
        other.children.append((1.0, z))
        
        return z
    
    def __sub__(self, other): #z=x-y, dz/dx = 1, dz/dy = -1
        z = Var(self.value + other.value)
        
        self.children.append((1, z))
        
        other.children.append((-1, z))
        
        return z
    
    def __truediv__(self, other): #(x=self, y=other) z=x/y, dz/dx=1/y, dz/dy=-x/y^2

        z = Var(self.value / other.value) 
      
        self.children.append((1/other.value, z)) 
      
        other.children.append((-self.value / other.value**2, z)) 
        
        return z

    def __pow__(self, other): #z=x^y, dz/dx = y*x^(y-1), dz/dy = ln(x)*x^y
        z = Var(self.value ** other.value)
        
        self.children.append((other.value * self.value**(other.value-1), z))
        
        other.children.append((math.log(self.value)*self.value ** other.value, z))
        
        return z




def sin(x):
    z = Var(math.sin(x.value))
    x.children.append((math.cos(x.value), z))
    return z

def cos(x):
    z = Var(math.cos(x.value))
    x.children.append((-math.sin(x.value), z))
    return z

def tan(x):
    z = Var(math.tan(x.value))
    x.children.append((math.sec(x.value)**2, z))
    return z

In [45]:
# Tests
v = Var(1)
v.__pow__((v.__truediv__(v.__mul__(v.__sub__(v.__add__(v)))))).value


1.0

# Forward computation first
1. Try running the following code to compute the value of the function  $z=a+b, a=x*y, b =sin(x)$  given  $x=0.5$  and  $y=4.2$.
2. Print out the children of x,y,a,b with their derivatives $\frac{\partial a}{\partial x},\frac{\partial b}{\partial x},\frac{\partial a}{\partial y},\frac{\partial z}{\partial a},\frac{\partial z}{\partial b}$


In [46]:
x=Var(0.5)
y=Var(4.2)

a=x.__mul__(y)
b=sin(x)
z=a.__add__(b)

#output
def print_values():
        out_str = f'''
        a = {a.value}  
        dz/da = {a.grad()}
        children:
                {a.children[0][0]}
                {a.children[0][1]}
        
        b = {b.value}  
        dz/db = {b.grad()}
        children:
                {b.children[0][0]}
                {b.children[0][1]}
        
        x = {x.value}  
        dz/dx = {x.grad()}
        children:
                {x.children[0][0]}
                {x.children[0][1]}
        
        y = {y.value}  
        dz/dy = {y.grad()}
        children:
                {y.children[0][0]}
                {y.children[0][1]}
        
        z = {z.value}  
        dz/dz = {z.grad()}
        children:
                {z.children}
                '''
        print(out_str)

print('Before seeding:')
print_values()

Before seeding:

        a = 2.1  
        dz/da = 0.0
        children:
                1.0
                2.579425538604203
        
        b = 0.479425538604203  
        dz/db = 0.0
        children:
                1.0
                2.579425538604203
        
        x = 0.5  
        dz/dx = 0.0
        children:
                4.2
                2.1
        
        y = 4.2  
        dz/dy = 0.0
        children:
                0.5
                2.1
        
        z = 2.579425538604203  
        dz/dz = 0
        children:
                []
                


# Reverse mode computation

So far we have done forward computing as we go. But we haven't computed $\frac{\partial z}{\partial x}$ and $\frac{\partial z}{\partial y}$ which is what we want essentially.

1. Run the code below
2. Print out the gradient of each variable and complete the code

In [48]:
z.grad_value=1.0  #z.grad_value = 1.0 #Note that we have to 'seed' the gradient of z to 1 (e.g. ∂z/∂z=1) before computing grads

print('z:', z)
print("dz/dx: ",x.grad())

#Complete the code here
print("dz/dy: ",y.grad())

print('After seeding: ')
print_values()

z: 2.579425538604203
dz/dx:  0.0
dz/dy:  0.0
After seeding: 

        a = 2.1  
        dz/da = 0.0
        children:
                1.0
                2.579425538604203
        
        b = 0.479425538604203  
        dz/db = 0.0
        children:
                1.0
                2.579425538604203
        
        x = 0.5  
        dz/dx = 0.0
        children:
                4.2
                2.1
        
        y = 4.2  
        dz/dy = 0.0
        children:
                0.5
                2.1
        
        z = 2.579425538604203  
        dz/dz = 1.0
        children:
                []
                
