
### Idea of the implementation
The key idea is that dual numbers work perfectly if we keep multiple epsilons, one for each variable, and adopt the rule that the product of any two espilons is zero.

Example 0:  
If we have done some computation so far and are storing $\{3+4\epsilon_x+5\epsilon_y\}$, we'd say that node of the computation has value 3, $\frac{d}{dx}=4$ and $\frac{d}{dy}=5$. Implicitly, $\frac{d}{dz}=0$; z's value does not affect this node of the computation.

Example 1: Adding extended duals
$$\{3+4\epsilon_x+5\epsilon_y\} + \{2 + 7\epsilon_y+10\epsilon_z\} = 5+4\epsilon_x+12\epsilon_y+10\epsilon_z$$

Example 2: Multiplying extended duals
$$\{3+4\epsilon_x+5\epsilon_y\} \cdot \{2 + 7\epsilon_y+10\epsilon_z\}\\
= (6+ 21\epsilon_y + 30 \epsilon_z) + (8\epsilon_x + 28\epsilon_y\epsilon_x+40\epsilon_z\epsilon_x) + (10\epsilon_y + 35\epsilon_y^2+50\epsilon_z5\epsilon_y)\\
= (6+ 21\epsilon_y + 30 \epsilon_z)+(8\epsilon_x + 0+0) + (10\epsilon_y+0+0)\\
= 6+ 8\epsilon_x+ (21+10)\epsilon_y + 30 \epsilon_z$$

If we were to do the above with symbols instead of numbers, we'd clearly see that the rule is that the real value of the first number distributes, the real value of the second number distributes, and we collect all the terms. More symbolically $\nabla \left(f(x,y)\cdot g(y,z)\right) = f(x,y)\cdot\left(\text{derivatives of g}\right) + g(y,x)\cdot\left(\text{derivatives of f}\right)$

Off the cuff, $\frac{f(x,y)}{g(y,z)}$ should be $\frac{(\text{value of g})\left(\text{derivatives of f}\right) -\text{(value of f)}\left(\text{derivatives of g}\right)}{(\text{value of g})^2}$

So the implementation should distribute g's value to all the derivatives we're storing for f, distribute -g's value to all the derivatives we're storing for g, and add the two.

### Code

Note that I use a `defaultdict` to store the derivatives. That way if we ask for one that isn't there, we get back 0 instead of an error.

I haven't put in any pretty printing or for the outputs, so there's a little clutter in the results

In [3]:
from collections import defaultdict

class DualNumber:

    def __init__(self, name, value, derivatives=None):
        # ideally, we should block users from using the derivtives interface. May require separate classes
        self.value = value
        if name is not None:
            self.derivatives = defaultdict(float)
            self.derivatives[name] = 1
        else:
            self.derivatives = derivatives
            
    @classmethod
    def emptyDual(cls):
        return cls(None,0,defaultdict(float))
    
    def __mul__(self, other):
        output=self.emptyDual()
        
        output.value = self.value*other.value
        
        # real part of first parent distributes
        for k2 in other.derivatives:
            output.derivatives[k2] += self.value*other.derivatives[k2]
        
        # real part of the second parent distributes
        for k1 in self.derivatives:
            output.derivatives[k1] += other.value*self.derivatives[k1]
            
        return output
    
    def __add__(self, other):
        output=self.emptyDual()
        
        output.value = self.value + other.value
        for k1 in self.derivatives:
            output.derivatives[k1] += self.derivatives[k1]
        for k2 in other.derivatives:
            output.derivatives[k2] += other.derivatives[k2]
        
        return output
                
x = DualNumber('x', 3)
y = DualNumber('y', 10)
z = DualNumber('z', .5)

print((x*y*z).derivatives) # derivative wrt z is xy = 30, wrt y is xz = 1.5, wrt z is xy = 5
print((x*x).derivatives)   # 2x = 6
print((y*y*y).derivatives) # 3y^2 = 300
print()
print(((x+y*y)*z).derivatives) #wrt z is x+y^2 = 103, wrt y is 2zy = 10, wrt x is z = .5

defaultdict(<class 'float'>, {'z': 30.0, 'y': 1.5, 'x': 5.0})
defaultdict(<class 'float'>, {'x': 6.0})
defaultdict(<class 'float'>, {'y': 300.0})

defaultdict(<class 'float'>, {'z': 103.0, 'x': 0.5, 'y': 10.0})


### Usage Example

In [4]:
def my_fun(x,y,z):
    return (x+y*z+z)*x

print(my_fun(DualNumber('x',7), DualNumber('y',8), DualNumber('z',10)).derivatives) #gradient at 7,8,10
print(my_fun(DualNumber('x',0), DualNumber('y',3), DualNumber('z',5)).derivatives) #gradient at 0,3,5

defaultdict(<class 'float'>, {'x': 104.0, 'z': 63.0, 'y': 70.0})
defaultdict(<class 'float'>, {'x': 20.0, 'z': 0.0, 'y': 0.0})
