In [None]:
%matplotlib inline

Here is an example we might discuss in class (based on https://towardsdatascience.com/can-neural-networks-really-learn-any-function-65e106617fc6) 

The point is, when looking at something like the universal function approximation (UFA) theorem, what does it mean? :) Meaning, many students study the theorem but never really develop any intuition. It is an existance theorem. 

Below, we have a target function and we combine a few simpler functions (our 'basic functions') and try to approximate our function.

In [None]:
import numpy as np 
import matplotlib.pyplot as plt 

x = np.asarray(np.linspace(-2, 2, 1000))

# target fx
f = np.power(x,3) + np.power(x,2) - x - 1

# mini fx's
n1 = np.maximum(-5 * x - 7.7,0)
n2 = np.maximum(-1.2 * x - 1.3,0)
n3 = np.maximum(1.2 * x + 1,0)
z = -n1 -n2 -n3

# draw x and y equal spacing
plt.xlim(-2, 2)
plt.ylim(-2, 2)
plt.gca().set_aspect('equal', adjustable='box')
plt.plot(x,f,'b')
plt.plot(x,n1,'r')
plt.plot(x,n2,'g')
plt.plot(x,n3,'m')
plt.plot(x,z,'k')
plt.show()

Now, depending on the number of functions that we use, we can get a better approximation

In [None]:
import numpy as np # include the numpy library for ops on arrays as the name "np"
import matplotlib.pyplot as plt # include the plotting library as the name "plt"

x = np.asarray(np.linspace(-2, 2, 1000))

# target fx
f = np.power(x,3) + np.power(x,2) - x - 1

# mini fx's
n1 = np.maximum(-5 * x - 7.7,0)
n2 = np.maximum(-1.2 * x - 1.3,0)
n3 = np.maximum(1.2 * x + 1,0)
n4 = np.maximum(1.2 * x - 0.2,0)
n5 = np.maximum(2 * x - 1.1,0)
n6 = np.maximum(5 * x - 2,0)
z = -n1 -n2 -n3 +n4 + n5 + n6

# draw x and y equal spacing
plt.xlim(-2, 2)
plt.ylim(-2, 2)
plt.gca().set_aspect('equal', adjustable='box')
plt.plot(x,f,'b')
plt.plot(x,n1,'r')
plt.plot(x,n2,'r')
plt.plot(x,n3,'r')
plt.plot(x,n4,'r')
plt.plot(x,n5,'r')
plt.plot(x,n6,'r')
plt.plot(x,z,'k')
plt.show()

And this process can be repeated and expanded to get an even tighter / better approximation

But, what set of functions do we pick? Are they always gaurenteed to reconstruct our target function? Under what conditions? etc. Ohhhh, the joy and pure torture of the UFA.

Some good additional links (with the math!!! :) include:

 * https://link.springer.com/article/10.1007/BF02551274
 * https://mcneela.github.io/machine_learning/2017/03/21/Universal-Approximation-Theorem.html