Detail about Legendre Polynomial Smoothing
========

For the curious who have good mathematics, basic information is supplied here, but this is not needed for the other sections.  It is a variation on the previous linear regression where we use the same linear functional form
$$f(x) ~=~ \sum_{i=0}^{I} a_i L_i(x)$$
where $L_i(x)$ is the $i$-th Legendre polynomial (which is an $i$-order polynomial).
Legendre polynomials are used instead of simple polynomials since they're a better basis set: simple powers have extreme values, and interact badly.

Moreover, to fit $f(x)$ we minimise the augmented squared error
$$\frac{1}{\sigma^2}\sum_{i=1}^N (y_i-f(x_i))^2 + \lambda {\large\int}_x (f''(x))^2 \mbox{d}x$$
where the second term involves the second derivative of $f()$ and induces *smoothness* in the resulting polynomial because it represents the mean square curvature. 
Bayesian theory is used to set the smoothing hyperparameter $\lambda$.  The second term says, "don't let the function $f()$ change too quickly."   The smoothing integral evaluates to be a sparse quadratic form in the parameters $(a_1,...,a_I)$. 

Discussion
------

Now is this a good general purpose fitting routine in 2-D?  Certainly not always.  In fact, its not even clear that such a thing as a "general purpose routine" exists.  Consider the following scenarios:
 * you're modelling exchange rate data at 5-minute intervals which can have wild changes;
 * you're modelling a fractal function, which means no matter what scale you fit, it seems somewhat the same,
 * you're modelling an industrial process known to undergo "phase changes" at different inputs, so occasional stark changes are expects.
 
Moreover, the celibrated ["no free lunch theorem"](https://en.wikipedia.org/wiki/No_free_lunch_theorem) (NFLT) says, roughly,
> if an algorithm performs well on a certain class of problems then it necessarily pays for that with degraded performance on the set of all remaining problems
 
Clearly, our "smoothing" is not intrinsically useful in the contexts above, and indeed there most be other contexts where it cannot do as well too, by the NFLT.  You may see in some examples that does seem to make things a bit too smooth, for instance to try and smooth out peaks.

Initialise
---------

First we reinitiallise things again.

In [None]:
# put the pieces together, sin(x) + noise + basic regression 
import sys
import os
import numpy
sys.path.append(os.getcwd())
import regressiondemo as rd
%matplotlib inline
import matplotlib.pyplot as pl

rd.setSigma(0.2)

#  don't make points more than 100 as demo is O(points^3)
points = 30

x = rd.makeX(points)

# xts and yts store the "true" function for the purposes of plotting
# these have to be high frequency to make the resultant plot look
# like a smooth curve
xts = rd.makeX(200,uniform=True)

The Basis Functions
----------

For those who want a glimpse of the gory details, read on.

We plot below a few basis functions.  The first plot is the Legendre polynomials, which are rather like sin curves but modified so they fit into a finite range of [-1,1].  The final plot is the actual polynomial bases used by the regression routine.  Note the higher order bases have been scaled to be so small that they look flat, but they are actually very curvy, as in the second plot.  The effect of scaling is that in the final polynomials have no sharp deviations of curvature left.

In [None]:
# CHOOSE:  degree of Legendre poly
legdegree = 20
legpoly = rd.LegPoly(legdegree)
legpoly.setX(xts)
vanders = legpoly.vander
legpoly.setX(x)
vander = legpoly.vander

plist = [1,2,5,10,legdegree-1]

# we will plot the first few Legendre polys
for i in plist:
    pl.plot(xts,vanders[:,i-1],label='Legendre_'+str(i))
pl.legend(bbox_to_anchor=(1.4, 1.1))
pl.suptitle('Legendre Polynomials of different order')
pl.show()
Vu, Vs, Vv = numpy.linalg.svd(legpoly.smooth)

# we will plot the first few Smoothed Legendre polys before scaling
for i in plist:
    pl.plot(xts,numpy.dot(vanders,Vv[:,legdegree-i+1]),label='UnscSmthLgdre_'+str(i)) 
pl.legend(bbox_to_anchor=(1.6, 1.1))
pl.suptitle('Smoothed Legendre Polynomials before scaling')
pl.show()

# now first few Smoothed Legendre polys with scaling
for i in plist:
    pl.plot(xts,numpy.dot(vanders,Vv[:,legdegree-i+1])/numpy.sqrt(Vs[legdegree-i+1]),label='SmoothLgdre_'+str(i)) 
pl.legend(bbox_to_anchor=(1.5, 1.1))
pl.suptitle('Smoothed Legendre Polynomials')
pl.show()

# now plot random functions
for i in range(10):
    uu = numpy.random.normal(0,1,legdegree+1)
    # uu[legdegree] = 0
    pl.plot(xts,numpy.dot(vanders,numpy.dot(Vv,uu/numpy.sqrt(Vs)))) 
pl.suptitle('Random Smoothed Legendre Polys (centered, mean slope 0)')
pl.show()
