<a href="https://colab.research.google.com/github/vvicky30/python/blob/master/nonparamatric_regression.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [14]:
"""
Algorithm
1. Read the Given data Sample to X and the curve (linear or non linear) to Y
2. Set the value for Smoothening parameter or Free parameter say τ
3. Set the bias /Point of interest set X0 which is a subset of X
4. Determine the weight matrix using :
        w(x,x0)=e**(x-x0)**2/(2tau**2)
      
5. Determine the value of model term parameter β using :
           beta(x0)=(x**T @ w @ x)**-1 x**T @ w @ y
          
6. Prediction = x0*β
"""

'\nAlgorithm\n1. Read the Given data Sample to X and the curve (linear or non linear) to Y\n2. Set the value for Smoothening parameter or Free parameter say τ\n3. Set the bias /Point of interest set X0 which is a subset of X\n4. Determine the weight matrix using :\n        w(x,x0)=e**(x-x0)**2/(2tau**2)\n      \n5. Determine the value of model term parameter β using :\n           beta(x0)=(x**T @ w @ x)**-1 x**T @ w @ y\n          \n6. Prediction = x0*β\n'

In [0]:
import numpy as np
from bokeh.plotting import figure, show, output_notebook
from bokeh.layouts import gridplot
from bokeh.io import push_notebook

output_notebook()

In [0]:
import numpy as np
"""
Loess/Lowess Regression: Loess regression is a nonparametric technique that 
uses local weighted regression to fit a smooth curve through points in a scatter plot.
"""

def local_regression(x0, X, Y, tau):
    #here tau is the smoothening parameters
    # add bias term
    x0 = np.r_[1, x0] # Add one to avoid the loss in information 
    X = np.c_[np.ones(len(X)), X]   
       
    # fit model: normal equations with kernel-function(higher dimentionality feature-space)
    xw = X.T * radial_kernel(x0, X, tau)   # XTranspose * W
    
    beta = np.linalg.pinv(xw @ X) @ xw @ Y   # @ Matrix Multiplication or Dot Product  (beta-formula)
        
    # predict value
    return x0 @ beta    #@ Matrix Multiplication or Dot Product for prediction 

def radial_kernel(x0, X, tau):#kernel-function eqn used for assigning weights
    return np.exp(np.sum((X - x0) ** 2, axis=1) / (-2 * tau * tau))   # Weight or Radial Kernal Bias Function

In [17]:
n = 800   #set for 800 rows
# generate dataset
X = np.linspace(-3, 3, num=n)#independent variable
print("The Data Set of 10 Samples (X) :\n",X[1:10])
Y = np.log(np.abs(X ** 2 - 1) + .5) #fitting non-paramatric(curve data)##dependent variable
print("The Fitting Curve Data Set of 10 Samples (Y)  :\n",Y[1:10])
# jitter X
X += np.random.normal(scale=.1, size=n)#some of the noise added to the independent variable
print("Normalised of 10 Samples (X) :\n",X[1:10])

"""
Lowess Algorithm: Locally weighted regression is a very powerful non-parametric model used in statistical learning. 
Given a dataset X, y, we attempt to find a model parameter β(x) that minimizes residual sum of weighted squared errors. 
The weights are given by a kernel function(k or w) which can be chosen arbitrarily .
"""

The Data Set of 10 Samples (X) :
 [-2.99249061 -2.98498123 -2.97747184 -2.96996245 -2.96245307 -2.95494368
 -2.94743429 -2.93992491 -2.93241552]
The Fitting Curve Data Set of 10 Samples (Y)  :
 [2.13475799 2.1294349  2.12409681 2.11874362 2.11337525 2.1079916
 2.10259259 2.09717812 2.0917481 ]
Normalised of 10 Samples (X) :
 [-2.93578407 -2.7793917  -3.08906596 -3.05768561 -2.94757388 -2.91086143
 -3.06285408 -2.98348797 -2.90960371]


'\nLowess Algorithm: Locally weighted regression is a very powerful non-parametric model used in statistical learning. \nGiven a dataset X, y, we attempt to find a model parameter β(x) that minimizes residual sum of weighted squared errors. \nThe weights are given by a kernel function(k or w) which can be chosen arbitrarily .\n'

In [18]:
domain = np.linspace(-3, 3, num=300)#domain setted for 300 transations(rows) 
print("  Domain Space 10 Samples (x0)  :\n",domain[1:10])
#here x0 is used for set of interest from the 'x'(or we can say that its subset of 'x')

def plot_lwr(tau):
    # prediction through regression
    prediction = [local_regression(x0, X, Y, tau) for x0 in domain]
    plot = figure(plot_width=400, plot_height=400)
    plot.title.text='tau=%g' % tau
    plot.scatter(X, Y, alpha=.3,color='blue')
    plot.line(domain, prediction, line_width=2, color='red')
    return plot

  Domain Space 10 Samples (x0)  :
 [-2.97993311 -2.95986622 -2.93979933 -2.91973244 -2.89966555 -2.87959866
 -2.85953177 -2.83946488 -2.81939799]


In [19]:
 #Plotting the curves with different tau
show(gridplot([
    [plot_lwr(10.), plot_lwr(1.)],
    [plot_lwr(0.1), plot_lwr(0.01)]
]))