# Common Functions

Linear regression can be used to fit non-linear relationships with the right model or feature engineering. The first step is to recognize what function describes the relationship, and potentially what transformation will create a linear relationship.

In the following examples, `x` is the explanatory variable, `y` is the response variable, and we will simulate random variation using a normal distribution. As the notebook is written, the normally distributed noise is applied to `y` before transforming, so in linear form the constant variability condition of statistical inference of model parameters is violated for transformed data. Making the transformation is helpful for determining a non-linear function that models the relationship.

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np

## Linear
$y = m*x + b$

In [None]:
def _linear(x, m=1, b=0):
    return m*x + b
# for plotting convenience, use numpy.vectorize method that will apply the function element-wise to an array argument.
linear = np.vectorize(_linear)

In [None]:
# set parameters
m = -1 # slope
b = 10 # interscept
xmin = 0
xmax = 10
xdel = 0.1 # resolution
mu = 0 # mean of simulated noise
sigma = 1 # standard deviation of simulated noise

# plot relationship with no random noise
x = np.arange(xmin, xmax, xdel)
y = linear(x, m, b)
plt.plot(x, y)
plt.title("Line Plot, no noise")
plt.show()

# same relationship with random noise
noise = [np.random.normal(mu, sigma) for i in x]
plt.scatter(x, y+noise)
plt.title("Scatter Plot, with noise")
plt.show()

## Quadratic
$y = m*x^2 + b$ 

Define a new feature $x^2$ to create a linear relationship.

In [None]:
def _quadratic(x, m=1, b=0):
    return m * x**2 + b
quadratic = np.vectorize(_quadratic)

In [None]:
# set parameters
m = 1 # coefficient
b = 0 # interscept
xmin = 0
xmax = 10
xdel = 0.1 # resolution
mu = 0 # mean of simulated noise
sigma = 7 # noise fraction of value for simulate noise


# plot relationship with no random noise
x = np.arange(xmin, xmax, xdel)
y = quadratic(x, m, b)
plt.plot(x, y)
plt.title("Line Plot, no noise")
plt.show()

# same relationship with random noise
noise = [np.random.normal(mu, sigma) for i in x]
plt.scatter(x, y+noise)
plt.title("Scatter Plot, with noise")
plt.show()
# transform explantory variable to create linear relationship
newfeature = np.power(x,2)
plt.scatter(newfeature, y+noise)
plt.title("Scatter Plot of Transformed Variable, with noise")
plt.show()

## Exponential
$y = \alpha * e^{\beta x}$

In this case we will build a model $ln(y) = \alpha + \beta * x$

In [None]:
def _exponential(x, a=1, b=1):
    return a*np.exp(b*x)
exponential = np.vectorize(_exponential)

In [None]:
# set parameters
# try a = 10, b=-0.5, sigma =1
a = 1 
b = 0.5 # positive value gives exponential growth, negative results in exponential decay
xmin = 1
xmax = 10
xdel = 0.1 # resolution
mu = 0 # mean of simulated noise
sigma = 10 # standard deviation of simulated noise


# plot relationship with no random noise
x = np.arange(xmin, xmax, xdel)
y = exponential(x, a, b)
plt.plot(x, y)
plt.title("Line Plot, no noise")
plt.show()

# same relationship with random noise
noise = [np.random.normal(mu, sigma) for i in x]
yn = np.abs(y+noise) # the absolute value biases the noise, but prevents nan in values
plt.scatter(x, yn)
plt.title("Scatter Plot, with noise")
plt.show()

# transform explantory variable to create linear relationship
plt.scatter(x, np.log(yn))
plt.title("Scatter Plot of Transformed Variable, with noise")
plt.ylabel("log(y)")
plt.show()

## Logarithmic
$ y = \alpha*log(\beta*x)$

Model $y = m\zeta+\beta$ where $m=\alpha$, $b=\alpha  \log(\beta)$, and $\zeta=\log(x)$.

In [None]:
def _logarithmic(x, a=1, b=1):
    return a*np.log(b*x)
logarithmic = np.vectorize(_logarithmic)

In [None]:
# set parameters
a = 1 
b = 10
xmin = 0.1
xmax = 10
xdel = 0.1 # resolution
mu = 0 # mean of simulated noise
sigma = 0.25 # standard deviation of simulated noise


# plot relationship with no random noise
x = np.arange(xmin, xmax, xdel)
y = logarithmic(x, a, b)
plt.plot(x, y)
plt.title("Line Plot, no noise")
plt.show()

# same relationship with random noise
noise = [np.random.normal(mu, sigma) for i in x]
yn = np.abs(y+noise) 
plt.scatter(x, yn)
plt.title("Scatter Plot, with noise")
plt.show()

# transform explantory variable to create linear relationship
plt.scatter(x, np.exp(yn))
plt.title("Scatter Plot of Transformed Variable, with noise")
plt.ylabel("log(y)")
plt.show()