# ML-Fundamentals - Lineare Regression - Exercise: Multivariate Linear Regression

In [160]:
#Lara Neubauer da Costa Schertel, s0575697
#Ich habe zusammen mit Jasha Springmann (s0575697) gearbeitet
    


## Table of Contents
* [Introduction](#Introduction)
* [Requirements](#Requirements) 
  * [Knowledge](#Knowledge) 
  * [Modules](#Python-Modules)
* [Exercises - Multivariate Linear Regression](#Exercises---Multivariate-Linear-Regression)
  * [Create Features](#Create-Features)
  * [Linear Hypothesis](#Linear-Hypothesis)
  * [Generate Target Values](#Generate-Target-Values)
  * [Plot The Data](#Plot-The-Data)
  * [Cost Function](#Cost-Function)
  * [Gradient Descent](#Gradient-Descent)
  * [Training and Evaluation](#Training-and-Evaluation)
  * [Feature Scaling](#Feature-Scaling)
* [Summary and Outlook](#Summary-and-Outlook)
* [Literature](#Literature) 
* [Licenses](#Licenses)

## Introduction

In this exercise you will implement the _multivariate linear regression_, a model with two or more predictors and one response variable (opposed to one predictor using univariate linear regression). The whole exercise consists of the following steps:

1. Generate values for two predictors/features $(x_1, x_2)$
2. Implement a linear function as hypothesis (model) 
3. Generate values for the response (Y / target values)
4. Plot the $((x_1, x_2), y)$ values in a 3D plot.
5. Write a function to quantify your model (cost function)
6. Implement the gradient descent algorithm to train your model (optimizer) 
7. Visualize your training process and results
8. Apply feature scaling (pen & paper)

## Requirements
### Knowledge

You should have a basic knowledge of:
- Univariate linear regression
- Multivariate linear regression
- Squared error
- Gradient descent
- numpy
- matplotlib

Suitable sources for acquiring this knowledge are:
- [Multivariate Linear Regression Notebook](http://christianherta.de/lehre/dataScience/machineLearning/basics/multivariate_linear_regression.php) by Christian Herta and his [lecture slides](http://christianherta.de/lehre/dataScience/machineLearning/multivariateLinearRegression.pdf) (German)
- Chapter 2 of the open classroom [Machine Learning](http://openclassroom.stanford.edu/MainFolder/CoursePage.php?course=MachineLearning) by Andrew Ng
- Chapter 5.1 of [Deep Learning](http://www.deeplearningbook.org/contents/ml.html) by Ian Goodfellow 
- Some parts of chapter 1 and 3 of [Pattern Recognition and Machine Learning](https://www.microsoft.com/en-us/research/people/cmbishop/#!prml-book) by Christopher M. Bishop
- [numpy quickstart](https://docs.scipy.org/doc/numpy-1.15.1/user/quickstart.html)
- [Matplotlib tutorials](https://matplotlib.org/tutorials/index.html)

### Python Modules

By [deep.TEACHING](https://www.deep-teaching.org/) convention, all python modules needed to run the notebook are loaded centrally at the beginning. 


In [161]:
# External Modules
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d.axes3d import Axes3D
from matplotlib import cm
np.random.seed(42)

%matplotlib notebook

## Exercise - Multivariate Linear Regression

We will only use two features in this notebook, so we are still able to plot them together with the target in a 3D plot. But your implementation should also be capable of handling more (except the plots). 

### Create Features

First we will create some features. The features should be in a 2D numpy array, the rows separating the different feature vectors (training examples), the columns containing the features. Each feature should be **uniformly** distributed in a specifiable range.

**Task:**

Implement the function to generate a feature matrix (numpy array).

In [162]:

def create_feature_matrix(sample_size, n_features, x_min, x_max):
    return np.random.uniform(x_min, x_max, (sample_size, n_features))
 
    '''creates random feature vectors based on a lienar function in a given interval
    
    Args:
        sample_size: number feature vectors
        n_features: number of features for each vector
        x_min: lower bound value ranges
        x_max: upper bound value ranges
    
    Returns:
        x: 2D array containing feature vecotrs with shape (sample_size, n_features)
    '''
    raise NotImplementedError("You should implement this!")

In [163]:
sample_size = 100
n_features = 2
x_min = [1.5, -0.5]
x_max = [11., 5.0]

X = create_feature_matrix(sample_size, n_features, x_min, x_max)
X

array([[ 5.05813113,  4.72892869],
       [ 8.45394245,  2.79262166],
       [ 2.98217708,  0.35796986],
       [ 2.05179432,  4.2639688 ],
       [ 7.21059261,  3.39439918],
       [ 1.6955527 ,  4.83450419],
       [ 9.40820509,  0.66786511],
       [ 3.22733719,  0.5087248 ],
       [ 4.39030131,  2.38616037],
       [ 5.60347768,  1.10176027],
       [ 7.3126025 ,  0.26721623],
       [ 4.27537416,  1.51499014],
       [ 5.83266485,  3.81846779],
       [ 3.39690093,  2.32828941],
       [ 7.1279384 , -0.24452273],
       [ 7.27167609,  0.43788268],
       [ 2.11799013,  4.71887045],
       [10.67350431,  3.94618541],
       [ 4.39383081,  0.03719663],
       [ 8.00021375,  1.92083872],
       [ 2.65936323,  2.22347301],
       [ 1.82669095,  4.50126221],
       [ 3.95840983,  3.14387256],
       [ 4.46125522,  2.36037412],
       [ 6.69374765,  0.51669951],
       [10.71105396,  3.76323053],
       [10.42523994,  4.42155043],
       [ 7.1800498 ,  4.57030829],
       [ 2.34067877,

In [164]:
assert len(X[:,0]) == sample_size
assert len(X[0,:]) == n_features
for i in range(n_features):
    assert np.max(X[:,i]) <= x_max[i]
    assert np.min(X[:,i]) >= x_min[i]

### Linear Hypothesis


A short recap, a hypothesis $h_\theta({\bf x})$ is a certain function that we believe is similar to a target function that we like to model. A hypothesis $h_\theta({\bf x})$ is a function of ${\bf x}$ with fixed parameters $\Theta$. 

Here we have $n$ features ${\bf x} = \{x_1, \ldots, x_n \}$ and $n+1$ parameters $\Theta = \{\theta_0, \theta_1 \ldots, \theta_n \}$:

$$
h_\theta({\bf x}) = \theta_{0} + \theta_{1} x_1 + \ldots \theta_n x_n 
$$

adding an extra element to $\vec x$ for convenience, this could also be rewritten as:

$$
h_\theta({\bf x}) = \theta_{0} x_0 + \theta_{1} x_1 + \ldots \theta_n x_n 
$$

with $x_0 = 1$ for all feature vectors (training examples).

Or treating ${\bf x}$ and $\Theta$ as vectors:

$$
h(\vec x) = \vec x'^T \vec \theta
$$

with:

$$
\vec x = \begin{pmatrix} 
x_1 & x_2 & \ldots & x_n \\
\end{pmatrix}^T
\text{   and   }
\vec x' = \begin{pmatrix} 
1 & x_1 & x_2 & \ldots & x_n \\
\end{pmatrix}^T
$$

and

$$
\vec \theta = \begin{pmatrix} 
\theta_0 & \theta_1 & \ldots & \theta_n \\
\end{pmatrix}^T
$$

Or for the whole data set at once: The rows in $X$ separate the different feature vectors, the columns contain the features. 

$$
\vec h_\Theta(X) = X' \cdot \vec \theta
$$

the vector $\vec h(X) = \left( h(\vec x^{(1)}),h(\vec x^{(2)}), \dots, h(\vec x^{(m)}) )\right)^T$ contains all predictions for the data batch $X$.

with:

$$
\begin{align}
X &= \begin{pmatrix} 
x_1^{(1)} & \ldots & x_n^{(1)} \\
x_1^{(2)} & \ldots & x_n^{(2)} \\
\vdots &\vdots &\vdots \\
x_1^{(m)} & \ldots & x_n^{(m)} \\
\end{pmatrix}
&=
\begin{pmatrix} 
\vec x^{(1)T} \\
\vec x^{(2)T}  \\
\vdots  \\
\vec x^{(m)T}  \\
\end{pmatrix}
\end{align}
$$
respectively
$$
\begin{align}
X' = \begin{pmatrix} 
1 & x_1^{(1)} & \ldots & x_n^{(1)} \\
1 & x_1^{(2)} & \ldots & x_n^{(2)} \\
\vdots &\vdots &\vdots &\vdots \\
1 & x_1^{(m)} & \ldots & x_n^{(m)} \\
\end{pmatrix}
&=
\begin{pmatrix} 
\vec x'^{(1)T} \\
\vec x'^{(2)T}  \\
\vdots  \\
\vec x'^{(m)T}  \\
\end{pmatrix}
\end{align}
$$

**Task:**

Implement hypothesis $\vec h_\Theta(X)$ in the method `linear_hypothesis` and return it as a function. Implement it the computationally efficient (**pythonic**) way by not using any loops and handling all data at once (use $X$ respectively $X'$).

**Hint:**

Of course you are free to implement as many helper functions as you like, e.g. for transforming $X$ to $X'$, though you do not have to. Up to you.

In [165]:
def linear_hypothesis(thetas):
    
    return lambda x: np.concatenate((np.ones((x.shape[0], 1)), x), axis=1).dot(thetas)
    
    '''  Combines given list argument in a linear equation and returns it as a function
    
    Args:
        thetas: list of coefficients
        
    Returns:
        lambda that models a linear function based on thetas and x
    '''
    raise NotImplementedError("You should implement this!")

In [166]:
assert len(linear_hypothesis([.1,.2,.3])(X)) == sample_size

### Generate Target Values

**Task:**

Use your implemented `linear_hypothesis` inside the next function to generate some target values $Y$. Additionally add some Gaussian noise.

In [167]:
def generate_targets(X, theta, sigma):
    original = linear_hypothesis(theta)(X)
    noise = np.random.normal(scale = sigma, size = original.shape)
    return original + noise
    ''' Combines given arguments in a linear equation with X, 
    adds some Gaussian noise and returns the result
    
    Args:
        X: 2D numpy feature matrix
        theta: list of coefficients
        sigma: standard deviation of the gaussian noise
        
    Returns:
        target values for X
    '''
    raise NotImplementedError("You should implement this!")

In [168]:
theta = (2., 3., -4.)
sigma = 3.
y = generate_targets(X, theta, sigma)

In [169]:
assert len(y) == sample_size

### Plot The Data

**Task:**

Plot the data $\mathcal D = \{((x^{(1)}_1,x^{(1)}_2)^T,y^{(1)}), \ldots, ((x^{(n)}_1,x^{(n)}_2)^T,y^{(n)})\}$ in a 3D scatter plot. The plot should look like the following:

<img src="https://gitlab.com/deep.TEACHING/educational-materials/raw/dev/media/klaus/exercise-multivariate-linear-regression-scatter.png" width="512" alt="internet connection needed">

**Sidenote:**

The command `%matplotlib notebook` (instead of `%matplotlib inline`) creates an interactive (e.g. rotatable) plot.

In [170]:
%matplotlib notebook

def plot_data_scatter(features, targets):
    fig = plt.figure()
    ax = fig.add_subplot(projection='3d')
    
    ax.set_xlabel('x1')
    ax.set_ylabel('x2')
    ax.set_zlabel('y')

    ax.scatter(features[:,0], features[:,1], targets, color="red")
    return plt.show
    
    """ Plots the features and the targets in a 3D scatter plot
    
    Args:
        features: 2D numpy-array features
        targets: ltargets
    """
    raise NotImplementedError("You should implement this!")

In [171]:
plot_data_scatter(X, y)

<IPython.core.display.Javascript object>

<function matplotlib.pyplot.show(block=None)>

### Cost Function
A cost function $J$ depends on the given training data $D$ and hypothesis $h_\theta(\vec x)$. In the context of the linear regression, the cost function measures how "wrong" a model is regarding its ability to estimate the relationship between $\vec x$ and $y$ for specific $\Theta$ values. Later we will treat this as an optimization problem and try to minimize the cost function $J_{\mathcal D}(\Theta)$ to find optimal $\theta$ values for our hypothesis $h_\theta(\vec x)$. The cost function we use in this exercise is the [Mean-Squared-Error](https://en.wikipedia.org/wiki/Mean_squared_error) cost function:

\begin{equation}
    J_{\mathcal D}(\Theta)=\frac{1}{2m}\sum_{i=1}^{m}{(h_\Theta(\vec x^{(i)})-y^{(i)})^2}
\end{equation}

Implement the cost function $J_D(\Theta)$ in the method `mse_cost_function`. The method should return a function that takes the values of $\Theta$ as an argument.

As a sidenote, the terms "loss function" or "error function" are often used interchangeably in the field of Machine Learning.

In [172]:
def mse_cost_function(x, y):
   
    m = len(x)
   
    return lambda theta: 1 / (2* float(m)) * sum((linear_hypothesis(theta)(x) - y)**2)
                          
    ''' Implements MSE cost function as a function J(theta) on given traning data 
    
    Args:
        x: vector of x values 
        y: vector of ground truth values y 
        
    Returns:
        lambda J(theta) that models the cost function
    '''
    raise NotImplementedError("You should implement this!")

Review the cell in which you generate the target values and note the theta values, which were used for it (If you haven't edited the default values, it should be `[2, 3, -4]`)

**Optional:**

Try a few different values for theta to pass to the cost function - Which thetas result in a low error and which produce a great error?

In [173]:
J = mse_cost_function(X, y)
print(J(theta))

4.558621127297221


###  Gradient Descent

A short recap, the gradient descent algorithm is a first-order iterative optimization for finding a minimum of a function. From the current position in a (cost) function, the algorithm steps proportional to the negative of the gradient and repeats this until it reaches a local or global minimum and determines. Stepping proportional means that it does not go entirely in the direction of the negative gradient, but scaled by a fixed value $\alpha$ also called the learning rate. Implementing the following formalized update rule is the core of the optimization process:

\begin{equation}
    \theta_{j}^{new} \leftarrow \theta_{j}^{old} - \alpha * \frac{\partial}{\partial\theta_{j}} J(\vec \theta^{old})
\end{equation}

**Task:**

Implement the function to update all theta values.

In [1]:
def update_theta(x, y, theta, learning_rate):
    
    return theta - (learning_rate * (1/len(x)) * sum((linear_hypothesis(theta)(x) - y).dot(x)))
    ''' Updates learnable parameters theta 
    
    The update is done by calculating the partial derivities of 
    the cost function including the linear hypothesis. The 
    gradients scaled by a scalar are subtracted from the given 
    theta values.
    
    Args:
        x: 2D numpy array of x values
        y: array of y values corresponding to x
        theta: current theta values
        learning_rate: value to scale the negative gradient 
        
    Returns:
        theta: Updated theta vector
    '''
    raise NotImplementedError("You should implement this!")

Using the `update_theta` method, you can now implement the gradient descent algorithm. Iterate over the update rule to find the values for $\vec \theta$ that minimize our cost function $J_D(\vec \theta)$. This process is often called training of a machine learning model. 

**Task:**
- Implement the function for the gradient descent.
- Create a history of all theta and cost values and return them.

In [175]:
def gradient_descent(learning_rate, theta, iterations, x, y, cost_function):

    array_t = np.empty((0, 0))
    array_c = np.empty((0))


    def helper(array_theta, array_cost, thetas, i):
        t = update_theta(x, y, thetas, learning_rate)
        c = cost_function(x, y)(thetas)
        arrt= [t]
        arrc = [c]

        if (i == iterations): 
            return array_cost, array_theta.reshape(iterations, len(theta))
        else: 
            return helper(np.append(array_theta, arrt), np.append(array_cost, arrc), t, i+1)


    a, b = helper(array_t, array_c, theta, 0)


    return a, b

    '''
     a = np.append(array_c, result[0])
    b = np.append(array_t, result[1])
    
     np.append(array_theta, t)
        np.append(array_cost, c)
    Minimize theta values of a linear model based on MSE cost function
    
    Args:
        learning_rate: scalar, scales the negative gradient 
        theta: initial theta values
        x: vector, x values from the data set
        y: vector, y values from the data set
        iterations: scalar, number of theta updates
        cost_function: python function for computing the cost
        
    Returns:
        history_cost: cost after each iteration
        history_theta: Updated theta values after each iteration
    '''
    raise NotImplementedError("You should implement this!")

### Training and Evaluation

**Task:**

Choose an appropriate learning rate, number of iterations and initial theta values and start the training

In [176]:
# Your implementation:

alpha = 0.001 # assign an appropriate value
nb_iterations = 100 # assign an appropriate value
start_values_theta = [1, 0, -1] # assign appropriate values
history_cost, history_theta = gradient_descent(alpha, start_values_theta, nb_iterations, X, y,  mse_cost_function)
print(history_cost)
print(history_theta)


[140.23911849 123.42956939 109.41364115  97.72742264  87.9840084
  79.86071904  73.08844226  67.44274265  62.73644669  58.81345829
  55.54360061  52.81831379  50.54706679  48.6543646   47.0772522
  45.76323269  44.66853104  43.75664597  42.99714214  42.36464289
  41.83799013  41.39954372  41.0345971   40.73089002  40.47820206
  40.26801373  40.09322381  39.94791361  39.82715037  39.72682341
  39.64350734  39.57434811  39.51696797  39.46938615  39.42995285
  39.39729407  39.37026569  39.34791518  39.32944961  39.3142091
  39.30164457  39.29129925  39.28279326  39.27581078  39.2700893
  39.26541071  39.26159386  39.25848835  39.2559694   39.25393351
  39.25229487  39.25098242  39.24993733  39.24911094  39.24846303
  39.24796041  39.24757569  39.24728631  39.24707373  39.24692271
  39.24682075  39.24675763  39.246725    39.24671608  39.24672534
  39.24674832  39.24678141  39.24682174  39.24686699  39.24691535
  39.24696537  39.24701594  39.24706621  39.24711552  39.24716339
  39.24720948 

Now that the training has finished we can visualize our results.

**Task:**

Plot the costs over the iterations. If you have used `fig = plt.figure()` and `ax = fig.add_subplot(111)` in the last plot, use it again here, else the plot will be added to the last plot instead of a new one.

Your plot should look similar to this one:

<img src="https://gitlab.com/deep.TEACHING/educational-materials/raw/dev/media/klaus/exercise-multivariate-linear-regression-costs.png" width="512" alt="internet connection needed">

In [177]:
def plot_progress(costs):
    fig = plt.figure()
    ax = fig.add_subplot()
   
    ax.set_xlabel('Iterationen')
    ax.set_ylabel('Kosten')

    return ax.plot(costs)
    

    
    """ Plots the costs over the iterations
    
    Args:
        costs: history of costs
    """
    raise NotImplementedError("You should implement this!")

In [178]:
print("costs before the training:\t ", history_cost[0])
print("costs after the training:\t ", history_cost[-1])
plot_progress(history_cost)

costs before the training:	  140.2391184907217
costs after the training:	  39.24774882241659


<IPython.core.display.Javascript object>

[<matplotlib.lines.Line2D at 0x23f9f404408>]

**Task:**

Finally plot the decision hyperplane (just a plain plane here though) together with the data in a 3D plot.

Your plot should look similar to this one:

<img src="https://gitlab.com/deep.TEACHING/educational-materials/raw/dev/media/klaus/exercise-multivariate-linear-regression-scatter_and_boundary.png" width="512" alt="internet connection needed">

In [179]:
def evaluation_plt(x, y, final_theta):
    
    fig = plt.figure()
    ax = fig.gca(projection='3d')
    
    ax.set_xlabel('x1')
    ax.set_ylabel('x2')
    ax.set_zlabel('y')


    ax.scatter(x[:,0], x[:,1], y, color="red")
    
    px = final_theta[0]/final_theta[1]
    py = final_theta[0]/final_theta[2]
    pz = final_theta[0]

    
    points = [[px, 0.0, 0.0], [0.0, py, 0.0], [0.0, 0.0, pz]]

    p0, p1, p2 = points
    x0, y0, z0 = p0
    x1, y1, z1 = p1
    x2, y2, z2 = p2

    ux, uy, uz = u = [x1-x0, y1-y0, z1-z0]
    vx, vy, vz = v = [x2-x0, y2-y0, z2-z0]

    u_cross_v = [uy*vz-uz*vy, uz*vx-ux*vz, ux*vy-uy*vx]

    point  = np.array(p0)
    normal = np.array(u_cross_v)
    d = -point.dot(normal)

    xx, yy = np.meshgrid(x[:, 0], x[:, 1])

    z = (normal[0] * xx - normal[1] * yy - d) * 1. /normal[2]
  
    ebene =fig.gca(projection='3d')
    ebene.plot_surface(xx, yy, z, color="gray", rstride=1, cstride=1, cmap='viridis')
    
    return fig.show
    
    '''
    
    Plots the data x, y together with the final model
    
    Args:
        cost_hist: vector, history of all cost values from a opitmization
        theta_0: scalar, model parameter for boundary
        theta_1: scalar, model parameter for boundary
        x: vector, x values from the data set
        y: vector, y values from the data set
    '''
    raise NotImplementedError("You should implement this!")

In [180]:
print("thetas before the training:\t", history_theta[0])
print("thetas after the training:\t", history_theta[-1])
evaluation_plt(X, y, history_theta[-1])

thetas before the training:	 [ 1.12622707  0.12622707 -0.87377293]
thetas after the training:	 [2.45542511 1.45542511 0.45542511]


<IPython.core.display.Javascript object>

  after removing the cwd from sys.path.


<bound method Figure.show of <Figure size 640x480 with 1 Axes>>

### Feature Scaling

Now suppose the following features $X$:

In [181]:
X = np.array([[0.0001, 2000],
       [0.0002, 1800],
       [0.0003, 1600]], dtype=np.float32)

sample_size = len(X[:,0])
print(X)

[[1.0e-04 2.0e+03]
 [2.0e-04 1.8e+03]
 [3.0e-04 1.6e+03]]


**Task:**

This task can be done via **pen & paper** or by inserting some code below. Either way, you should be able to solve both tasks below on paper only using a calculator.

1. Apply feature scaling onto `X` using the *mean* and the *standard deviation*. What values do the scaled features have?
    * *Optional:*

        You can even execute the cell above and start running your notebook again from top (all **except** executing the cell to generate your features, which would overwrite these new features).

        When you start training you should notice that your costs do not decrease, maybe even increase, if you have not adjusted your learning rate (training might also throw an overflow warning).

In [182]:
def featureScaling():
    
     return lambda x: (x-np.mean(x))/np.std(x)

print(featureScaling()(X))

[[-0.99187     1.2122852 ]
 [-0.99186987  0.9918697 ]
 [-0.99186975  0.77145416]]


**Task:**

2. After the training with scaled features your new $\\vec \theta'$ values will something like: $\vec \theta'=\left(-7197,  326, -326\right)^T$ (you can try training with but you do not have to). 

    Suppose $\vec \theta'=\left(-7197,  326, -326\right)^T$. What are the corresponding $\theta_j$ values for the unscaled data?

    * (If you did train your model with the scaled features, the resulting $\theta_j$ should really be $\vec \theta'=\left(-7197,  326, -326\right)^T$

In [183]:
def unscaledThetas(scaled_thetas, x):
   
    theta_0 = scaled_thetas[0] - (scaled_thetas[1]*(np.mean(x[1]))/np.std(x[1])) - (scaled_thetas[2]*(np.mean(x[2]))/np.std(x[2]))
    theta_1 = scaled_thetas[1] / np.std(x[1])
    theta_2 = scaled_thetas[2] / np.std(x[2])
    
    return [theta_0, theta_1, theta_2]

print(unscaledThetas([-7197, 326, -326], X))

    

[-7196.9999640740125, 0.36222227135176205, -0.4075000932693695]


## Summary and Outlook

During this exercise, the linear regression was extended to multidimensional feature space and feature scaling was practiced. You should be able to answer the following questions:
- How does the implementation of the multivariate regression differ from the univariate one?
- Why do we apply feature scaling?
- Why does feature scaling help?

## Licenses

### Notebook License (CC-BY-SA 4.0)

*The following license applies to the complete notebook, including code cells. It does however not apply to any referenced external media (e.g., images).*

Exercise: Multivariate Linear Regression <br/>
by Christian Herta, Klaus Strohmenger<br/>
is licensed under a [Creative Commons Attribution-ShareAlike 4.0 International License](http://creativecommons.org/licenses/by-sa/4.0/).<br/>
Based on a work at https://gitlab.com/deep.TEACHING.


### Code License (MIT)

*The following license only applies to code cells of the notebook.*

Copyright 2018 Christian Herta, Klaus Strohmenger

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.