# Abstract

**The purpose of my regression and exploratory data analysis is to get an insight of the housing prices with respect to its other attributes. Here I am studying the dataset “ Housing Prices” by Kaggle. The main focus is on  analyzing the factors which are affecting the prices of houses from the given 500000 houses with their prices and other columns which will be taken into consideration as factors which might affect the prices.**

In [None]:
# importing libraries
%matplotlib inline 
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from scipy import stats as st
import seaborn as sns
import re
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LinearRegression
from sklearn.linear_model import LogisticRegression
import missingno as msno
from math import* 
from reportlab.lib.styles import ParagraphStyle
import statsmodels.api as sm
from statsmodels.sandbox.regression.predstd import wls_prediction_std
from sklearn.preprocessing import MinMaxScaler
from sklearn.linear_model import Ridge
from sklearn.model_selection import cross_validate
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.model_selection import StratifiedKFold
from sklearn.metrics import mean_squared_error

**Importing the libraries and reading the dataset from CSV file**

In [None]:
# importing the dataset
df=pd.read_csv("HousePrices_HalfMil.csv", decimal = ',')

In [None]:
df.head()

**Taking the general idea about the dataset using head()**

In [None]:
df.describe()

**Just taking basic summary statistics of the data, which gives counts, min, max, quantiles, std deviation, mean of the data**

In [None]:
top_housing_prices = df.sort_values('Prices',ascending=False)
# Look at top 20
top_housing_prices[['Prices','Area']].head(20)

**Checking the top 20 Houses with their areas and prices**

In [None]:
fig, ax = plt.subplots(figsize=(16,6))
sns.barplot(x='Area', y='Prices', data=top_housing_prices.head(33), palette='Set1')
ax.set_xlabel(ax.get_xlabel(), labelpad=15)
ax.set_ylabel(ax.get_ylabel(), labelpad=30)
ax.xaxis.label.set_fontsize(16)
ax.yaxis.label.set_fontsize(16)
plt.xticks(rotation=90)
plt.show()

**Making a barplot of top 20 Prices of houses and last 20 houses's prices according to its Area**

In [None]:
fig, ax = plt.subplots(figsize=(16,6))
sns.barplot(x='Area', y='Prices', data=top_housing_prices.tail(33), palette='Set1')
ax.set_xlabel(ax.get_xlabel(), labelpad=15)
ax.set_ylabel(ax.get_ylabel(), labelpad=30)
ax.xaxis.label.set_fontsize(16)
ax.yaxis.label.set_fontsize(16)
plt.xticks(rotation=90)
plt.show()

In [None]:
total = df.isnull().sum()[df.isnull().sum() != 0].sort_values(ascending = False)
percent = pd.Series(round(total/len(df)*100,2))
pd.concat([total, percent], axis=1, keys=['total_missing', 'percent'])

**Checking the columns and the number of missing entries in them
Examining this is important as because of this the dataset can lose expressiveness, which can lead to weak and biased analyses**

In [None]:
print(np.isnan(df['Prices']))

**Through above code we can see that no element here is null, thus we dont have to drop any row**

In [None]:
msno.matrix(df)

**The above graph shows that there are no missing values**

In [None]:
df.skew()

**Skewness tells how assymmetric data is spread around the mean.. If the right tail of histogram then positive skew and negative tail is negative skew**

In [None]:
y = df['Prices']
plt.figure(2); plt.title('Normal')
sns.distplot(y, kde=False, fit=st.norm)

In [None]:
df.kurt()

**The higher the kurtosis the longer is the tail of the histogram which can be seen in the above graph of skewness, but here the kurtosis ins't higher for any column**

In [None]:
sns.distplot(df['Prices'])

**Dist plot shows hw symettrically the data is spread as we are doing regression we need to check for this, we can see the data is symmetrical**

In [None]:
df.corr()

**The above chart shows the corelation of each coloumn with the other column**

In [None]:
plt.figure(figsize=(16,12))
sns.heatmap(data=df.iloc[:,:].corr(),annot=True,fmt='.2f',cmap='coolwarm')
plt.show()

**Heat map represents the corealtion in a better way**
Here we can see that White Marble,Indian Marble, Fiber and Floors are max corelated
Indian Marble and White Marble are corelated with each other as well

# Linear Regression

In [None]:
xs = df[['Area','Garage','FirePlace','Baths','White Marble','Black Marble','Indian Marble','Floors','City','Solar','Electric','Fiber','Glass Doors','Swiming Pool','Garden']]
ys = df['Prices']
len(xs), len(ys)

**Assigning values and xs = independent value and ys = dependent or target value**

In [None]:
x_train, x_test, y_train, y_test = train_test_split(xs,ys,test_size = 0.2, random_state = 4)

**Using tarin_test_split to train and test the data, training with 80% of data and testing with remaining 20 %**

In [None]:
x_train.head()

**Checking the first 5 rows of training set before further analysis**

In [None]:
regr = LinearRegression()
regr.fit(x_train, y_train)

**Applying Linear Regression and fitting that in our training dataset**

In [None]:
regr.intercept_

**Checking the intercept after applying Linear regression to ur model**

In [None]:
print('Coefficients: ', regr.coef_)
print("Mean Squared Error: %.2f"
     % np.mean((regr.predict(x_test) - y_test) **2))
print ('Variance Score: %.2f'% regr.score(x_test,y_test))
print('Score'%regr.score(x_train,y_train))

**Finding the coeffecient, Mean squared error and the varience score**

In [None]:
names = [i for i in list(xs)]
print(names)

In [None]:
#style.use("bmh")
plt.scatter(regr.predict(x_test),y_test)
plt.show()

In [None]:
print(xs.shape)


In [None]:
import statsmodels.formula.api as sm
xs = np.append(arr = np.ones((500000,1)).astype(int), values = xs,axis =1)
def backwardElimination(x, sl):
    numVars = len(x[0])
    for i in range(0, numVars):
        reg_OLS = sm.OLS(ys, x).fit()
        maxVar = max(reg_OLS.pvalues)
        if maxVar > sl:
            for j in range(0, numVars - i):
                if (reg_OLS.pvalues[j].astype(float) == maxVar):
                    x = np.delete(x, j, 1)
    reg_OLS.summary()
    return x

SL = 0.05
X_opt = xs[:, [0, 1, 2, 3, 4, 5,6,7,8,9,10,11,12,13,14]]
X_Mod = backwardElimination(X_opt, SL)
e_df = pd.DataFrame(X_Mod)
e_df

In [None]:
model1=sm.OLS(y_train,x_train)

In [None]:
result = model1.fit()

In [None]:
print(result.summary())

**The p values in any model should be less than 0.05, we have perfect p values, we can remove garden(p value = 0.02) or Swiming Pool with negative t values, We can remove Swiming Pool**

### Cross Validation

In [None]:
a= cross_val_score(regr,x_train,y_train,cv=10)
b = (np.sqrt(a).mean())
print(b)

 - 1) Yes, The Relationship is significant as the p values is less than 0.05 and the regression line differs significantly from 0.
 - 2.a)linearity and additivity**- Not Violated
 - 2.b)multi collinearity** - it is there it will be solved later in the code 
 - 2.c)homoscedasticity**- not violated
 - 2.d)Normality**-not violated(already shown in above graph)
 - 3)Yes the model makes sense-Area, baths and Prices -it is continuos value, 'Garbage' and 'City' - multi class categorical variable, rest other columns are binary categorical variable.**
 - 4)yes cross validated it did well
 - 5)The AIC, BIC should decrease in a better model which we can check later, R square is perfect




### Model 2 for Linear Regression

In [None]:
xs = df[['Area','Garage','FirePlace','Baths','Black Marble','Indian Marble','Floors','City']]
ys = df['Prices']
len(xs), len(ys)

In [None]:
x_train, x_test, y_train, y_test = train_test_split(xs,ys,test_size = 0.2, random_state = 4)

In [None]:
regr = LinearRegression()
regr.fit(x_train, y_train)

In [None]:
regr.intercept_

In [None]:
print('Coefficients: ', regr.coef_)
print("Mean Squared Error: %.2f"
     % np.mean((regr.predict(x_test) - y_test) **2))
print ('Variance Score: %.2f'% regr.score(x_test,y_test))
print(regr.score(x_train,y_train))

In [None]:
#style.use("bmh")
plt.scatter(regr.predict(x_test),y_test)
plt.show()

In [None]:
import statsmodels.formula.api as sm
xs = np.append(arr = np.ones((500000,1)).astype(int), values = xs,axis =1)
def backwardElimination(x, sl):
    numVars = len(x[0])
    for i in range(0, numVars):
        reg_OLS = sm.OLS(ys, x).fit()
        maxVar = max(reg_OLS.pvalues)
        if maxVar > sl:
            for j in range(0, numVars - i):
                if (reg_OLS.pvalues[j].astype(float) == maxVar):
                    x = np.delete(x, j, 1)
    reg_OLS.summary()
    return x

SL = 0.05
X_opt = xs[:, [0, 1, 2, 3, 4, 5,6,7]]
X_Mod = backwardElimination(X_opt, SL)
e_df = pd.DataFrame(X_Mod)
e_df

In [None]:
model1=sm.OLS(y_train,x_train)

In [None]:
result = model1.fit()

In [None]:
print(result.summary())

**The p value of Garden is greater than 0.05, removing the column garden to make our model more accurate** 

## Cross Validation

In [None]:
print(np.sqrt(cross_val_score(regr,x_train,y_train,cv=10)).mean())

 - 1) Yes, The Relationship is significant as the p values is less than 0.05 and the regression line differs significantly from 0.
 - 2.a)linearity and additivity**- Not Violated
 - 2.b)multi collinearity** - it is there it will be solved later in the code 
 - 2.c)homoscedasticity**- not violated
 - 2.d)Normality**-not violated(already shown in above graph)
 - 3)Yes the model makes sense-Area, baths  -it is continuos value, 'Garbage' and 'City' - multi class categorical variable, rest other columns are binary categorical variable.**
 - 4)yes cross validated it did well
 - 5)The AIC, BIC should decrease but it increased here as compared to previous model a, R square increased(Thus model 1 was better

### Model 3 of Linear Regression

In [None]:
xs = df[['Area','Garage','FirePlace','Baths','Floors','City','Electric','Fiber','Glass Doors']]
ys = df['Prices']
len(xs), len(ys)

In [None]:
x_train, x_test, y_train, y_test = train_test_split(xs,ys,test_size = 0.2, random_state = 4)

In [None]:
regr = LinearRegression()
regr.fit(x_train, y_train)
y_pred = regr.predict(x_test)

In [None]:
regr.intercept_

In [None]:
print('Coefficients: ', regr.coef_)
print("Mean Squared Error: %.2f"
     % np.mean((regr.predict(x_test) - y_test) **2))
print ('Variance Score: %.2f'% regr.score(x_test,y_test))
print(regr.score(x_train,y_train))

In [None]:
#style.use("bmh")
plt.scatter(regr.predict(x_test),y_test)
plt.show()

In [None]:
import statsmodels.formula.api as sm
xs = np.append(arr = np.ones((500000,1)).astype(int), values = xs,axis =1)
def backwardElimination(x, sl):
    numVars = len(x[0])
    for i in range(0, numVars):
        reg_OLS = sm.OLS(ys, x).fit()
        maxVar = max(reg_OLS.pvalues)
        if maxVar > sl:
            for j in range(0, numVars - i):
                if (reg_OLS.pvalues[j].astype(float) == maxVar):
                    x = np.delete(x, j, 1)
    reg_OLS.summary()
    return x

SL = 0.05
X_opt = xs[:, [0, 1, 2, 3, 4, 5,6,7,8]]
X_Mod = backwardElimination(X_opt, SL)
e_df = pd.DataFrame(X_Mod)
e_df

In [None]:
model1=sm.OLS(y_train,x_train)

In [None]:
result = model1.fit()

In [None]:
print(result.summary())

**Thus the 1st model is our best model**

## Cross Validation

In [None]:
print(np.sqrt(cross_val_score(regr,x_train,y_train,cv=10)).mean())

 - 1) Yes, The Relationship is significant as the p values is less than 0.05 and the regression line differs significantly from 0.
 - 2.a)linearity and additivity**- Not Violated
 - 2.b)multi collinearity** - it is there it will be solved later in the code 
 - 2.c)homoscedasticity**- not violated
 - 2.d)Normality**-not violated(already shown in above graph)
 - 3)Yes the model makes sense-Area, baths and Prices -it is continuos value, 'Garbage' and 'City' - multi class categorical variable, rest other columns are binary categorical variable.**
 - 4)yes cross validated it did well
 - 5)The AIC, BIC decreased from 2nd model but increased from 1st. Thus 1st is the best model, R square increased from 2nd but still decreased from 1st 
 - The best model is Model 1

### Multicolinearity

**White and Indian Marble are corelated with each other and are inter corelated as well**

In [None]:
xs = df[['Area','Garage','FirePlace','Baths','White Marble','Black Marble','Indian Marble','Floors','City','Solar','Electric','Fiber','Glass Doors','Swiming Pool','Garden']]
ys = df['Prices']
len(xs), len(ys)
x_train, x_test, y_train, y_test = train_test_split(xs,ys,test_size = 0.2, random_state = 4)

In [None]:
from statsmodels.stats import outliers_influence
def variance_IF(x):
    vif=vif = pd.DataFrame()
    vif["VIF Factor"] = [outliers_influence.variance_inflation_factor(x.values, i) for i in range(x.shape[1])]
    vif["features"] = x.columns
    return vif

print(variance_IF(xs))

In [None]:
xs = df[['Area','Garage','FirePlace','Baths','Floors','City','Solar','Electric','Fiber','Glass Doors','Swiming Pool','Garden']]
ys = df['Prices']
len(xs), len(ys)
x_train, x_test, y_train, y_test = train_test_split(xs,ys,test_size = 0.2, random_state = 4)

In [None]:
regr = LinearRegression()
regr.fit(x_train, y_train)
y_pred = regr.predict(x_test)

In [None]:
regr.intercept_

In [None]:
print('Coefficients: ', regr.coef_)
print("Mean Squared Error: %.2f"
     % np.mean((regr.predict(x_test) - y_test) **2))
print ('Variance Score: %.2f'% regr.score(x_test,y_test))
print(regr.score(x_train,y_train))

In [None]:
corr_df = x_train.corr(method = 'pearson')
print("------------Create a Corelation plot----------------")
mask = np.zeros_like(corr_df)
mask[np.triu_indices_from(mask)] = True
sns.heatmap(corr_df, cmap='RdYlGn_r', vmax = 1.0, vmin=-1.0, mask = mask, linewidths = 2.5)
plt.yticks(rotation = 0)
plt.xticks(rotation = 90)
plt.show()

In [None]:
print(np.sqrt(cross_val_score(regr,x_train,y_train,cv=10)).mean())

 - 1) Yes, there was multi-colinearity  in the model, which has been removed now.**
 - 2)yes, the predictor variables are independent of all other predictor variables.**
 - 3)The most significant predictor variables are Floors, Fiber, White Marbles and City, The insignificant ones are already excluded**
 - 4)Performed Cross validation**

### Interaction Effect

In [None]:
df['interaction1'] = df['Indian Marble']*df['Black Marble']*df['White Marble']
xs = df[['Area','Garage','FirePlace','Baths','interaction1','Floors','City','Solar','Electric','Fiber','Glass Doors','Swiming Pool','Garden']]
ys = df['Prices']
len(xs), len(ys)

In [None]:
x_train, x_test, y_train, y_test = train_test_split(xs,ys,test_size = 0.2, random_state = 4)

In [None]:
regr = LinearRegression()
regr.fit(x_train, y_train)

In [None]:
regr.intercept_

In [None]:
print('Coefficients: ', regr.coef_)
print("Mean Squared Error: %.2f"
     % np.mean((regr.predict(x_test) - y_test) **2))
print ('Variance Score: %.2f'% regr.score(x_test,y_test))
print(regr.score(x_train,y_train))

In [None]:
model1=sm.OLS(y_train,x_train)

In [None]:
result_interaction = model1.fit()

In [None]:
print(result_interaction.summary())

**After appling interaction the performance is better than model 2 and 3 but not model 1**

## Regularization for Linear Regression (Ridge)

In [None]:
scaler = MinMaxScaler()
scaler.fit(x_train)
x_train_scaled = scaler.fit_transform(x_train)
x_test_scaled = scaler.transform(x_test)
linridge = Ridge(alpha= this_alpha).fit(x_train_scaled,y_train)
print('ridge regression linear model intercept : {}'.format(linridge.intercept_))
print('ridge regression linear model coeff : {}'.format(linridge.coef_))
print('R-squared score (training) : {:.3f}'.format(linridge.score(x_train_scaled,y_train)))
print('R- squared score (test) : {:.3f}'.format(linridge.score(x_test_scaled,y_test)))
print('Number of non zero features:{}'.format(np.sum(linridge.coef_!=0)))

In [None]:
scaler = MinMaxScaler()
scaler.fit(x_train)
print('Effect of Alpha regularisation parameter')
for this_alpha in [0,0.2,0.4,0.6,0.8,1.0] :
    linridge = Ridge(alpha= this_alpha).fit(x_train_scaled,y_train)
    r2_train = linridge.score(x_train_scaled,y_train)
    r2_test =  linridge.score(x_test_scaled,y_test)
    num_coeff_bigger = np.sum(abs(linridge.coef_)>1.0)
    print('Alpha = {:2f}\n\num abs(coeff)>1.0:{},r-squared training: {:.2f}, r-squared test: {:.2f}\n'.format(this_alpha, num_coeff_bigger, r2_train, r2_test))

**After Performing regularisation the performance didnt improve**

# Logistic Regression

In [None]:
#Logistic Regression
medium = (df['Prices'].max()-df['Prices'].min())/2
df['Price_Logistic'] = np.where(df['Prices']>=medium, 1, 0)
print(medium)

**Adding another column in our dataset as Price Logistic which the target column- 1 = High Price, 0 = Low Price**

In [None]:
df.tail()

In [None]:
sns.countplot(x="Price_Logistic", data = df)

**Subplots which shows the distribution of high and low price**

In [None]:
df.tail()

In [None]:
xs_logi = df[['Area','Garage','FirePlace','Baths','White Marble','Black Marble','Indian Marble','Floors','City','Solar','Electric','Fiber','Glass Doors','Swiming Pool','Garden','Prices']]

ys_logi = df['Price_Logistic']

In [None]:
x_train1, x_test1, y_train1, y_test1 = train_test_split(xs_logi,ys_logi,test_size=0.33, random_state= 1)

In [None]:
sc = StandardScaler()
x_train1 = sc.fit_transform(x_train1)
x_test1 = sc.transform(x_test1)

In [None]:
logmodel = LogisticRegression()
logmodel.fit(x_train1,y_train1)

In [None]:
prediction1 = logmodel.predict(x_test1)

In [None]:
#how my mdel is performing
classification_report(y_test1,prediction1)

In [None]:
confusion_matrix(y_test1,prediction1)

In [None]:
accuracy_score(y_test1,prediction1)
#quiet good

In [None]:
print(classification_report(y_test1,prediction1))

 - 1)yes it is significant as most of the predictor value matches the test variables, also the accuracy is good.**
 - 2)No assumption is being violated except multicolinearity**
 - 3)Yes the model makes sense, -Area, baths and Prices -it is continuos value, 'Garbage' and 'City' - multi class categorical variable, rest other columns are binary categorical variable.**
 - 4)Yes cross validated the model**
 - 5)the probability is 99.99**

### Cross Validation

In [None]:
print(np.sqrt(cross_val_score(logmodel,xs_logi,ys_logi,cv=10)).mean())

### Model 2 of Logistic Regression

In [None]:
xs_logi2 = df[['Area','Garage','FirePlace','Baths','Floors','City','Solar','Electric','Fiber','Glass Doors','Swiming Pool','Garden']]

ys_logi2 = df['Price_Logistic']
x_train2, x_test2, y_train2, y_test2 = train_test_split(xs_logi2,ys_logi2,test_size=0.33, random_state= 1)

In [None]:
sc = StandardScaler()
x_train2 = sc.fit_transform(x_train2)
x_test2 = sc.transform(x_test2)

**We do standard Scalar when the dataset is large enough(say more that 10k) to improve the accuracy**

In [None]:
logmodel2 = LogisticRegression(random_state = 0)
logmodel2.fit(x_train2,y_train2)

In [None]:
y_pred2 = logmodel2.predict(x_test2)

In [None]:
accuracy_score(y_test2,y_pred2)*100

**Clearly the accuracy has decreased**

In [None]:
logit_model2 = sm.Logit(y_train2, x_train2)
result2 = logit_model2.fit()
print(result2.summary())

### Cross Validation

In [None]:
print(np.sqrt(cross_val_score(logmodel2,xs_logi2,ys_logi2,cv=10)).mean())

 - 1)yes it is significant as most of the predictor value matches the test variables, also the accuracy is good.**
 - 2)No assumption is being violated except multicolinearity**
 - 3)Yes the model makes sense, -Area, baths -it is continuos value, 'Garbage' and 'City' - multi class categorical variable, rest other columns are binary categorical variable.**
 - 4)Yes cross validated the model**
 - 5)the probability is 86.34(Model 1 was better)** 

### Model 3 of Logistic Regression

In [None]:
xs_logi3 = df[['Area','Garage','FirePlace','Baths','White Marble','Floors','City','Solar','Glass Doors','Swiming Pool','Garden']]

ys_logi3 = df['Price_Logistic']
x_train3, x_test3, y_train3, y_test3 = train_test_split(xs_logi3,ys_logi3,test_size=0.33, random_state= 1)

In [None]:
sc = StandardScaler()
x_train3 = sc.fit_transform(x_train3)
x_test3 = sc.transform(x_test3)

In [None]:
logmodel3 = LogisticRegression(random_state = 0)
logmodel3.fit(x_train3,y_train3)

In [None]:
y_pred3 = logmodel3.predict(x_test3)

In [None]:
accuracy_score(y_test3,y_pred3)*100

In [None]:
logit_model3 = sm.Logit(y_train3, x_train3)
result3 = logit_model3.fit()
print(result3.summary())

In [None]:
##Cross validation

In [None]:
print(np.sqrt(cross_val_score(logmodel3,xs_logi3,ys_logi3,cv=10)).mean())

 - 1)yes it is significant as most of the predictor value matches the test variables, also the accuracy is good.**
 - 2)No assumption is being violated except multicolinearity**
 - 3)Yes the model makes sense, -Area, baths-it is continuos value, 'Garbage' and 'City' - multi class categorical variable, rest other columns are binary categorical variable.**
 - 4)Yes cross validated the model**
 - 5)the probability is 83.68**

### Interaction Effect

In [None]:
df['interaction_logi'] = df['White Marble'] * df['Black Marble'] *df['Indian Marble']
xs_logi4 = df[['Area','Garage','FirePlace','Baths','interaction_logi','Floors','City','Solar','Electric','Fiber','Glass Doors','Swiming Pool','Garden','Prices']]
ys_logi4 = df['Price_Logistic']

In [None]:
x_train4, x_test4, y_train4, y_test4 = train_test_split(xs_logi4,ys_logi4,test_size=0.33, random_state= 1)
sc = StandardScaler()
x_train4 = sc.fit_transform(x_train4)
x_test4 = sc.transform(x_test4)

In [None]:
logmodel4 = LogisticRegression(random_state = 0)
logmodel4.fit(x_train4,y_train4)

In [None]:
y_pred4 = logmodel4.predict(x_test4)

In [None]:
accuracy_score(y_test4,y_pred4)*100

**Yes there was synergy in the tested terms**

# Conclusion
**The best linear regression model was model 1, and the best logistic regression model was model 1 as well.**

# Contribution
**My contribution is 60% while 40% is the materials i have taken from internet**

# Citation
 - https://github.com/ResidentMario/missingno
 - https://towardsdatascience.com/visualizing-data-with-pair-plots-in-python-f228cf529166 
 - https://stackoverflow.com/
 - https://www.kaggle.com/ekami66/detailed-exploratory-data-analysis-with-python 
 - https://www.datacamp.com/
 - https://www.datascience.com/blog/introduction-to-correlation-learn-data-science-tutorials
 - https://www.edureka.com
 - https://www.dummies.com/programming/big-data/data-science/data-science-how-to-create-interactions-between-variables-with-python/

# License
**Copyright
Jyoti Goyal**
**THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.**