## Happiness 2019

In the dataset we will observe happiness 2019 dataset. This dataset gives the happiness rank and happiness score of 156 countries around the world based on six factors including GDP per capita, Social support, Healthy life expectancy, Freedom to make life choices, Generosity, Perceptions of corruption. Sum of the value of these six factors gives us the happiness score and the higher the happiness score, the lower the happiness rank. So, it is evident that the higher value of each of these six factors means the level of happiness is higher. We can define the meaning of these factors as the extent to which these factors lead to happiness.

In [None]:
# Import all libraries we need
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt # Visualization
# sklearn library
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import RandomForestRegressor
# Evaluation Metric
from sklearn.metrics import r2_score

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

In [None]:
df = pd.read_csv("/kaggle/input/world-happiness/2019.csv")
df.head()

In [None]:
df.info()

In [None]:
# plot data
plt.scatter(df["Social support"],df["Score"])
plt.xlabel("Social Support")
plt.ylabel("Score")
plt.show()


## Linear Regression

In [None]:
# linear regression model
linear_reg = LinearRegression()

x = df["Social support"].values.reshape(-1,1)
y = df["Score"].values.reshape(-1,1)

linear_reg.fit(x,y)

### Prediction

In [None]:
Predicted_Score1 = linear_reg.predict([[1.5]])
print("Predicted Score 1: ",Predicted_Score1)

Predicted_Score2 = linear_reg.predict([[0.5]])
print("Predicted Score 2: ",Predicted_Score2)

Predicted_Score3 = linear_reg.predict([[1.8]])
print("Predicted Score 3: ",Predicted_Score3)

intercept = linear_reg.intercept_
print("intercept: ",intercept)   # y eksenini kestigi nokta intercept

slope = linear_reg.coef_
print("slope: ",slope)   # egim slope

# Score = 1.91243024 + 2.89098704*Social Support 



### Visualization

In [None]:
y_predicted = linear_reg.predict(x)
plt.scatter(x,y)
plt.plot(x, y_predicted,color = "red")
plt.xlabel("Social Support")
plt.ylabel("Score")
plt.title("Linear Regression")
plt.show()
print("r_score: ", r2_score(y,y_predicted))

## Multiple Linear Regression

In [None]:
# Multiple Linear Regression Model
x = df.iloc[:,3:].values
y = df["Score"].values.reshape(-1,1)
multiple_linear_regression = LinearRegression()
multiple_linear_regression.fit(x,y)

### Prediction

In [None]:
print("Intercept: ", multiple_linear_regression.intercept_)
print("b1,b2,b3,b4,b5,b6: ",multiple_linear_regression.coef_)

In [None]:
# prediction
multiple_linear_regression.predict(np.array([[1.340,1.587,0.986,0.596,0.153,0.393]]))

## Polynomial Regression

In [None]:
df.head()

In [None]:
# plot data
plt.scatter(df["Social support"],df["Score"])
plt.xlabel("Social support")
plt.ylabel("Score")
plt.show()

In [None]:
x = df["Healthy life expectancy"].values.reshape(-1,1)
y = df["Score"].values.reshape(-1,1)

In [None]:
lr = LinearRegression()
lr.fit(x,y)
y_head = lr.predict(x)
plt.scatter(df["Healthy life expectancy"],df["Score"])
plt.xlabel("Healthy life expectancy")
plt.ylabel("Score")
plt.plot(x,y_head,color="red",label ="linear")
plt.show()

In [None]:
polynomial_regression = PolynomialFeatures(degree = 2)
x_polynomial = polynomial_regression.fit_transform(x)
linear_regression2 = LinearRegression()
linear_regression2.fit(x_polynomial,y)

### Visualization

In [None]:
y_head2 = linear_regression2.predict(x_polynomial)
plt.scatter(df["Healthy life expectancy"],df["Score"])
plt.xlabel("Healthy life expectancy")
plt.ylabel("Score")
plt.plot(x,y_head2,color= "green",label = "poly")
plt.title("Polynomial Regression")
plt.legend()
plt.show()
print("r_square score: ", r2_score(y,y_head2))

## Decision Tree

In [None]:
x = df["GDP per capita"].values.reshape(-1,1)
y = df["Score"].values.reshape(-1,1)

In [None]:
tree_reg = DecisionTreeRegressor()
tree_reg.fit(x,y)
tree_reg.predict([[1.2]])
x_ = np.arange(min(x),max(x),0.1).reshape(-1,1)
y_head = tree_reg.predict(x_)

In [None]:
# visualize
plt.scatter(x,y,color="red")
plt.plot(x_,y_head,color = "green")
plt.xlabel("GDP per capita")
plt.ylabel("Score")
plt.title("Decision Tree")
plt.show()

## Random Forest

In [None]:
x = df["Freedom to make life choices"].values.reshape(-1,1)
y = df["Score"].values.reshape(-1,1)

In [None]:
rf = RandomForestRegressor(n_estimators = 100, random_state = 42)
rf.fit(x,y)
print("Predicted Value = : ",rf.predict([[0.5]]))
x_ = np.arange(min(x),max(x),0.01).reshape(-1,1)
y_head = rf.predict(x_)

In [None]:
# visualize
plt.scatter(x,y,color="red")
plt.plot(x_,y_head,color="green")
plt.xlabel("Freedom to make life choices")
plt.ylabel("Score")
plt.show()