# Neural Networks

In this example we will be using neural network to perform regression rather than classification. This dataset has an issue, since it has categorical values and a neural network does not accept them.

## Predicting Medical Insurance Costs

We have a dataset related to medical costs, with the following features:
- age: age beneficiary
- sex: gender, female, male
- bmi: Body mass index, providing an understanding of body, weights that are relatively high or low relative to height,
- children: Number of children covered by health insurance
- smoker: whether the beneficiary is a smoker or not
- region: the beneficiary's residential area in the US, northeast, southeast, southwest, northwest.

The target is:
- charges: Individual medical costs billed by health insurance

### Loading the data

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

df = pd.read_csv(fr'C:\Users\ivane\Desktop\ACI-3\data\insurance.csv')
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1338 entries, 0 to 1337
Data columns (total 7 columns):
 #   Column    Non-Null Count  Dtype  
---  ------    --------------  -----  
 0   age       1338 non-null   int64  
 1   sex       1338 non-null   object 
 2   bmi       1338 non-null   float64
 3   children  1338 non-null   int64  
 4   smoker    1338 non-null   object 
 5   region    1338 non-null   object 
 6   charges   1338 non-null   float64
dtypes: float64(2), int64(2), object(3)
memory usage: 73.3+ KB


### Check data

Next step is to check the state of the data. We can obtain basic statistics.

In [2]:
df['smoker'].value_counts()

smoker
no     1064
yes     274
Name: count, dtype: int64

We can also check if there are any nulls.

In [3]:
df['sex'].value_counts()

sex
male      676
female    662
Name: count, dtype: int64

### Selecting Labels and Features

Now we can define the features, and the target label. But first we will have to identify the numerical features and categorical features and make necessary changes.

In [4]:
categorical = ['smoker', 'region']
numerical = ['age', 'bmi', 'children']
target = 'charges'

The easiest way to convert categorical values is to simply assign an integer e.g. male = 1 and female = 0. But this does not really make sense, since one cannot say male > female or compare them using numerical values.

A more viable solution is to use One-Hot Encoding (or Dummy Encoding), this will create a column for every unique value and each observation will have a 1 if the attribute is related to it, otherwise 0.

In [5]:
from sklearn.preprocessing import OneHotEncoder

ohe = OneHotEncoder() 

data_transformed = pd.DataFrame()
data_transformed = pd.DataFrame(ohe.fit_transform(df[categorical]).toarray())

So, now we have categorical values converted to a binary matrix. The next step is to join it with the other numerical features.

In [6]:
data_transformed = pd.DataFrame.join(data_transformed, df[numerical])
data_transformed.head()

Unnamed: 0,0,1,2,3,4,5,age,bmi,children
0,0.0,1.0,0.0,0.0,0.0,1.0,19,27.9,0
1,1.0,0.0,0.0,0.0,1.0,0.0,18,33.77,1
2,1.0,0.0,0.0,0.0,1.0,0.0,28,33.0,3
3,1.0,0.0,0.0,1.0,0.0,0.0,33,22.705,0
4,1.0,0.0,0.0,1.0,0.0,0.0,32,28.88,0


Next, we take the names of all the columns that will be used as features.

In [7]:
predictors = data_transformed.columns
predictors

Index([0, 1, 2, 3, 4, 5, 'age', 'bmi', 'children'], dtype='object')

### Data Scaling

In neural networks it is ideal to convert all the values to the same range. This can be done using MinMaxScaler found in sklearn. It changes all the values from 0 to 1.

In [9]:
from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler()
scaled = scaler.fit_transform(data_transformed)
print(scaled)

[[0.         1.         0.         ... 0.02173913 0.3212268  0.        ]
 [1.         0.         0.         ... 0.         0.47914985 0.2       ]
 [1.         0.         0.         ... 0.2173913  0.45843422 0.6       ]
 ...
 [1.         0.         0.         ... 0.         0.56201238 0.        ]
 [1.         0.         0.         ... 0.06521739 0.26472962 0.        ]
 [0.         1.         0.         ... 0.93478261 0.35270379 0.        ]]


### Data Splitting

Now, we can split the data making sure to use the scaled data for the features.

In [11]:
from sklearn.model_selection import train_test_split

X = scaled
y = df[target].values

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=123)

print('Training: ', X_train.shape)
print('Testing: ', X_test.shape)

Training:  (936, 9)
Testing:  (402, 9)


### Training the Model

In this case we will be using a Neural Network regresson named MLPRegressor. It accepts a number of parameters, in this case:
- hidden_layer_sizes is set to (11,11,11) this means 3 hidden layers with 11 perceptrons in each node
- solver is set to 'lbfgs', this is the solver for the weight optimization; in this case this was used since others did not converge
- max_iter is set to 5000, this is the number of times the network iterates until it converges

For more information: **[MLPRegressor](https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPRegressor.html)**

After the model is trained, we can predict using the test data set.

In [12]:
from sklearn.neural_network import MLPRegressor

mlp = MLPRegressor(random_state=123, max_iter=5000, solver='lbfgs', hidden_layer_sizes=(11,11,11), activation='relu')
mlp.fit(X_train, y_train)

predictions = mlp.predict(X_test)

### Evaluating the Model

After we predict the values we can use any metric we want to calculate the accuracy of the model. In this case a classification report, and a confusion matrix is created.

In [15]:
from sklearn.metrics import mean_squared_error, r2_score

# Calculate MSE and R2 score
mse = mean_squared_error(y_test, predictions)
r2 = r2_score(y_test, predictions)

print(f'Mean Squared Error: {mse:.2f}')
print(f'R² Score: {r2:.3f}')

Mean Squared Error: 16711831.62
R² Score: 0.883
