# Random Forest Loan Approver

In this activity, you will build a Random Forest classifier that can be used to predict the loan status (approve or deny) given a set of input features.

## Instructions

1. Read the data into a Pandas DataFrame.

2. Separate the features `X` from the target `y`. In this case, the loan status is the target.

3. Separate the data into training and testing subsets.

4. Scale the data using `StandardScaler`.

5. Import and instantiate an Random Forest classifier using sklearn.

6. Fit the model to the data.

7. Calculate the accuracy score using both the training and the testing data.

8. Make predictions using the testing data.

9. Generate the confusion matrix for the test data predictions.

10. Generate the classification report for the test data.

## Load Data
### 1. Read the data into a Pandas DataFrame.

In [2]:
# Import modules
from pathlib import Path
import pandas as pd

In [7]:
# Read in the data
df=pd.read_csv(Path('../Resources/loans.csv'))
df

Unnamed: 0,assets,liabilities,income,credit_score,mortgage,status
0,0.210859,0.452865,0.281367,0.628039,0.302682,deny
1,0.395018,0.661153,0.330622,0.638439,0.502831,approve
2,0.291186,0.593432,0.438436,0.434863,0.315574,approve
3,0.458640,0.576156,0.744167,0.291324,0.394891,approve
4,0.463470,0.292414,0.489887,0.811384,0.566605,approve
...,...,...,...,...,...,...
95,0.360945,0.823295,0.542451,0.224285,0.328504,approve
96,0.114420,0.107174,0.619564,0.370300,0.047719,deny
97,0.309276,0.692433,0.483730,0.328953,0.304493,approve
98,0.549153,0.301588,0.651869,0.717826,0.602004,approve


### 2. Separate the Features `X` from the Target `y`

In [8]:
# Segment the features from the target
y=df['status']
X=df.drop(columns='status')

### 3. Split the data into training and testing sets

In [9]:
from sklearn.model_selection import train_test_split

# Use the train_test_split function to create training and testing subsets
X_train,X_test, y_train, y_test = train_test_split(X,y, random_state=1)

### 4. Scale the data using `StandardScaler`

In [10]:
from sklearn.preprocessing import StandardScaler

# Scale the data
scaler=StandardScaler()
X_scaler=scaler.fit(X_train)
X_train_scaled=X_scaler.transform(X_train)
X_test_scaled=X_scaler.transform(X_test)

## Model

### 5. Import and instantiate the `RandomForestClassifier` using sklearn. 

In [11]:
from sklearn.ensemble import RandomForestClassifier

# Create a random forest classifier
rf_model = RandomForestClassifier(n_estimators=100, random_state=1)

## Fit

### 6. Train the model using the training data

In [22]:
# Fit the data
rf_model=rf_model.fit(X_train_scaled, y_train)

### 7. Score the model using the test data

In [23]:
# Score the accuracy
scores=rf_model.score(X_test_scaled, y_test)
print(scores)

0.84


## Predict

### 8. Make predictions

In [24]:
# Make predictions using the test data
predictions = rf_model.predict(X_test_scaled)

## Evaluate

### 9. Generate Confusion Matrix

In [25]:
from sklearn.metrics import confusion_matrix

# Create a confusion matrix
confusion.matrix(y_train, predictions)

NameError: name 'confusion' is not defined

### 10. Generate Classification Report

In [11]:
from sklearn.metrics import classification_report

# Print the classification report
classification_report.()

              precision    recall  f1-score   support

     approve       0.75      0.75      0.75        12
        deny       0.77      0.77      0.77        13

    accuracy                           0.76        25
   macro avg       0.76      0.76      0.76        25
weighted avg       0.76      0.76      0.76        25

