# SVM Loan Approver

In this activity, you will build a Support Vector Machine (SVM) classifier that can be used to predict the loan status (approve or deny) given a set of input features.

## Instructions

1. Read the data into a Pandas DataFrame.

2. Separate the features `X` from the target `y`. In this case, the loan status is the target.

3. Separate the data into training and testing subsets.

4. Scale the data using `StandardScaler`.

5. Import and instantiate an SVM classifier using sklearn.

6. Fit the model to the data.

7. Calculate the accuracy score using both the training and the testing data.

8. Make predictions using the testing data.

9. Generate the confusion matrix for the test data predictions.

10. Generate the classification report for the test data.


**Bonus**: Compare the performance of the SVM model against the logistic regression model. Decide which model performed better, and be prepared to discuss these results with the class.

## Load Data
### Import modules

In [1]:
# Import modules
from path import Path
import pandas as pd

### 1. Read the data into a Pandas DataFrame.

In [2]:
# Read in the data
data = Path('../Resources/loans.csv')
df = pd.read_csv(data)
df.head()

Unnamed: 0,assets,liabilities,income,credit_score,mortgage,status
0,0.210859,0.452865,0.281367,0.628039,0.302682,deny
1,0.395018,0.661153,0.330622,0.638439,0.502831,approve
2,0.291186,0.593432,0.438436,0.434863,0.315574,approve
3,0.45864,0.576156,0.744167,0.291324,0.394891,approve
4,0.46347,0.292414,0.489887,0.811384,0.566605,approve


### 2. Separate the Features `X` from the Target `y`

In [3]:
# Segment the features from the target
y = df["status"]
X = df.drop(columns="status")

### 3. Split the data into training and testing sets

In [4]:
from sklearn.model_selection import train_test_split

# Use the train_test_split function to create training and testing subsets
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=1, stratify=y)
X_train.shape

(75, 5)

### 4. Scale the data using `StandardScaler`

In [5]:
from sklearn.preprocessing import StandardScaler

# Scale the data
scaler = StandardScaler()
X_scaler = scaler.fit(X_train)
X_train_scaled = X_scaler.transform(X_train)
X_test_scaled = X_scaler.transform(X_test)

## Model

### 5. Import and instantiate an SVM classifier using sklearn.

In [6]:
from sklearn.svm import SVC

# Instantiate a linear SVM model
svm_model = SVC(kernel='linear')
svm_model

SVC(kernel='linear')

## Fit

### 6. Fit the model to the data.Train the model using the training data

In [7]:
# Fit the data
svm_model.fit(X_train_scaled, y_train)

SVC(kernel='linear')

### 7. Calculate the accuracy score using both the training and the testing data.

In [8]:
# Score the accuracy
print(f"Training Data Score: {svm_model.score(X_train_scaled, y_train)}")
print(f"Testing Data Score: {svm_model.score(X_test_scaled, y_test)}")

Training Data Score: 0.6133333333333333
Testing Data Score: 0.64


## Predict

### 8. Make predictions

In [9]:
# Make predictions using the test data
y_pred = svm_model.predict(X_test_scaled)

results = pd.DataFrame({
    "Prediction": y_pred, 
    "Actual": y_test
}).reset_index(drop=True)

results.head()

Unnamed: 0,Prediction,Actual
0,approve,deny
1,approve,approve
2,deny,deny
3,approve,deny
4,deny,deny


## Evaluate

### 9. Generate Confusion Matrix

In [10]:
from sklearn.metrics import confusion_matrix

# Create a confusion matrix
confusion_matrix(y_test, y_pred)

array([[8, 4],
       [5, 8]], dtype=int64)

### 10. Generate Classification Report

In [11]:
from sklearn.metrics import classification_report

# Print the classification report
print(classification_report(y_test, y_pred))

              precision    recall  f1-score   support

     approve       0.62      0.67      0.64        12
        deny       0.67      0.62      0.64        13

    accuracy                           0.64        25
   macro avg       0.64      0.64      0.64        25
weighted avg       0.64      0.64      0.64        25



### Bonus: Logistic Regression

In [12]:
# Instantiate a logistic regression model
from sklearn.linear_model import LogisticRegression
lr = LogisticRegression()
lr

LogisticRegression()

In [13]:
# Fit the data
lr.fit(X_train_scaled, y_train)

LogisticRegression()

In [14]:
# Make predictions using the test data
y_pred_lr = lr.predict(X_test_scaled)

In [15]:
# Print classification report
print(classification_report(y_test, y_pred_lr))

              precision    recall  f1-score   support

     approve       0.62      0.67      0.64        12
        deny       0.67      0.62      0.64        13

    accuracy                           0.64        25
   macro avg       0.64      0.64      0.64        25
weighted avg       0.64      0.64      0.64        25

