# SML Lab 3

In [1]:
# Imports:
import turicreate as tc
from sklearn.model_selection import train_test_split
import numpy as np

In [2]:
# 1) In a markdown cell, discuss whether a false positive or false negative is 
# worse for this use case. State a value of β that is suitable for an Fβ score.

**Explanation:**

1. A false negative (FN) is worse than a false positive (FP). With a FP, maintenance staff might be alerted to a failure that doesn’t exist, resulting in an unnecessary inspection. However, with an FN, a failure is missed, potentially leading to serious consequences such as safety issues, equipment malfunction, or even catastrophic failure.

2. A suitable value for β in the Fβ score is 2. This value is greater than 1, which means it prioritizes recall over precision, focusing more on minimizing false negatives. Given that false negatives are more critical in this context, using β = 2 helps ensure that actual failures are detected and addressed promptly.

In [3]:
# 2) Load the CSV file into an SFrame named data. Print the SFrame. 
# Split the data into training/validation/testing sets using 80%/10%/10% respectively.

data = tc.SFrame('0399259_data.csv')
print(data)

voltage = np.array(data['Voltage'])
current = np.array(data['Current'])
temp = np.array(data['Temperature'])
condition = np.array(data['Condition'])
features = np.column_stack((voltage, current, temp))

x_train, x_valtest, y_train, y_valtest = train_test_split(features, condition, train_size=0.8)
x_val, x_test, y_val, y_test = train_test_split(x_valtest, y_valtest, train_size=0.5)

train_data = tc.SFrame({'Voltage': x_train[:, 0], 'Current': x_train[:, 1], 'Temperature': x_train[:, 2], 'Condition': y_train})
val_data = tc.SFrame({'Voltage': x_val[:, 0], 'Current': x_val[:, 1], 'Temperature': x_val[:, 2], 'Condition': y_val})
test_data = tc.SFrame({'Voltage': x_test[:, 0], 'Current': x_test[:, 1], 'Temperature': x_test[:, 2], 'Condition': y_test})

------------------------------------------------------
Inferred types from first 100 line(s) of file as 
column_type_hints=[int,float,float,float]
If parsing fails due to incorrect types, you can correct
the inferred type list above and pass it to read_csv in
the column_type_hints argument
------------------------------------------------------


+-----------+--------------------+--------------------+--------------------+
| Condition |      Voltage       |      Current       |    Temperature     |
+-----------+--------------------+--------------------+--------------------+
|     0     | 12.105402229028574 | 1.2010158804571862 | 27.021383826347545 |
|     0     | 18.47696101766036  | 9.228798007868006  | 19.223667320653277 |
|     0     | 5.699595016939316  | 31.970087490684595 | 26.006400351846032 |
|     0     | 32.26151306174472  | 13.278921807347974 | 83.45379653646954  |
|     0     | 17.214851542698558 | 10.747552975112113 | 29.968236276473505 |
|     0     | 11.509978758782651 | 13.364409126220089 | 29.776983373817707 |
|     0     | 25.21192122391736  | 16.613095269745283 | 34.362140439076796 |
|     0     | 27.07633338214532  | 10.639192625540735 | 48.82253332157811  |
|     0     | 26.55653370361028  | 5.011417360921026  | 29.52311270140705  |
|     0     | 27.626069846362416 | 14.52428111164478  | 47.07836152351818  |

In [4]:
# 3) Note in markdown whether or not feature rescaling is turned on by default 
# for the function turicreate.logistic_classifier.create, and state which 
# scale (original or rescaled) the coefficients are given in.

**Explanation:**

1. By default, feature rescaling is turned on for this function.
2. The coefficients provided are based on the rescaled features.

In [5]:
# 4) Create a perceptron named ‘model1’ using Turi Create to classify data with 
# ‘Condition’ as the target, with both l1_penalty and l2_penalty set to zero. 
# Then, for both training and validation data sets:
#  · Find the predictions using model.predict() and setting output_type=’class’.
#  · Find and display the confusion matrix using tc.evaluation.confusion_matrix().
#  · Find and display the accuracy, precision, recall, and 𝐹𝛽 score using the value 
#    of β you chose above, and the functions under tc.evaluation.

model1 = tc.logistic_classifier.create(train_data, target='Condition', l1_penalty=0, l2_penalty=0)

train_predictions = model1.predict(train_data, output_type='class')
val_predictions = model1.predict(val_data, output_type='class')

# Training Dataset
print('Training Dataset:')
print(tc.evaluation.confusion_matrix(train_data['Condition'], train_predictions))
print('Accuracy:', tc.evaluation.accuracy(train_data['Condition'], train_predictions))
print('Precision:', tc.evaluation.precision(train_data['Condition'], train_predictions))
print('Recall:', tc.evaluation.recall(train_data['Condition'], train_predictions))
print('F[2] Score:', tc.evaluation.fbeta_score(train_data['Condition'], train_predictions, beta=2))

# Validation Dataset
print('Validation Dataset:')
print(tc.evaluation.confusion_matrix(val_data['Condition'], val_predictions))
print('Accuracy:', tc.evaluation.accuracy(val_data['Condition'], val_predictions))
print('Precision:', tc.evaluation.precision(val_data['Condition'], val_predictions))
print('Recall:', tc.evaluation.recall(val_data['Condition'], val_predictions))
print('F[2] Score:', tc.evaluation.fbeta_score(val_data['Condition'], val_predictions, beta=2))

PROGRESS: Creating a validation set from 5 percent of training data. This may take a while.
          You can set ``validation_set=None`` to disable validation tracking.



Training Dataset:
+--------------+-----------------+-------+
| target_label | predicted_label | count |
+--------------+-----------------+-------+
|      0       |        0        |  382  |
|      1       |        1        |  363  |
|      0       |        1        |   26  |
|      1       |        0        |   29  |
+--------------+-----------------+-------+
[4 rows x 3 columns]

Accuracy: 0.93125
Precision: 0.9331619537275064
Recall: 0.9260204081632653
F[2] Score: 0.9274399591211037
Validation Dataset:
+--------------+-----------------+-------+
| target_label | predicted_label | count |
+--------------+-----------------+-------+
|      1       |        1        |   56  |
|      0       |        0        |   40  |
|      1       |        0        |   3   |
|      0       |        1        |   1   |
+--------------+-----------------+-------+
[4 rows x 3 columns]

Accuracy: 0.96
Precision: 0.9824561403508771
Recall: 0.9491525423728814
F[2] Score: 0.9556313993174061


In [6]:
# 5) Create a perceptron model named ‘model2’. This time experiment with the l1_penalty 
# and l2_penalty hyperparameters to see if you can get a better model using regularization. 
# Then, for both training and validation data sets:
#  · Find the predictions.
#  · Find and display the confusion matrix.
#  · Find and display the accuracy, precision, recall, and 𝐹𝛽 score.

model2 = tc.logistic_classifier.create(train_data, target='Condition', l1_penalty=20.05, l2_penalty=55.5)

train_predictions2 = model2.predict(train_data, output_type='class')
val_predictions2 = model2.predict(val_data, output_type='class')

# Training Dataset
print('Training Dataset:')
print(tc.evaluation.confusion_matrix(train_data['Condition'], train_predictions2))
print('Accuracy:', tc.evaluation.accuracy(train_data['Condition'], train_predictions2))
print('Precision:', tc.evaluation.precision(train_data['Condition'], train_predictions2))
print('Recall:', tc.evaluation.recall(train_data['Condition'], train_predictions2))
print('F[2] Score:', tc.evaluation.fbeta_score(train_data['Condition'], train_predictions2, beta=2))

# Validation Dataset
print('Validation Dataset:')
print(tc.evaluation.confusion_matrix(val_data['Condition'], val_predictions2))
print('Accuracy:', tc.evaluation.accuracy(val_data['Condition'], val_predictions2))
print('Precision:', tc.evaluation.precision(val_data['Condition'], val_predictions2))
print('Recall:', tc.evaluation.recall(val_data['Condition'], val_predictions2))
print('F[2] Score:', tc.evaluation.fbeta_score(val_data['Condition'], val_predictions2, beta=2))

PROGRESS: Creating a validation set from 5 percent of training data. This may take a while.
          You can set ``validation_set=None`` to disable validation tracking.



Training Dataset:
+--------------+-----------------+-------+
| target_label | predicted_label | count |
+--------------+-----------------+-------+
|      0       |        0        |  378  |
|      0       |        1        |   30  |
|      1       |        1        |  357  |
|      1       |        0        |   35  |
+--------------+-----------------+-------+
[4 rows x 3 columns]

Accuracy: 0.91875
Precision: 0.9224806201550387
Recall: 0.9107142857142857
F[2] Score: 0.9130434782608694
Validation Dataset:
+--------------+-----------------+-------+
| target_label | predicted_label | count |
+--------------+-----------------+-------+
|      1       |        1        |   56  |
|      0       |        0        |   40  |
|      1       |        0        |   3   |
|      0       |        1        |   1   |
+--------------+-----------------+-------+
[4 rows x 3 columns]

Accuracy: 0.96
Precision: 0.9824561403508771
Recall: 0.9491525423728814
F[2] Score: 0.9556313993174061


In [7]:
# 6) Using markdown, select which of your two models is the best (or declare a tie) and 
# justify your choice by commenting on metrics and the confusion matrix.

**Explanation:**
<pre>
The model I chose is model1. I chose this model for 2 reasons:
1. Evaluation metrics: Model1 maintains the highest turicreate evaluation metrics (accuracy, precision, recall, and F2 score) when compared to model2.
   Despite experimenting with the l1 and l2 penalties in model2, the metrics do not surpass those of model1.
2. Confusion matrix: The second reason is the confusion matrix. Changing the penalties in model2 does not lead to an improvement in the model's performance as seen in the confusion matrix.

To summarize, adjusting the penalties doesn't improve model2's performance sufficiently to surpass model1. Therefore, I chose model1 as the best model.
</pre>
    
**Metrics for model1:**
<pre>
Training Set:                                        Validation Set:
+--------------+-----------------+-------+           +--------------+-----------------+-------+
| target_label | predicted_label | count |           | target_label | predicted_label | count |
+--------------+-----------------+-------+           +--------------+-----------------+-------+
|      1       |        1        |  354  |           |      1       |        1        |   56  |
|      0       |        0        |  393  |           |      0       |        0        |   40  |
|      1       |        0        |   30  |           |      0       |        1        |   3   |
|      0       |        1        |   23  |           |      1       |        0        |   1   |
+--------------+-----------------+-------+           +--------------+-----------------+-------+

- Accuracy: 0.93375                                  - Accuracy: 0.96
- Precision: 0.9389920424403183                      - Precision: 0.9491525423728814
- Recall: 0.921875                                   - Recall: 0.9824561403508771
- F2 Score: 0.9252483010977524                       - F2 Score: 0.975609756097561
</pre>
    
**Metrics for model2 (L1=1.05, L2=2.5):**
<pre>
Training Set:                                        Validation Set:
+--------------+-----------------+-------+           +--------------+-----------------+-------+
| target_label | predicted_label | count |           | target_label | predicted_label | count |
+--------------+-----------------+-------+           +--------------+-----------------+-------+
|      1       |        1        |  351  |           |      1       |        1        |   54  |
|      0       |        0        |  396  |           |      0       |        0        |   40  |
|      1       |        0        |   33  |           |      0       |        1        |   3   |
|      0       |        1        |   20  |           |      1       |        0        |   3   |
+--------------+-----------------+-------+           +--------------+-----------------+-------+

- Accuracy: 0.93375                                  - Accuracy: 0.94
- Precision: 0.9460916442048517                      - Precision: 0.9473684210526315
- Recall: 0.9140625                                  - Recall: 0.9473684210526315
- F2 Score: 0.9202936549554274                       - F2 Score: 0.9473684210526314
</pre>
    
**Metrics for model2 (L1=100, L2=20):**
<pre>
Training Set:                                        Validation Set:
+--------------+-----------------+-------+           +--------------+-----------------+-------+
| target_label | predicted_label | count |           | target_label | predicted_label | count |
+--------------+-----------------+-------+           +--------------+-----------------+-------+
|      1       |        1        |  341  |           |      1       |        1        |   55  |
|      0       |        0        |  396  |           |      0       |        0        |   40  |
|      1       |        0        |   43  |           |      0       |        1        |   3   |
|      0       |        1        |   20  |           |      1       |        0        |   2   |
+--------------+-----------------+-------+           +--------------+-----------------+-------+

- Accuracy: 0.92125                                  - Accuracy: 0.95
- Precision: 0.9445983379501385                      - Precision: 0.9482758620689655
- Recall: 0.8880208333333334                         - Recall: 0.9649122807017544
- F2 Score: 0.8987875593041645                       - F2 Score: 0.9615384615384617
</pre>
    
**Metrics for model2 (L1=20, L2=55):**
<pre>
Training Set:                                        Validation Set:
+--------------+-----------------+-------+           +--------------+-----------------+-------+
| target_label | predicted_label | count |           | target_label | predicted_label | count |
+--------------+-----------------+-------+           +--------------+-----------------+-------+
|      1       |        1        |  343  |           |      1       |        1        |   55  |
|      0       |        0        |  395  |           |      0       |        0        |   40  |
|      1       |        0        |   41  |           |      0       |        1        |   3   |
|      0       |        1        |   21  |           |      1       |        0        |   2   |
+--------------+-----------------+-------+           +--------------+-----------------+-------+

- Accuracy: 0.9225                                   - Accuracy: 0.95
- Precision: 0.9423076923076923                      - Precision: 0.9482758620689655
- Recall: 0.8932291666666666                         - Recall: 0.9649122807017544
- F2 Score: 0.9026315789473682                       - F2 Score: 0.9615384615384617
</pre>

In [8]:
# Using the test set and your choice of best model:
#  · Find the predictions.
#  · Find and display the confusion matrix.
#  · Find and display the accuracy, precision, recall, and 𝐹𝛽 score.

test_predictions = model1.predict(test_data, output_type='class')

# Testing Dataset
print('Testing Dataset:')
print(tc.evaluation.confusion_matrix(test_data['Condition'], test_predictions))
print('Accuracy:', tc.evaluation.accuracy(test_data['Condition'], test_predictions))
print('Precision:', tc.evaluation.precision(test_data['Condition'], test_predictions))
print('Recall:', tc.evaluation.recall(test_data['Condition'], test_predictions))
print('F[2] Score:', tc.evaluation.fbeta_score(test_data['Condition'], test_predictions, beta=2))

Testing Dataset:
+--------------+-----------------+-------+
| target_label | predicted_label | count |
+--------------+-----------------+-------+
|      1       |        1        |   46  |
|      0       |        0        |   49  |
|      0       |        1        |   2   |
|      1       |        0        |   3   |
+--------------+-----------------+-------+
[4 rows x 3 columns]

Accuracy: 0.95
Precision: 0.9583333333333334
Recall: 0.9387755102040817
F[2] Score: 0.9426229508196722
