# Assignment 2: Linear Models and Validation Metrics (30 marks total)
### Due: October 10 at 11:59pm

### Name: 

### In this assignment, you will need to write code that uses linear models to perform classification and regression tasks. You will also be asked to describe the process by which you came up with the code. More details can be found below. Please cite any websites or AI tools that you used to help you with this assignment.

## Part 1: Classification (14.5 marks total)

You have been asked to develop code that can help the user determine if the email they have received is spam or not. Following the machine learning workflow described in class, write the relevant code in each of the steps below:

### Step 0: Import Libraries

In [2]:
import numpy as np
import pandas as pd

### Step 1: Data Input (1 mark)

The data used for this task can be downloaded using the yellowbrick library: 
https://www.scikit-yb.org/en/latest/api/datasets/spam.html

Use the yellowbrick function `load_spam()` to load the spam dataset into the feature matrix `X` and target vector `y`.

Print the size and type of `X` and `y`.

In [3]:
# TO DO: Import spam dataset from yellowbrick library
# TO DO: Print size and type of X and y
from yellowbrick.datasets import load_spam
X, y = load_spam()
print("Size of X: ", X.shape)
print("Size of y: ", y.shape)
print("Type of X: ", X.dtypes)
print("Type of y: ", y.dtypes)

Size of X:  (4600, 57)
Size of y:  (4600,)
Type of X:  word_freq_make                float64
word_freq_address             float64
word_freq_all                 float64
word_freq_3d                  float64
word_freq_our                 float64
word_freq_over                float64
word_freq_remove              float64
word_freq_internet            float64
word_freq_order               float64
word_freq_mail                float64
word_freq_receive             float64
word_freq_will                float64
word_freq_people              float64
word_freq_report              float64
word_freq_addresses           float64
word_freq_free                float64
word_freq_business            float64
word_freq_email               float64
word_freq_you                 float64
word_freq_credit              float64
word_freq_your                float64
word_freq_font                float64
word_freq_000                 float64
word_freq_money               float64
word_freq_hp                  flo

### Step 2: Data Processing (1.5 marks)

Check to see if there are any missing values in the dataset. If necessary, select an appropriate method to fill-in the missing values.

In [4]:
# TO DO: Check if there are any missing values and fill them in if necessary
print("X NaN: ", X.isnull().sum().sum())
print("Y NaN: ", y.isnull().sum().sum())

X NaN:  0
Y NaN:  0


For this task, we want to test if the linear model would still work if we used less data. Use the `train_test_split` function from sklearn to create a new feature matrix named `X_small` and a new target vector named `y_small` that contain **5%** of the data.

In [6]:
# TO DO: Create X_small and y_small 
from sklearn.model_selection import train_test_split

# This divides the data by 5% and creates a new target vector called X_small and y_small
X_ex, X_small, y_ex, y_small = train_test_split(X, y, 
                                                    test_size=0.05,
                                                    random_state=0)
print(X_ex.shape)
print(X_small.shape)

(4370, 57)
(230, 57)


### Step 3: Implement Machine Learning Model

1. Import `LogisticRegression` from sklearn
2. Instantiate model `LogisticRegression(max_iter=2000)`.
3. Implement the machine learning model with three different datasets: 
    - `X` and `y`
    - Only first two columns of `X` and `y`
    - `X_small` and `y_small`

### Step 4: Validate Model

Calculate the training and validation accuracy for the three different tests implemented in Step 3

### Step 5: Visualize Results (4 marks)

1. Create a pandas DataFrame `results` with columns: Data size, training accuracy, validation accuracy
2. Add the data size, training and validation accuracy for each dataset to the `results` DataFrame
3. Print `results`

In [30]:
# TO DO: ADD YOUR CODE HERE FOR STEPS 3-5
# Note: for any random state parameters, you can use random_state = 0
# HINT: USING A LOOP TO STORE THE DATA IN YOUR RESULTS DATAFRAME WILL BE MORE EFFICIENT

from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import cross_validate

# Now we do a train test split with a test size of 20% and train of 80% for 3 different databases:

# X and y
X_train, X_test, y_train, y_test = train_test_split(X, y, 
                                                    test_size=0.2,
                                                    random_state=0)
# first two columns of X and y
X_col2_train, X_col2_test, y_col2_train, y_col2_test = train_test_split(X.iloc[:,:2], y, 
                                                    test_size=0.2,
                                                    random_state=0)
# X_small and y_small
X_small_train, X_small_test, y_small_train, y_small_test = train_test_split(X_small, y_small, 
                                                    test_size=0.2,
                                                    random_state=0)

# Instantiate the model
model = LogisticRegression(max_iter=2000)
# Apply the model
model1 = model.fit(X_train, y_train) # X and y
model2 = model.fit(X_col2_train, y_col2_train) # X and y first two columns
model3 = model.fit(X_small_train, y_small_train) # X_small and y_small

# Training and validation scores for each model
scores1 = cross_validate(model1, X_train, y_train, cv=5, 
                        scoring='accuracy',
                       return_train_score=True)

for label_pair in [ ('train_score', 'train_score')]:
    train1 = (scores1[label_pair[0]].mean())
for label_pair in [ ('test_score', 'validation_score')]:
    val1 = (scores1[label_pair[0]].mean())
    
scores2 = cross_validate(model2, X_col2_train, y_col2_train, cv=5, 
                        scoring='accuracy',
                       return_train_score=True)

for label_pair in [ ('train_score', 'train_score')]:
    train2 = (scores2[label_pair[0]].mean())
for label_pair in [ ('test_score', 'validation_score')]:
    val2 = (scores2[label_pair[0]].mean())
    
scores3 = cross_validate(model3, X_small_train, y_small_train, cv=5, 
                        scoring='accuracy',
                       return_train_score=True)

for label_pair in [ ('train_score', 'train_score')]:
    train3 = (scores3[label_pair[0]].mean())
for label_pair in [ ('test_score', 'validation_score')]:
    val3 = (scores3[label_pair[0]].mean())
    
# Creating the Dataframe
data = {'Data Size': [X.shape, X.iloc[:,:2].shape, X_small.shape],
            'Training Accuracy': [train1, train2, train3],
            'Validation Accuracy': [val1, val2, val3]}
results = pd.DataFrame(data)
print(results)

    Data Size  Training Accuracy  Validation Accuracy
0  (4600, 57)           0.927717             0.921739
1   (4600, 2)           0.615693             0.614402
2   (230, 57)           0.956527             0.902102


### Questions (4 marks)
1. How do the training and validation accuracy change depending on the amount of data used? Explain with values.
2. In this case, what do a false positive and a false negative represent? Which one is worse?

#### Answers
1. The training and validation accuracy is the best for the entire dataset, with the most data, since both training and validation is similar at 92.8% and 92.2% at this point. When only the first two columns are used, both training and accuracy is terrible at 61.6% and 61.4% and the data is underfitted with high-bias. This might be because all columns are significant. For just 5% of the data, the training is very high at 95.6% but the testing is low at 90.2% indicating an overfitted data and a high-variance.

2. In this case, a false positive might be that the email received is not spam but labelled as such and a false negative might be that the email is spam but not categorized. In this case, a false positive would be far worse, since you do not want important emails to be marked as spam, while spam emails are fine to ocassionally pop-up.

### Process Description (4 marks)
Please describe the process you used to create your code. Cite any websites or generative AI tools used. You can use the following questions as guidance:
1. Where did you source your code?
1. In what order did you complete the steps?
1. If you used generative AI, what prompts did you use? Did you need to modify the code at all? Why or why not?
1. Did you have any challenges? If yes, what were they? If not, what helped you to be successful?

*DESCRIBE YOUR PROCESS HERE*

1. The source code was derived from the lecture example: Linear Example. All information originated from here.
2. The steps were completed in the order they were asked.
3. No generative AI was necessary since the lecture code was simple enough to easily modify without external resources.
4. The only challenge faced was dividing the data into 5%, which was overcome with the help of the professor.

## Part 2: Regression (10.5 marks total)

For this section, we will be evaluating concrete compressive strength of different concrete samples, based on age and ingredients. You will need to repeat the steps 1-4 from Part 1 for this analysis.

### Step 1: Data Input (1 mark)

The data used for this task can be downloaded using the yellowbrick library: 
https://www.scikit-yb.org/en/latest/api/datasets/concrete.html

Use the yellowbrick function `load_concrete()` to load the spam dataset into the feature matrix `X` and target vector `y`.

Print the size and type of `X` and `y`.

In [34]:
# TO DO: Import spam dataset from yellowbrick library
# TO DO: Print size and type of X and y

from yellowbrick.datasets import load_concrete
X, y = load_concrete()
print("Size of X: ", X.shape)
print("Size of y: ", y.shape)
print("Type of X: ", X.dtypes)
print("Type of y: ", y.dtypes)

Size of X:  (1030, 8)
Size of y:  (1030,)
Type of X:  cement    float64
slag      float64
ash       float64
water     float64
splast    float64
coarse    float64
fine      float64
age         int64
dtype: object
Type of y:  float64


### Step 2: Data Processing (0.5 marks)

Check to see if there are any missing values in the dataset. If necessary, select an appropriate method to fill-in the missing values.

In [36]:
# TO DO: Check if there are any missing values and fill them in if necessary
print("X NaN: ", X.isnull().sum().sum())
print("Y NaN: ", y.isnull().sum().sum())

X NaN:  0
Y NaN:  0


### Step 3: Implement Machine Learning Model (1 mark)

1. Import `LinearRegression` from sklearn
2. Instantiate model `LogisticRegression(max_iter=2000)`.
3. Implement the machine learning model with `X` and `y`

In [53]:
# TO DO: ADD YOUR CODE HERE
# Note: for any random state parameters, you can use random_state = 0

from sklearn.linear_model import LinearRegression

# Now we do a train test split with a test size of 20% and train of 80% for the database:

# X and y
X_train, X_test, y_train, y_test = train_test_split(X, y, 
                                                    test_size=0.2,
                                                    random_state=0)

# Instantiate the model
model = LinearRegression()

# Apply the model
model = model.fit(X_train, y_train) # X and y

### Step 4: Validate Model (1 mark)

Calculate the training and validation accuracy using mean squared error and R2 score.

In [54]:
# TO DO: ADD YOUR CODE HERE
from sklearn.metrics import mean_squared_error, r2_score

# Predictions for the training and testing sets for y
y_train_pred = model.predict(X_train)
y_test_pred = model.predict(X_test)

# Calculate MSE for the training and testing sets
mse_train = mean_squared_error(y_train, y_train_pred)
mse_test = mean_squared_error(y_test, y_test_pred)

# Calculate R2 score for the training and testing sets
r2_train = r2_score(y_train, y_train_pred)
r2_test = r2_score(y_test, y_test_pred)
    
print("Training Mean squared error:", mse_train)
print("Validation MSE:", mse_test)
print("Training R2 Score:", r2_train)
print("Validation R2 Score:", r2_test)

Training Mean squared error: 110.34550122934108
Validation MSE: 95.63533482690428
Training R2 Score: 0.6090710418548884
Validation R2 Score: 0.6368981103411242


### Step 5: Visualize Results (1 mark)
1. Create a pandas DataFrame `results` with columns: Training accuracy and Validation accuracy, and index: MSE and R2 score
2. Add the accuracy results to the `results` DataFrame
3. Print `results`

In [55]:
# TO DO: ADD YOUR CODE HERE

# Creating the Dataframe
data = {'Training Accuracy': [mse_train, r2_train],
            'Validation Accuracy':[mse_test, r2_test]}

results = pd.DataFrame(data, index=['MSE', 'R2 Score'])

print(results)

          Training Accuracy  Validation Accuracy
MSE              110.345501            95.635335
R2 Score           0.609071             0.636898


### Questions (2 marks)
1. Did using a linear model produce good results for this dataset? Why or why not?

Using the linear model did not produce good results since the mean squared error was very high, it should be close to 0. Also, the R2 score was very low and should have been closer to 1. 

### Process Description (4 marks)
Please describe the process you used to create your code. Cite any websites or generative AI tools used. You can use the following questions as guidance:
1. Where did you source your code?
1. In what order did you complete the steps?
1. If you used generative AI, what prompts did you use? Did you need to modify the code at all? Why or why not?
1. Did you have any challenges? If yes, what were they? If not, what helped you to be successful?

*DESCRIBE YOUR PROCESS HERE*

1. The source code was derived from the lecture example: Regression Metrics and Linear Regression. All information originated from here.
2. The steps were completed in the order they were asked.
3. Generative AI was used to get the mean squared error and R2 score. ChatGPT was asked "Calculate the training and validation accuracy using mean squared error and R2 score". The result was very close and only the variable names needed to be modified to fit the current code. Only a snipet of code was used from the code generated by ChatGPT.
4. The only challenge faced was how exactly to get MSE and R2 score since there was no lecture example which went over these steps. ChatGPT was useful in helping generate this code.

## Part 3: Observations/Interpretation (3 marks)

Describe any pattern you see in the results. Relate your findings to what we discussed during lectures. Include data to justify your findings.


*ADD YOUR FINDINGS HERE*

For the R2 score, the validation accuracy was higher than the training (0.637 vs. 0.609) which should not happen according to the discussion. It should be close to each other but should never cross. The mean squared error for both testing and validation was very high (110.3 and 95.6). This error should be as low as possible to be a good fit, which it was not, meaning the model underfit the data. This means the data had a high bias and the model was too simple; we needed a more complex model to describe the data.

## Part 4: Reflection (2 marks)
Include a sentence or two about:
- what you liked or disliked,
- found interesting, confusing, challangeing, motivating
while working on this assignment.


*ADD YOUR THOUGHTS HERE*

I liked how there were multiple datasets derived from the same data in part one that were used to compare results. This made it easier to grasp the concept about how the same data can be used in various ways to attain different results. The lab itself was simple to accomplish, since the code was given in lecture examples, making it very efficient to follow. The only challenge was dividing the data in the first part, since it required a bit of clarity.

## Part 5: Bonus Question (4 marks)

Repeat Part 2 with Ridge and Lasso regression to see if you can improve the accuracy results. Which method and what value of alpha gave you the best R^2 score? Is this score "good enough"? Explain why or why not.

**Remember**: Only test values of alpha from 0.001 to 100 along the logorithmic scale.

In [73]:
# TO DO: ADD YOUR CODE HERE
from sklearn.linear_model import Ridge
from sklearn.metrics import mean_squared_error, r2_score

# Ridge Alpha = 0.001
ridge0001 = Ridge(alpha=0.001).fit(X_train, y_train)
# Predictions for the training and testing sets for y
y_train_pred = ridge0001.predict(X_train)
y_test_pred = ridge0001.predict(X_test)
# Calculate MSE for the training and testing sets
mse_train0001 = mean_squared_error(y_train, y_train_pred)
mse_test0001 = mean_squared_error(y_test, y_test_pred)
# Calculate R2 score for the training and testing sets
r2_train0001 = r2_score(y_train, y_train_pred)
r2_test0001 = r2_score(y_test, y_test_pred)

# Ridge Alpha = 0.1
ridge01 = Ridge(alpha=0.1).fit(X_train, y_train)
# Predictions for the training and testing sets for y
y_train_pred = ridge01.predict(X_train)
y_test_pred = ridge01.predict(X_test)
# Calculate MSE for the training and testing sets
mse_train01 = mean_squared_error(y_train, y_train_pred)
mse_test01 = mean_squared_error(y_test, y_test_pred)
# Calculate R2 score for the training and testing sets
r2_train01 = r2_score(y_train, y_train_pred)
r2_test01 = r2_score(y_test, y_test_pred)

# Ridge Alpha = 1
ridge1 = Ridge().fit(X_train, y_train)
# Predictions for the training and testing sets for y
y_train_pred = ridge1.predict(X_train)
y_test_pred = ridge1.predict(X_test)
# Calculate MSE for the training and testing sets
mse_train1 = mean_squared_error(y_train, y_train_pred)
mse_test1 = mean_squared_error(y_test, y_test_pred)
# Calculate R2 score for the training and testing sets
r2_train1 = r2_score(y_train, y_train_pred)
r2_test1 = r2_score(y_test, y_test_pred)

# Ridge Alpha = 10
ridge10 = Ridge(alpha=10).fit(X_train, y_train)
# Predictions for the training and testing sets for y
y_train_pred = ridge10.predict(X_train)
y_test_pred = ridge10.predict(X_test)
# Calculate MSE for the training and testing sets
mse_train10 = mean_squared_error(y_train, y_train_pred)
mse_test10 = mean_squared_error(y_test, y_test_pred)
# Calculate R2 score for the training and testing sets
r2_train10 = r2_score(y_train, y_train_pred)
r2_test10 = r2_score(y_test, y_test_pred)

# Ridge Alpha = 100
ridge100 = Ridge(alpha=100).fit(X_train, y_train)
# Predictions for the training and testing sets for y
y_train_pred = ridge100.predict(X_train)
y_test_pred = ridge100.predict(X_test)
# Calculate MSE for the training and testing sets
mse_train100 = mean_squared_error(y_train, y_train_pred)
mse_test100 = mean_squared_error(y_test, y_test_pred)
# Calculate R2 score for the training and testing sets
r2_train100 = r2_score(y_train, y_train_pred)
r2_test100 = r2_score(y_test, y_test_pred)

# Ridge Alpha = 1000
ridge1000 = Ridge(alpha=1000).fit(X_train, y_train)
# Predictions for the training and testing sets for y
y_train_pred = ridge1000.predict(X_train)
y_test_pred = ridge1000.predict(X_test)
# Calculate MSE for the training and testing sets
mse_train1000 = mean_squared_error(y_train, y_train_pred)
mse_test1000 = mean_squared_error(y_test, y_test_pred)
# Calculate R2 score for the training and testing sets
r2_train1000 = r2_score(y_train, y_train_pred)
r2_test1000 = r2_score(y_test, y_test_pred)

data = {'Training Accuracy: 0.001': [mse_train0001, r2_train0001],
        'Validation Accuracy: 0.001':[mse_test0001, r2_test0001],
        'Training Accuracy: 0.1': [mse_train01, r2_train01],
        'Validation Accuracy: 0.1':[mse_test01, r2_test01],
        'Training Accuracy: 1': [mse_train1, r2_train1],
        'Validation Accuracy: 1':[mse_test1, r2_test1],
        'Training Accuracy: 10': [mse_train10, r2_train10],
        'Validation Accuracy: 10':[mse_test10, r2_test10],
        'Training Accuracy: 100': [mse_train100, r2_train100],
        'Validation Accuracy: 100':[mse_test100, r2_test100],
        'Training Accuracy: 1000': [mse_train1000, r2_train1000],
        'Validation Accuracy: 1000':[mse_test1000, r2_test1000]}

results = pd.DataFrame(data, index=['MSE', 'R2 Score']).transpose().head(30)

print(results)

                                   MSE  R2 Score
Training Accuracy: 0.001    110.345501  0.609071
Validation Accuracy: 0.001   95.635335  0.636898
Training Accuracy: 0.1      110.345501  0.609071
Validation Accuracy: 0.1     95.635324  0.636898
Training Accuracy: 1        110.345501  0.609071
Validation Accuracy: 1       95.635231  0.636899
Training Accuracy: 10       110.345502  0.609071
Validation Accuracy: 10      95.634301  0.636902
Training Accuracy: 100      110.345597  0.609071
Validation Accuracy: 100     95.625173  0.636937
Training Accuracy: 1000     110.353529  0.609043
Validation Accuracy: 1000    95.548714  0.637227


In [75]:
from sklearn.linear_model import Lasso

# lasso Alpha = 0.001
lasso0001 = Lasso(alpha=0.001, max_iter=100000).fit(X_train, y_train)
# Predictions for the training and testing sets for y
y_train_pred = lasso0001.predict(X_train)
y_test_pred = lasso0001.predict(X_test)
# Calculate MSE for the training and testing sets
mse_train0001 = mean_squared_error(y_train, y_train_pred)
mse_test0001 = mean_squared_error(y_test, y_test_pred)
# Calculate R2 score for the training and testing sets
r2_train0001 = r2_score(y_train, y_train_pred)
r2_test0001 = r2_score(y_test, y_test_pred)

# lasso Alpha = 0.1
lasso01 = Lasso(alpha=0.1, max_iter=100000).fit(X_train, y_train)
# Predictions for the training and testing sets for y
y_train_pred = lasso01.predict(X_train)
y_test_pred = lasso01.predict(X_test)
# Calculate MSE for the training and testing sets
mse_train01 = mean_squared_error(y_train, y_train_pred)
mse_test01 = mean_squared_error(y_test, y_test_pred)
# Calculate R2 score for the training and testing sets
r2_train01 = r2_score(y_train, y_train_pred)
r2_test01 = r2_score(y_test, y_test_pred)

# Lasso Alpha = 1
lasso1 = Lasso(max_iter=100000).fit(X_train, y_train)
# Predictions for the training and testing sets for y
y_train_pred = lasso1.predict(X_train)
y_test_pred = lasso1.predict(X_test)
# Calculate MSE for the training and testing sets
mse_train1 = mean_squared_error(y_train, y_train_pred)
mse_test1 = mean_squared_error(y_test, y_test_pred)
# Calculate R2 score for the training and testing sets
r2_train1 = r2_score(y_train, y_train_pred)
r2_test1 = r2_score(y_test, y_test_pred)

# Lasso Alpha = 10
lasso10 = Lasso(alpha=10, max_iter=100000).fit(X_train, y_train)
# Predictions for the training and testing sets for y
y_train_pred = lasso10.predict(X_train)
y_test_pred = lasso10.predict(X_test)
# Calculate MSE for the training and testing sets
mse_train10 = mean_squared_error(y_train, y_train_pred)
mse_test10 = mean_squared_error(y_test, y_test_pred)
# Calculate R2 score for the training and testing sets
r2_train10 = r2_score(y_train, y_train_pred)
r2_test10 = r2_score(y_test, y_test_pred)

# Lasso Alpha = 100
lasso100 = Lasso(alpha=100, max_iter=100000).fit(X_train, y_train)
# Predictions for the training and testing sets for y
y_train_pred = lasso100.predict(X_train)
y_test_pred = lasso100.predict(X_test)
# Calculate MSE for the training and testing sets
mse_train100 = mean_squared_error(y_train, y_train_pred)
mse_test100 = mean_squared_error(y_test, y_test_pred)
# Calculate R2 score for the training and testing sets
r2_train100 = r2_score(y_train, y_train_pred)
r2_test100 = r2_score(y_test, y_test_pred)

# Lasso Alpha = 1000
lasso1000 = Lasso(alpha=1000, max_iter=100000).fit(X_train, y_train)
# Predictions for the training and testing sets for y
y_train_pred = lasso1000.predict(X_train)
y_test_pred = lasso1000.predict(X_test)
# Calculate MSE for the training and testing sets
mse_train1000 = mean_squared_error(y_train, y_train_pred)
mse_test1000 = mean_squared_error(y_test, y_test_pred)
# Calculate R2 score for the training and testing sets
r2_train1000 = r2_score(y_train, y_train_pred)
r2_test1000 = r2_score(y_test, y_test_pred)

data = {'Training Accuracy: 0.001': [mse_train0001, r2_train0001],
        'Validation Accuracy: 0.001':[mse_test0001, r2_test0001],
        'Training Accuracy: 0.1': [mse_train01, r2_train01],
        'Validation Accuracy: 0.1':[mse_test01, r2_test01],
        'Training Accuracy: 1': [mse_train1, r2_train1],
        'Validation Accuracy: 1':[mse_test1, r2_test1],
        'Training Accuracy: 10': [mse_train10, r2_train10],
        'Validation Accuracy: 10':[mse_test10, r2_test10],
        'Training Accuracy: 100': [mse_train100, r2_train100],
        'Validation Accuracy: 100':[mse_test100, r2_test100],
        'Training Accuracy: 1000': [mse_train1000, r2_train1000],
        'Validation Accuracy: 1000':[mse_test1000, r2_test1000]}

results = pd.DataFrame(data, index=['MSE', 'R2 Score']).transpose().head(30)

print(results)

                                   MSE  R2 Score
Training Accuracy: 0.001    110.345501  0.609071
Validation Accuracy: 0.001   95.634971  0.636899
Training Accuracy: 0.1      110.346120  0.609069
Validation Accuracy: 0.1     95.599545  0.637034
Training Accuracy: 1        110.407340  0.608852
Validation Accuracy: 1       95.335850  0.638035
Training Accuracy: 10       112.093055  0.602880
Validation Accuracy: 10      95.114791  0.638874
Training Accuracy: 100      151.368492  0.463736
Validation Accuracy: 100    126.142568  0.521070
Training Accuracy: 1000     282.264844  0.000000
Validation Accuracy: 1000   265.384493 -0.007594


*ANSWER HERE*

For Ridge Regression, when testing R2 score, they were all pretty similar. They were only slightly different for alpha of 100 and 1000. The best one was an alpha of 1000 with a training accuracy of 0.609043 and validation accuracy of 0.637227. This was only minutely better than the linear regression. The results were still very bad and did not describe the data properly. There was still a high bias and the model was too simple. For the Lasso Regression, the best R2 score belonged to alpha of 10. This had a training score of 0.602880 and validation score of 0.638874. The worst score was an alpha of 1000 with a score of 0 and negative. Lasso Regression had the best validation score out of all the models, but was still a poor score overall, since there was still a high bias and the model was still to simple for the dataset. As R2 score should be close to 1, the models did not come close to this score at all.
