## Instructions

Download the dataset here: https://archive.ics.uci.edu/ml/datasets/AbaloneLinks to an external site.

The data and variables names are in different files; you will likely need them both. The goal here is to predict the age of the abalone using the other variables in the dataset because the traditional method for aging these organisms is boring and tedious.

There are two challenges (in my opinion):

1. You should try to build the best, stacking-based model(s) to predict age.

2. The UC Irvine Machine Learning Repository classifies this dataset as a "classification" dataset, but age is stored as a numeric (albeit discrete-valued) variable. So, I think it could maybe be reasonable to treat this as a regression problem. It's up to you!

How does your work here compare to your results with bagging?!

## Data Import

In [3]:
# Packages
import pandas as pd
from sklearn.compose import ColumnTransformer
from sklearn.metrics import mean_squared_error, r2_score
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import OneHotEncoder
from sklearn.tree import DecisionTreeRegressor
import numpy as np
import matplotlib.pyplot as plt

In [4]:
# Data
abalone_df = pd.read_csv('Data/abalone.data', header=None)
abalone_df.columns = [
    'Sex',
    'Length',
    'Diameter',
    'Height',
    'Whole_weight',
    'Shucked_weight',
    'Viscera_weight',
    'Shell_weight',
    'Rings'
]
abalone_df.head()

Unnamed: 0,Sex,Length,Diameter,Height,Whole_weight,Shucked_weight,Viscera_weight,Shell_weight,Rings
0,M,0.455,0.365,0.095,0.514,0.2245,0.101,0.15,15
1,M,0.35,0.265,0.09,0.2255,0.0995,0.0485,0.07,7
2,F,0.53,0.42,0.135,0.677,0.2565,0.1415,0.21,9
3,M,0.44,0.365,0.125,0.516,0.2155,0.114,0.155,10
4,I,0.33,0.255,0.08,0.205,0.0895,0.0395,0.055,7


## Data Prep

In [5]:
# Rings +1.5 gives the age in years
abalone_df["Age"] = abalone_df["Rings"] + 1.5
abalone_df.drop(columns=["Rings"], inplace=True)
abalone_df.head()

Unnamed: 0,Sex,Length,Diameter,Height,Whole_weight,Shucked_weight,Viscera_weight,Shell_weight,Age
0,M,0.455,0.365,0.095,0.514,0.2245,0.101,0.15,16.5
1,M,0.35,0.265,0.09,0.2255,0.0995,0.0485,0.07,8.5
2,F,0.53,0.42,0.135,0.677,0.2565,0.1415,0.21,10.5
3,M,0.44,0.365,0.125,0.516,0.2155,0.114,0.155,11.5
4,I,0.33,0.255,0.08,0.205,0.0895,0.0395,0.055,8.5
