# Abalone Age Prediction

The age of abalone is determined by cutting the shell through the cone, staining it, and counting the number of rings through a microscope -- a boring and time-consuming task.  Other measurements, which are easier to obtain, are used to predict the age. Further information, such as weather patterns and location (hence food availability) may be required to solve the problem. From the original data examples with missing values were removed (the majority having the predicted value missing), and the ranges of the continuous values have been scaled for use with an ANN (by dividing by 200).

**Source**: https://archive.ics.uci.edu/ml/datasets/Abalone

<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/7/70/Haliotis_iris_01.JPG/1920px-Haliotis_iris_01.JPG">

## Goals:

### Understand the dataset
- How the features are related to each other?
- Is there redundant features?
- Is there outliers?
- Is there missing data?
- Are data types adequate for analysis?

### Understand the problem
- What features are correlated to the target feature?
- Is it possible to create new features that are correlated to the target feature?
- Answer [questions](https://en.wikipedia.org/wiki/Data_analysis#Analytical_activities_of_data_users) using data 

In [1]:
import numpy as np
import pandas as pd
import seaborn as sns
from matplotlib import pyplot as plt

%matplotlib inline

In [2]:
pd.set_option('display.max_columns', None)

In [3]:
names = [
    "Sex", "Length", "Diameter",
    "Height", "Whole weight", "Shucked weight",
    "Viscera weight", "Shell weight", "Rings"
]
df = pd.read_csv("data/abalone.data", names=names)