## Abalone Age Prediction

Rahul Reddy Yasam

Jogendra Aniruddh Ayatha

Anvitha Swamy

Rishabh Singh

### Abstract:
Ear-shells or sea ears are the common name for abalones which are marine gastropod mollusks distributed around the globe. Normally, the determination of age in abalones requires a tedious procedure of cutting out a shell sample, then stain it under microscope whereby one can count the rings within shell layers. Age determination in abalone is tedious and this research work uses machine learning algorithms on the body measurements dataset. This aims at predicting the age-range of abalone shells from their morphological indices treating it as a regression task. This research proposes to explore different regression models that provide better and cheaper ways than the expensive and invasive traditional method of estimating abalone age. The expected results expect to develop a dependable predictive model based on shell geometrical parameters in order to significantly slash expenses and simplify this procedure.

### Introduction:


Abalone is very much sought after as food around the world and can be also called a valuable source of iron and pantothenic acid. A large amount of minerals and vitamins are contained in this aquatic organi.s. The market price of abalone depends on its age which is an important feature for abalone farmers and buyers as well. Nevertheless, the conventional age determination techniques for abalones involve shell cutting and subsequent ring counting with a microscope that are very challenging. However, abalone age is hard to estimate because of other influences apart from just age like food availability and inter-populations’ variances in growth rates. Despite that this is a complex and expensive strategy, it poses the need of finding an easier way or other approache

Machine learning algorithms identify trends and generate predictions by using observational data. The process of regression within a dataset is quite complicated whereby each sample is assigned a class label but this is even worse in large datasets or those with extensive features.s.

The goal of this report is to highlight a set of indications which precisely predict ring and, consequently, age in abalones. Shell sample preparation is generally needed in traditional abalone age determination that involves “rings” counting through microscopes. The regression problem that we use as a case study is based on a dataset from the University of California Irvine Machine Learning Repository with abalone physical characteristics and age data. This paper aims at determining the correlation between morphological characteristics and life age of abalone. A number of these regression approaches will be compared and the best predictive technique will be seen as superior.

### Literature Review:

This research intends to examine non-invasive, less costly, and potentially more accurate age determination methods for Abalones. The intended result seeks to achieve a reliable predictive approach which relies on the use of some shell geometrical parameters that are much efficient in cost-saving and age estimation (1).

Abalones may be considered valuable since it is both delicious when eaten and full of nutrition, including iron and pantothenic acid, from an ecological point of view. This is an indication that market valuation of abalones has a strong correlation to age as it pertains to both the abalone fme]er. Nevertheless, the traditional ways of finding out their abalone age are limited and not that reliable, such as microscopically cut rings along with some other indicators in shells they fo (2)rm.

Nevertheless, for a variety of reasons, including variation in feeding success rates as well as different growth patterns among different populations, estimating abalone age still presents challenges. While traditional approaches are complex and expensive, a strong urge to look for more sensible and unconventional alternatives is finate (3)]f t.

This report seeks out parameters that are highly precise for making rings and consequently inferring the age of abalones. In the conventional case, abalone age determination requires careful shell sample preparation and sophisticated ring co (4)o
cope. This paper intends to show that there exists a linkage between abalone’s physical traits and length of life by using Regression Problem with UC Irvine machine learning Repository data including weight, shell height, etc. Several regression techniques will be evaluated in this study and the best predictive identification (5).# 
#### Research Question
1. How can the age of abalones be accurately predicted using machine learning algorithms based on their physical characteristics:

#### References:

[1] Li Chen and John Ryan, "Abalone in Diasporic Chinese Culture: The Transformation of Biocultural Traditions through Engagement with the Western Australian Environment", Heritage, vol. 1, no. 1, pp. 122-41, 2018

[2] Hiran. Mayukh, M. Hossain and N. M. Chowdhurry, "Econometrics Ways to Estimate the Age and Price of Abalone", Age of Abalones Using Physical Characteristics: A Classification Problem, no. Ml, pp. 1-4, 2010.

[3] Mobarak Hossain and Niaz Murshed Chowdhury, Econometric Ways to Estimate the Age and Price of Abalone, no. 7068,
 2008.
[4] Jiaxu Dong et al., "Abalone Muscle Texture Evaluation and Prediction Based on TPA Experiment", Journal of Food Quality, vol. 2017, 2017.

[5] Dheeru Dua and Casey Graff, "UCI Machine Learning Repository", Archive.Ics.Uci.Edu, 2017, [online] Available: https://archive.ics.uci.edu/ml/citation_policy.html.ified.



In [None]:
# Import necessary libraries
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.neighbors import KNeighborsRegressor
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
%matplotlib inline

In [None]:
# Load the dataset
df=pd.read_csv('abalone.csv')

### Data Collection/Cleaning/Processing/Description :
We are collecting the data from Kaggle. The link to the dataset is https://www.kaggle.com/code/ashokkumarpalivela/abalone-age-prediction/input

The dataset comprises important data concerning abalones with regard to a number of body parts as well as years old. These attributes are important in regression tasks where they will be used to predict abalone’s approximate or actual age based on its physical characteristics.

In [None]:
df.head(10)

In [None]:
df.columns

In [None]:
df.info()

In [None]:
df.apply(lambda x:sum(x.isnull()),axis=0)

In [None]:
# Calculate 'Age' based on 'Class_number_of_rings' as age determination
df['Age'] = df['Class_number_of_rings'] + 1.5
df.head(10)

In [None]:
df.drop('Class_number_of_rings', axis=1, inplace=True)
df.describe()

In [None]:
# Encode categorical data in 'Sex' column
df.pivot_table(df,index=['Sex'])

### Data Visualization:

In [None]:
df['Age'].hist(bins=10)

In [None]:
# Generate correlation matrix
import seaborn as sns
df_num = df.drop('Sex', axis=1)
corr = df_num.corr()

# Plot heatmap for visualization
sns.heatmap(corr, annot=True)
plt.title("Abalone Correlation")

### Correlation Analysis

In [None]:
# Generated and displayed a correlation matrix for numerical columns after removing the 'Sex' column, providing insights into the relationships between variables in the DataFrame.
from tabulate import tabulate
df_num = df.drop('Sex', axis=1)
corr = df_num.corr()
print(tabulate(corr, headers='keys', tablefmt='fancy_grid'))


In [None]:
df_num.corr() #correalation analysis

In [None]:
df_num.cov() #covariance

In [None]:
# Visualized pairwise relationships among numerical columns in 'df', distinguishing data by 'Sex' with Seaborn's pairplot.
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')  # Ignore all warnings (use cautiously)

sns.pairplot(df,hue='Sex')

In [None]:
X=df.iloc[:,:-1].values
X

In [None]:
Y=df.iloc[:,-1].values
Y

In [None]:
# Used LabelEncoder to transform categorical data in the first column (index 0) of array 'X' into numerical values.
from sklearn.preprocessing import LabelEncoder
lencode = LabelEncoder()
X[:,0] = lencode.fit_transform(X[:,0])
X

In [None]:
X=X.astype(float)
X

In [None]:
df = df.drop('Sex', axis=1)

In [None]:
X = df.drop(columns=['Age'])
y = df['Age']

In [None]:
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

### Machine Learning Model

In [None]:
regressor = KNeighborsRegressor(n_neighbors=29, weights='distance')

In [None]:
# Train the regressor model
regressor.fit(X_train, y_train)

In [None]:
y_pred = regressor.predict(X_test)

In [None]:
# Calculate evaluation metrics - Root Mean Squared Error (RMSE) and R-squared
rmse = mean_squared_error(y_test, y_pred, squared=False)
r2 = r2_score(y_test, y_pred)


In [None]:
print('Root Mean Squared Error:', rmse)
print('R-squared Score:', r2)

### Results and Conclussion:

In Conclusion, the project looks at the predictions for abalone age by a series of physical measurement regressions. However, simple strategies such as body size and weight are acceptable for age classifications generally, but may need additional complex techniques in regards to detailed age information. The investigation indicates that some measurements may be combined for effective labelling of albacores with different ages, in relation to sex. Yet, in order to accurately predict differentiated age estimates, it might be imperative to conduct additional studies involving improved regression models. In general, it shows that it may have practical implications for radiometric dating of abalone and suggests further studies for more accurate results.