# Description

This dataset contains synthetic data designed for predicting age based on various health and lifestyle factors. It includes 3,000 rows with 24 features, each representing different aspects of physical health and lifestyle. 

## Features:

* Height (cm): The height of the individual in centimeters.
* Weight (kg): The weight of the individual in kilograms.
* Blood Pressure (s/d): Blood pressure (systolic/diastolic) in mmHg.
* Cholesterol Level (mg/dL): Cholesterol level in milligrams per deciliter.
* BMI: Body Mass Index, calculated from height and weight.
* Blood Glucose Level (mg/dL): Blood glucose level in milligrams per deciliter.
* Bone Density (g/cm²): Bone density in grams per square centimeter.
* Vision Sharpness: Vision sharpness on a scale from 0 (blurry) to 100 (perfect).
* Hearing Ability (dB): Hearing ability in decibels.
* Physical Activity Level: Categorized as 'Low', 'Moderate', or 'High'.
* Smoking Status: Categorical values including 'Never', 'Former', and 'Current'.
* Alcohol Consumption: Frequency of alcohol consumption.
* Diet: Type of diet, categorized as 'Balanced', 'High Protein', 'Low Carb', etc.
* Chronic Diseases: Presence of chronic diseases (e.g., diabetes, hypertension).
* Medication Use: Usage of medication.
* Family History: Presence of family history of age-related conditions.
* Cognitive Function: Self-reported cognitive function on a scale from 0 (poor) to 100 (excellent).
* Mental Health Status: Self-reported mental health status on a scale from 0 (poor) to 100 (excellent).
* Sleep Patterns: Average number of sleep hours per night.
* Stress Levels: Self-reported stress levels on a scale from 0 (low) to 100 (high).
* Pollution Exposure: Exposure to pollution measured in arbitrary units.
* Sun Exposure: Average sun exposure in hours per week.
* Education Level: Highest level of education attained.
* Income Level: Annual income in USD.
* Age (years): The target variable representing the age of the individual.

# Import Necessary Libraries

In [1]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np

from sklearn.impute import SimpleImputer
from sklearn.ensemble import RandomForestClassifier

from sklearn.metrics import mean_squared_error,accuracy_score
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import train_test_split,GridSearchCV
from sklearn.preprocessing import StandardScaler, LabelEncoder, MinMaxScaler
from sklearn.model_selection import KFold


# To suppress warnings
%matplotlib inline
import warnings
warnings.filterwarnings('ignore')

  from pandas.core import (


# Load Datasets

In [2]:
# Load the training and testing datasets
train_df = pd.read_csv('Train.csv') 
test_df = pd.read_csv('Test.csv')  