#### Customer Height Prediction

#### Overview
This Python script predicts the height of customers based on their age using various regression algorithms. Height prediction is a common task in healthcare, human resource management, and various other fields where understanding human growth patterns is essential.

#### Problem Description
The problem addressed in this script is that of predicting customer height based on their age. Understanding the relationship between age and height can be crucial for various applications, such as assessing child growth patterns or determining suitable clothing sizes for customers.

In [1]:
import pandas as pd
import numpy as np
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import RandomForestRegressor, GradientBoostingRegressor
from sklearn.linear_model import LinearRegression, Ridge, Lasso
from sklearn.svm import SVR
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

In [2]:
# Load dataset
dataset = pd.read_csv('dataset.csv')

In [3]:
print(dataset.shape)
print(dataset.head(5))

(71, 2)
   Age  Height
0   10     138
1   11     138
2   12     138
3   13     139
4   14     139


In [4]:
# Extract features (X) and target variable (Y)
X = dataset.iloc[:, :-1].values
Y = dataset.iloc[:, -1].values

In [5]:
x_train,x_test,y_train,y_test = train_test_split(X,Y,test_size=0.25,random_state=0)

In [6]:
regressors = {
    'Linear Regression': LinearRegression(),
    'Ridge Regression': Ridge(alpha=10),
    'Lasso Regression': Lasso(alpha=10),
    'Decision Tree Regressor': DecisionTreeRegressor(max_depth=None, min_samples_leaf=1, min_samples_split=2),
    'Random Forest Regressor': RandomForestRegressor(max_depth=None, min_samples_leaf=1, min_samples_split=2, n_estimators=50),
    'Gradient Boosting Regressor': GradientBoostingRegressor(learning_rate=0.1, max_depth=7, n_estimators=100),
    'Support Vector Regression': SVR(C=10, kernel='rbf')
}

In [9]:
best_rmse = float('inf')
best_regressor = None

for regressor_name, regressor in regressors.items():
    regressor.fit(x_train, y_train)
    y_pred_regression = regressor.predict(x_test)

    mse_regression = mean_squared_error(y_test, y_pred_regression)
    rmse_regression = np.sqrt(mse_regression)

    print(f"{regressor_name} Results:")
    print("Root Mean Square Error:", rmse_regression)
    print("--"*15)

    if rmse_regression < best_rmse:
        best_rmse = rmse_regression
        best_regressor = regressor

Linear Regression Results:
Root Mean Square Error: 7.2238470885301735
------------------------------
Ridge Regression Results:
Root Mean Square Error: 7.226532505076517
------------------------------
Lasso Regression Results:
Root Mean Square Error: 7.3892964435465
------------------------------
Decision Tree Regressor Results:
Root Mean Square Error: 1.4719601443879744
------------------------------
Random Forest Regressor Results:
Root Mean Square Error: 1.2006664815842878
------------------------------
Gradient Boosting Regressor Results:
Root Mean Square Error: 1.4722044028569488
------------------------------
Support Vector Regression Results:
Root Mean Square Error: 2.8037047301779263
------------------------------


In [10]:
# Train the best model on the entire dataset
best_regressor.fit(X, Y)

In [13]:
# Input from the user for prediction
input_data = []
for feature_name in dataset.columns[:-1]:
    user_input = float(input(f"Enter {feature_name}: "))
    input_data.append(user_input)

# Predict using the best model
input_data = np.array(input_data).reshape(1, -1)
predicted_output = best_regressor.predict(input_data)

print("Input data:",user_input)
print("Predicted Output:", predicted_output[0])

Input data: 20.0
Predicted Output: 141.0
