###Create a regression model with Artificial Neural Networks (ANN) by following steps like data preparation and model training for predicting continuous outcomes

A regression model is used when the goal is to predict a continuous target variable based on one or more input features.

In [None]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, LabelEncoder
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense


In [None]:
# Load the dataset
file_path = '/content/Titaanic Datasetforlogisticregression .csv'
data = pd.read_csv(file_path)

In [None]:
data

Unnamed: 0,Survived,Pclass,Sex,Age,Siblings/Spouses Aboard,Parents/Children Aboard,Fare
0,0,3,male,22.0,1,0,7.2500
1,1,1,female,38.0,1,0,71.2833
2,1,3,female,26.0,0,0,7.9250
3,1,1,female,35.0,1,0,53.1000
4,0,3,male,35.0,0,0,8.0500
...,...,...,...,...,...,...,...
882,0,2,male,27.0,0,0,13.0000
883,1,1,female,19.0,0,0,30.0000
884,0,3,female,7.0,1,2,23.4500
885,1,1,male,26.0,0,0,30.0000


In [None]:
# Handling missing values
data['Age'].fillna(data['Age'].mean(), inplace=True)

In [None]:
# Encoding categorical variables
label_encoder = LabelEncoder()
data['Sex'] = label_encoder.fit_transform(data['Sex'])

*  **Encoding categorical variable** encode categorical labels into numerical labels.
* label_encoder = LabelEncoder(): This initializes a LabelEncoder object. The LabelEncoder converts categorical labels (i.e., strings or integers) into numerical labels (0, 1, 2, ...).

In [None]:
# Defining features and target variable
X = data.drop(['Fare'], axis=1)
y = data['Fare']

* we are dropping target variable and considering rest of the variables
* X = data.drop(['Fare'], axis=1): This line creates a new DataFrame X that contains all columns from the DataFrame data except for the column named 'Fare'
* axis=1 specifies that we are dropping a column

In [None]:
# Splitting the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [None]:
# Standardizing the data
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)


In [None]:
# Define the ANN model
model = Sequential()
model.add(Dense(units=64, activation='relu', input_shape=(X_train_scaled.shape[1],)))
model.add(Dense(units=32, activation='relu'))
model.add(Dense(units=16, activation='relu'))
model.add(Dense(units=1))

* model = Sequential(): This initializes a sequential model in Keras. A sequential model is appropriate for a plain stack of layers
* Dense(units=64): Adds a fully connected (dense) layer with 64 neurons.
* ReLU is commonly used in hidden layers of neural networks as it introduces non-linearity.


In [None]:
# Compile the model
model.compile(optimizer='adam', loss='mean_squared_error')

* Adam stands for Adaptive Moment Estimation, which is an adaptive learning rate optimization algorithm.
* Mean Squared Error (MSE) is a common loss function used for regression tasks used to calculate difference between the predicted values and the actual target values.


* Setting epochs=100 means that the model will be trained on the entire dataset 100 times.
* A batch size of 32 means that the model will update its weights after processing every 32 samples.
* 0.2 means that 20% of the training data will be set aside as validation data, while the remaining 80% will be used for training.

In [None]:
# Train the model
model.fit(X_train_scaled, y_train, epochs=100, batch_size=32, validation_split=0.2)

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78

<keras.src.callbacks.History at 0x7af639feba30>

In [None]:
# Evaluate the model
loss = model.evaluate(X_test_scaled, y_test)
print(f"Test Loss: {loss}")

Test Loss: 680.3341064453125


* Print Statement: This line prints out the test loss obtained from evaluating the model on the test data.

loss: This variable stores the loss value returned by model.evaluate().

In [None]:
# Make predictions
y_pred = model.predict(X_test_scaled)




In [None]:
# Display predictions
print(y_pred[:5])

[[79.644516 ]
 [54.434372 ]
 [ 7.9626894]
 [68.72749  ]
 [11.083474 ]]


In [None]:
# Calculate evaluation metrics
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
mse = mean_squared_error(y_test, y_pred)
mae = mean_absolute_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

In [None]:
# Display evaluation metrics
print(f"Mean Squared Error (MSE): {mse}")
print(f"Mean Absolute Error (MAE): {mae}")
print(f"R-squared (R²): {r2}")

Mean Squared Error (MSE): 680.33414111885
Mean Absolute Error (MAE): 14.313338755961217
R-squared (R²): 0.4179079163390901


* Mean Squared Error (MSE) measures the average squared difference between the predicted values and the actual values.
* Mean Absolute Error (MAE) measures the average absolute difference between the predicted values and the actual values.
* R-squared (R²) is a statistical measure that indicates the proportion of the variance in the dependent variable (target) that is predictable from the independent variables (features).