# Travel Personality Classification Project

## Overview
This project aims to build a machine learning model that can classify users into different travel personality types, such as **Adventure Seeker**, **Cultural Explorer**, **Luxury Traveler**, and **Nature Lover**. The classification is based on user preferences and demographic information, helping us understand and categorize users’ travel inclinations. To demonstrate proficiency in popular machine learning frameworks, the project is implemented using both **TensorFlow** and **PyTorch**.

## Objectives
The primary goals of this project are:
1. To create an accurate travel personality classification model.
2. To showcase implementations using **TensorFlow** and **PyTorch** for comparison.
3. To highlight the strengths of each framework in handling multi-class classification tasks.

## Dataset
The dataset consists of synthetic records representing user demographics and travel preferences, such as:
- **Age**
- **Location** (encoded as categorical variables)
- **Budget Preference**
- **Adventure Score**
- **Luxury Score**
- **Cultural Interest**
- **Nature Lover Score**

The target variable is the **Travel Personality** type, with four distinct classes.

## Data Preprocessing
The following preprocessing steps are applied:
1. **Label Encoding**: Converting categorical labels (travel personalities) into integer classes.
2. **Feature Scaling**: Standardizing features to have consistent influence on the model.
3. **One-Hot Encoding** (for TensorFlow): Converting target labels into a one-hot encoded format to support multi-class classification.

## Model Architectures

### TensorFlow Model
The TensorFlow model is a feed-forward neural network configured with three hidden layers using ReLU activation, followed by a Softmax output layer. This model is optimized using the Adam optimizer and utilizes Categorical Crossentropy as the loss function, making it suitable for multi-class classification.

### PyTorch Model
The PyTorch model uses a similar architecture, implemented as a custom neural network class with four fully connected layers. The hidden layers apply ReLU activation, and the output layer uses Softmax. The Adam optimizer and CrossEntropyLoss are used for optimization and loss calculation, respectively, which are compatible with multi-class classification in PyTorch.

## Training and Evaluation
Both models are trained and validated using cross-validation to ensure model generalization. Evaluation on a test dataset assesses performance, with key metrics including **accuracy** and **loss**. Cross-validation provides insight into each model’s ability to generalize, while test accuracy reveals the model’s real-world performance.

## Results and Observations
Both the TensorFlow and PyTorch models achieved high classification accuracy, confirming the effectiveness of the model architectures. Future enhancements could include:
- **Hyperparameter Tuning**: Experimenting with batch sizes, learning rates, and additional layers.
- **Data Augmentation**: Expanding the dataset to improve model robustness and generalization.

## Conclusion
This project demonstrates the process of building and evaluating a multi-class classification model using both TensorFlow and PyTorch. The successful classification accuracy achieved highlights the capabilities of each framework in handling neural network-based classification tasks.


In [5]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, LabelEncoder

# Load dataset
data = pd.read_csv('your_travel_data.csv')

# One-hot encode the 'location' column to convert it to numeric format
data = pd.get_dummies(data, columns=['location'], drop_first=True)

# Separate features (X) and target (y)
X = data.drop(columns=['travel_personality'])
y = data['travel_personality']

# Encode target labels
label_encoder = LabelEncoder()
y = label_encoder.fit_transform(y)  # Convert to integer classes

# Split data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Scale features
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)


Step 2: Import TensorFlow and Define the Model

In [9]:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Input
from tensorflow.keras.utils import to_categorical

# Convert y_train and y_test to categorical format
y_train_categorical = to_categorical(y_train)
y_test_categorical = to_categorical(y_test)

# Define the model architecture using an Input layer
model_tf = Sequential([
    Input(shape=(X_train.shape[1],)),  # Define the input shape here
    Dense(64, activation='relu'),
    Dense(32, activation='relu'),
    Dense(16, activation='relu'),
    Dense(y_train_categorical.shape[1], activation='softmax')
])


Step 3: Compile the TensorFlow Model

In [11]:
model_tf.compile(
    optimizer='adam',
    loss='categorical_crossentropy',  # Use categorical crossentropy for multi-class classification
    metrics=['accuracy']
)


Step 4: Train the TensorFlow Model

In [13]:
# Train the model
history_tf = model_tf.fit(X_train, y_train_categorical, epochs=100, batch_size=16, validation_data=(X_test, y_test_categorical))


Epoch 1/100
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 8ms/step - accuracy: 0.3886 - loss: 1.3138 - val_accuracy: 0.5350 - val_loss: 1.0662
Epoch 2/100
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.5875 - loss: 1.0053 - val_accuracy: 0.6850 - val_loss: 0.7888
Epoch 3/100
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.8067 - loss: 0.6492 - val_accuracy: 0.7650 - val_loss: 0.5870
Epoch 4/100
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - accuracy: 0.8637 - loss: 0.4340 - val_accuracy: 0.7900 - val_loss: 0.4733
Epoch 5/100
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.8733 - loss: 0.3320 - val_accuracy: 0.8100 - val_loss: 0.4239
Epoch 6/100
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.9051 - loss: 0.2732 - val_accuracy: 0.8450 - val_loss: 0.3941
Epoch 7/100
[1m50/50[0m [32m━━━

Step 5: Evaluate the TensorFlow Model

In [15]:
# Evaluate the model on test data
test_loss, test_accuracy = model_tf.evaluate(X_test, y_test_categorical)
print(f"TensorFlow Test Accuracy: {test_accuracy:.2f}")


[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - accuracy: 0.9809 - loss: 0.1122
TensorFlow Test Accuracy: 0.98


In [19]:
pip install torch

Defaulting to user installation because normal site-packages is not writeable
Collecting torch
  Downloading torch-2.5.1-cp312-cp312-win_amd64.whl.metadata (28 kB)
Collecting sympy==1.13.1 (from torch)
  Downloading sympy-1.13.1-py3-none-any.whl.metadata (12 kB)
Downloading torch-2.5.1-cp312-cp312-win_amd64.whl (203.0 MB)
   ---------------------------------------- 0.0/203.0 MB ? eta -:--:--
   ---------------------------------------- 0.5/203.0 MB 3.4 MB/s eta 0:01:01
   ---------------------------------------- 1.6/203.0 MB 4.0 MB/s eta 0:00:51
    --------------------------------------- 2.6/203.0 MB 4.4 MB/s eta 0:00:46
    --------------------------------------- 4.2/203.0 MB 5.3 MB/s eta 0:00:38
   - -------------------------------------- 5.8/203.0 MB 5.9 MB/s eta 0:00:34
   - -------------------------------------- 7.6/203.0 MB 6.3 MB/s eta 0:00:31
   - -------------------------------------- 10.0/203.0 MB 7.0 MB/s eta 0:00:28
   -- ------------------------------------- 12.1/203.0 MB 

Step 2: Import PyTorch and Define the Model

In [27]:
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import TensorDataset, DataLoader

# Convert data to tensors
X_train_tensor = torch.tensor(X_train, dtype=torch.float32)
y_train_tensor = torch.tensor(y_train, dtype=torch.long)
X_test_tensor = torch.tensor(X_test, dtype=torch.float32)
y_test_tensor = torch.tensor(y_test, dtype=torch.long)

# Define DataLoader for training
train_data = TensorDataset(X_train_tensor, y_train_tensor)
train_loader = DataLoader(train_data, batch_size=16, shuffle=True)

# Define the model architecture in PyTorch
class TravelPersonalityNN(nn.Module):
    def __init__(self, input_size, num_classes):
        super(TravelPersonalityNN, self).__init__()
        self.fc1 = nn.Linear(input_size, 64)
        self.fc2 = nn.Linear(64, 32)
        self.fc3 = nn.Linear(32, 16)
        self.fc4 = nn.Linear(16, num_classes)
    
    def forward(self, x):
        x = torch.relu(self.fc1(x))
        x = torch.relu(self.fc2(x))
        x = torch.relu(self.fc3(x))
        x = torch.softmax(self.fc4(x), dim=1)
        return x

# Instantiate the model
model_pt = TravelPersonalityNN(X_train.shape[1], len(set(y_train)))


Step 3: Define the Loss Function and Optimizer in PyTorch

In [29]:
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model_pt.parameters(), lr=0.001)


Step 4: Train the PyTorch Model

In [31]:
# Training loop
num_epochs = 100
for epoch in range(num_epochs):
    for inputs, labels in train_loader:
        optimizer.zero_grad()  # Reset gradients
        outputs = model_pt(inputs)  # Forward pass
        loss = criterion(outputs, labels)  # Compute loss
        loss.backward()  # Backpropagate
        optimizer.step()  # Update weights


Step 5: Evaluate the PyTorch Model

In [32]:
# Evaluate on test set
with torch.no_grad():
    outputs = model_pt(X_test_tensor)
    _, predicted = torch.max(outputs, 1)
    test_accuracy = (predicted == y_test_tensor).float().mean().item()
    print(f"PyTorch Test Accuracy: {test_accuracy:.2f}")


PyTorch Test Accuracy: 0.98
