# TTML Basic Usage Example

This notebook demonstrates the basic usage of the Tabular Transformer (TTML) model using the Titanic dataset. We'll cover:

1. Loading and preprocessing data
2. Configuring and initializing the model
3. Training the model
4. Making predictions
5. Basic evaluation

In [None]:
import sys
import os
import numpy as np
import pandas as pd
import torch
from sklearn.metrics import accuracy_score, classification_report

# Import TTML modules
from tabular_transformer.models import TransformerEncoder
from tabular_transformer.models.task_heads import ClassificationHead
from tabular_transformer.training import Trainer
from tabular_transformer.inference import predict

# Import data utilities
from data_utils import download_titanic_dataset, prepare_dataset

## 1. Load and Preprocess Data

First, we'll download the Titanic dataset and prepare it for training.

In [None]:
# Download Titanic dataset
df = download_titanic_dataset(save_csv=False)
print("Dataset shape:", df.shape)
print("\nFeature types:")
print(df.dtypes)

In [None]:
# Prepare dataset for training
target_column = 'survived'
data = prepare_dataset(
    df=df,
    target_column=target_column,
    test_size=0.2,
    random_state=42
)

X_train = data['X_train']
X_test = data['X_test']
y_train = data['y_train']
y_test = data['y_test']

## 2. Configure and Initialize Model

Now we'll set up the TTML model with a classification head for the survival prediction task.

In [None]:
# Model configuration
input_dim = X_train.shape[1]  # Number of features
num_classes = len(np.unique(y_train))

# Initialize transformer encoder
encoder = TransformerEncoder(
    input_dim=input_dim,
    d_model=64,
    nhead=4,
    num_layers=2,
    dim_feedforward=128,
    dropout=0.1
)

# Initialize classification head
task_head = ClassificationHead(
    input_dim=64,  # Should match d_model from encoder
    num_classes=num_classes
)

# Convert data to PyTorch tensors
X_train_tensor = torch.FloatTensor(X_train.values)
y_train_tensor = torch.LongTensor(y_train.values)
X_test_tensor = torch.FloatTensor(X_test.values)
y_test_tensor = torch.LongTensor(y_test.values)

## 3. Train the Model

We'll use the TTML Trainer to train our model.

In [None]:
# Initialize trainer
trainer = Trainer(
    encoder=encoder,
    task_head=task_head,
    learning_rate=0.001,
    batch_size=32,
    num_epochs=10
)

# Train the model
history = trainer.fit(
    X_train=X_train_tensor,
    y_train=y_train_tensor,
    X_val=X_test_tensor,
    y_val=y_test_tensor
)

## 4. Make Predictions

Let's use our trained model to make predictions on the test set.

In [None]:
# Make predictions
predictions = predict.predict(
    encoder=encoder,
    task_head=task_head,
    X=X_test_tensor
)

# Convert predictions to numpy for evaluation
y_pred = predictions.numpy()

## 5. Evaluate Results

Finally, let's evaluate our model's performance.

In [None]:
# Calculate accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.4f}")

# Display detailed classification report
print("\nClassification Report:")
print(classification_report(y_test, y_pred))

## Conclusion

This notebook demonstrated the basic usage of the TTML model for a binary classification task using the Titanic dataset. The model achieved reasonable performance in predicting survival outcomes.

For more advanced usage and different tasks, check out the other example notebooks:
- classification_examples.ipynb
- regression_examples.ipynb
- clustering_examples.ipynb
- survival_analysis.ipynb
- multi_task_examples.ipynb