# FT-Transformer Training for Airbnb Regression

This notebook demonstrates how to train an FT-Transformer model for Airbnb price prediction using the modular training functions.

## Overview

The FT-Transformer (Feature Tokenizer + Transformer) is a state-of-the-art architecture for tabular data that:
- Converts features into embeddings using feature tokenization
- Uses multi-head attention to capture feature interactions
- Applies layer normalization and residual connections
- Provides excellent performance on regression tasks

## Dataset
- **Source**: Dgomonov's New York City Airbnb Open Data
- **Task**: Regression (predicting Airbnb listing prices)
- **Features**: Various features including location, property type, reviews, etc.
- **Target**: Price per night in USD

## 1. Import Required Libraries and Functions

In [None]:
# Import all training functions
from improved_ft_transformer_training import *

# Set device
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")

print("🏠 FT-Transformer Training for Airbnb Regression")
print("Dataset: Airbnb NYC Dataset")

## 2. Load Preprocessed Data

Load the preprocessed Airbnb data from Section 1.

In [None]:
# Load preprocessed data
(X_train_scaled, X_val_scaled, X_test_scaled, 
 y_train, y_val, y_test, feature_names, data_summary) = load_and_analyze_data('./Section1_Data_PreProcessing/airbnb_preprocessed_data.pkl')

print(f"\n📊 Data Summary:")
print(f"   Training samples: {len(X_train_scaled):,}")
print(f"   Validation samples: {len(X_val_scaled):,}")
print(f"   Test samples: {len(X_test_scaled):,}")
print(f"   Features: {len(feature_names)}")
print(f"   Target range: [{y_train.min():.2f}, {y_train.max():.2f}]")

## 3. Apply Target Scaling

Apply robust scaling to the target variable for better training stability.

In [None]:
# Apply target scaling
y_train_scaled, y_val_scaled, y_test_scaled, target_scaler = apply_target_scaling(
    y_train, y_val, y_test)

print(f"\n✅ Target scaling completed!")

## 4. Prepare Data for Training

Convert data to PyTorch tensors and create data loaders.

In [None]:
# Prepare data for training
batch_size = 128

(train_loader, val_loader, test_loader, feature_info,
 X_train_tensor, X_val_tensor, X_test_tensor,
 y_train_tensor, y_val_tensor, y_test_tensor) = prepare_improved_data(
    X_train_scaled, X_val_scaled, X_test_scaled, 
    y_train_scaled, y_val_scaled, y_test_scaled, 
    feature_names, device, batch_size)

print(f"\n✅ Data preparation completed!")

## 5. Create FT-Transformer Model

Create the improved FT-Transformer model for regression.

In [None]:
# Create FT-Transformer model
model, total_params = create_improved_ft_transformer(feature_info, device)

print(f"\n🤖 Model created with {total_params:,} parameters")

## 6. Setup Training Components

Setup loss function, optimizer, and scheduler with improved settings.

In [None]:
# Setup training components
learning_rate = 5e-4
weight_decay = 1e-4

criterion, optimizer, scheduler, training_config = setup_improved_training(
    model, learning_rate, weight_decay)

print(f"\n🔧 Training setup completed!")
print(f"   Learning rate: {learning_rate}")
print(f"   Weight decay: {weight_decay}")
print(f"   Max epochs: {training_config['n_epochs']}")
print(f"   Early stopping patience: {training_config['patience']}")

## 7. Train the Model

Train the FT-Transformer model with improved training loop and early stopping.

In [None]:
# Train the model
model, history, best_epoch, training_time = train_improved_ft_transformer(
    model, train_loader, val_loader, criterion, optimizer, scheduler, 
    training_config, device, target_scaler)

print(f"\n🏁 Training completed in {training_time:.2f} seconds")
print(f"   Best epoch: {best_epoch + 1}")
print(f"   Final validation R²: {history['val_r2'][best_epoch]:.4f}")

## 8. Evaluate the Model

Evaluate the trained model on the test set with proper unscaling.

In [None]:
# Evaluate the model
predictions, metrics, y_test_unscaled = evaluate_improved_model(
    model, X_test_tensor, y_test_tensor, device, target_scaler)

print(f"\n📊 Test Set Performance:")
print(f"   R² Score: {metrics['r2_score']:.4f}")
print(f"   RMSE: {metrics['rmse']:.4f}")
print(f"   MAE: {metrics['mae']:.4f}")
print(f"   MAPE: {metrics['mape']:.2f}%")
print(f"   Explained Variance: {metrics['explained_variance']:.4f}")

## 9. Create Visualizations

Create training and evaluation plots.

In [None]:
# Create improved plots
create_improved_plots(history, best_epoch, predictions, y_test_unscaled, metrics, './Section2_Model_Training')

print("\n📈 Training and evaluation plots created!")

## 10. Save Results

Save all results, model, and generated files.

In [None]:
# Save results
save_improved_results(model, history, metrics, predictions, y_test_unscaled, 
                     feature_names, training_time, total_params, target_scaler, './Section2_Model_Training')

print("\n💾 All results saved successfully!")

## 11. Alternative: Run Complete Pipeline

Alternatively, you can run the complete pipeline with a single function call.

In [None]:
# Run complete pipeline (alternative approach)
# Uncomment the following lines to run the complete pipeline in one go:

# model, history, metrics, predictions, feature_names, target_scaler = run_improved_ft_transformer_training(
#     data_path='./Section1_Data_PreProcessing/airbnb_preprocessed_data.pkl',
#     device=device,
#     batch_size=128,
#     learning_rate=5e-4,
#     weight_decay=1e-4,
#     save_dir='./Section2_Model_Training'
# )

print("\n🚀 Complete pipeline function available for one-step execution!")

## 12. Model Analysis and Insights

Analyze the trained model performance and provide insights.

In [None]:
# Model performance analysis
print("\n" + "="*60)
print("FT-TRANSFORMER PERFORMANCE ANALYSIS")
print("="*60)

print(f"\n🎯 Model Performance:")
r2_score = metrics['r2_score']
if r2_score > 0.9:
    performance_level = "Excellent"
elif r2_score > 0.8:
    performance_level = "Good"
elif r2_score > 0.7:
    performance_level = "Moderate"
else:
    performance_level = "Needs Improvement"

print(f"   Performance Level: {performance_level} (R² = {r2_score:.4f})")
print(f"   RMSE: ${metrics['rmse']:.2f}")
print(f"   MAE: ${metrics['mae']:.2f}")
print(f"   MAPE: {metrics['mape']:.2f}%")

print(f"\n📊 Model Characteristics:")
print(f"   Total Parameters: {total_params:,}")
print(f"   Training Time: {training_time:.2f} seconds")
print(f"   Best Epoch: {best_epoch + 1}")

print(f"\n💡 Business Insights:")
avg_actual = y_test_unscaled.mean()
avg_error = metrics['mae']
error_percentage = (avg_error / avg_actual) * 100

print(f"   Average Airbnb price: ${avg_actual:.2f}")
print(f"   Average prediction error: ${avg_error:.2f} ({error_percentage:.1f}%)")

if error_percentage < 10:
    print(f"   ✅ Excellent accuracy for pricing decisions")
elif error_percentage < 20:
    print(f"   ✅ Good accuracy for market analysis")
else:
    print(f"   ⚠️ Consider model improvements for better accuracy")

print(f"\n📁 Generated Files:")
print(f"   - Training History: ./Section2_Model_Training/improved_ft_transformer_training_history.csv")
print(f"   - Evaluation Metrics: ./Section2_Model_Training/improved_ft_transformer_evaluation_metrics.csv")
print(f"   - Predictions: ./Section2_Model_Training/improved_ft_transformer_predictions.csv")
print(f"   - Model Checkpoint: ./Section2_Model_Training/improved_ft_transformer_model.pth")
print(f"   - Training Plots: ./Section2_Model_Training/Improved_FT_Transformer_results.png")

print(f"\n🚀 FT-Transformer training completed successfully!")
print(f"   Model ready for deployment and comparison with other models!")

## Summary

This notebook demonstrated how to:

1. **Load preprocessed data** from the Airbnb dataset
2. **Apply target scaling** for improved training stability
3. **Prepare data** for FT-Transformer training with PyTorch tensors
4. **Create an improved FT-Transformer model** specifically for regression
5. **Train the model** with enhanced training loop and early stopping
6. **Evaluate performance** using comprehensive regression metrics
7. **Generate visualizations** for training progress and model performance
8. **Save all results** for future analysis and comparison

The improved FT-Transformer provides state-of-the-art performance on tabular data by leveraging attention mechanisms to capture complex feature interactions, making it particularly effective for Airbnb price prediction.

### Key Features:
- **Improved architecture**: Enhanced with regularization and better hyperparameters
- **Target scaling**: RobustScaler for better handling of outliers
- **Enhanced training**: Gradient clipping, learning rate warmup, and better monitoring
- **Comprehensive evaluation**: Multiple regression metrics and visualizations
- **Reproducible results**: Fixed random seeds and saved model checkpoints
- **GPU support**: Automatic device detection and memory management

### Next Steps:
- Compare with other models (XGBoost, SAINT, etc.)
- Perform hyperparameter tuning
- Analyze feature importance and model interpretability
- Deploy the model for real-time price predictions