# PyTorch Neural Networks Basics: Australian Tourism Sentiment Analysis 🇦🇺

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vuhung16au/pytorch-mastery/blob/main/examples/pytorch-nlp/neural-network-basic.ipynb)
[![View on GitHub](https://img.shields.io/badge/View_on-GitHub-blue?logo=github)](https://github.com/vuhung16au/pytorch-mastery/blob/main/examples/pytorch-nlp/neural-network-basic.ipynb)

A comprehensive introduction to PyTorch neural networks through Australian tourism sentiment analysis with multilingual support (English + Vietnamese). This notebook demonstrates basic neural network concepts using real-world NLP applications.

## Learning Objectives
- Build and train basic neural networks with PyTorch `nn.Module`
- Implement sentiment classification for Australian tourism reviews
- Master PyTorch training loops vs TensorFlow's `model.fit()`
- Handle multilingual text data (English + Vietnamese)
- Use TensorBoard for comprehensive training monitoring
- Apply Australian context to NLP tasks (Sydney, Melbourne, Brisbane)

## What We'll Build
- **Binary Sentiment Classifier**: Positive/Negative Australian tourism reviews
- **Multi-class City Classifier**: Sydney/Melbourne/Brisbane/Perth location detection
- **Multilingual Support**: Process English and Vietnamese tourism content
- **Complete Training Pipeline**: Data loading, training, validation, visualization

---

## 1. Environment Setup and Runtime Detection

Following PyTorch mastery repository standards for cross-platform compatibility:

In [None]:
# Environment Detection and Setup
import sys
import subprocess
import os
import time

# Detect the runtime environment
IS_COLAB = "google.colab" in sys.modules
IS_KAGGLE = "kaggle_secrets" in sys.modules or "kaggle" in os.environ.get('KAGGLE_URL_BASE', '')
IS_LOCAL = not (IS_COLAB or IS_KAGGLE)

print(f"Environment detected:")
print(f"  - Local: {IS_LOCAL}")
print(f"  - Google Colab: {IS_COLAB}")
print(f"  - Kaggle: {IS_KAGGLE}")

# Platform-specific system setup
if IS_COLAB:
    print("\nSetting up Google Colab environment...")
    !apt update -qq
    !apt install -y -qq software-properties-common
elif IS_KAGGLE:
    print("\nSetting up Kaggle environment...")
    # Kaggle usually has most packages pre-installed
else:
    print("\nSetting up local environment...")

In [None]:
# Install required packages for this notebook
required_packages = [
    "torch",
    "torchvision", 
    "torchaudio",
    "pandas",
    "seaborn",
    "matplotlib",
    "numpy",
    "scikit-learn",
    "tensorboard"
]

print("Installing required packages...")
for package in required_packages:
    if IS_COLAB or IS_KAGGLE:
        !pip install -q {package}
    else:
        subprocess.run([sys.executable, "-m", "pip", "install", "-q", package], 
                      capture_output=True)
    print(f"✓ {package}")

print("\n✅ Package installation completed!")

## 2. Import Libraries and Device Detection

Following repository guidelines for consistent imports and device handling:

In [None]:
# Core imports following repository standards
import torch
import torch.nn as nn
import torch.nn.functional as F  # Standard alias for functional operations
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader, TensorDataset
from torch.utils.tensorboard import SummaryWriter

# Data science and visualization
import numpy as np
import pandas as pd
import seaborn as sns  # Primary visualization library for notebooks
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import classification_report, confusion_matrix

# Text processing
import re
import string
from collections import Counter, defaultdict
from dataclasses import dataclass
from typing import Dict, List, Tuple, Optional, Any
import platform
import datetime

# Set seaborn style for better notebook aesthetics
sns.set_style("whitegrid")
plt.rcParams['figure.figsize'] = (10, 6)

# Verify PyTorch installation
print(f"✅ PyTorch {torch.__version__} ready!")
print(f"✅ All libraries imported successfully!")

In [None]:
# Device Detection with comprehensive hardware support
def detect_device():
    """
    Detect the best available PyTorch device with comprehensive hardware support.
    
    Priority order:
    1. CUDA (NVIDIA GPUs) - Best performance for deep learning
    2. MPS (Apple Silicon) - Optimized for M1/M2/M3 Macs  
    3. CPU (Universal) - Always available fallback
    
    Returns:
        torch.device: The optimal device for PyTorch operations
        str: Human-readable device description for logging
    """
    # Check for CUDA (NVIDIA GPU)
    if torch.cuda.is_available():
        device = torch.device("cuda")
        gpu_name = torch.cuda.get_device_name(0)
        device_info = f"CUDA GPU: {gpu_name}"
        
        # Additional CUDA info for optimization
        cuda_version = torch.version.cuda
        gpu_memory = torch.cuda.get_device_properties(0).total_memory / 1024**3
        
        print(f"🚀 Using CUDA acceleration")
        print(f"   GPU: {gpu_name}")
        print(f"   CUDA Version: {cuda_version}")
        print(f"   GPU Memory: {gpu_memory:.1f} GB")
        
        return device, device_info
    
    # Check for MPS (Apple Silicon)
    elif hasattr(torch.backends, 'mps') and torch.backends.mps.is_available():
        device = torch.device("mps")
        device_info = "Apple Silicon MPS"
        
        # Get system info for Apple Silicon
        system_info = platform.uname()
        
        print(f"🍎 Using Apple Silicon MPS acceleration")
        print(f"   System: {system_info.system} {system_info.release}")
        print(f"   Machine: {system_info.machine}")
        print(f"   Processor: {system_info.processor}")
        
        return device, device_info
    
    # Fallback to CPU
    else:
        device = torch.device("cpu")
        device_info = "CPU (No GPU acceleration available)"
        
        # Get CPU info for optimization guidance
        cpu_count = torch.get_num_threads()
        system_info = platform.uname()
        
        print(f"💻 Using CPU (no GPU acceleration detected)")
        print(f"   Processor: {system_info.processor}")
        print(f"   PyTorch Threads: {cpu_count}")
        print(f"   System: {system_info.system} {system_info.release}")
        
        # Provide optimization suggestions for CPU-only setups
        print(f"\n💡 CPU Optimization Tips:")
        print(f"   • Reduce batch size to prevent memory issues")
        print(f"   • Consider using smaller models for faster training")
        print(f"   • Enable PyTorch optimizations: torch.set_num_threads({cpu_count})")
        
        return device, device_info

# Usage in notebook
device, device_info = detect_device()
print(f"\n✅ PyTorch device selected: {device}")
print(f"📊 Device info: {device_info}")

# Set global device for the notebook
DEVICE = device

## 3. Australian Tourism Dataset Creation

Creating multilingual Australian tourism sentiment data with English and Vietnamese examples:

In [None]:
# Australian Tourism Sentiment Dataset
# Following repository guidelines: Australian context with English-Vietnamese multilingual support

australian_tourism_data = {
    'english_reviews': [
        # Positive reviews - Sydney (label: 1)
        "The Sydney Opera House tour was absolutely breathtaking! A must-see landmark.",
        "Harbour Bridge climb exceeded all expectations. Sydney views are stunning!",
        "Bondi Beach is perfect for swimming and surfing. Love the Australian lifestyle!",
        "Circular Quay area is vibrant with amazing restaurants and street performers.",
        "The Royal Botanic Gardens offer peaceful walks with Opera House views.",
        
        # Positive reviews - Melbourne (label: 1)
        "Melbourne's coffee culture is world-class. Every café serves excellent brews.",
        "The laneways are filled with incredible street art and hidden gem restaurants.",
        "Queen Victoria Market has amazing fresh produce and local crafts.",
        "St Kilda penguin parade at sunset is a magical wildlife experience.",
        "Melbourne's food scene is diverse and incredibly delicious.",
        
        # Positive reviews - Brisbane (label: 1)
        "South Bank Parklands is a beautiful urban oasis with great river views.",
        "Brisbane River cruise offers spectacular city skyline perspectives.",
        "The Gold Coast beaches are pristine with perfect surfing conditions.",
        "Story Bridge climb provides amazing 360-degree city views.",
        "Brisbane's subtropical climate makes outdoor activities enjoyable year-round.",
        
        # Negative reviews - Sydney (label: 0)
        "Sydney accommodation prices are extremely expensive and overpriced.",
        "Circular Quay was overcrowded with tourists, very disappointing experience.",
        "The weather was terrible during our Sydney visit, rained constantly.",
        "Public transport in Sydney is confusing and unreliable.",
        "Sydney restaurant prices are outrageous, not worth the money.",
        
        # Negative reviews - Melbourne (label: 0)
        "Melbourne weather is unpredictable and ruined our outdoor plans.",
        "The city center feels congested with construction everywhere.",
        "Melbourne restaurants are pretentious and overpriced for mediocre food.",
        "Public transport delays made us late for every appointment.",
        "The hotel service in Melbourne was poor and unprofessional.",
        
        # Negative reviews - Brisbane (label: 0)
        "Brisbane is boring compared to other Australian cities.",
        "The heat and humidity in Brisbane is unbearable during summer.",
        "Gold Coast beaches are overcrowded and polluted with tourists.",
        "Brisbane nightlife is limited and closes too early.",
        "Public facilities in Brisbane need major improvements and maintenance."
    ],
    
    'vietnamese_reviews': [
        # Positive reviews - Vietnamese translations (label: 1)
        "Tour Nhà hát Opera Sydney thật tuyệt vời! Là điểm tham quan không thể bỏ qua.",
        "Leo cầu Harbour Bridge vượt quá mong đợi. Cảnh Sydney thật đẹp!",
        "Bãi biển Bondi hoàn hảo để bơi lội và lướt sóng. Yêu lối sống Úc!",
        "Khu vực Circular Quay sôi động với nhà hàng và nghệ sĩ đường phố tuyệt vời.",
        "Vườn Bách thảo Hoàng gia có đường đi bộ yên bình với view Nhà hát Opera.",
        
        "Văn hóa cà phê Melbourne đẳng cấp thế giới. Mọi quán đều pha chế tuyệt vời.",
        "Những con hẻm đầy nghệ thuật đường phố và nhà hàng ẩn mình tuyệt vời.",
        "Chợ Queen Victoria có nông sản tươi ngon và thủ công mỹ nghệ địa phương.",
        "Cuộc diễu hành chim cánh cụt St Kilda lúc hoàng hôn thật kỳ diệu.",
        "Ẩm thực Melbourne đa dạng và cực kỳ ngon miệng.",
        
        "Công viên South Bank là ốc đảo đô thị xinh đẹp với view sông tuyệt vời.",
        "Du thuyền sông Brisbane mang đến góc nhìn tuyệt vời về đường chân trời.",
        "Bãi biển Gold Coast nguyên sơ với điều kiện lướt sóng hoàn hảo.",
        "Leo cầu Story Bridge có view 360 độ thành phố tuyệt vời.",
        "Khí hậu cận nhiệt đới Brisbane làm cho hoạt động ngoài trời thú vị quanh năm.",
        
        # Negative reviews - Vietnamese translations (label: 0)
        "Giá chỗ ở Sydney cực kỳ đắt đỏ và quá mức.",
        "Circular Quay đông nghẹt khách du lịch, trải nghiệm thất vọng.",
        "Thời tiết tệ trong chuyến thăm Sydney, mưa liên tục.",
        "Giao thông công cộng Sydney khó hiểu và không đáng tin cậy.",
        "Giá nhà hàng Sydney quá đắt, không xứng đáng với đồng tiền.",
        
        "Thời tiết Melbourne khó đoán và làm hỏng kế hoạch ngoài trời.",
        "Trung tâm thành phố cảm thấy tắc nghẽn với công trình xây dựng khắp nơi.",
        "Nhà hàng Melbourne kiêu căng và đắt đỏ cho đồ ăn tầm thường.",
        "Giao thông công cộng chậm trễ làm chúng tôi trễ mọi cuộc hẹn.",
        "Dịch vụ khách sạn Melbourne kém và không chuyên nghiệp.",
        
        "Brisbane nhàm chán so với các thành phố Úc khác.",
        "Cái nóng và độ ẩm ở Brisbane không chịu nổi vào mùa hè.",
        "Bãi biển Gold Coast đông nghẹt và ô nhiễm với khách du lịch.",
        "Cuộc sống về đêm Brisbane hạn chế và đóng cửa quá sớm.",
        "Cơ sở công cộng Brisbane cần cải thiện và bảo trì lớn."
    ]
}

# Create labels: 1 for positive (first 15 each language), 0 for negative (last 15 each language)
labels = [1] * 15 + [0] * 15  # English labels
labels_vi = [1] * 15 + [0] * 15  # Vietnamese labels

# Combine all texts and labels
all_texts = australian_tourism_data['english_reviews'] + australian_tourism_data['vietnamese_reviews']
all_labels = labels + labels_vi

# Create DataFrame for easier manipulation
tourism_df = pd.DataFrame({
    'text': all_texts,
    'label': all_labels,
    'language': ['en'] * len(labels) + ['vi'] * len(labels_vi),
    'city': (
        ['Sydney'] * 5 + ['Melbourne'] * 5 + ['Brisbane'] * 5 +  # Positive English
        ['Sydney'] * 5 + ['Melbourne'] * 5 + ['Brisbane'] * 5 +  # Negative English
        ['Sydney'] * 5 + ['Melbourne'] * 5 + ['Brisbane'] * 5 +  # Positive Vietnamese
        ['Sydney'] * 5 + ['Melbourne'] * 5 + ['Brisbane'] * 5    # Negative Vietnamese
    )
})

print(f"🇦🇺 Australian Tourism Dataset Created:")
print(f"   📊 Total samples: {len(tourism_df)}")
print(f"   🌏 Languages: {tourism_df['language'].value_counts().to_dict()}")
print(f"   😊 Sentiment distribution: {tourism_df['label'].value_counts().to_dict()}")
print(f"   🏙️ City distribution: {tourism_df['city'].value_counts().to_dict()}")

# Display sample data
print(f"\n📝 Sample data:")
display(tourism_df.head())

## 4. Simple Neural Network for Text Classification

Building a basic neural network using PyTorch's `nn.Module` with OOP design patterns:

In [None]:
# Simple Text Processing Helper Functions

def simple_tokenize(text: str) -> List[str]:
    """Simple tokenization by splitting on whitespace and removing punctuation."""
    # Convert to lowercase and remove punctuation
    text = text.lower()
    for punct in string.punctuation:
        text = text.replace(punct, ' ')
    # Split and filter empty strings
    return [word for word in text.split() if word]

def build_vocab(texts: List[str], max_vocab_size: int = 5000) -> Dict[str, int]:
    """Build vocabulary from texts."""
    # Count all words
    word_counts = Counter()
    for text in texts:
        tokens = simple_tokenize(text)
        word_counts.update(tokens)
    
    # Create vocabulary with most frequent words
    vocab = {'<UNK>': 0, '<PAD>': 1}  # Special tokens
    most_common = word_counts.most_common(max_vocab_size - 2)
    
    for i, (word, count) in enumerate(most_common):
        vocab[word] = i + 2
    
    print(f"📚 Built vocabulary: {len(vocab)} words")
    print(f"   Most common words: {[word for word, _ in most_common[:10]]}")
    
    return vocab

def text_to_sequence(text: str, vocab: Dict[str, int], max_length: int = 50) -> List[int]:
    """Convert text to sequence of token IDs."""
    tokens = simple_tokenize(text)
    sequence = [vocab.get(token, vocab['<UNK>']) for token in tokens]
    
    # Pad or truncate to max_length
    if len(sequence) > max_length:
        sequence = sequence[:max_length]
    else:
        sequence.extend([vocab['<PAD>']] * (max_length - len(sequence)))
    
    return sequence

# Build vocabulary from our tourism data
vocab = build_vocab(tourism_df['text'].tolist(), max_vocab_size=3000)
vocab_size = len(vocab)
max_length = 50

print(f"\n🔤 Text preprocessing setup:")
print(f"   Vocabulary size: {vocab_size}")
print(f"   Max sequence length: {max_length}")

# Test tokenization
sample_text = "The Sydney Opera House is absolutely beautiful!"
sample_sequence = text_to_sequence(sample_text, vocab, max_length)
print(f"\n🧪 Sample tokenization:")
print(f"   Text: '{sample_text}'")
print(f"   Tokens: {simple_tokenize(sample_text)}")
print(f"   Sequence (first 10): {sample_sequence[:10]}")

In [None]:
class BasicSentimentClassifier(nn.Module):
    """
    Basic neural network for Australian tourism sentiment classification.
    
    Architecture:
    - Embedding layer: Maps word IDs to dense vectors
    - Global Average Pooling: Averages word embeddings to get sentence representation
    - Hidden layers: Two fully connected layers with ReLU activation
    - Output layer: Binary classification (positive/negative sentiment)
    
    This demonstrates fundamental PyTorch neural network concepts:
    - Inheriting from nn.Module
    - Defining layers in __init__
    - Implementing forward pass
    - Using functional operations (F.relu, F.dropout)
    """
    
    def __init__(self, vocab_size: int, embed_dim: int = 64, hidden_dim: int = 128, 
                 output_dim: int = 2, dropout_rate: float = 0.3):
        super(BasicSentimentClassifier, self).__init__()
        
        # Store hyperparameters
        self.vocab_size = vocab_size
        self.embed_dim = embed_dim
        self.hidden_dim = hidden_dim
        self.output_dim = output_dim
        self.dropout_rate = dropout_rate
        
        # Embedding layer - converts token IDs to dense vectors
        self.embedding = nn.Embedding(vocab_size, embed_dim, padding_idx=vocab.get('<PAD>', 1))
        
        # Fully connected layers
        self.fc1 = nn.Linear(embed_dim, hidden_dim)
        self.fc2 = nn.Linear(hidden_dim, hidden_dim // 2)
        self.fc3 = nn.Linear(hidden_dim // 2, output_dim)
        
        # Dropout for regularization
        self.dropout = nn.Dropout(dropout_rate)
        
        # Australian cities for context (just for fun!)
        self.australian_cities = ["Sydney", "Melbourne", "Brisbane", "Perth", "Adelaide"]
    
    def forward(self, x: torch.Tensor) -> torch.Tensor:
        """
        Forward pass through the network.
        
        Args:
            x: Input tensor of token IDs, shape (batch_size, sequence_length)
            
        Returns:
            Logits for sentiment classification, shape (batch_size, output_dim)
        """
        # 1. Embedding lookup: (batch_size, seq_len) -> (batch_size, seq_len, embed_dim)
        embedded = self.embedding(x)
        
        # 2. Global average pooling: average across sequence dimension
        # This gives us a fixed-size representation regardless of input length
        pooled = embedded.mean(dim=1)  # (batch_size, embed_dim)
        
        # 3. First hidden layer with ReLU activation and dropout
        hidden1 = F.relu(self.fc1(pooled))
        hidden1 = self.dropout(hidden1)
        
        # 4. Second hidden layer with ReLU activation and dropout
        hidden2 = F.relu(self.fc2(hidden1))
        hidden2 = self.dropout(hidden2)
        
        # 5. Output layer (no activation - will use CrossEntropyLoss)
        logits = self.fc3(hidden2)
        
        return logits
    
    def predict_sentiment(self, text: str, vocab: Dict[str, int], device: torch.device) -> Dict[str, Any]:
        """
        Predict sentiment for a single text with Australian context analysis.
        
        Args:
            text: Input text to analyze
            vocab: Vocabulary dictionary
            device: PyTorch device
            
        Returns:
            Dictionary with prediction results and Australian context
        """
        self.eval()  # Set to evaluation mode
        
        with torch.no_grad():
            # Convert text to sequence
            sequence = text_to_sequence(text, vocab, max_length)
            input_tensor = torch.tensor([sequence], dtype=torch.long).to(device)
            
            # Forward pass
            logits = self.forward(input_tensor)
            probabilities = F.softmax(logits, dim=1)
            
            # Get prediction
            predicted_class = logits.argmax(dim=1).item()
            confidence = probabilities[0, predicted_class].item()
            
            # Check for Australian cities mentioned
            mentioned_cities = [city for city in self.australian_cities if city.lower() in text.lower()]
            
            return {
                'text': text,
                'predicted_sentiment': 'Positive' if predicted_class == 1 else 'Negative',
                'confidence': confidence,
                'probabilities': {
                    'negative': probabilities[0, 0].item(),
                    'positive': probabilities[0, 1].item()
                },
                'australian_cities_mentioned': mentioned_cities,
                'has_australian_context': len(mentioned_cities) > 0
            }
    
    def get_model_info(self) -> Dict[str, Any]:
        """Get model architecture information."""
        total_params = sum(p.numel() for p in self.parameters())
        trainable_params = sum(p.numel() for p in self.parameters() if p.requires_grad)
        
        return {
            'architecture': 'Basic Feedforward Neural Network',
            'vocab_size': self.vocab_size,
            'embed_dim': self.embed_dim,
            'hidden_dim': self.hidden_dim,
            'output_dim': self.output_dim,
            'dropout_rate': self.dropout_rate,
            'total_parameters': total_params,
            'trainable_parameters': trainable_params,
            'target_task': 'Australian Tourism Sentiment Analysis',
            'supported_languages': ['English', 'Vietnamese']
        }

# Create model instance
model = BasicSentimentClassifier(
    vocab_size=vocab_size,
    embed_dim=64,
    hidden_dim=128,
    output_dim=2,  # Binary classification: negative (0) and positive (1)
    dropout_rate=0.3
).to(DEVICE)

# Display model information
model_info = model.get_model_info()
print(f"🧠 Basic Sentiment Classifier Created:")
print(f"   Architecture: {model_info['architecture']}")
print(f"   Total parameters: {model_info['total_parameters']:,}")
print(f"   Vocabulary size: {model_info['vocab_size']:,}")
print(f"   Embedding dimension: {model_info['embed_dim']}")
print(f"   Hidden dimension: {model_info['hidden_dim']}")
print(f"   Target task: {model_info['target_task']}")
print(f"   Supported languages: {', '.join(model_info['supported_languages'])}")
print(f"   Device: {DEVICE}")

# Display model architecture
print(f"\n📐 Model Architecture:")
print(model)

In [None]:
# Prepare data for training
print("🏗️ Preparing training data...")

# Convert all texts to sequences
X = []
y = []

for idx, row in tourism_df.iterrows():
    sequence = text_to_sequence(row['text'], vocab, max_length)
    X.append(sequence)
    y.append(row['label'])

# Convert to tensors
X_tensor = torch.tensor(X, dtype=torch.long)
y_tensor = torch.tensor(y, dtype=torch.long)

print(f"   📊 Data shape: X={X_tensor.shape}, y={y_tensor.shape}")
print(f"   🎯 Label distribution: {Counter(y)}")

# Split into train/test sets
X_train, X_test, y_train, y_test = train_test_split(
    X_tensor, y_tensor, 
    test_size=0.2, 
    random_state=42, 
    stratify=y_tensor  # Ensure balanced split
)

print(f"   🚂 Train set: {X_train.shape[0]} samples")
print(f"   🧪 Test set: {X_test.shape[0]} samples")

# Create DataLoaders for batch processing
batch_size = 8 if DEVICE.type == 'cpu' else 16

train_dataset = TensorDataset(X_train, y_train)
test_dataset = TensorDataset(X_test, y_test)

train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

print(f"   🔄 Batch size: {batch_size} (optimized for {DEVICE.type.upper()})")
print(f"   📦 Training batches: {len(train_loader)}")
print(f"   📦 Test batches: {len(test_loader)}")

# Test data loading
sample_batch_X, sample_batch_y = next(iter(train_loader))
print(f"\n🧪 Sample batch:")
print(f"   Input shape: {sample_batch_X.shape}")
print(f"   Label shape: {sample_batch_y.shape}")
print(f"   Sample labels: {sample_batch_y.tolist()[:5]}")

## 5. Training Setup and Configuration

Setting up the training process with loss function, optimizer, and TensorBoard logging:

In [None]:
# Training Configuration
# Following repository standards for comprehensive training setup

# Training hyperparameters
num_epochs = 20
learning_rate = 0.001
weight_decay = 1e-5

# Loss function for binary classification
criterion = nn.CrossEntropyLoss()

# Optimizer with weight decay for regularization
optimizer = optim.Adam(model.parameters(), lr=learning_rate, weight_decay=weight_decay)

# Learning rate scheduler for better training
scheduler = optim.lr_scheduler.StepLR(optimizer, step_size=7, gamma=0.5)

# TensorBoard setup for monitoring (required by repository standards)
log_dir = f"runs/australian_tourism_sentiment_{datetime.datetime.now().strftime('%Y%m%d_%H%M%S')}"
writer = SummaryWriter(log_dir)

print(f"🔧 Training Configuration:")
print(f"   Epochs: {num_epochs}")
print(f"   Learning rate: {learning_rate}")
print(f"   Batch size: {batch_size}")
print(f"   Optimizer: Adam with weight decay {weight_decay}")
print(f"   Loss function: CrossEntropyLoss")
print(f"   Device: {DEVICE}")
print(f"   📊 TensorBoard logs: {log_dir}")

# Log model architecture to TensorBoard
sample_input = X_train[:1].to(DEVICE)
writer.add_graph(model, sample_input)

# Log hyperparameters
hyperparams = {
    'learning_rate': learning_rate,
    'batch_size': batch_size,
    'epochs': num_epochs,
    'embed_dim': model.embed_dim,
    'hidden_dim': model.hidden_dim,
    'dropout_rate': model.dropout_rate,
    'vocab_size': vocab_size,
    'max_length': max_length
}

print(f"\n📋 Hyperparameters logged to TensorBoard:")
for key, value in hyperparams.items():
    print(f"   {key}: {value}")

## 6. PyTorch Training Loop

Implementing a manual training loop (contrasted with TensorFlow's `model.fit()`):

In [None]:
# Manual Training Loop - Key difference from TensorFlow
# TensorFlow: model.fit(X_train, y_train, epochs=num_epochs, validation_data=(X_test, y_test))
# PyTorch: Manual implementation with explicit forward/backward passes

def train_model(model, train_loader, test_loader, criterion, optimizer, scheduler, 
                num_epochs, device, writer):
    """
    Train the sentiment classification model with comprehensive logging.
    
    This demonstrates the key differences between PyTorch and TensorFlow:
    - Manual epoch and batch loops
    - Explicit gradient zeroing: optimizer.zero_grad()
    - Manual backward pass: loss.backward()
    - Manual optimizer step: optimizer.step()
    - Explicit model mode switching: model.train() / model.eval()
    """
    
    print(f"🚀 Starting Training: Australian Tourism Sentiment Analysis")
    print(f"📊 Training samples: {len(train_loader.dataset)}")
    print(f"📊 Test samples: {len(test_loader.dataset)}")
    print(f"🎯 Target: Binary sentiment classification (Positive/Negative)")
    print(f"🌏 Context: Multilingual Australian tourism reviews")
    print(f"⚡ Device: {device}")
    print("-" * 60)
    
    # Training history storage
    history = {
        'train_loss': [],
        'train_acc': [],
        'test_loss': [],
        'test_acc': [],
        'learning_rates': []
    }
    
    best_test_acc = 0.0
    
    for epoch in range(num_epochs):
        # Training Phase
        model.train()  # Set model to training mode (enables dropout, batch norm)
        train_loss = 0.0
        train_correct = 0
        train_total = 0
        
        for batch_idx, (data, targets) in enumerate(train_loader):
            # Move data to device
            data, targets = data.to(device), targets.to(device)
            
            # Zero gradients (REQUIRED in PyTorch, automatic in TensorFlow)
            optimizer.zero_grad()
            
            # Forward pass
            outputs = model(data)
            loss = criterion(outputs, targets)
            
            # Backward pass (explicit in PyTorch)
            loss.backward()
            
            # Update weights
            optimizer.step()
            
            # Track statistics
            train_loss += loss.item()
            _, predicted = outputs.max(1)
            train_total += targets.size(0)
            train_correct += predicted.eq(targets).sum().item()
            
            # Log batch-level metrics to TensorBoard
            if batch_idx % 5 == 0:
                global_step = epoch * len(train_loader) + batch_idx
                writer.add_scalar('Loss/Train_Batch', loss.item(), global_step)
        
        # Calculate epoch training metrics
        epoch_train_loss = train_loss / len(train_loader)
        epoch_train_acc = train_correct / train_total
        
        # Validation/Test Phase
        model.eval()  # Set model to evaluation mode (disables dropout)
        test_loss = 0.0
        test_correct = 0
        test_total = 0
        
        with torch.no_grad():  # Disable gradient computation for efficiency
            for data, targets in test_loader:
                data, targets = data.to(device), targets.to(device)
                outputs = model(data)
                loss = criterion(outputs, targets)
                
                test_loss += loss.item()
                _, predicted = outputs.max(1)
                test_total += targets.size(0)
                test_correct += predicted.eq(targets).sum().item()
        
        # Calculate epoch test metrics
        epoch_test_loss = test_loss / len(test_loader)
        epoch_test_acc = test_correct / test_total
        
        # Update learning rate scheduler
        scheduler.step()
        current_lr = optimizer.param_groups[0]['lr']
        
        # Store history
        history['train_loss'].append(epoch_train_loss)
        history['train_acc'].append(epoch_train_acc)
        history['test_loss'].append(epoch_test_loss)
        history['test_acc'].append(epoch_test_acc)
        history['learning_rates'].append(current_lr)
        
        # Log epoch-level metrics to TensorBoard
        writer.add_scalar('Loss/Train', epoch_train_loss, epoch)
        writer.add_scalar('Loss/Test', epoch_test_loss, epoch)
        writer.add_scalar('Accuracy/Train', epoch_train_acc, epoch)
        writer.add_scalar('Accuracy/Test', epoch_test_acc, epoch)
        writer.add_scalar('Learning_Rate', current_lr, epoch)
        
        # Log parameter histograms (every 5 epochs)
        if epoch % 5 == 0:
            for name, param in model.named_parameters():
                writer.add_histogram(f'Parameters/{name}', param, epoch)
        
        # Save best model
        if epoch_test_acc > best_test_acc:
            best_test_acc = epoch_test_acc
            torch.save(model.state_dict(), 'best_sentiment_model.pth')
        
        # Print progress
        print(f'Epoch {epoch+1:2d}/{num_epochs} | '
              f'Train Loss: {epoch_train_loss:.4f} | Train Acc: {epoch_train_acc:.4f} | '
              f'Test Loss: {epoch_test_loss:.4f} | Test Acc: {epoch_test_acc:.4f} | '
              f'LR: {current_lr:.6f}')
        
        # Early stopping if perfect accuracy
        if epoch_test_acc >= 0.99:
            print(f"🎉 Early stopping: Achieved {epoch_test_acc:.1%} test accuracy!")
            break
    
    writer.close()
    
    print(f"\n🏁 Training Complete!")
    print(f"   📈 Best test accuracy: {best_test_acc:.4f} ({best_test_acc:.1%})")
    print(f"   💾 Best model saved as: best_sentiment_model.pth")
    print(f"   📊 TensorBoard logs: {log_dir}")
    
    return history

# Start training
training_history = train_model(
    model=model,
    train_loader=train_loader,
    test_loader=test_loader,
    criterion=criterion,
    optimizer=optimizer,
    scheduler=scheduler,
    num_epochs=num_epochs,
    device=DEVICE,
    writer=writer
)

## 7. Training Results and Visualization

Using seaborn for training metrics visualization (following repository standards):

In [None]:
# Training Results Visualization with Seaborn
# Following repository guidelines for seaborn usage in notebooks

def plot_training_history(history):
    """
    Plot training metrics using seaborn for better aesthetics.
    
    Following repository policy: Use seaborn instead of matplotlib when possible.
    """
    epochs = range(1, len(history['train_loss']) + 1)
    
    # Create DataFrame for seaborn plotting
    df_loss = pd.DataFrame({
        'Epoch': list(epochs) * 2,
        'Loss': history['train_loss'] + history['test_loss'],
        'Dataset': ['Train'] * len(epochs) + ['Test'] * len(epochs)
    })
    
    df_acc = pd.DataFrame({
        'Epoch': list(epochs) * 2,
        'Accuracy': history['train_acc'] + history['test_acc'],
        'Dataset': ['Train'] * len(epochs) + ['Test'] * len(epochs)
    })
    
    # Create comprehensive visualization
    fig, axes = plt.subplots(2, 2, figsize=(15, 10))
    fig.suptitle('🇦🇺 Australian Tourism Sentiment Analysis - Training Results', 
                 fontsize=16, fontweight='bold', y=0.98)
    
    # Loss plot
    sns.lineplot(data=df_loss, x='Epoch', y='Loss', hue='Dataset', ax=axes[0,0])
    axes[0,0].set_title('Training & Test Loss', fontsize=12, fontweight='bold')
    axes[0,0].grid(True, alpha=0.3)
    
    # Accuracy plot
    sns.lineplot(data=df_acc, x='Epoch', y='Accuracy', hue='Dataset', ax=axes[0,1])
    axes[0,1].set_title('Training & Test Accuracy', fontsize=12, fontweight='bold')
    axes[0,1].grid(True, alpha=0.3)
    axes[0,1].set_ylim(0, 1)
    
    # Learning rate plot
    axes[1,0].plot(epochs, history['learning_rates'], 'g-', linewidth=2)
    axes[1,0].set_title('Learning Rate Schedule', fontsize=12, fontweight='bold')
    axes[1,0].set_xlabel('Epoch')
    axes[1,0].set_ylabel('Learning Rate')
    axes[1,0].grid(True, alpha=0.3)
    axes[1,0].set_yscale('log')  # Log scale for better visibility
    
    # Final metrics summary
    final_train_acc = history['train_acc'][-1]
    final_test_acc = history['test_acc'][-1]
    final_train_loss = history['train_loss'][-1]
    final_test_loss = history['test_loss'][-1]
    
    metrics_text = f"""
Final Training Results:

📊 Accuracy:
   • Training: {final_train_acc:.1%}
   • Test: {final_test_acc:.1%}

📉 Loss:
   • Training: {final_train_loss:.4f}
   • Test: {final_test_loss:.4f}

🎯 Model Performance:
   • {'Excellent' if final_test_acc > 0.9 else 'Good' if final_test_acc > 0.8 else 'Fair'} generalization
   • {'Low' if abs(final_train_acc - final_test_acc) < 0.1 else 'Some'} overfitting

🇦🇺 Australian Context:
   • Multilingual support ✓
   • Tourism domain ✓
   • City-specific analysis ✓
    """
    
    axes[1,1].text(0.05, 0.95, metrics_text.strip(), 
                   transform=axes[1,1].transAxes, 
                   verticalalignment='top',
                   fontsize=10, 
                   bbox=dict(boxstyle='round', facecolor='lightblue', alpha=0.8))
    axes[1,1].set_title('Training Summary', fontsize=12, fontweight='bold')
    axes[1,1].axis('off')
    
    plt.tight_layout()
    plt.show()
    
    return final_test_acc

# Plot training results
final_accuracy = plot_training_history(training_history)

print(f"\n🎉 Training visualization complete!")
print(f"   Final test accuracy: {final_accuracy:.1%}")
print(f"   Model ready for predictions on Australian tourism reviews!")

## 8. Model Testing and Predictions

Testing our trained model on new Australian tourism texts:

In [None]:
# Test the trained model with new Australian tourism examples

# Load the best model if available
try:
    model.load_state_dict(torch.load('best_sentiment_model.pth', map_location=DEVICE))
    print("✅ Best model loaded successfully!")
except:
    print("ℹ️ Using current model weights (best model not found)")

# Test samples with Australian context
test_samples = [
    # English samples
    "The Sydney Opera House concert was absolutely magnificent! World-class acoustics.",
    "Melbourne traffic is horrible and the weather ruined our vacation completely.",
    "Brisbane's South Bank is a beautiful place for families to relax and enjoy.",
    "Perth beaches are overrated and the service at hotels was disappointing.",
    "Gold Coast theme parks provide amazing entertainment for the whole family!",
    
    # Vietnamese samples
    "Cầu Harbour Bridge ở Sydney thật tuyệt vời và đáng để leo lên xem cảnh!",
    "Thời tiết Melbourne quá khó đoán và làm hỏng cả chuyến du lịch của chúng tôi.",
    "Ẩm thực đường phố ở Adelaide rất ngon và giá cả hợp lý.",
    "Dịch vụ tại khách sạn Darwin rất tệ và nhân viên không thân thiện.",
    "Vườn thú Taronga ở Sydney có nhiều động vật thú vị và view tuyệt đẹp!"
]

print("🧪 Testing Australian Tourism Sentiment Classification:")
print("="*70)

# Test each sample
correct_predictions = 0
total_predictions = len(test_samples)

# Expected sentiments (for manual evaluation)
expected_sentiments = ['Positive', 'Negative', 'Positive', 'Negative', 'Positive',
                      'Positive', 'Negative', 'Positive', 'Negative', 'Positive']

for i, text in enumerate(test_samples):
    # Get prediction
    result = model.predict_sentiment(text, vocab, DEVICE)
    
    # Detect language
    is_vietnamese = any(char in text for char in 'àáảãạăắằẳẵặâấầẩẫậèéẻẽẹêếềểễệìíỉĩịòóỏõọôốồổỗộơớờởỡợùúủũụưứừửữựỳýỷỹỵđ')
    language = "🇻🇳 Vietnamese" if is_vietnamese else "🇺🇸 English"
    
    # Check if prediction matches expected
    is_correct = result['predicted_sentiment'] == expected_sentiments[i]
    if is_correct:
        correct_predictions += 1
    
    # Display results
    print(f"\nTest {i+1}/10 ({language}):")
    print(f"   Text: \"{text[:60]}{'...' if len(text) > 60 else ''}\"")
    print(f"   Predicted: {result['predicted_sentiment']} ({result['confidence']:.1%} confidence)")
    print(f"   Expected: {expected_sentiments[i]} {'✓' if is_correct else '✗'}")
    
    if result['australian_cities_mentioned']:
        print(f"   🏙️ Australian cities mentioned: {', '.join(result['australian_cities_mentioned'])}")
    
    # Show probability breakdown for interesting cases
    if abs(result['probabilities']['positive'] - result['probabilities']['negative']) < 0.3:
        print(f"   📊 Close call - Positive: {result['probabilities']['positive']:.1%}, "
              f"Negative: {result['probabilities']['negative']:.1%}")

print(f"\n{'='*70}")
print(f"🎯 Manual Evaluation Results:")
print(f"   Accuracy on test samples: {correct_predictions}/{total_predictions} ({correct_predictions/total_predictions:.1%})")
print(f"   ✅ Correctly classified: {correct_predictions} samples")
print(f"   ❌ Misclassified: {total_predictions - correct_predictions} samples")

# Model performance summary
print(f"\n🧠 Model Performance Summary:")
print(f"   📈 Training accuracy: {training_history['train_acc'][-1]:.1%}")
print(f"   📊 Test accuracy: {training_history['test_acc'][-1]:.1%}")
print(f"   🧪 Manual test accuracy: {correct_predictions/total_predictions:.1%}")
print(f"   🌏 Multilingual support: English + Vietnamese ✓")
print(f"   🇦🇺 Australian context: Tourism reviews ✓")
print(f"   🏙️ City detection: Sydney, Melbourne, Brisbane, etc. ✓")

## 9. Summary and Next Steps

### What We Accomplished

🎉 **Successfully built and trained a basic neural network for sentiment analysis!**

#### Key PyTorch Concepts Learned:
- **`nn.Module` Architecture**: Created a custom neural network class with proper inheritance
- **Manual Training Loops**: Implemented explicit forward/backward passes vs TensorFlow's `model.fit()`
- **Device Management**: Proper CPU/GPU handling with device detection
- **Data Processing**: Custom tokenization and dataset creation for multilingual text
- **TensorBoard Integration**: Comprehensive training monitoring and visualization

#### Australian Context Features:
- 🇦🇺 **Tourism Domain**: Sentiment analysis for Australian travel reviews
- 🌏 **Multilingual Support**: English and Vietnamese text processing
- 🏙️ **City Recognition**: Sydney, Melbourne, Brisbane, Perth detection
- 📊 **Real-world Application**: Practical sentiment classification

#### PyTorch vs TensorFlow Key Differences:

| Aspect | TensorFlow | PyTorch |
|--------|------------|----------|
| **Model Definition** | `tf.keras.Sequential` | Custom `nn.Module` classes |
| **Training** | `model.fit()` | Manual loops with `loss.backward()` |
| **Gradients** | Automatic | Manual `optimizer.zero_grad()` |
| **Device Handling** | Mostly automatic | Explicit `.to(device)` |
| **Debugging** | Can be complex | Python-native, easier debugging |
| **Control** | High-level abstractions | Low-level control |

### Next Steps for Advanced Learning:

1. **🔄 Recurrent Neural Networks**: LSTMs and GRUs for sequence modeling
2. **🤗 Hugging Face Integration**: Pre-trained transformers (BERT, RoBERTa)
3. **📊 Advanced Architectures**: Attention mechanisms and Transformers
4. **🚀 Model Optimization**: Quantization, pruning, and deployment
5. **🌐 Production Deployment**: Model serving and REST APIs

### TensorBoard Viewing Instructions:

To view the training logs and metrics:

```bash
# In terminal/command prompt:
tensorboard --logdir runs/australian_tourism_sentiment_*

# Then open: http://localhost:6006
```

**Available visualizations:**
- 📈 Loss and accuracy curves
- 🧠 Model architecture graph
- 📊 Parameter histograms
- ⚙️ Hyperparameter tracking

---

**🎓 Congratulations!** You've successfully implemented a neural network in PyTorch with Australian tourism context and multilingual support. This foundation prepares you for more advanced NLP tasks and modern transformer architectures.

**📚 Continue your PyTorch journey** with the other notebooks in this repository for deeper neural network concepts and real-world applications!