# PyTorch Tensor Fundamentals: From TensorFlow to PyTorch

This notebook provides a comprehensive introduction to PyTorch tensors, designed for learners transitioning from TensorFlow. We'll explore tensor creation, operations, and the NumPy bridge using Australian-themed examples.

## Learning Objectives
- Understand PyTorch tensor basics and how they compare to TensorFlow tensors
- Master tensor creation, properties, and data types
- Learn essential tensor operations for NLP and deep learning
- Explore tensor indexing, slicing, and reshaping
- Bridge between PyTorch tensors and NumPy arrays
- Apply tensor operations to Australian tourism and multilingual text data

## Key Differences from TensorFlow
| Aspect | TensorFlow | PyTorch |
|--------|------------|---------|
| **Execution** | Graph-based (TF 1.x) or Eager (TF 2.x) | Always eager (dynamic graphs) |
| **Tensor Creation** | `tf.constant([1, 2, 3])` | `torch.tensor([1, 2, 3])` |
| **Random Tensors** | `tf.random.normal([2, 3])` | `torch.randn(2, 3)` |
| **Reshaping** | `tf.reshape(x, [2, -1])` | `x.view(2, -1)` or `x.reshape(2, -1)` |
| **Device Management** | Automatic with strategies | Explicit with `.to(device)` |

---

## 1. Environment Setup and Runtime Detection

Following PyTorch best practices for cross-platform compatibility:

In [None]:
# Environment Detection and Setup
import sys
import subprocess
import os
import time

# Detect the runtime environment
IS_COLAB = "google.colab" in sys.modules
IS_KAGGLE = "kaggle_secrets" in sys.modules or "kaggle" in os.environ.get('KAGGLE_URL_BASE', '')
IS_LOCAL = not (IS_COLAB or IS_KAGGLE)

print(f"🌐 Environment detected:")
print(f"  - Local: {IS_LOCAL}")
print(f"  - Google Colab: {IS_COLAB}")
print(f"  - Kaggle: {IS_KAGGLE}")

# Platform-specific system setup
if IS_COLAB:
    print("\n🔧 Setting up Google Colab environment...")
    # Colab usually has PyTorch pre-installed
elif IS_KAGGLE:
    print("\n🔧 Setting up Kaggle environment...")
    # Kaggle usually has most packages pre-installed
else:
    print("\n🔧 Setting up local environment...")

In [None]:
# Install required packages based on platform
required_packages = [
    "torch",
    "numpy",
    "matplotlib",
    "pandas"
]

print("📦 Installing required packages...")
for package in required_packages:
    if IS_COLAB or IS_KAGGLE:
        # Use IPython magic commands for notebook environments
        try:
            exec(f"!pip install -q {package}")
            print(f"✅ {package}")
        except:
            print(f"⚠️ {package} (may already be installed)")
    else:
        try:
            subprocess.run([sys.executable, "-m", "pip", "install", "-q", package], 
                          capture_output=True, check=True)
            print(f"✅ {package}")
        except subprocess.CalledProcessError:
            print(f"⚠️ {package} (may already be installed)")

In [None]:
# Verify PyTorch installation and detect optimal device
import torch
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import platform

def detect_device():
    """
    Detect the best available PyTorch device with comprehensive hardware support.
    
    Priority order:
    1. CUDA (NVIDIA GPUs) - Best performance for deep learning
    2. MPS (Apple Silicon) - Optimized for M1/M2/M3 Macs  
    3. CPU (Universal) - Always available fallback
    
    Returns:
        torch.device: The optimal device for PyTorch operations
        str: Human-readable device description for logging
    """
    # Check for CUDA (NVIDIA GPU)
    if torch.cuda.is_available():
        device = torch.device("cuda")
        gpu_name = torch.cuda.get_device_name(0)
        device_info = f"CUDA GPU: {gpu_name}"
        
        print(f"🚀 Using CUDA acceleration")
        print(f"   GPU: {gpu_name}")
        print(f"   CUDA Version: {torch.version.cuda}")
        
        return device, device_info
    
    # Check for MPS (Apple Silicon)
    elif hasattr(torch.backends, 'mps') and torch.backends.mps.is_available():
        device = torch.device("mps")
        device_info = "Apple Silicon MPS"
        
        system_info = platform.uname()
        
        print(f"🍎 Using Apple Silicon MPS acceleration")
        print(f"   System: {system_info.system} {system_info.release}")
        print(f"   Machine: {system_info.machine}")
        
        return device, device_info
    
    # Fallback to CPU
    else:
        device = torch.device("cpu")
        device_info = "CPU (No GPU acceleration available)"
        
        cpu_count = torch.get_num_threads()
        system_info = platform.uname()
        
        print(f"💻 Using CPU (no GPU acceleration detected)")
        print(f"   Processor: {system_info.processor}")
        print(f"   PyTorch Threads: {cpu_count}")
        print(f"   System: {system_info.system} {system_info.release}")
        
        return device, device_info

# Detect and set up device
device, device_info = detect_device()

print(f"\n✅ PyTorch {torch.__version__} ready!")
print(f"📱 Device selected: {device}")
print(f"📊 Device info: {device_info}")

# Set global device for the notebook
DEVICE = device

## 2. Basic Tensor Creation

PyTorch tensors are the fundamental building blocks for deep learning. Let's explore different ways to create tensors using Australian-themed examples.

In [None]:
# Creating tensors from Python lists - Australian cities example
print("🇦🇺 Creating tensors with Australian city data\n")

# Australian cities population data (in millions, approximate)
cities = ["Sydney", "Melbourne", "Brisbane", "Perth", "Adelaide", "Gold Coast", "Newcastle", "Canberra"]
populations = [5.3, 5.1, 2.6, 2.1, 1.4, 0.7, 0.5, 0.5]  # millions

# Create tensors from lists
population_tensor = torch.tensor(populations, dtype=torch.float32)
print(f"Population tensor: {population_tensor}")
print(f"Shape: {population_tensor.shape}")
print(f"Data type: {population_tensor.dtype}")
print(f"Device: {population_tensor.device}")

# TensorFlow comparison
print("\n📊 TensorFlow vs PyTorch Comparison:")
print("   TensorFlow: tf.constant([5.3, 5.1, 2.6, 2.1, 1.4, 0.7, 0.5, 0.5])")
print(f"   PyTorch:    torch.tensor([5.3, 5.1, 2.6, 2.1, 1.4, 0.7, 0.5, 0.5])")

# Different data types
print("\n🔢 Different tensor data types:")
int_tensor = torch.tensor([1, 2, 3, 4, 5], dtype=torch.int32)
float_tensor = torch.tensor([1.0, 2.0, 3.0, 4.0, 5.0], dtype=torch.float32)
bool_tensor = torch.tensor([True, False, True, False], dtype=torch.bool)

print(f"Integer tensor: {int_tensor} (dtype: {int_tensor.dtype})")
print(f"Float tensor: {float_tensor} (dtype: {float_tensor.dtype})")
print(f"Boolean tensor: {bool_tensor} (dtype: {bool_tensor.dtype})")

In [None]:
# Creating 2D tensors - Australian tourism ratings
print("🏖️ 2D Tensors: Australian Tourism Attraction Ratings\n")

# Tourism ratings matrix: rows=attractions, columns=rating categories
# Categories: [Overall, Scenery, Accessibility, Family-Friendly, Cost-Effective]
attractions = ["Sydney Opera House", "Great Barrier Reef", "Uluru", "Bondi Beach", "Melbourne Laneways"]
rating_categories = ["Overall", "Scenery", "Accessibility", "Family-Friendly", "Cost-Effective"]

# Ratings out of 10
tourism_ratings = [
    [9.5, 9.8, 8.5, 8.0, 6.0],  # Sydney Opera House
    [9.8, 10.0, 6.0, 8.5, 4.0], # Great Barrier Reef
    [9.0, 10.0, 7.0, 7.5, 5.0], # Uluru
    [8.5, 9.0, 9.5, 9.0, 9.5],  # Bondi Beach
    [8.0, 7.5, 9.0, 7.0, 8.5]   # Melbourne Laneways
]

tourism_tensor = torch.tensor(tourism_ratings, dtype=torch.float32)
print(f"Tourism ratings tensor shape: {tourism_tensor.shape}")
print(f"Tensor:\n{tourism_tensor}")

# Display with labels for clarity
print("\n📋 Tourism Ratings Matrix:")
print(f"{'Attraction':<20} {'Overall':<8} {'Scenery':<8} {'Access.':<8} {'Family':<8} {'Cost':<8}")
print("-" * 70)
for i, attraction in enumerate(attractions):
    ratings = tourism_tensor[i]
    print(f"{attraction:<20} {ratings[0]:<8.1f} {ratings[1]:<8.1f} {ratings[2]:<8.1f} {ratings[3]:<8.1f} {ratings[4]:<8.1f}")

# Tensor properties
print(f"\n🔍 Tensor Properties:")
print(f"   Dimensions: {tourism_tensor.ndim}")
print(f"   Shape: {tourism_tensor.shape}")
print(f"   Size: {tourism_tensor.numel()} elements")
print(f"   Data type: {tourism_tensor.dtype}")
print(f"   Device: {tourism_tensor.device}")
print(f"   Memory usage: {tourism_tensor.element_size() * tourism_tensor.numel()} bytes")

In [None]:
# Special tensor creation functions
print("🎯 Special Tensor Creation Functions\n")

# Zeros tensor - useful for initialization
zeros_tensor = torch.zeros(3, 4)
print(f"Zeros tensor (3x4):\n{zeros_tensor}")

# Ones tensor - useful for masks and weights
ones_tensor = torch.ones(2, 3)
print(f"\nOnes tensor (2x3):\n{ones_tensor}")

# Identity matrix - essential for linear algebra
identity_tensor = torch.eye(4)
print(f"\nIdentity tensor (4x4):\n{identity_tensor}")

# Random tensors - crucial for neural network initialization
print("\n🎲 Random Tensor Creation:")

# Random normal distribution (mean=0, std=1)
random_normal = torch.randn(3, 3)
print(f"Random normal (3x3):\n{random_normal}")

# Random uniform distribution [0, 1)
random_uniform = torch.rand(2, 4)
print(f"\nRandom uniform (2x4):\n{random_uniform}")

# Random integers in a range
random_int = torch.randint(0, 10, (3, 3))
print(f"\nRandom integers 0-9 (3x3):\n{random_int}")

# Australian-specific example: Random tourist group sizes
print("\n🚌 Australian Tourism Example - Random Group Sizes:")
# Generate random tourist group sizes for different attractions (5-50 people)
group_sizes = torch.randint(5, 51, (len(attractions),))
for i, attraction in enumerate(attractions):
    print(f"   {attraction}: {group_sizes[i].item()} visitors")

# TensorFlow comparison
print("\n📊 TensorFlow vs PyTorch Random Tensors:")
print("   TensorFlow: tf.random.normal([3, 3])")
print("   PyTorch:    torch.randn(3, 3)")
print("   TensorFlow: tf.random.uniform([2, 4])")
print("   PyTorch:    torch.rand(2, 4)")

## 3. Tensor Operations

PyTorch provides a rich set of operations for tensor manipulation. Let's explore mathematical operations, matrix operations, and more using Australian tourism data.

In [None]:
# Mathematical operations with Australian weather data
print("🌡️ Mathematical Operations: Australian Weather Data\n")

# Australian cities average temperatures (Celsius) for different seasons
# Cities: Sydney, Melbourne, Brisbane, Perth, Adelaide
summer_temps = torch.tensor([26.5, 25.5, 28.0, 30.5, 28.5], dtype=torch.float32)
winter_temps = torch.tensor([17.0, 14.0, 21.0, 18.5, 15.5], dtype=torch.float32)

print(f"Summer temperatures: {summer_temps}")
print(f"Winter temperatures: {winter_temps}")

# Basic arithmetic operations
temp_difference = summer_temps - winter_temps
average_temps = (summer_temps + winter_temps) / 2
temp_ratio = summer_temps / winter_temps

print(f"\nTemperature differences: {temp_difference}")
print(f"Average temperatures: {average_temps}")
print(f"Summer/Winter ratio: {temp_ratio}")

# Element-wise operations
print("\n🔢 Element-wise Operations:")
squared_temps = torch.pow(summer_temps, 2)
sqrt_temps = torch.sqrt(summer_temps)
rounded_temps = torch.round(average_temps)

print(f"Squared summer temps: {squared_temps}")
print(f"Square root of summer temps: {sqrt_temps}")
print(f"Rounded average temps: {rounded_temps}")

# TensorFlow comparison
print("\n📊 TensorFlow vs PyTorch Operations:")
print("   TensorFlow: tf.add(a, b) or a + b")
print("   PyTorch:    torch.add(a, b) or a + b")
print("   TensorFlow: tf.square(a)")
print("   PyTorch:    torch.pow(a, 2) or a.pow(2)")

In [None]:
# Matrix operations with Australian tourism data
print("🏖️ Matrix Operations: Australian Tourism Analysis\n")

# Tourist arrivals matrix (millions): rows=years, columns=cities
# Years: 2019, 2020, 2021, 2022
# Cities: Sydney, Melbourne, Brisbane, Perth
tourist_arrivals = torch.tensor([
    [4.5, 3.2, 2.8, 1.9],  # 2019
    [2.1, 1.5, 1.2, 0.8],  # 2020 (COVID impact)
    [1.8, 1.2, 1.0, 0.6],  # 2021
    [3.8, 2.7, 2.3, 1.5]   # 2022 (recovery)
], dtype=torch.float32)

print(f"Tourist arrivals matrix (millions):\n{tourist_arrivals}")
print(f"Shape: {tourist_arrivals.shape}")

# Matrix transpose
arrivals_transposed = tourist_arrivals.transpose(0, 1)
print(f"\nTransposed matrix (cities x years):\n{arrivals_transposed}")

# Matrix-vector operations
# Weight vector for different tourism spending per visitor (thousands AUD)
spending_per_visitor = torch.tensor([8.5, 7.2, 6.8, 9.1], dtype=torch.float32)
print(f"\nSpending per visitor (thousands AUD): {spending_per_visitor}")

# Calculate total tourism revenue for each year
total_revenue = torch.matmul(tourist_arrivals, spending_per_visitor)
print(f"Total tourism revenue by year (millions AUD): {total_revenue}")

# Element-wise multiplication (Hadamard product)
print("\n💰 Revenue calculation (arrivals × avg spending):")
# Create spending matrix (same shape as arrivals)
spending_matrix = spending_per_visitor.unsqueeze(0).repeat(4, 1)
revenue_matrix = tourist_arrivals * spending_matrix
print(f"Revenue by city and year:\n{revenue_matrix}")

# Matrix norms and statistics
print("\n📊 Matrix Statistics:")
print(f"Total arrivals across all years/cities: {tourist_arrivals.sum().item():.1f} million")
print(f"Average arrivals per city per year: {tourist_arrivals.mean().item():.1f} million")
print(f"Max arrivals (single city/year): {tourist_arrivals.max().item():.1f} million")
print(f"Min arrivals (single city/year): {tourist_arrivals.min().item():.1f} million")

# TensorFlow comparison
print("\n📊 TensorFlow vs PyTorch Matrix Operations:")
print("   TensorFlow: tf.matmul(a, b) or tf.linalg.matmul(a, b)")
print("   PyTorch:    torch.matmul(a, b) or torch.mm(a, b)")
print("   TensorFlow: tf.transpose(a)")
print("   PyTorch:    a.transpose() or a.T")

In [None]:
# Reduction operations and aggregations
print("📈 Reduction Operations: Australian Economic Analysis\n")

# Australian state GDP data (billions AUD, simplified)
states = ["NSW", "VIC", "QLD", "WA", "SA", "TAS"]
gdp_sectors = ["Mining", "Manufacturing", "Services", "Agriculture"]

# GDP by state and sector (billions AUD)
gdp_data = torch.tensor([
    [45.2, 78.5, 385.2, 12.1],  # NSW
    [12.8, 95.2, 325.5, 15.3],  # VIC
    [85.6, 45.8, 185.4, 22.7],  # QLD
    [165.2, 28.4, 125.8, 18.9], # WA
    [8.5, 18.2, 65.4, 8.2],     # SA
    [2.1, 4.5, 18.9, 3.8]       # TAS
], dtype=torch.float32)

print(f"GDP data shape: {gdp_data.shape} (states × sectors)")
print(f"GDP data:\n{gdp_data}")

# Reduction along different dimensions
print("\n🎯 Reduction Operations:")

# Sum along states (dim=0) - total GDP by sector across Australia
gdp_by_sector = torch.sum(gdp_data, dim=0)
print(f"\nTotal GDP by sector (billions AUD):")
for i, sector in enumerate(gdp_sectors):
    print(f"   {sector}: ${gdp_by_sector[i]:.1f}B")

# Sum along sectors (dim=1) - total GDP by state
gdp_by_state = torch.sum(gdp_data, dim=1)
print(f"\nTotal GDP by state (billions AUD):")
for i, state in enumerate(states):
    print(f"   {state}: ${gdp_by_state[i]:.1f}B")

# Other reduction operations
print(f"\nOther Statistics:")
print(f"   Total Australian GDP: ${torch.sum(gdp_data).item():.1f}B")
print(f"   Average sector GDP per state: ${torch.mean(gdp_data).item():.1f}B")
print(f"   Largest sector in any state: ${torch.max(gdp_data).item():.1f}B")
print(f"   Smallest sector in any state: ${torch.min(gdp_data).item():.1f}B")

# Find indices of max/min values
max_indices = torch.argmax(gdp_data)
max_state, max_sector = divmod(max_indices.item(), gdp_data.shape[1])
print(f"   Largest sector: {gdp_sectors[max_sector]} in {states[max_state]}")

# Standard deviation and variance
print(f"\n📊 Variability:")
print(f"   GDP standard deviation: ${torch.std(gdp_data).item():.1f}B")
print(f"   GDP variance: ${torch.var(gdp_data).item():.1f}B²")

# TensorFlow comparison
print("\n📊 TensorFlow vs PyTorch Reductions:")
print("   TensorFlow: tf.reduce_sum(x, axis=0)")
print("   PyTorch:    torch.sum(x, dim=0)")
print("   TensorFlow: tf.reduce_mean(x)")
print("   PyTorch:    torch.mean(x)")

## 4. Tensor Indexing and Slicing

Tensor indexing and slicing are essential for data manipulation in deep learning. Let's explore these concepts using Australian text and linguistic data.

In [None]:
# Basic indexing with Australian text data
print("📝 Basic Indexing: Australian Text Analysis\n")

# Create a tensor representing word frequencies in Australian tourism reviews
# Words: ["beautiful", "expensive", "crowded", "peaceful", "accessible"]
# Reviews for: Sydney Opera House, Great Barrier Reef, Uluru, Bondi Beach
word_frequencies = torch.tensor([
    [85, 45, 72, 28, 35],  # Sydney Opera House
    [95, 78, 25, 88, 15],  # Great Barrier Reef
    [92, 32, 18, 95, 22],  # Uluru
    [78, 12, 65, 45, 85]   # Bondi Beach
], dtype=torch.float32)

attractions = ["Sydney Opera House", "Great Barrier Reef", "Uluru", "Bondi Beach"]
words = ["beautiful", "expensive", "crowded", "peaceful", "accessible"]

print(f"Word frequency tensor shape: {word_frequencies.shape}")
print(f"Tensor:\n{word_frequencies}")

# Basic indexing
print("\n🎯 Basic Indexing Examples:")

# Access single element
opera_house_beautiful = word_frequencies[0, 0]
print(f"'Beautiful' mentions for Sydney Opera House: {opera_house_beautiful.item()}")

# Access entire row (all words for one attraction)
uluru_words = word_frequencies[2]
print(f"\nAll word frequencies for Uluru: {uluru_words}")

# Access entire column (one word for all attractions)
expensive_mentions = word_frequencies[:, 1]
print(f"'Expensive' mentions across attractions: {expensive_mentions}")

# Negative indexing (last elements)
last_attraction = word_frequencies[-1]
last_word_all_attractions = word_frequencies[:, -1]
print(f"\nLast attraction frequencies: {last_attraction}")
print(f"Last word ('accessible') across all: {last_word_all_attractions}")

# TensorFlow comparison
print("\n📊 TensorFlow vs PyTorch Indexing:")
print("   TensorFlow: tensor[0, 1] or tf.gather(tensor, [0])")
print("   PyTorch:    tensor[0, 1] (same syntax!)")

In [None]:
# Advanced slicing operations
print("✂️ Advanced Slicing: Australian Language Analysis\n")

# Create a more complex tensor: sentence sentiment scores
# Dimensions: [languages, sentences, sentiment_aspects]
# Languages: English, Vietnamese
# Sentences: 6 tourism review sentences per language
# Sentiment aspects: [positive, negative, neutral]

# Sample sentences:
english_sentences = [
    "Sydney Opera House is absolutely stunning!",
    "The prices in Melbourne are quite high.",
    "Bondi Beach has perfect weather today.",
    "The crowds at Uluru were overwhelming.",
    "Brisbane offers good value for money.",
    "The Great Barrier Reef is worth the trip."
]

vietnamese_sentences = [
    "Nhà hát Opera Sydney thật tuyệt vời!",
    "Giá cả ở Melbourne khá đắt.",
    "Bãi biển Bondi có thời tiết hoàn hảo hôm nay.",
    "Đám đông ở Uluru thật áp đảo.",
    "Brisbane cung cấp giá trị tốt cho tiền.",
    "Rạn san hô Great Barrier Reef đáng để đi."
]

# Sentiment scores (0-1 scale)
sentiment_data = torch.tensor([
    # English sentences [positive, negative, neutral]
    [[0.95, 0.02, 0.03],  # "absolutely stunning"
     [0.15, 0.65, 0.20],  # "quite high prices"
     [0.88, 0.05, 0.07],  # "perfect weather"
     [0.10, 0.78, 0.12],  # "overwhelming crowds"
     [0.75, 0.15, 0.10],  # "good value"
     [0.85, 0.08, 0.07]], # "worth the trip"
    
    # Vietnamese sentences [positive, negative, neutral]
    [[0.92, 0.03, 0.05],  # "thật tuyệt vời"
     [0.18, 0.62, 0.20],  # "khá đắt"
     [0.90, 0.04, 0.06],  # "hoàn hảo"
     [0.12, 0.75, 0.13],  # "áp đảo"
     [0.72, 0.18, 0.10],  # "giá trị tốt"
     [0.83, 0.09, 0.08]]  # "đáng để đi"
], dtype=torch.float32)

print(f"Sentiment tensor shape: {sentiment_data.shape} (languages × sentences × aspects)")

# Advanced slicing examples
print("\n🔍 Advanced Slicing Examples:")

# Extract only English data
english_sentiment = sentiment_data[0]
print(f"English sentiment shape: {english_sentiment.shape}")

# Extract only positive sentiment scores for both languages
positive_scores = sentiment_data[:, :, 0]
print(f"\nPositive sentiment scores:")
print(f"English: {positive_scores[0]}")
print(f"Vietnamese: {positive_scores[1]}")

# Extract first 3 sentences for both languages
first_three = sentiment_data[:, :3, :]
print(f"\nFirst 3 sentences shape: {first_three.shape}")

# Skip every other sentence
every_other = sentiment_data[:, ::2, :]
print(f"Every other sentence shape: {every_other.shape}")

# Complex slicing: negative sentiment for Vietnamese sentences 2-4
vietnamese_negative_subset = sentiment_data[1, 2:5, 1]
print(f"\nVietnamese negative sentiment (sentences 2-4): {vietnamese_negative_subset}")

print("\n📊 TensorFlow vs PyTorch Slicing:")
print("   TensorFlow: tensor[0:3, :, 1] or tf.slice(tensor, [0, 0, 1], [3, -1, 1])")
print("   PyTorch:    tensor[0:3, :, 1] (same syntax!)")

In [None]:
# Boolean indexing and advanced selection
print("🎯 Boolean Indexing: Australian Tourism Filtering\n")

# Australian tourism data: attractions with various metrics
attraction_names = ["Sydney Opera House", "Great Barrier Reef", "Uluru", "Bondi Beach", 
                   "Melbourne Laneways", "Blue Mountains", "Gold Coast Theme Parks", "Kangaroo Island"]

# Metrics: [visitor_satisfaction, price_rating, accessibility, family_friendly]
# Scale: 1-10 for all metrics
attraction_metrics = torch.tensor([
    [9.5, 6.0, 7.5, 8.0],  # Sydney Opera House
    [9.8, 4.0, 5.5, 7.5],  # Great Barrier Reef
    [9.2, 5.5, 6.0, 6.5],  # Uluru
    [8.8, 9.0, 9.5, 9.2],  # Bondi Beach
    [8.5, 8.5, 9.0, 7.0],  # Melbourne Laneways
    [8.0, 8.0, 7.0, 8.5],  # Blue Mountains
    [7.5, 6.5, 8.0, 9.5],  # Gold Coast Theme Parks
    [8.2, 7.0, 6.5, 8.0]   # Kangaroo Island
], dtype=torch.float32)

print(f"Attraction metrics shape: {attraction_metrics.shape}")
print("Metrics: [satisfaction, price_rating, accessibility, family_friendly]\n")

# Boolean indexing examples
print("🔍 Boolean Indexing Examples:")

# Find highly satisfying attractions (satisfaction > 9.0)
high_satisfaction = attraction_metrics[:, 0] > 9.0
high_satisfaction_attractions = attraction_metrics[high_satisfaction]
print(f"High satisfaction mask: {high_satisfaction}")
print(f"High satisfaction attractions count: {high_satisfaction.sum().item()}")
print("High satisfaction attractions:")
for i, is_high in enumerate(high_satisfaction):
    if is_high:
        print(f"   {attraction_names[i]}: {attraction_metrics[i].tolist()}")

# Find budget-friendly and family-friendly attractions
budget_friendly = attraction_metrics[:, 1] >= 7.0  # price_rating >= 7
family_friendly = attraction_metrics[:, 3] >= 8.0  # family_friendly >= 8
budget_and_family = budget_friendly & family_friendly

print(f"\nBudget & family-friendly attractions:")
for i, is_suitable in enumerate(budget_and_family):
    if is_suitable:
        print(f"   {attraction_names[i]}: {attraction_metrics[i].tolist()}")

# Advanced filtering: find attractions with balanced scores (all metrics > 7.0)
balanced_attractions = torch.all(attraction_metrics > 7.0, dim=1)
print(f"\nBalanced attractions (all metrics > 7.0):")
for i, is_balanced in enumerate(balanced_attractions):
    if is_balanced:
        print(f"   {attraction_names[i]}: {attraction_metrics[i].tolist()}")

# Using torch.where for conditional selection
print(f"\n💡 Using torch.where for conditional operations:")
# Replace low accessibility scores (< 7) with "Needs Improvement" (encoded as 0)
improved_accessibility = torch.where(attraction_metrics[:, 2] < 7.0, 
                                   torch.tensor(0.0), 
                                   attraction_metrics[:, 2])
print(f"Original accessibility: {attraction_metrics[:, 2]}")
print(f"Improved accessibility: {improved_accessibility}")

print("\n📊 TensorFlow vs PyTorch Boolean Indexing:")
print("   TensorFlow: tf.boolean_mask(tensor, condition)")
print("   PyTorch:    tensor[condition]")
print("   TensorFlow: tf.where(condition, x, y)")
print("   PyTorch:    torch.where(condition, x, y)")

In [None]:
# Reshaping and view operations
print("🔄 Reshaping and View Operations: Text Processing\n")

# Create a tensor representing tokenized text
# Simulate tokenized Australian tourism reviews
print("Example: Processing tokenized Australian tourism reviews")
print("Original text: 'Sydney Opera House offers stunning harbor views and excellent performances'")
print("Vietnamese: 'Nhà hát Opera Sydney cung cấp tầm nhìn cảng tuyệt đẹp và các buổi biểu diễn xuất sắc'\n")

# Token IDs for the sentence (simplified vocabulary)
original_tokens = torch.tensor([
    15, 67, 89, 23, 156, 78, 234, 45, 167, 98, 134, 56
], dtype=torch.long)

print(f"Original token sequence: {original_tokens}")
print(f"Shape: {original_tokens.shape}")

# Reshape into matrix (e.g., for batch processing)
print("\n🔄 Reshaping Examples:")

# Reshape to 3x4 matrix
reshaped_3x4 = original_tokens.reshape(3, 4)
print(f"Reshaped to 3x4:\n{reshaped_3x4}")

# Reshape to 2x6 matrix
reshaped_2x6 = original_tokens.reshape(2, 6)
print(f"\nReshaped to 2x6:\n{reshaped_2x6}")

# Use -1 for automatic dimension calculation
reshaped_auto = original_tokens.reshape(4, -1)
print(f"\nReshaped to 4x? (auto-calculated):\n{reshaped_auto}")
print(f"Auto shape: {reshaped_auto.shape}")

# View vs Reshape
print("\n👁️ View vs Reshape:")
viewed_tensor = original_tokens.view(3, 4)
print(f"View (shares memory): {viewed_tensor.shape}")
print(f"Original data pointer same as view: {original_tokens.data_ptr() == viewed_tensor.data_ptr()}")

# Adding and removing dimensions
print("\n📐 Adding/Removing Dimensions:")

# Add dimension (unsqueeze)
with_batch_dim = original_tokens.unsqueeze(0)  # Add batch dimension
print(f"With batch dimension: {with_batch_dim.shape}")

with_channel_dim = original_tokens.unsqueeze(1)  # Add channel dimension
print(f"With channel dimension: {with_channel_dim.shape}")

# Remove dimension (squeeze)
squeezed = with_batch_dim.squeeze(0)  # Remove batch dimension
print(f"After squeezing batch dim: {squeezed.shape}")

# Flatten tensor
flattened = reshaped_3x4.flatten()
print(f"\nFlattened tensor: {flattened}")
print(f"Flattened shape: {flattened.shape}")

# Practical NLP example: preparing for embedding layer
print("\n💡 Practical NLP Example: Preparing for Embedding Layer")
# Simulate batch of sentences with different lengths (padded)
batch_sentences = torch.tensor([
    [15, 67, 89, 23, 0, 0],    # Sentence 1 (4 real tokens + 2 padding)
    [156, 78, 234, 45, 167, 98], # Sentence 2 (6 real tokens)
    [134, 56, 12, 0, 0, 0]      # Sentence 3 (3 real tokens + 3 padding)
], dtype=torch.long)

print(f"Batch of sentences: {batch_sentences.shape} (batch_size × seq_length)")
print(f"Batch:\n{batch_sentences}")

# Flatten for processing
flattened_batch = batch_sentences.flatten()
print(f"\nFlattened for lookup: {flattened_batch.shape}")
print(f"Flattened: {flattened_batch}")

# TensorFlow comparison
print("\n📊 TensorFlow vs PyTorch Reshaping:")
print("   TensorFlow: tf.reshape(x, [3, 4])")
print("   PyTorch:    x.reshape(3, 4) or x.view(3, 4)")
print("   TensorFlow: tf.expand_dims(x, axis=0)")
print("   PyTorch:    x.unsqueeze(0)")
print("   TensorFlow: tf.squeeze(x, axis=0)")
print("   PyTorch:    x.squeeze(0)")

## 5. Bridge with NumPy

One of PyTorch's strengths is its seamless integration with NumPy. Let's explore converting between PyTorch tensors and NumPy arrays using multilingual Australian text data.

In [None]:
# Converting between PyTorch tensors and NumPy arrays
print("🔄 PyTorch ↔ NumPy Conversion: Multilingual Text Analysis\n")

# Start with NumPy array - character frequencies in Australian text
print("📝 Character Frequency Analysis: English vs Vietnamese")
print("English: 'Sydney beaches are amazing for surfing and swimming'")
print("Vietnamese: 'Bãi biển Sydney thật tuyệt vời cho lướt sóng và bơi lội'\n")

# Character frequency data (simplified)
# Characters: ['a', 'e', 'i', 'o', 'u', 'n', 's', 't']
english_char_freq = np.array([6, 7, 4, 2, 2, 8, 5, 3], dtype=np.float32)
vietnamese_char_freq = np.array([4, 5, 3, 4, 1, 6, 4, 7], dtype=np.float32)

print(f"English char frequencies (NumPy): {english_char_freq}")
print(f"Vietnamese char frequencies (NumPy): {vietnamese_char_freq}")
print(f"NumPy array type: {type(english_char_freq)}")
print(f"NumPy dtype: {english_char_freq.dtype}")

# Convert NumPy to PyTorch
print("\n🔄 NumPy → PyTorch Conversion:")
english_tensor = torch.from_numpy(english_char_freq)
vietnamese_tensor = torch.from_numpy(vietnamese_char_freq)

print(f"English tensor: {english_tensor}")
print(f"Vietnamese tensor: {vietnamese_tensor}")
print(f"Tensor type: {type(english_tensor)}")
print(f"Tensor dtype: {english_tensor.dtype}")

# Check memory sharing
print(f"\n🧠 Memory Sharing Check:")
print(f"Shares memory: {english_tensor.data_ptr() == english_char_freq.__array_interface__['data'][0]}")
print("Note: from_numpy() creates a tensor that shares memory with the NumPy array")

# Demonstrate shared memory
original_value = english_char_freq[0]
print(f"\nBefore modification - NumPy[0]: {english_char_freq[0]}, Tensor[0]: {english_tensor[0]}")
english_char_freq[0] = 999  # Modify NumPy array
print(f"After modifying NumPy - NumPy[0]: {english_char_freq[0]}, Tensor[0]: {english_tensor[0]}")
english_char_freq[0] = original_value  # Restore original value

In [None]:
# Convert PyTorch to NumPy
print("🔄 PyTorch → NumPy Conversion: Tourism Data Analysis\n")

# Create PyTorch tensor with Australian tourism spending data
# Categories: Accommodation, Food, Transport, Activities, Shopping
spending_categories = ["Accommodation", "Food", "Transport", "Activities", "Shopping"]
daily_spending_aud = torch.tensor([180.50, 85.25, 45.75, 120.00, 95.30], dtype=torch.float32)

print(f"Daily spending (PyTorch): {daily_spending_aud}")
print(f"Tensor type: {type(daily_spending_aud)}")

# Convert to NumPy using .numpy()
spending_numpy = daily_spending_aud.numpy()
print(f"\nDaily spending (NumPy): {spending_numpy}")
print(f"NumPy type: {type(spending_numpy)}")

# Alternative conversion using .detach().numpy() (important for tensors with gradients)
spending_detached = daily_spending_aud.detach().numpy()
print(f"Detached NumPy: {spending_detached}")

# Create a detailed analysis using NumPy
print("\n📊 Analysis using NumPy operations:")
total_daily = np.sum(spending_numpy)
average_category = np.mean(spending_numpy)
max_category_idx = np.argmax(spending_numpy)
min_category_idx = np.argmin(spending_numpy)

print(f"Total daily spending: ${total_daily:.2f} AUD")
print(f"Average per category: ${average_category:.2f} AUD")
print(f"Highest spending: {spending_categories[max_category_idx]} (${spending_numpy[max_category_idx]:.2f})")
print(f"Lowest spending: {spending_categories[min_category_idx]} (${spending_numpy[min_category_idx]:.2f})")

# Convert back to PyTorch for further processing
processed_tensor = torch.from_numpy(spending_numpy * 1.1)  # 10% increase
print(f"\nAfter 10% increase (back to PyTorch): {processed_tensor}")

In [None]:
# Working with different devices and NumPy
print("🔧 Device Considerations: CPU vs GPU Tensors\n")

# Create tensor on CPU
cpu_tensor = torch.tensor([1.0, 2.0, 3.0, 4.0], device='cpu')
print(f"CPU tensor: {cpu_tensor}")
print(f"Device: {cpu_tensor.device}")

# Convert CPU tensor to NumPy (works directly)
cpu_numpy = cpu_tensor.numpy()
print(f"CPU → NumPy: {cpu_numpy}")

# Device handling demonstration
if torch.cuda.is_available():
    print("\n🚀 CUDA available - GPU tensor demonstration:")
    gpu_tensor = cpu_tensor.to('cuda')
    print(f"GPU tensor: {gpu_tensor}")
    print(f"Device: {gpu_tensor.device}")
    
    # Must move to CPU before converting to NumPy
    gpu_to_numpy = gpu_tensor.cpu().numpy()
    print(f"GPU → CPU → NumPy: {gpu_to_numpy}")
    
elif hasattr(torch.backends, 'mps') and torch.backends.mps.is_available():
    print("\n🍎 MPS available - Apple Silicon demonstration:")
    mps_tensor = cpu_tensor.to('mps')
    print(f"MPS tensor: {mps_tensor}")
    print(f"Device: {mps_tensor.device}")
    
    # Must move to CPU before converting to NumPy
    mps_to_numpy = mps_tensor.cpu().numpy()
    print(f"MPS → CPU → NumPy: {mps_to_numpy}")
    
else:
    print("\n💻 No GPU/MPS acceleration available - using CPU only")

print("\n⚠️ Important Notes:")
print("- NumPy arrays are always on CPU")
print("- GPU/MPS tensors must be moved to CPU before .numpy()")
print("- from_numpy() always creates CPU tensors")
print("- Use .to(device) to move tensors between devices")

In [None]:
# Practical example: Multilingual text processing pipeline
print("🌏 Practical Example: Multilingual Australian Tourism Processing\n")

# Simulate processing pipeline for English-Vietnamese tourism reviews
print("Processing tourism reviews for sentiment analysis...")
print("English: 'The Sydney harbour cruise was absolutely magnificent!'")
print("Vietnamese: 'Chuyến du thuyền cảng Sydney thật tuyệt vời!'\n")

# Step 1: Start with NumPy arrays (common in data preprocessing)
# Simulated word embeddings (300-dimensional)
np.random.seed(16)  # For reproducible results
english_embeddings = np.random.randn(8, 300).astype(np.float32)  # 8 words
vietnamese_embeddings = np.random.randn(7, 300).astype(np.float32)  # 7 words

print(f"English embeddings shape (NumPy): {english_embeddings.shape}")
print(f"Vietnamese embeddings shape (NumPy): {vietnamese_embeddings.shape}")

# Step 2: Convert to PyTorch for deep learning processing
english_tensor = torch.from_numpy(english_embeddings)
vietnamese_tensor = torch.from_numpy(vietnamese_embeddings)

print(f"\nConverted to PyTorch tensors:")
print(f"English tensor shape: {english_tensor.shape}")
print(f"Vietnamese tensor shape: {vietnamese_tensor.shape}")

# Step 3: Move to appropriate device for processing
device, _ = detect_device()
english_tensor = english_tensor.to(device)
vietnamese_tensor = vietnamese_tensor.to(device)

print(f"\nMoved to device: {device}")

# Step 4: Perform PyTorch operations (simulated sentiment analysis)
# Calculate average embeddings (sentence representations)
english_sentence_repr = torch.mean(english_tensor, dim=0)
vietnamese_sentence_repr = torch.mean(vietnamese_tensor, dim=0)

# Calculate similarity (dot product)
similarity = torch.dot(english_sentence_repr, vietnamese_sentence_repr)
print(f"\nCross-lingual similarity score: {similarity.item():.4f}")

# Step 5: Convert back to NumPy for visualization/further analysis
english_final = english_sentence_repr.cpu().numpy()
vietnamese_final = vietnamese_sentence_repr.cpu().numpy()

print(f"\nFinal sentence representations (NumPy):")
print(f"English sentence vector: {english_final[:5]}... (showing first 5 dims)")
print(f"Vietnamese sentence vector: {vietnamese_final[:5]}... (showing first 5 dims)")

# Step 6: Use NumPy for analysis and visualization
cosine_similarity = np.dot(english_final, vietnamese_final) / (
    np.linalg.norm(english_final) * np.linalg.norm(vietnamese_final)
)
print(f"\nCosine similarity (NumPy calculation): {cosine_similarity:.4f}")
print(f"Interpretation: {'Very similar' if cosine_similarity > 0.8 else 'Moderately similar' if cosine_similarity > 0.5 else 'Different'} semantic content")

# TensorFlow comparison
print("\n📊 TensorFlow vs PyTorch NumPy Integration:")
print("   TensorFlow: tf.convert_to_tensor(numpy_array)")
print("   PyTorch:    torch.from_numpy(numpy_array)")
print("   TensorFlow: tensor.numpy()")
print("   PyTorch:    tensor.numpy() or tensor.detach().numpy()")
print("\n💡 PyTorch Advantage: Seamless memory sharing with NumPy!")

## 6. NLP-Focused Tensor Applications

Let's apply our tensor knowledge to common NLP tasks, preparing for neural network implementation and Hugging Face integration.

In [None]:
# Text tokenization and vocabulary mapping
print("📝 Text Tokenization with Tensors: Australian Tourism Reviews\n")

# Sample Australian tourism reviews (English and Vietnamese)
reviews = {
    'english': [
        "Sydney Opera House is stunning and iconic",
        "Bondi Beach has perfect waves for surfing",
        "Melbourne coffee culture is world famous",
        "Great Barrier Reef offers amazing snorkeling",
        "Uluru sunset views are absolutely breathtaking"
    ],
    'vietnamese': [
        "Nhà hát Opera Sydney tuyệt đẹp và mang tính biểu tượng",
        "Bãi biển Bondi có sóng hoàn hảo để lướt sóng",
        "Văn hóa cà phê Melbourne nổi tiếng thế giới",
        "Rạn san hô Great Barrier Reef cung cấp lặn tuyệt vời",
        "Cảnh hoàng hôn Uluru thật ngoạn mục"
    ]
}

# Simple tokenization (split by spaces)
def simple_tokenize(text):
    return text.lower().split()

# Build vocabulary from all reviews
all_tokens = []
for lang_reviews in reviews.values():
    for review in lang_reviews:
        all_tokens.extend(simple_tokenize(review))

vocab = sorted(set(all_tokens))
vocab_to_idx = {word: idx for idx, word in enumerate(vocab)}
idx_to_vocab = {idx: word for word, idx in vocab_to_idx.items()}

print(f"Vocabulary size: {len(vocab)}")
print(f"Sample vocabulary: {vocab[:10]}")
print(f"Sample Vietnamese words: {[w for w in vocab if any(c in 'àáảãạăắằẳẵặâấầẩẫậđèéẻẽẹêếềểễệìíỉĩịòóỏõọôốồổỗộơớờởỡợùúủũụưứừửữựỳýỷỹỵ' for c in w)][:5]}")

# Convert text to tensor of token indices
def text_to_tensor(text, vocab_to_idx, max_length=None):
    tokens = simple_tokenize(text)
    indices = [vocab_to_idx.get(token, 0) for token in tokens]  # 0 for unknown
    
    if max_length:
        if len(indices) < max_length:
            indices.extend([0] * (max_length - len(indices)))  # Pad with 0
        else:
            indices = indices[:max_length]  # Truncate
    
    return torch.tensor(indices, dtype=torch.long)

# Convert first review to tensor
first_english = reviews['english'][0]
first_vietnamese = reviews['vietnamese'][0]

english_tensor = text_to_tensor(first_english, vocab_to_idx, max_length=10)
vietnamese_tensor = text_to_tensor(first_vietnamese, vocab_to_idx, max_length=10)

print(f"\nFirst English review: '{first_english}'")
print(f"Tokenized tensor: {english_tensor}")
print(f"First Vietnamese review: '{first_vietnamese}'")
print(f"Tokenized tensor: {vietnamese_tensor}")

# Reconstruct text from tensor
def tensor_to_text(tensor, idx_to_vocab):
    tokens = [idx_to_vocab.get(idx.item(), '<UNK>') for idx in tensor if idx.item() != 0]
    return ' '.join(tokens)

reconstructed_english = tensor_to_text(english_tensor, idx_to_vocab)
print(f"\nReconstructed English: '{reconstructed_english}'")

In [None]:
# Batch processing and padding for NLP
print("📦 Batch Processing: Preparing for Neural Networks\n")

# Convert all reviews to tensors
all_reviews_text = reviews['english'] + reviews['vietnamese']
all_labels = [0] * len(reviews['english']) + [1] * len(reviews['vietnamese'])  # 0=English, 1=Vietnamese

print(f"Total reviews: {len(all_reviews_text)}")
print(f"Labels (0=English, 1=Vietnamese): {all_labels}")

# Find maximum sequence length
max_seq_length = max(len(simple_tokenize(review)) for review in all_reviews_text)
print(f"Maximum sequence length: {max_seq_length}")

# Convert all to padded tensors
review_tensors = []
for review in all_reviews_text:
    tensor = text_to_tensor(review, vocab_to_idx, max_length=max_seq_length)
    review_tensors.append(tensor)

# Stack into batch tensor
batch_reviews = torch.stack(review_tensors)
batch_labels = torch.tensor(all_labels, dtype=torch.long)

print(f"\nBatch tensor shape: {batch_reviews.shape} (batch_size × seq_length)")
print(f"Labels shape: {batch_labels.shape}")
print(f"Batch tensor:\n{batch_reviews}")
print(f"Labels: {batch_labels}")

# Create attention masks (for transformer models)
attention_masks = (batch_reviews != 0).float()  # 1 for real tokens, 0 for padding
print(f"\nAttention masks shape: {attention_masks.shape}")
print(f"Attention masks:\n{attention_masks}")

# Simulate embedding lookup (preparing for neural networks)
print(f"\n🔍 Simulating Embedding Lookup:")
embedding_dim = 50
vocab_size = len(vocab)

# Random embedding matrix (in real use, this would be learned)
torch.manual_seed(16)  # For reproducible results
embedding_matrix = torch.randn(vocab_size, embedding_dim)
print(f"Embedding matrix shape: {embedding_matrix.shape} (vocab_size × embedding_dim)")

# Lookup embeddings for first review
first_review_indices = batch_reviews[0]
first_review_embeddings = embedding_matrix[first_review_indices]
print(f"\nFirst review embeddings shape: {first_review_embeddings.shape} (seq_length × embedding_dim)")
print(f"First few embedding values: {first_review_embeddings[0, :5]}")

In [None]:
# Preparing for Hugging Face integration
print("🤗 Preparing for Hugging Face Integration\n")

# Simulate data format expected by Hugging Face transformers
print("Creating data structures compatible with Hugging Face tokenizers...")

# Prepare data in the format expected by transformers
def prepare_for_transformers(texts, labels, max_length=64):
    """
    Prepare text data in format compatible with Hugging Face transformers.
    In practice, you'd use a proper tokenizer like AutoTokenizer.
    """
    input_ids = []
    attention_masks = []
    
    for text in texts:
        # Simulate tokenizer output (normally done by transformers tokenizer)
        tokens = simple_tokenize(text)
        
        # Convert to indices (add special tokens)
        token_ids = [1] + [vocab_to_idx.get(token, 0) for token in tokens] + [2]  # [CLS] + tokens + [SEP]
        
        # Pad or truncate
        if len(token_ids) < max_length:
            attention_mask = [1] * len(token_ids) + [0] * (max_length - len(token_ids))
            token_ids.extend([0] * (max_length - len(token_ids)))
        else:
            token_ids = token_ids[:max_length]
            attention_mask = [1] * max_length
        
        input_ids.append(token_ids)
        attention_masks.append(attention_mask)
    
    return {
        'input_ids': torch.tensor(input_ids, dtype=torch.long),
        'attention_mask': torch.tensor(attention_masks, dtype=torch.long),
        'labels': torch.tensor(labels, dtype=torch.long)
    }

# Prepare sample data
sample_texts = [
    "Sydney Opera House is magnificent",
    "Nhà hát Opera Sydney tuyệt vời"
]
sample_labels = [0, 1]  # English, Vietnamese

transformer_data = prepare_for_transformers(sample_texts, sample_labels)

print(f"Transformer-ready data structure:")
for key, value in transformer_data.items():
    print(f"  {key}: {value.shape}")
    print(f"    {value}")

# Show what this would look like with real Hugging Face usage
print(f"\n💡 Real Hugging Face Usage:")
print(f"""# With actual Hugging Face tokenizer:
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained('bert-base-multilingual-cased')
encoded = tokenizer(
    sample_texts,
    padding=True,
    truncation=True,
    max_length=64,
    return_tensors='pt'  # Return PyTorch tensors!
)
# Returns: input_ids, attention_mask, token_type_ids (all as tensors)""")

print(f"\n🎯 Key Tensor Concepts for NLP:")
print(f"  • input_ids: Token indices for transformer input")
print(f"  • attention_mask: Padding mask (1=real token, 0=padding)")
print(f"  • labels: Target classes or values for training")
print(f"  • embeddings: Dense vector representations of tokens")
print(f"  • batch_size: Number of examples processed together")
print(f"  • sequence_length: Maximum number of tokens per example")

## 7. Summary and Next Steps

Congratulations! You've mastered the fundamentals of PyTorch tensors. Let's summarize key concepts and prepare for the next steps in your PyTorch journey.

In [None]:
# Summary of key tensor operations
print("🎓 PyTorch Tensor Mastery Summary\n")

print("✅ What You've Learned:")
print("  1. 🔧 Environment Setup: Cross-platform PyTorch installation")
print("  2. 📦 Tensor Creation: From lists, NumPy, and special functions")
print("  3. ➕ Tensor Operations: Math, matrix ops, and reductions")
print("  4. 🎯 Indexing & Slicing: Data selection and manipulation")
print("  5. 🔄 NumPy Bridge: Seamless interoperability")
print("  6. 📝 NLP Applications: Text processing and tokenization")

print("\n📊 Key Differences: TensorFlow → PyTorch")
comparison_table = [
    ["Execution", "Static/Eager", "Always Eager"],
    ["Tensor Creation", "tf.constant()", "torch.tensor()"],
    ["Device Management", "Automatic", "Explicit .to(device)"],
    ["Reshaping", "tf.reshape()", "tensor.view() or .reshape()"],
    ["Random Tensors", "tf.random.normal()", "torch.randn()"],
    ["Matrix Multiply", "tf.matmul()", "torch.matmul()"],
    ["NumPy Conversion", "tf.convert_to_tensor()", "torch.from_numpy()"]
]

print(f"{'Operation':<20} {'TensorFlow':<25} {'PyTorch':<25}")
print("-" * 70)
for row in comparison_table:
    print(f"{row[0]:<20} {row[1]:<25} {row[2]:<25}")

print("\n🌟 PyTorch Advantages for NLP:")
print("  • Dynamic graphs: Perfect for variable-length sequences")
print("  • Pythonic syntax: Easy debugging and experimentation")
print("  • Hugging Face ecosystem: State-of-the-art NLP models")
print("  • Research-friendly: Most academic papers use PyTorch")
print("  • Memory sharing with NumPy: Efficient data processing")

print("\n🎯 Next Steps in Your PyTorch Journey:")
print("  1. 🧠 Neural Networks: Learn nn.Module and autograd")
print("  2. 📚 Data Loading: Master DataLoader and Dataset")
print("  3. 🏋️ Training Loops: Implement optimization and backprop")
print("  4. 🤗 Hugging Face: Use pre-trained transformers")
print("  5. 🚀 Advanced Topics: Custom layers, mixed precision")

print("\n📝 Australian Context Examples You've Mastered:")
australian_examples = [
    "Population data for major Australian cities",
    "Tourism ratings for iconic attractions",
    "Weather analysis across different states",
    "Economic data processing (GDP by sector)",
    "Multilingual text processing (English-Vietnamese)",
    "Character frequency analysis for language detection",
    "Tourism spending categorization and analysis"
]

for i, example in enumerate(australian_examples, 1):
    print(f"  {i}. {example}")

print("\n🏆 You're now ready to build neural networks with PyTorch!")
print("Next recommended notebook: Neural Network fundamentals with nn.Module")

In [None]:
# Final practical exercise: Create your own tensor operations
print("🛠️ Final Exercise: Australian Wine Rating Analysis\n")

# Create a practical tensor exercise for students
print("🍷 Exercise: Analyze Australian wine ratings across regions")
print("Your task: Use tensor operations to find insights from wine data\n")

# Wine regions and their ratings (out of 100)
regions = ["Barossa Valley", "Hunter Valley", "Margaret River", "Yarra Valley", "Clare Valley"]
wine_types = ["Shiraz", "Chardonnay", "Cabernet Sauvignon", "Pinot Noir"]

# Wine ratings matrix: regions × wine_types
wine_ratings = torch.tensor([
    [95, 88, 92, 85],  # Barossa Valley
    [89, 94, 87, 91],  # Hunter Valley
    [91, 90, 96, 88],  # Margaret River
    [87, 92, 89, 94],  # Yarra Valley
    [93, 86, 90, 87]   # Clare Valley
], dtype=torch.float32)

print(f"Wine ratings tensor: {wine_ratings.shape} (regions × wine_types)")
print(f"Regions: {regions}")
print(f"Wine types: {wine_types}")
print(f"Ratings:\n{wine_ratings}")

# Exercise solutions
print("\n📊 Analysis Results:")

# 1. Best wine type overall
avg_by_wine = torch.mean(wine_ratings, dim=0)
best_wine_idx = torch.argmax(avg_by_wine)
print(f"1. Best wine type overall: {wine_types[best_wine_idx]} (avg: {avg_by_wine[best_wine_idx]:.1f})")

# 2. Best region overall
avg_by_region = torch.mean(wine_ratings, dim=1)
best_region_idx = torch.argmax(avg_by_region)
print(f"2. Best region overall: {regions[best_region_idx]} (avg: {avg_by_region[best_region_idx]:.1f})")

# 3. Highest single rating
max_rating = torch.max(wine_ratings)
max_indices = torch.where(wine_ratings == max_rating)
max_region = regions[max_indices[0][0]]
max_wine = wine_types[max_indices[1][0]]
print(f"3. Highest rating: {max_rating.item()} ({max_wine} from {max_region})")

# 4. Most consistent region (lowest std deviation)
region_stds = torch.std(wine_ratings, dim=1)
most_consistent_idx = torch.argmin(region_stds)
print(f"4. Most consistent region: {regions[most_consistent_idx]} (std: {region_stds[most_consistent_idx]:.2f})")

# 5. Regions with all wines above 90
excellent_regions = torch.all(wine_ratings >= 90, dim=1)
print(f"5. Regions with all wines ≥90:")
for i, is_excellent in enumerate(excellent_regions):
    if is_excellent:
        print(f"   • {regions[i]}")

print("\n🎉 Congratulations! You've completed your PyTorch tensor journey!")
print("You now have the foundation to build neural networks and work with Hugging Face transformers.")