# üöÄ Kaggle-Ready Phishing Detection LSTM Model

This notebook preprocesses behavioral event data and builds an LSTM model for phishing URL detection.

**‚ö° Optimized for Dual T4 GPU Training on Kaggle**

**Dataset Structure:**
- **Features**: 25 behavioral features including SSL validity, redirects, forms, scripts, page load times, etc.
- **Target**: Binary classification (0 = legitimate, 1 = phishing)  
- **Format**: Sequential behavioral events from URL visits
- **Total samples**: 12,800+ URLs with comprehensive behavioral analysis
- **Feature groups**: Basic behavioral features (13) + Count-based features (12)

**Model Architecture:**
- **Basic LSTM**: Simple architecture for baseline comparison with robust error handling
- **Enhanced Bidirectional LSTM**: Advanced architecture with dual GPU support and comprehensive monitoring
- **Full epoch training**: 150 epochs with no early stopping, best model auto-selected
- **Deployment ready**: Complete prediction functions with 25-feature validation

**Training Configuration:**
- üî• **150 epochs** for improved model (no early stopping for optimal performance)
- ‚ö° **Dual T4 GPU** support with MirroredStrategy on Kaggle
- üéØ **Balanced training** with computed class weights  
- üìä **Comprehensive monitoring** with memory tracking and detailed visualizations
- üõ°Ô∏è **Robust error handling** throughout all training phases
- üíæ **Memory management** with usage monitoring and optimization

**Production Ready Features:**
- ‚úÖ **Flexible data loading** with multiple path fallbacks for Kaggle/local environments
- ‚úÖ **Enhanced validation** at every step with comprehensive error handling
- ‚úÖ **Feature count verification** (exactly 25 features required for deployment)
- ‚úÖ **Chrome extension integration** with detailed documentation and examples

## 1. Import Required Libraries

Import all necessary libraries for data preprocessing, model building, and evaluation.

In [None]:
# Data manipulation and analysis
import pandas as pd
import numpy as np
import json
import time
import pickle
import joblib
from collections import defaultdict

# Machine learning libraries
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.metrics import classification_report, confusion_matrix, roc_auc_score, roc_curve, f1_score
from sklearn.utils import shuffle
from sklearn.utils.class_weight import compute_class_weight

# Deep learning libraries
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout, Embedding, Bidirectional
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau, ModelCheckpoint, CSVLogger, LearningRateScheduler

# Visualization libraries
import matplotlib.pyplot as plt
import seaborn as sns

# Utilities
import warnings
warnings.filterwarnings('ignore')

print("‚úÖ All libraries imported successfully!")
print(f"TensorFlow version: {tf.__version__}")
print(f"Pandas version: {pd.__version__}")
print(f"Numpy version: {np.__version__}")

## 2. Load and Explore Dataset

Load the preprocessed behavioral events dataset and explore its structure.

In [None]:
# Load the dataset - optimized for Kaggle
import pandas as pd
import numpy as np

# Load from Kaggle dataset path
print("üìä Loading dataset...")
try:
    df = pd.read_csv('/kaggle/input/phishing-dataset-full-lstm/events_dataset_full.csv')
    print(f"‚úÖ Dataset loaded successfully from Kaggle!")
    print(f"Shape: {df.shape}")
    print(f"Columns: {list(df.columns)}")
except Exception as e:
    print(f"‚ùå Error loading dataset: {e}")
    raise

# Display basic info
print(f"\nüìà Dataset Info:")
print(f"Total samples: {len(df)}")
print(f"Total features: {df.shape[1] - 2}")  # -2 for 'url' and 'label' columns
print(f"Missing values: {df.isnull().sum().sum()}")

# Show class distribution
print(f"\n‚öñÔ∏è Class Distribution:")
if 'label' in df.columns:
    class_counts = df['label'].value_counts()
    print(f"Legitimate (0): {class_counts.get(0, 0)} ({class_counts.get(0, 0)/len(df)*100:.1f}%)")
    print(f"Phishing (1): {class_counts.get(1, 0)} ({class_counts.get(1, 0)/len(df)*100:.1f}%)")
else:
    print("‚ùå 'label' column not found!")

# Display first few rows
print(f"\nüìã Sample Data:")
print(df.head())

## 3. Data Preprocessing and Cleaning

Clean and prepare the data for LSTM training. The dataset contains 25 behavioral features (excluding 'url' and 'label' columns) that characterize URL visiting behavior for phishing detection.

In [None]:
# Simple data preprocessing for LSTM training
print("üßπ Starting data preprocessing...")

# Make a copy for cleaning
df_clean = df.copy()

# Remove URL column (not needed for training)
if 'url' in df_clean.columns:
    df_clean = df_clean.drop('url', axis=1)
    print("üóëÔ∏è Removed 'url' column")

# Fill missing values with 0
df_clean = df_clean.fillna(0)

# Convert boolean success column to integer if it exists
if 'success' in df_clean.columns:
    df_clean['success'] = df_clean['success'].astype(int)

# Remove duplicates
print(f"Removing duplicates: {len(df_clean)} ‚Üí ", end="")
df_clean = df_clean.drop_duplicates()
print(f"{len(df_clean)}")

# Select feature columns (exclude target variable)
feature_columns = [col for col in df_clean.columns if col != 'label']
X = df_clean[feature_columns]
y = df_clean['label']

print(f"\n‚úÖ Data preprocessing completed:")
print(f"   Features: {len(feature_columns)}")
print(f"   Samples: {len(X)}")
print(f"   Classes: {sorted(y.unique())}")

# Basic feature statistics
print(f"\n? Feature columns:")
for i, col in enumerate(feature_columns[:10]):  # Show first 10
    print(f"  {i+1}. {col}")
if len(feature_columns) > 10:
    print(f"  ... and {len(feature_columns)-10} more features")

# Check class balance
class_counts = y.value_counts().sort_index()
print(f"\n‚öñÔ∏è Class balance:")
for class_val, count in class_counts.items():
    print(f"   Class {class_val}: {count} samples ({count/len(y)*100:.1f}%)")

## 4. Feature Engineering and Sequence Preparation

Prepare features for LSTM input by creating sequences and normalizing data.

In [None]:
# Normalize features
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

print(f"‚úÖ Features scaled. Shape: {X_scaled.shape}")

# For LSTM, we need to create sequences
# Since each row represents behavioral features of a URL, we'll treat each sample as a sequence of length 1
# Alternatively, we can create artificial sequences by grouping features

# Method 1: Simple approach - reshape for LSTM (samples, timesteps=1, features)
X_lstm = X_scaled.reshape(X_scaled.shape[0], 1, X_scaled.shape[1])

print(f"üìä LSTM input shape: {X_lstm.shape}")
print(f"   - Samples: {X_lstm.shape[0]}")
print(f"   - Time steps: {X_lstm.shape[1]}")
print(f"   - Features: {X_lstm.shape[2]}")

# Alternative Method 2: Create multi-step sequences by grouping similar features
# Group features by type for better sequence modeling
feature_groups = {
    'ssl_features': ['ssl_valid', 'ssl_invalid'],
    'content_features': ['forms', 'password_fields', 'iframes', 'scripts', 'suspicious_keywords'],
    'network_features': ['redirects', 'external_requests', 'page_load_time'],
    'error_features': ['has_errors', 'success'],
    'count_features': [col for col in feature_columns if col.startswith('count_')]
}

print(f"\nüéØ Feature groups:")
for group, features in feature_groups.items():
    available_features = [f for f in features if f in feature_columns]
    print(f"   {group}: {len(available_features)} features")

# Create sequences using feature groups (optional alternative approach)
def create_feature_sequences(X_data, feature_groups, feature_columns):
    """Create sequences by grouping related features"""
    sequences = []
    
    for idx in range(len(X_data)):
        sample = X_data[idx]
        sequence = []
        
        # Create a sequence step for each feature group
        for group_name, group_features in feature_groups.items():
            available_features = [f for f in group_features if f in feature_columns]
            if available_features:
                # Get indices of these features
                feature_indices = [feature_columns.index(f) for f in available_features]
                # Extract values for this group
                group_values = [sample[i] for i in feature_indices]
                sequence.append(group_values)
        
        sequences.append(sequence)
    
    return sequences

# Use simple approach for now (Method 1)
print(f"\n‚úÖ Using simple LSTM sequence format")
print(f"Final X shape for LSTM: {X_lstm.shape}")
print(f"Final y shape: {y.shape}")

## 5. Train-Test Split

Split the data into training and testing sets for model evaluation.

In [None]:
# Split the data
X_train, X_test, y_train, y_test = train_test_split(
    X_lstm, y, 
    test_size=0.2, 
    random_state=42, 
    stratify=y
)

print(f"üìä Data split completed:")
print(f"   Training set: {X_train.shape} | {y_train.shape}")
print(f"   Testing set: {X_test.shape} | {y_test.shape}")

print(f"\nüéØ Class distribution in training set:")
print(y_train.value_counts())
print(f"Training balance: {y_train.value_counts()[0] / y_train.value_counts()[1]:.2f}")

print(f"\nüéØ Class distribution in testing set:")
print(y_test.value_counts())
print(f"Testing balance: {y_test.value_counts()[0] / y_test.value_counts()[1]:.2f}")

# Convert to numpy arrays for TensorFlow
X_train = np.array(X_train)
X_test = np.array(X_test)
y_train = np.array(y_train)
y_test = np.array(y_test)

print(f"\n‚úÖ Data converted to numpy arrays")
print(f"X_train shape: {X_train.shape}")
print(f"X_test shape: {X_test.shape}")
print(f"y_train shape: {y_train.shape}")
print(f"y_test shape: {y_test.shape}")

## 6. Build LSTM Model Architecture

Define the LSTM model architecture for binary classification.

In [None]:
# Simple LSTM Model Architecture
def create_lstm_model(input_shape):
    """Create LSTM model for phishing detection"""
    from tensorflow.keras.models import Sequential
    from tensorflow.keras.layers import LSTM, Dense, Dropout
    
    model = Sequential([
        # LSTM layers
        LSTM(128, return_sequences=True, input_shape=input_shape),
        Dropout(0.3),
        
        LSTM(64, return_sequences=False),
        Dropout(0.3),
        
        # Dense layers
        Dense(50, activation='relu'),
        Dropout(0.2),
        
        Dense(25, activation='relu'),
        Dropout(0.2),
        
        # Output layer for binary classification
        Dense(1, activation='sigmoid')
    ])
    
    return model

# Create the model
input_shape = (X_train.shape[1], X_train.shape[2])  # (timesteps, features)
print(f"üèóÔ∏è Creating LSTM model with input shape: {input_shape}")

model = create_lstm_model(input_shape)

# Display model architecture
print("üìä LSTM Model Architecture:")
model.summary()

# Count parameters
total_params = model.count_params()
print(f"\nüìà Total parameters: {total_params:,}")

print("‚úÖ LSTM model created successfully!")

## 7. Compile and Train the Model

Compile the model with appropriate optimizer and loss function, then train it.

In [None]:
# Compile the model
model.compile(
    optimizer=Adam(learning_rate=0.001),
    loss='binary_crossentropy',
    metrics=['accuracy', 'precision', 'recall']
)

print("‚úÖ Model compiled successfully!")

# Define callbacks (NO EARLY STOPPING - for comparison with improved model)
reduce_lr = ReduceLROnPlateau(
    monitor='val_loss',
    factor=0.2,
    patience=5,
    min_lr=0.0001,
    verbose=1
)

# Save best model without early stopping
model_checkpoint = ModelCheckpoint(
    'basic_lstm_model_best.h5',
    monitor='val_accuracy',
    mode='max',
    save_best_only=True,
    verbose=1,
    save_weights_only=False
)

callbacks = [reduce_lr, model_checkpoint]

print("üìã Callbacks configured:")
print("   - Learning Rate Reduction (patience=5)")
print("   - ModelCheckpoint (saves best model)")
print("   - NO EARLY STOPPING - will train for full epochs")

# Train the model for full epochs
print("\nüöÄ Starting model training...")

history = model.fit(
    X_train, y_train,
    epochs=150,  # Full training epochs to match improved model
    batch_size=32,
    validation_split=0.2,
    callbacks=callbacks,
    verbose=1
)

print("\n‚úÖ Basic model training completed!")

## 8. Model Evaluation and Metrics

Evaluate the trained model and visualize performance metrics.

In [None]:
# Plot training history
def plot_training_history(history):
    """Plot training and validation metrics"""
    
    fig, axes = plt.subplots(2, 2, figsize=(15, 10))
    
    # Plot loss
    axes[0, 0].plot(history.history['loss'], label='Training Loss')
    axes[0, 0].plot(history.history['val_loss'], label='Validation Loss')
    axes[0, 0].set_title('Model Loss')
    axes[0, 0].set_xlabel('Epoch')
    axes[0, 0].set_ylabel('Loss')
    axes[0, 0].legend()
    axes[0, 0].grid(True)
    
    # Plot accuracy
    axes[0, 1].plot(history.history['accuracy'], label='Training Accuracy')
    axes[0, 1].plot(history.history['val_accuracy'], label='Validation Accuracy')
    axes[0, 1].set_title('Model Accuracy')
    axes[0, 1].set_xlabel('Epoch')
    axes[0, 1].set_ylabel('Accuracy')
    axes[0, 1].legend()
    axes[0, 1].grid(True)
    
    # Plot precision
    axes[1, 0].plot(history.history['precision'], label='Training Precision')
    axes[1, 0].plot(history.history['val_precision'], label='Validation Precision')
    axes[1, 0].set_title('Model Precision')
    axes[1, 0].set_xlabel('Epoch')
    axes[1, 0].set_ylabel('Precision')
    axes[1, 0].legend()
    axes[1, 0].grid(True)
    
    # Plot recall
    axes[1, 1].plot(history.history['recall'], label='Training Recall')
    axes[1, 1].plot(history.history['val_recall'], label='Validation Recall')
    axes[1, 1].set_title('Model Recall')
    axes[1, 1].set_xlabel('Epoch')
    axes[1, 1].set_ylabel('Recall')
    axes[1, 1].legend()
    axes[1, 1].grid(True)
    
    plt.tight_layout()
    plt.show()

# Plot training history
plot_training_history(history)

# Evaluate on test set
print("üéØ Evaluating model on test set...")
test_loss, test_accuracy, test_precision, test_recall = model.evaluate(X_test, y_test, verbose=0)

print(f"\nüìä Test Set Performance:")
print(f"   Loss: {test_loss:.4f}")
print(f"   Accuracy: {test_accuracy:.4f}")
print(f"   Precision: {test_precision:.4f}")
print(f"   Recall: {test_recall:.4f}")

# Make predictions
y_pred_proba = model.predict(X_test)
y_pred = (y_pred_proba > 0.5).astype(int).flatten()

# Calculate additional metrics
roc_auc = roc_auc_score(y_test, y_pred_proba)
print(f"   ROC AUC: {roc_auc:.4f}")

# Classification report
print(f"\nüìã Detailed Classification Report:")
print(classification_report(y_test, y_pred, target_names=['Legitimate', 'Phishing']))

In [None]:
# Plot confusion matrix and ROC curve
fig, axes = plt.subplots(1, 2, figsize=(15, 5))

# Confusion Matrix
cm = confusion_matrix(y_test, y_pred)
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', 
            xticklabels=['Legitimate', 'Phishing'],
            yticklabels=['Legitimate', 'Phishing'],
            ax=axes[0])
axes[0].set_title('Confusion Matrix')
axes[0].set_xlabel('Predicted')
axes[0].set_ylabel('Actual')

# ROC Curve
fpr, tpr, thresholds = roc_curve(y_test, y_pred_proba)
axes[1].plot(fpr, tpr, color='darkorange', lw=2, label=f'ROC curve (AUC = {roc_auc:.4f})')
axes[1].plot([0, 1], [0, 1], color='navy', lw=2, linestyle='--')
axes[1].set_xlim([0.0, 1.0])
axes[1].set_ylim([0.0, 1.05])
axes[1].set_xlabel('False Positive Rate')
axes[1].set_ylabel('True Positive Rate')
axes[1].set_title('ROC Curve')
axes[1].legend(loc="lower right")
axes[1].grid(True)

plt.tight_layout()
plt.show()

# Feature importance analysis (approximate)
print("\nüîç Analyzing feature importance...")

# Since LSTM doesn't provide direct feature importance, we'll use a simple approach
# Calculate correlation between features and predictions
feature_importance = []
for i in range(X_test.shape[2]):  # For each feature
    feature_values = X_test[:, 0, i]  # Extract feature values (timestep 0)
    correlation = np.corrcoef(feature_values, y_pred_proba.flatten())[0, 1]
    feature_importance.append(abs(correlation))

# Sort features by importance
feature_names = feature_columns[:len(feature_importance)]  # Match length
importance_df = pd.DataFrame({
    'feature': feature_names,
    'importance': feature_importance
}).sort_values('importance', ascending=False)

# Plot top 10 most important features
plt.figure(figsize=(10, 6))
top_features = importance_df.head(10)
plt.barh(range(len(top_features)), top_features['importance'])
plt.yticks(range(len(top_features)), top_features['feature'])
plt.xlabel('Absolute Correlation with Predictions')
plt.title('Top 10 Most Important Features')
plt.gca().invert_yaxis()
plt.tight_layout()
plt.show()

print("üìä Top 10 most important features:")
print(importance_df.head(10))

## 9. Save Model and Predictions

Save the trained model and generate predictions for deployment.

In [None]:
# Save the trained model
model.save('phishing_lstm_model.h5')
print("‚úÖ Model saved as 'phishing_lstm_model.h5'")

# Save the scaler for future use
import joblib
joblib.dump(scaler, 'feature_scaler.pkl')
print("‚úÖ Feature scaler saved as 'feature_scaler.pkl'")

# Save model architecture as JSON
model_json = model.to_json()
with open('model_architecture.json', 'w') as json_file:
    json_file.write(model_json)
print("‚úÖ Model architecture saved as 'model_architecture.json'")

# Save training history
import pickle
with open('training_history.pkl', 'wb') as f:
    pickle.dump(history.history, f)
print("‚úÖ Training history saved as 'training_history.pkl'")

# Create a summary report
report = {
    'model_performance': {
        'test_accuracy': float(test_accuracy),
        'test_precision': float(test_precision),
        'test_recall': float(test_recall),
        'test_loss': float(test_loss),
        'roc_auc': float(roc_auc)
    },
    'dataset_info': {
        'total_samples': len(df_clean),
        'training_samples': len(X_train),
        'testing_samples': len(X_test),
        'num_features': X_train.shape[2],
        'class_balance': {
            'legitimate': int(class_counts[0]),
            'phishing': int(class_counts[1])
        }
    },
    'model_info': {
        'total_parameters': int(total_params),
        'input_shape': list(input_shape),
        'architecture': 'LSTM with Dense layers'
    }
}

# Save report as JSON
with open('model_report.json', 'w') as f:
    json.dump(report, f, indent=2)
print("‚úÖ Model report saved as 'model_report.json'")

# Display final summary
print(f"\nüéâ Training Complete! Final Results:")
print(f"   üìä Test Accuracy: {test_accuracy:.4f}")
print(f"   üéØ Test Precision: {test_precision:.4f}")
print(f"   üîç Test Recall: {test_recall:.4f}")
print(f"   üìà ROC AUC: {roc_auc:.4f}")
print(f"\nüìÅ Files saved:")
print(f"   - phishing_lstm_model.h5 (trained model)")
print(f"   - feature_scaler.pkl (preprocessing scaler)")
print(f"   - model_architecture.json (model structure)")
print(f"   - training_history.pkl (training metrics)")
print(f"   - model_report.json (performance summary)")

# Example prediction function
def predict_phishing(url_features, model, scaler):
    """
    Predict if a URL is phishing based on behavioral features
    
    Args:
        url_features: List or array of behavioral features
        model: Trained LSTM model
        scaler: Fitted StandardScaler
    
    Returns:
        probability of being phishing (0-1)
    """
    # Ensure features are in the right format
    features = np.array(url_features).reshape(1, -1)
    
    # Scale features
    features_scaled = scaler.transform(features)
    
    # Reshape for LSTM (samples, timesteps, features)
    features_lstm = features_scaled.reshape(1, 1, -1)
    
    # Make prediction
    prediction = model.predict(features_lstm, verbose=0)[0][0]
    
    return prediction

print(f"\nüöÄ Model ready for deployment!")
print(f"Use the predict_phishing() function to make predictions on new URLs.")

## 10. Improved Model with Better Hyperparameters

Let's create an improved version with better architecture and hyperparameters.

In [None]:
# Enhanced GPU Configuration for Kaggle Dual T4 GPUs
print("üîß Configuring GPU environment for optimal training...")
print(f"TensorFlow version: {tf.__version__}")

# Comprehensive GPU detection and configuration
try:
    # List all available devices
    devices = tf.config.list_physical_devices()
    print(f"\nüì± All available devices:")
    for device in devices:
        print(f"   {device}")
    
    # Focus on GPUs
    gpus = tf.config.list_physical_devices('GPU')
    print(f"\nüéÆ GPU Detection Results:")
    print(f"   Available GPUs: {len(gpus)}")
    
    if gpus:
        for i, gpu in enumerate(gpus):
            print(f"   GPU {i}: {gpu}")
            
        # Configure GPU memory growth to prevent OOM errors
        print(f"\n‚öôÔ∏è Configuring GPU memory management...")
        try:
            for gpu in gpus:
                tf.config.experimental.set_memory_growth(gpu, True)
            print("   ‚úÖ Memory growth enabled for all GPUs")
        except RuntimeError as e:
            print(f"   ‚ö†Ô∏è Memory growth configuration failed: {e}")
            print("   This may occur if virtual GPUs are already initialized")
        
        # Set GPU memory limit if needed (useful for preventing OOM)
        try:
            for i, gpu in enumerate(gpus):
                tf.config.experimental.set_virtual_device_configuration(
                    gpu,
                    [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024*14)]  # 14GB per GPU
                )
            print("   ‚úÖ GPU memory limits configured (14GB per GPU)")
        except RuntimeError as e:
            print(f"   ‚ö†Ô∏è Memory limit configuration skipped: {e}")
    else:
        print("   ‚ö†Ô∏è No GPUs detected - will use CPU training")
        print("   This will be significantly slower for LSTM training")
    
except Exception as e:
    print(f"‚ùå GPU detection failed: {str(e)}")
    gpus = []

# Configure distributed training strategy
print(f"\nüåê Configuring training strategy...")
try:
    if len(gpus) > 1:
        # Multi-GPU strategy for Kaggle dual T4 setup
        strategy = tf.distribute.MirroredStrategy()
        print(f"   üöÄ MirroredStrategy initialized")
        print(f"   üî• Training will use {strategy.num_replicas_in_sync} GPUs in parallel")
        print(f"   ‚ö° Expected performance boost: ~{strategy.num_replicas_in_sync*0.8:.1f}x")
        
        # Optimize for dual GPU setup
        tf.config.optimizer.set_jit(True)  # Enable XLA compilation
        print(f"   ‚úÖ XLA optimization enabled")
        
    elif len(gpus) == 1:
        strategy = tf.distribute.get_strategy()
        print(f"   üîß Single GPU strategy selected")
        print(f"   üìä Will use: {gpus[0]}")
        
        # Single GPU optimizations
        tf.config.optimizer.set_jit(True)
        print(f"   ‚úÖ XLA optimization enabled")
        
    else:
        strategy = tf.distribute.get_strategy()
        print(f"   üíª CPU-only strategy selected")
        print(f"   ‚ö†Ô∏è Training will be significantly slower without GPU")
        
except Exception as e:
    print(f"‚ùå Strategy configuration failed: {str(e)}")
    strategy = tf.distribute.get_strategy()
    print(f"   üîÑ Falling back to default strategy")

print(f"\nüìä Final configuration:")
print(f"   Strategy: {type(strategy).__name__}")
print(f"   Devices: {strategy.num_replicas_in_sync}")
print(f"   GPU Memory: {'Managed' if gpus else 'N/A'}")

# Memory usage estimation
if gpus:
    print(f"\nüíæ Memory usage estimation:")
    print(f"   Model parameters: ~{improved_total_params if 'improved_total_params' in globals() else 'TBD'}")
    print(f"   Batch size: 16 per GPU = {16 * strategy.num_replicas_in_sync} total")
    print(f"   Expected GPU memory usage: ~8-12GB per GPU")
    print(f"   Kaggle T4 GPU memory: 16GB per GPU")
    print(f"   ‚úÖ Memory requirements should be satisfied")

# Improved Model Architecture with Enhanced Error Handling
def create_improved_lstm_model(input_shape):
    """
    Create improved LSTM model with comprehensive validation
    
    Args:
        input_shape: Tuple of (timesteps, features)
    
    Returns:
        Compiled Keras model
    """
    try:
        # Validate input shape
        if len(input_shape) != 2:
            raise ValueError(f"Expected input_shape length 2, got {len(input_shape)}")
        
        timesteps, features = input_shape
        if features != 25:
            print(f"‚ö†Ô∏è Warning: Expected 25 features, got {features}")
        
        print(f"üèóÔ∏è Building model for input shape: {input_shape}")
        
        model = Sequential([
            # Bidirectional LSTM layers for better pattern recognition
            Bidirectional(LSTM(64, return_sequences=True, recurrent_dropout=0.2), 
                         input_shape=input_shape, name='bidirectional_lstm_1'),
            Dropout(0.3, name='dropout_1'),
            
            Bidirectional(LSTM(32, return_sequences=False, recurrent_dropout=0.2),
                         name='bidirectional_lstm_2'),
            Dropout(0.3, name='dropout_2'),
            
            # Dense layers with regularization
            Dense(64, activation='relu', name='dense_1'),
            Dropout(0.4, name='dropout_3'),
            
            Dense(32, activation='relu', name='dense_2'),
            Dropout(0.3, name='dropout_4'),
            
            Dense(16, activation='relu', name='dense_3'),
            Dropout(0.2, name='dropout_5'),
            
            # Output layer
            Dense(1, activation='sigmoid', name='output')
        ])
        
        print(f"‚úÖ Model architecture created successfully")
        return model
        
    except Exception as e:
        print(f"‚ùå Model creation failed: {str(e)}")
        raise

# Calculate class weights for balanced training with error handling
try:
    if 'y_train' in globals():
        class_weights = compute_class_weight(
            'balanced',
            classes=np.unique(y_train),
            y=y_train
        )
        class_weight_dict = {i: weight for i, weight in enumerate(class_weights)}
        print(f"\nüéØ Class weights calculated: {class_weight_dict}")
    else:
        print(f"‚ö†Ô∏è y_train not available yet - will calculate class weights later")
        class_weight_dict = None
        
except Exception as e:
    print(f"‚ùå Class weight calculation failed: {str(e)}")
    class_weight_dict = None

# Create and compile the improved model within strategy scope
print(f"\nüî® Creating improved model within strategy scope...")
try:
    with strategy.scope():
        # Validate input shape availability
        if 'input_shape' not in globals():
            print(f"‚ö†Ô∏è input_shape not defined - using default")
            input_shape = (1, 25)  # Default shape for LSTM
            
        # Create the improved model
        improved_model = create_improved_lstm_model(input_shape)
        
        # Compile with strategy scope
        improved_model.compile(
            optimizer=tf.keras.optimizers.Adam(learning_rate=0.0005, clipnorm=1.0),
            loss='binary_crossentropy',
            metrics=['accuracy', 'precision', 'recall']
        )
        
        print(f"‚úÖ Model compiled successfully within strategy scope")

    print(f"\nüìä Improved Model Architecture Summary:")
    improved_model.summary()

    # Count parameters
    improved_total_params = improved_model.count_params()
    print(f"\n? Model Statistics:")
    print(f"   Total parameters: {improved_total_params:,}")
    print(f"   Trainable parameters: {improved_model.count_params():,}")
    print(f"   Model size estimate: ~{improved_total_params * 4 / (1024*1024):.1f} MB")

    print(f"\n‚úÖ Improved model ready for dual GPU training!")
    
except Exception as e:
    print(f"‚ùå Model creation/compilation failed: {str(e)}")
    print(f"üîÑ Will attempt to continue with basic error handling...")
    improved_model = None
    improved_total_params = 0

# Validation checkpoint
print(f"\nüîç Pre-training validation:")
print(f"   Strategy: {'‚úÖ' if strategy else '‚ùå'} {type(strategy).__name__}")
print(f"   Model: {'‚úÖ' if 'improved_model' in locals() and improved_model else '‚ùå'}")
print(f"   GPUs: {'‚úÖ' if gpus else '‚ö†Ô∏è'} ({len(gpus)} detected)")
print(f"   Class weights: {'‚úÖ' if class_weight_dict else '‚ö†Ô∏è'}")

if not gpus:
    print(f"\n‚ö†Ô∏è IMPORTANT: No GPUs detected!")
    print(f"   Training will be much slower on CPU")
    print(f"   Consider enabling GPU runtime in Kaggle settings")

print(f"\nüöÄ GPU configuration complete - ready for enhanced training!")

In [None]:
# Enhanced Training Configuration with Comprehensive Monitoring and Validation
import psutil
import gc

def get_memory_usage():
    """Get current memory usage statistics"""
    process = psutil.Process()
    memory_info = process.memory_info()
    return {
        'rss_mb': memory_info.rss / (1024 * 1024),
        'vms_mb': memory_info.vms / (1024 * 1024),
        'available_mb': psutil.virtual_memory().available / (1024 * 1024),
        'percent_used': psutil.virtual_memory().percent
    }

def log_gpu_memory():
    """Log GPU memory usage if available"""
    try:
        gpus = tf.config.list_physical_devices('GPU')
        if gpus:
            # Get GPU memory info
            gpu_details = tf.config.experimental.get_memory_info('GPU:0')
            return {
                'current_mb': gpu_details['current'] / (1024 * 1024),
                'peak_mb': gpu_details['peak'] / (1024 * 1024)
            }
    except:
        pass
    return {'current_mb': 0, 'peak_mb': 0}

# Pre-training validation and setup
print("üîç Pre-training validation and setup...")

# Validate all required variables
required_vars = ['X_train', 'X_test', 'y_train', 'y_test', 'improved_model', 'strategy']
missing_vars = [var for var in required_vars if var not in globals()]

if missing_vars:
    print(f"‚ùå Missing required variables: {missing_vars}")
    print(f"   Cannot proceed with training!")
    print(f"   Please run previous cells first.")
else:
    print(f"‚úÖ All required variables available")

# Memory monitoring
initial_memory = get_memory_usage()
gpu_memory = log_gpu_memory()

print(f"\nüíæ Initial Memory Status:")
print(f"   RAM Usage: {initial_memory['rss_mb']:.1f} MB ({initial_memory['percent_used']:.1f}%)")
print(f"   Available RAM: {initial_memory['available_mb']:.1f} MB")
if gpu_memory['current_mb'] > 0:
    print(f"   GPU Memory: {gpu_memory['current_mb']:.1f} MB")
else:
    print(f"   GPU Memory: Not available or not accessible")

# Data validation
print(f"\nüìä Training Data Validation:")
print(f"   X_train shape: {X_train.shape if 'X_train' in globals() else 'Not available'}")
print(f"   y_train shape: {y_train.shape if 'y_train' in globals() else 'Not available'}")
print(f"   X_test shape: {X_test.shape if 'X_test' in globals() else 'Not available'}")
print(f"   y_test shape: {y_test.shape if 'y_test' in globals() else 'Not available'}")

if 'X_train' in globals() and 'y_train' in globals():
    # Check for data consistency
    print(f"   Sample consistency: {'‚úÖ' if len(X_train) == len(y_train) else '‚ùå'}")
    print(f"   Data types: X_train={X_train.dtype}, y_train={y_train.dtype}")
    print(f"   Value ranges: X_train=[{X_train.min():.3f}, {X_train.max():.3f}]")
    print(f"   Target classes: {np.unique(y_train)}")
    
    # Memory estimation
    data_size_mb = (X_train.nbytes + y_train.nbytes + X_test.nbytes + y_test.nbytes) / (1024 * 1024)
    print(f"   Dataset size: {data_size_mb:.1f} MB")

# Calculate class weights with validation
print(f"\n‚öñÔ∏è Class Weight Calculation:")
try:
    if 'y_train' in globals():
        unique_classes = np.unique(y_train)
        print(f"   Classes found: {unique_classes}")
        
        if len(unique_classes) == 2:
            class_weights = compute_class_weight(
                'balanced',
                classes=unique_classes,
                y=y_train
            )
            class_weight_dict = {int(cls): weight for cls, weight in zip(unique_classes, class_weights)}
            
            print(f"   ‚úÖ Class weights: {class_weight_dict}")
            print(f"   Balance ratio: {class_weight_dict[0]/class_weight_dict[1]:.2f}")
        else:
            print(f"   ‚ùå Expected 2 classes, found {len(unique_classes)}")
            class_weight_dict = None
    else:
        print(f"   ‚ö†Ô∏è y_train not available")
        class_weight_dict = None
        
except Exception as e:
    print(f"   ‚ùå Class weight calculation failed: {str(e)}")
    class_weight_dict = None

# Enhanced callbacks for comprehensive monitoring
print(f"\nüìã Configuring enhanced callbacks...")

callbacks_list = []

# 1. Model checkpoint (save best model)
model_checkpoint = tf.keras.callbacks.ModelCheckpoint(
    'best_improved_model.h5',
    monitor='val_accuracy',
    mode='max',
    save_best_only=True,
    verbose=1,
    save_weights_only=False
)
callbacks_list.append(model_checkpoint)
print(f"   ‚úÖ ModelCheckpoint: Save best model based on val_accuracy")

# 2. Learning rate reduction
reduce_lr = tf.keras.callbacks.ReduceLROnPlateau(
    monitor='val_loss',
    factor=0.5,
    patience=15,
    min_lr=1e-7,
    verbose=1,
    cooldown=5
)
callbacks_list.append(reduce_lr)
print(f"   ‚úÖ ReduceLROnPlateau: Reduce LR when val_loss plateaus")

# 3. CSV Logger for detailed training logs
csv_logger = tf.keras.callbacks.CSVLogger('improved_training_log.csv', append=False)
callbacks_list.append(csv_logger)
print(f"   ‚úÖ CSVLogger: Log training metrics to file")

# 4. Learning rate scheduler
def lr_schedule(epoch):
    """Custom learning rate schedule"""
    initial_lr = 0.0005
    if epoch < 30:
        return initial_lr
    elif epoch < 60:
        return initial_lr * 0.5
    elif epoch < 100:
        return initial_lr * 0.25
    else:
        return initial_lr * 0.1

lr_scheduler = tf.keras.callbacks.LearningRateScheduler(lr_schedule, verbose=0)
callbacks_list.append(lr_scheduler)
print(f"   ‚úÖ LearningRateScheduler: Custom LR schedule over 150 epochs")

# 5. Custom callback for memory monitoring
class MemoryMonitorCallback(tf.keras.callbacks.Callback):
    def __init__(self):
        self.epoch_memory = []
        
    def on_epoch_end(self, epoch, logs=None):
        if epoch % 10 == 0:  # Log every 10 epochs
            memory = get_memory_usage()
            gpu_mem = log_gpu_memory()
            self.epoch_memory.append({
                'epoch': epoch,
                'memory_mb': memory['rss_mb'],
                'gpu_memory_mb': gpu_mem['current_mb']
            })
            
            if epoch % 25 == 0:  # Print every 25 epochs
                print(f"   ? Epoch {epoch}: RAM {memory['rss_mb']:.1f}MB, GPU {gpu_mem['current_mb']:.1f}MB")

memory_monitor = MemoryMonitorCallback()
callbacks_list.append(memory_monitor)
print(f"   ‚úÖ MemoryMonitor: Track memory usage during training")

# Training configuration validation
batch_size = 16
epochs = 150
validation_split = 0.25

print(f"\n? Final Training Configuration:")
print(f"   Model: {'‚úÖ Ready' if 'improved_model' in globals() else '‚ùå Not ready'}")
print(f"   Strategy: {'‚úÖ' if 'strategy' in globals() else '‚ùå'} {type(strategy).__name__ if 'strategy' in globals() else 'Unknown'}")
print(f"   GPUs: {strategy.num_replicas_in_sync if 'strategy' in globals() else 'Unknown'}")
print(f"   Batch size: {batch_size} per replica = {batch_size * (strategy.num_replicas_in_sync if 'strategy' in globals() else 1)} total")
print(f"   Epochs: {epochs} (FULL TRAINING - NO EARLY STOPPING)")
print(f"   Validation split: {validation_split}")
print(f"   Class weights: {'‚úÖ Applied' if class_weight_dict else '‚ùå Not applied'}")
print(f"   Callbacks: {len(callbacks_list)} configured")

# Memory requirement estimation
if 'improved_model' in globals():
    model_params = improved_model.count_params()
    estimated_memory_mb = (model_params * 4 * 3) / (1024 * 1024)  # Parameters * 4 bytes * 3 (weights, gradients, optimizer)
    print(f"   Estimated training memory: {estimated_memory_mb:.1f} MB")
    
    if initial_memory['available_mb'] < estimated_memory_mb * 1.5:
        print(f"   ‚ö†Ô∏è WARNING: May run out of memory during training!")
    else:
        print(f"   ‚úÖ Sufficient memory available")

# Force garbage collection
gc.collect()

# Start training with comprehensive monitoring
if missing_vars:
    print(f"\n‚ùå Cannot start training due to missing variables: {missing_vars}")
else:
    print(f"\nüöÄ Starting enhanced training with full monitoring...")
    print(f"üìù Training will run for ALL {epochs} epochs (no early stopping)")
    print(f"‚ö° Using {strategy.num_replicas_in_sync} GPU(s) with MirroredStrategy")
    print(f"üíæ Memory usage will be monitored every 10 epochs")
    print(f"üìä Best model will be saved automatically")
    print(f"üìà Training progress logged to 'improved_training_log.csv'")
    
    # Record training start time
    start_time = time.time()
    
    try:
        # Start training with all enhancements
        improved_history = improved_model.fit(
            X_train, y_train,
            batch_size=batch_size,
            epochs=epochs,
            validation_split=validation_split,
            callbacks=callbacks_list,
            class_weight=class_weight_dict if class_weight_dict else None,
            verbose=1,
            shuffle=True,
            workers=4,  # Use multiple workers for data loading
            use_multiprocessing=True
        )
        
        training_time = time.time() - start_time
        
        print(f"\nüéâ TRAINING COMPLETED SUCCESSFULLY!")
        print(f"‚è±Ô∏è  Total training time: {training_time/3600:.2f} hours ({training_time/60:.1f} minutes)")
        print(f"‚ö° Average time per epoch: {training_time/epochs:.1f} seconds")
        print(f"‚úÖ All {epochs} epochs completed without early stopping")
        print(f"üíæ Best model automatically saved to 'best_improved_model.h5'")
        
        # Final memory check
        final_memory = get_memory_usage()
        final_gpu_memory = log_gpu_memory()
        
        print(f"\nüìä Final Memory Status:")
        print(f"   RAM: {final_memory['rss_mb']:.1f} MB (change: {final_memory['rss_mb'] - initial_memory['rss_mb']:+.1f} MB)")
        if final_gpu_memory['current_mb'] > 0:
            print(f"   GPU: {final_gpu_memory['current_mb']:.1f} MB (peak: {final_gpu_memory['peak_mb']:.1f} MB)")
        
        # Save memory monitoring data
        if hasattr(memory_monitor, 'epoch_memory') and memory_monitor.epoch_memory:
            memory_df = pd.DataFrame(memory_monitor.epoch_memory)
            memory_df.to_csv('training_memory_log.csv', index=False)
            print(f"   üíæ Memory usage log saved to 'training_memory_log.csv'")
        
    except Exception as e:
        training_time = time.time() - start_time
        print(f"\n‚ùå TRAINING FAILED!")
        print(f"   Error: {str(e)}")
        print(f"   Time before failure: {training_time/60:.1f} minutes")
        print(f"   Check logs and memory usage")
        
        # Save partial results if available
        try:
            if 'improved_history' in locals():
                with open('partial_training_history.pkl', 'wb') as f:
                    pickle.dump(improved_history.history, f)
                print(f"   üíæ Partial training history saved")
        except:
            pass
            
        raise  # Re-raise the exception for debugging

In [None]:
# Load the best model (automatically saved during 150 epochs training)
print("üì¶ Loading best improved model from 150 epochs...")
best_improved_model = tf.keras.models.load_model('best_improved_model.h5')

# Evaluate on test set
print("üß™ Evaluating best improved model on test set...")
improved_test_loss, improved_test_accuracy, improved_test_precision, improved_test_recall = best_improved_model.evaluate(
    X_test, y_test, verbose=0
)

# Calculate F1-score and other metrics
y_pred_improved = (best_improved_model.predict(X_test) > 0.5).astype(int)
y_pred_improved_proba = best_improved_model.predict(X_test)

improved_test_f1 = f1_score(y_test, y_pred_improved)
improved_roc_auc = roc_auc_score(y_test, y_pred_improved_proba)

print(f"\nüéØ FINAL IMPROVED MODEL RESULTS (Best from 150 epochs):")
print(f"{'='*60}")
print(f"Test Accuracy:  {improved_test_accuracy:.4f}")
print(f"Test Precision: {improved_test_precision:.4f}")
print(f"Test Recall:    {improved_test_recall:.4f}")
print(f"Test F1-Score:  {improved_test_f1:.4f}")
print(f"ROC AUC:        {improved_roc_auc:.4f}")
print(f"{'='*60}")

# Get model parameters
improved_total_params = best_improved_model.count_params()
print(f"üìä Model Parameters: {improved_total_params:,}")

# Find the best epoch from training history
best_epoch = np.argmax(improved_history.history['val_accuracy']) + 1
best_val_accuracy = max(improved_history.history['val_accuracy'])
print(f"üèÜ Best epoch: {best_epoch}/150 (Validation Accuracy: {best_val_accuracy:.4f})")

# Plot comprehensive training history for all 150 epochs
fig, axes = plt.subplots(2, 3, figsize=(18, 10))

# Training & Validation Loss
axes[0, 0].plot(improved_history.history['loss'], label='Training Loss', color='blue')
axes[0, 0].plot(improved_history.history['val_loss'], label='Validation Loss', color='red')
axes[0, 0].axvline(x=best_epoch-1, color='green', linestyle='--', alpha=0.7, label=f'Best Epoch ({best_epoch})')
axes[0, 0].set_title('Model Loss Over 150 Epochs')
axes[0, 0].set_xlabel('Epoch')
axes[0, 0].set_ylabel('Loss')
axes[0, 0].legend()
axes[0, 0].grid(True, alpha=0.3)

# Training & Validation Accuracy
axes[0, 1].plot(improved_history.history['accuracy'], label='Training Accuracy', color='blue')
axes[0, 1].plot(improved_history.history['val_accuracy'], label='Validation Accuracy', color='red')
axes[0, 1].axvline(x=best_epoch-1, color='green', linestyle='--', alpha=0.7, label=f'Best Epoch ({best_epoch})')
axes[0, 1].set_title('Model Accuracy Over 150 Epochs')
axes[0, 1].set_xlabel('Epoch')
axes[0, 1].set_ylabel('Accuracy')
axes[0, 1].legend()
axes[0, 1].grid(True, alpha=0.3)

# Learning Rate Schedule (check if exists in history)
if 'lr' in improved_history.history:
    axes[0, 2].plot(improved_history.history['lr'], label='Learning Rate', color='green')
    axes[0, 2].set_title('Learning Rate Schedule')
    axes[0, 2].set_xlabel('Epoch')
    axes[0, 2].set_ylabel('Learning Rate')
    axes[0, 2].set_yscale('log')
    axes[0, 2].legend()
    axes[0, 2].grid(True, alpha=0.3)
else:
    # If no learning rate history, plot validation F1 approximation
    val_f1_approx = []
    for i in range(len(improved_history.history['val_precision'])):
        p = improved_history.history['val_precision'][i]
        r = improved_history.history['val_recall'][i]
        f1 = 2 * (p * r) / (p + r) if (p + r) > 0 else 0
        val_f1_approx.append(f1)
    
    axes[0, 2].plot(val_f1_approx, label='Validation F1-Score', color='purple')
    axes[0, 2].set_title('Validation F1-Score Over Time')
    axes[0, 2].set_xlabel('Epoch')
    axes[0, 2].set_ylabel('F1-Score')
    axes[0, 2].legend()
    axes[0, 2].grid(True, alpha=0.3)

# Precision & Recall
axes[1, 0].plot(improved_history.history['precision'], label='Training Precision', color='blue')
axes[1, 0].plot(improved_history.history['val_precision'], label='Validation Precision', color='red')
axes[1, 0].plot(improved_history.history['recall'], label='Training Recall', color='blue', linestyle='--')
axes[1, 0].plot(improved_history.history['val_recall'], label='Validation Recall', color='red', linestyle='--')
axes[1, 0].axvline(x=best_epoch-1, color='green', linestyle='--', alpha=0.7, label=f'Best Epoch ({best_epoch})')
axes[1, 0].set_title('Precision & Recall Over 150 Epochs')
axes[1, 0].set_xlabel('Epoch')
axes[1, 0].set_ylabel('Score')
axes[1, 0].legend()
axes[1, 0].grid(True, alpha=0.3)

# Confusion Matrix
cm_improved = confusion_matrix(y_test, y_pred_improved)
sns.heatmap(cm_improved, annot=True, fmt='d', cmap='Blues', 
            xticklabels=['Legitimate', 'Phishing'],
            yticklabels=['Legitimate', 'Phishing'],
            ax=axes[1, 1])
axes[1, 1].set_title(f'Best Model Confusion Matrix\n(Epoch {best_epoch})')

# ROC Curve
fpr_improved, tpr_improved, _ = roc_curve(y_test, y_pred_improved_proba)
axes[1, 2].plot(fpr_improved, tpr_improved, color='blue', lw=2, 
                label=f'Best Model (AUC = {improved_roc_auc:.4f})')
axes[1, 2].plot([0, 1], [0, 1], color='red', lw=1, linestyle='--', alpha=0.5)
axes[1, 2].set_xlim([0.0, 1.0])
axes[1, 2].set_ylim([0.0, 1.05])
axes[1, 2].set_xlabel('False Positive Rate')
axes[1, 2].set_ylabel('True Positive Rate')
axes[1, 2].set_title(f'ROC Curve - Best Model\n(Epoch {best_epoch})')
axes[1, 2].legend(loc="lower right")
axes[1, 2].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Save the final best model for deployment
best_improved_model.save('phishing_lstm_model_final.h5')
print("‚úÖ Final optimized model saved as 'phishing_lstm_model_final.h5'")

# Also save the scaler for deployment
joblib.dump(scaler, 'feature_scaler_final.pkl')
print("‚úÖ Feature scaler saved as 'feature_scaler_final.pkl'")

# Create comprehensive final report
final_report = {
    'training_info': {
        'total_epochs_completed': 150,
        'best_epoch': int(best_epoch),
        'best_validation_accuracy': float(best_val_accuracy),
        'early_stopping_used': False,
        'training_time_minutes': float(training_time/60)
    },
    'model_info': {
        'architecture': 'Bidirectional LSTM',
        'parameters': int(improved_total_params),
        'gpus_used': strategy.num_replicas_in_sync,
        'input_shape': list(input_shape)
    },
    'performance_metrics': {
        'test_accuracy': float(improved_test_accuracy),
        'test_precision': float(improved_test_precision),
        'test_recall': float(improved_test_recall),
        'test_f1_score': float(improved_test_f1),
        'roc_auc': float(improved_roc_auc)
    },
    'training_config': {
        'learning_rate_initial': 0.0005,
        'batch_size': 16,
        'validation_split': 0.25,
        'class_weights': class_weight_dict
    },
    'dataset_info': {
        'total_samples': len(df_clean),
        'training_samples': len(X_train),
        'testing_samples': len(X_test),
        'num_features': X_train.shape[2]
    }
}

# Save final report
with open('final_model_report.json', 'w') as f:
    json.dump(final_report, f, indent=2)
print("‚úÖ Final model report saved")

# Save feature column names for deployment reference
feature_info = {
    'feature_columns': feature_columns,
    'total_features': len(feature_columns),
    'feature_groups': {
        'ssl_features': ['ssl_valid', 'ssl_invalid'],
        'content_features': ['forms', 'password_fields', 'iframes', 'scripts', 'suspicious_keywords'],
        'network_features': ['redirects', 'external_requests', 'page_load_time'],
        'error_features': ['has_errors', 'success'],
        'count_features': [col for col in feature_columns if col.startswith('count_')]
    }
}

with open('feature_info.json', 'w') as f:
    json.dump(feature_info, f, indent=2)
print("‚úÖ Feature information saved")

print(f"\nüéâ DUAL GPU TRAINING COMPLETE - ALL 150 EPOCHS!")
print(f"üèÜ Best model selected from epoch {best_epoch} out of 150")
print(f"üöÄ Model ready for Chrome extension deployment!")
print(f"‚ö° Trained on {strategy.num_replicas_in_sync} GPU(s) for maximum performance!")
print(f"üìà Training improvement: Model trained for full 150 epochs, best automatically selected")

print(f"\nüìÅ Files ready for download:")
print(f"   - phishing_lstm_model_final.h5 (optimized model)")
print(f"   - feature_scaler_final.pkl (feature scaler)")
print(f"   - final_model_report.json (performance metrics)")
print(f"   - feature_info.json (feature configuration)")
print(f"   - improved_training_log.csv (training logs)")

## 11. Deployment-Ready Prediction Function

Create a complete prediction function for Chrome extension integration.

In [None]:
# Create deployment-ready prediction function with robust error handling
def predict_phishing_deployment(url_features, model_path='phishing_lstm_model_final.h5', scaler_path='feature_scaler_final.pkl'):
    """
    Complete deployment-ready function for phishing prediction
    
    Args:
        url_features: List or array of 25 behavioral features (excluding 'url' and 'label' columns)
        model_path: Path to the trained model
        scaler_path: Path to the fitted scaler
    
    Returns:
        dict: {
            'probability': float (0-1),
            'prediction': str ('legitimate' or 'phishing'),
            'confidence': str ('low', 'medium', 'high'),
            'model_version': str,
            'feature_count': int
        }
    """
    import tensorflow as tf
    import joblib
    import numpy as np
    import os
    
    try:
        # Validate file existence
        if not os.path.exists(model_path):
            return {
                'error': f'Model file not found: {model_path}',
                'probability': 0.5,
                'prediction': 'unknown',
                'confidence': 'low'
            }
            
        if not os.path.exists(scaler_path):
            return {
                'error': f'Scaler file not found: {scaler_path}',
                'probability': 0.5,
                'prediction': 'unknown',
                'confidence': 'low'
            }
        
        # Load model and scaler with error handling
        try:
            model = tf.keras.models.load_model(model_path)
        except Exception as e:
            return {
                'error': f'Failed to load model: {str(e)}',
                'probability': 0.5,
                'prediction': 'unknown',
                'confidence': 'low'
            }
            
        try:
            scaler = joblib.load(scaler_path)
        except Exception as e:
            return {
                'error': f'Failed to load scaler: {str(e)}',
                'probability': 0.5,
                'prediction': 'unknown',
                'confidence': 'low'
            }
        
        # Validate input features - expecting 25 features
        expected_features = 25  # Based on dataset: 26 total columns - 'url' - 'label' = 24 features
        if not isinstance(url_features, (list, np.ndarray)):
            return {
                'error': f'url_features must be list or numpy array, got {type(url_features)}',
                'probability': 0.5,
                'prediction': 'unknown',
                'confidence': 'low'
            }
            
        if len(url_features) != expected_features:
            return {
                'error': f'Expected {expected_features} features, got {len(url_features)}',
                'probability': 0.5,
                'prediction': 'unknown',
                'confidence': 'low',
                'expected_features': expected_features,
                'received_features': len(url_features)
            }
        
        # Validate feature values
        features_array = np.array(url_features, dtype=np.float32)
        if np.any(np.isnan(features_array)):
            return {
                'error': 'Input features contain NaN values',
                'probability': 0.5,
                'prediction': 'unknown',
                'confidence': 'low'
            }
            
        if np.any(np.isinf(features_array)):
            return {
                'error': 'Input features contain infinite values',
                'probability': 0.5,
                'prediction': 'unknown',
                'confidence': 'low'
            }
        
        # Prepare features
        features = features_array.reshape(1, -1)
        
        # Scale features
        try:
            features_scaled = scaler.transform(features)
        except Exception as e:
            return {
                'error': f'Feature scaling failed: {str(e)}',
                'probability': 0.5,
                'prediction': 'unknown',
                'confidence': 'low'
            }
        
        # Reshape for LSTM (samples, timesteps, features)
        features_lstm = features_scaled.reshape(1, 1, -1)
        
        # Validate LSTM input shape
        expected_input_shape = model.input_shape
        actual_input_shape = features_lstm.shape
        
        if expected_input_shape[1:] != actual_input_shape[1:]:
            return {
                'error': f'Input shape mismatch. Expected: {expected_input_shape}, Got: {actual_input_shape}',
                'probability': 0.5,
                'prediction': 'unknown',
                'confidence': 'low'
            }
        
        # Make prediction
        try:
            prediction_raw = model.predict(features_lstm, verbose=0)
            probability = float(prediction_raw[0][0])
        except Exception as e:
            return {
                'error': f'Model prediction failed: {str(e)}',
                'probability': 0.5,
                'prediction': 'unknown',
                'confidence': 'low'
            }
        
        # Validate probability
        if not (0 <= probability <= 1):
            probability = max(0, min(1, probability))  # Clamp to [0,1]
        
        # Determine prediction and confidence
        if probability > 0.5:
            prediction = 'phishing'
            confidence_score = probability
        else:
            prediction = 'legitimate'
            confidence_score = 1 - probability
            
        # Enhanced confidence levels
        if confidence_score < 0.6:  # 0.5-0.6 range
            confidence = 'low'
        elif confidence_score < 0.8:  # 0.6-0.8 range
            confidence = 'medium'
        else:  # 0.8-1.0 range
            confidence = 'high'
        
        return {
            'probability': probability,
            'prediction': prediction,
            'confidence': confidence,
            'model_version': 'LSTM_v2.0',
            'feature_count': len(url_features),
            'confidence_score': confidence_score
        }
        
    except Exception as e:
        return {
            'error': f'Unexpected error in prediction: {str(e)}',
            'probability': 0.5,
            'prediction': 'unknown',
            'confidence': 'low'
        }

# Test the deployment function with comprehensive validation
print("üß™ Testing deployment function with comprehensive validation...")

# Test with correct number of features (25)
sample_features_25 = [
    1, 5, 1, 0, 2, 1, 0, 3, 8, 1, 5, 2500, 0,  # Basic behavioral features (13)
    0.0, 0.0, 1.0, 2.0, 5.0, 1.0, 0.0, 3.0, 8.0, 1.0, 5.0, 2500.0  # Count features (12)
]  # Total: 25 features

print(f"‚úÖ Testing with {len(sample_features_25)} features (expected: 25)")
result = predict_phishing_deployment(sample_features_25)
print(f"Sample prediction result: {result}")

# Test with incorrect number of features
print(f"\nüß™ Testing error handling with wrong feature count...")
sample_features_wrong = [1, 2, 3, 4, 5]  # Only 5 features
result_error = predict_phishing_deployment(sample_features_wrong)
print(f"Error handling result: {result_error}")

# Feature mapping documentation
feature_mapping = {
    'basic_features': [
        'success', 'num_events', 'ssl_valid', 'ssl_invalid', 'redirects', 
        'forms', 'password_fields', 'iframes', 'scripts', 'suspicious_keywords',
        'external_requests', 'page_load_time', 'has_errors'
    ],
    'count_features': [
        'count_ssl_invalid', 'count_webdriver_error', 'count_ssl_valid',
        'count_redirects', 'count_external_requests', 'count_forms_detected',
        'count_password_fields', 'count_iframes_detected', 'count_scripts_detected',
        'count_suspicious_keywords', 'count_page_load_time'
    ]
}

print(f"\nüìã Feature Mapping for Deployment (25 features total):")
print(f"Basic features (13): {feature_mapping['basic_features']}")
print(f"Count features (12): {feature_mapping['count_features']}")

# Save the enhanced deployment function
deployment_code = f'''"""
Phishing Detection Model - Enhanced Deployment Function
Chrome Extension Integration Ready - Version 2.0
Features: 25 behavioral features (excluding 'url' and 'label')
"""

import tensorflow as tf
import joblib
import numpy as np
import os

def predict_phishing(url_features, model_path='phishing_lstm_model_final.h5', scaler_path='feature_scaler_final.pkl'):
    """
    Enhanced phishing prediction with comprehensive error handling
    
    Args:
        url_features: List of exactly 25 behavioral features in this order:
                     {feature_mapping['basic_features'] + feature_mapping['count_features']}
        model_path: Path to trained LSTM model (.h5 file)
        scaler_path: Path to feature scaler (.pkl file)
    
    Returns:
        dict: {{
            'probability': float (0-1, phishing probability),
            'prediction': str ('legitimate' or 'phishing'),
            'confidence': str ('low', 'medium', 'high'),
            'model_version': str,
            'error': str (if any error occurred)
        }}
    """
    try:
        # File existence validation
        if not os.path.exists(model_path):
            return {{'error': f'Model file not found: {{model_path}}'}}
        if not os.path.exists(scaler_path):
            return {{'error': f'Scaler file not found: {{scaler_path}}'}}
        
        # Load model and scaler
        model = tf.keras.models.load_model(model_path)
        scaler = joblib.load(scaler_path)
        
        # Validate input (exactly 25 features expected)
        if len(url_features) != 25:
            return {{
                'error': f'Expected exactly 25 features, got {{len(url_features)}}',
                'expected_features': 25,
                'received_features': len(url_features)
            }}
        
        # Prepare and validate features
        features = np.array(url_features, dtype=np.float32).reshape(1, -1)
        if np.any(np.isnan(features)) or np.any(np.isinf(features)):
            return {{'error': 'Input contains invalid values (NaN or Inf)'}}
        
        # Scale and reshape for LSTM
        features_scaled = scaler.transform(features)
        features_lstm = features_scaled.reshape(1, 1, -1)
        
        # Predict
        probability = float(model.predict(features_lstm, verbose=0)[0][0])
        probability = max(0, min(1, probability))  # Ensure valid range
        
        # Classify and assess confidence
        prediction = 'phishing' if probability > 0.5 else 'legitimate'
        confidence_score = probability if probability > 0.5 else (1 - probability)
        
        if confidence_score < 0.6:
            confidence = 'low'
        elif confidence_score < 0.8:
            confidence = 'medium'
        else:
            confidence = 'high'
        
        return {{
            'probability': probability,
            'prediction': prediction,
            'confidence': confidence,
            'model_version': 'Enhanced_LSTM_v2.0',
            'feature_count': 25
        }}
        
    except Exception as e:
        return {{'error': f'Prediction failed: {{str(e)}}'}}

# Feature order for reference:
FEATURE_ORDER = {feature_mapping['basic_features'] + feature_mapping['count_features']}

# Example usage:
# features = [1, 5, 1, 0, 2, 1, 0, 3, 8, 1, 5, 2500, 0, 0.0, 0.0, 1.0, 2.0, 5.0, 1.0, 0.0, 3.0, 8.0, 1.0, 5.0, 2500.0]
# result = predict_phishing(features)
# print(f"Prediction: {{result['prediction']}} ({{result['confidence']}} confidence)")
'''

# Write deployment function to file
with open('phishing_predictor_v2.py', 'w', encoding='utf-8') as f:
    f.write(deployment_code)

print(f"\n‚úÖ Enhanced deployment function saved as 'phishing_predictor_v2.py'")

# Create feature configuration file
feature_config = {
    'version': '2.0',
    'total_features': 25,
    'feature_order': feature_mapping['basic_features'] + feature_mapping['count_features'],
    'feature_groups': feature_mapping,
    'model_requirements': {
        'tensorflow_version': '>=2.8.0',
        'input_shape': [1, 25],
        'output_shape': [1],
        'activation': 'sigmoid'
    },
    'deployment_notes': [
        'Features must be provided in exact order specified',
        'All 25 features are required for prediction',
        'Feature scaling is applied automatically',
        'Model expects LSTM input format: (samples, timesteps, features)'
    ]
}

import json
with open('feature_config_v2.json', 'w', encoding='utf-8') as f:
    json.dump(feature_config, f, indent=2)

print(f"‚úÖ Feature configuration saved as 'feature_config_v2.json'")

# Create integration guide
integration_guide = """
# Chrome Extension Integration Guide

## Model Files Required:
- phishing_lstm_model_final.h5 (trained model)
- feature_scaler_final.pkl (feature scaler)
- phishing_predictor_v2.py (prediction function)

## Feature Collection:
Your Chrome extension should collect these 25 features in exact order:

### Basic Features (13):
1. success (0/1)
2. num_events (integer)
3. ssl_valid (0/1)
4. ssl_invalid (0/1)
5. redirects (integer)
6. forms (integer)
7. password_fields (integer)
8. iframes (integer)
9. scripts (integer)
10. suspicious_keywords (integer)
11. external_requests (integer)
12. page_load_time (milliseconds)
13. has_errors (0/1)

### Count Features (12):
14. count_ssl_invalid (float)
15. count_webdriver_error (float)
16. count_ssl_valid (float)
17. count_redirects (float)
18. count_external_requests (float)
19. count_forms_detected (float)
20. count_password_fields (float)
21. count_iframes_detected (float)
22. count_scripts_detected (float)
23. count_suspicious_keywords (float)
24. count_page_load_time (float)

## Usage Example:
```python
from phishing_predictor_v2 import predict_phishing

# Collect features from URL
features = collect_url_features(url)  # Your implementation
result = predict_phishing(features)

if 'error' in result:
    print(f"Error: {result['error']}")
else:
    print(f"Prediction: {result['prediction']}")
    print(f"Confidence: {result['confidence']}")
    print(f"Probability: {result['probability']:.3f}")
```
"""

with open('chrome_extension_integration.md', 'w', encoding='utf-8') as f:
    f.write(integration_guide)

print(f"‚úÖ Integration guide saved as 'chrome_extension_integration.md'")

print(f"\nüéØ Deployment Function Status:")
print(f"   ‚úÖ Enhanced error handling implemented")
print(f"   ‚úÖ Feature count validation (25 features)")
print(f"   ‚úÖ Input validation and sanitization") 
print(f"   ‚úÖ Model compatibility verification")
print(f"   ‚úÖ Comprehensive documentation created")

print(f"\nüìÅ Files ready for Chrome extension:")
print(f"   - phishing_predictor_v2.py (enhanced prediction function)")
print(f"   - feature_config_v2.json (feature specifications)")
print(f"   - chrome_extension_integration.md (integration guide)")

print(f"\nüöÄ Deployment function ready for production use!")