In [1]:
# Use "Base" Kernel for this notebook

### Advanced CNN Classification with Deep Layers and Alternative Activations

This notebook explores an enhanced CNN architecture for text classification. Compared to the previous experiment, this model includes:
1. **More Dense Layers**: Increased depth to capture more complex patterns.
2. **Alternative Activation Functions**: Using `LeakyReLU` instead of standard `ReLU` to prevent the vanishing gradient problem (dead neurons).
3. **Higher Dropout**: To prevent overfitting in deeper layers.
4. **K-Fold and SMOTE**: Maintaining robust evaluation and balancing techniques.

In [2]:
import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Embedding, Conv1D, GlobalMaxPooling1D, Dropout, LeakyReLU
from sklearn.model_selection import train_test_split, KFold
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score
from imblearn.over_sampling import SMOTE
from IPython.display import display

# Load Data
file_path = '(A) Data/(A) PreProcessed_News Content Title_3000 Data.csv'
df = pd.read_csv(file_path, usecols=['Detokenized', 'Labelling'], engine='python')
df = df.dropna()

# Map labels to 0, 1, 2
label_mapping = {-1: 0, 0: 1, 1: 2}
df['label_encoded'] = df['Labelling'].map(label_mapping)

X = df['Detokenized'].values
y = df['label_encoded'].values

print(f"Data Shape: {df.shape}")
print("Class Distribution:\n", df['label_encoded'].value_counts())

Data Shape: (2791, 3)
Class Distribution:
 label_encoded
1    1331
0    1026
2     434
Name: count, dtype: int64


### 1. Sequence Preprocessing

In [3]:
# Hyperparameters
vocab_size = 5000
embedding_dim = 100
max_length = 100
oov_tok = "<OOV>"

# Initialize Tokenizer
tokenizer = Tokenizer(num_words=vocab_size, oov_token=oov_tok)
tokenizer.fit_on_texts(X)

# Convert to Sequences and Pad
sequences = tokenizer.texts_to_sequences(X)
padded_X = pad_sequences(sequences, maxlen=max_length, padding='post', truncating='post')

print(f"Found {len(tokenizer.word_index)} unique tokens.")

Found 6142 unique tokens.


### 2. Advanced Model Definition
We use a deeper architecture with `LeakyReLU` activation and multiple Dense layers.

In [4]:
def create_advanced_model():
    model = Sequential([
        Embedding(input_dim=vocab_size, output_dim=embedding_dim, input_length=max_length),
        Conv1D(filters=128, kernel_size=5, activation='relu'),
        GlobalMaxPooling1D(),
        
        # Deeper Dense Network
        Dense(128),
        LeakyReLU(alpha=0.1),
        Dropout(0.4),
        
        Dense(64),
        LeakyReLU(alpha=0.1),
        Dropout(0.4),
        
        Dense(32),
        LeakyReLU(alpha=0.1),
        
        Dense(3, activation='softmax')
    ])
    
    optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)
    model.compile(loss='sparse_categorical_crossentropy', optimizer=optimizer, metrics=['accuracy'])
    return model

model_summary = create_advanced_model()
model_summary.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 100, 100)          500000    
                                                                 
 conv1d (Conv1D)             (None, 96, 128)           64128     
                                                                 
 global_max_pooling1d (Globa  (None, 128)              0         
 lMaxPooling1D)                                                  
                                                                 
 dense (Dense)               (None, 128)               16512     
                                                                 
 leaky_re_lu (LeakyReLU)     (None, 128)               0         
                                                                 
 dropout (Dropout)           (None, 128)               0         
                                                        

### 3. Experiment: Advanced CNN + SMOTE + K-Fold Validation

In [5]:
smote = SMOTE(random_state=42)
kfold = KFold(n_splits=5, shuffle=True, random_state=42)
advanced_results = []
fold_no = 1

for train, test in kfold.split(padded_X, y):
    print(f'Training fold {fold_no} (Advanced Model + SMOTE)...')
    
    # Apply SMOTE to the TRAINING set
    X_train_fold, y_train_fold = padded_X[train], y[train]
    X_train_res, y_train_res = smote.fit_resample(X_train_fold, y_train_fold)
    
    model = create_advanced_model()
    # Using slightly more epochs (20) for the deeper model to converge
    model.fit(X_train_res, y_train_res, epochs=20, batch_size=32, verbose=0)
    
    y_pred = np.argmax(model.predict(padded_X[test]), axis=1)
    report = classification_report(y[test], y_pred, output_dict=True, labels=[0, 1, 2])
    
    advanced_results.append({
        'Fold': f'Group {fold_no}',
        'Accuracy': accuracy_score(y[test], y_pred),
        'Prec Class 0': report['0']['precision'],
        'Prec Class 1': report['1']['precision'],
        'Prec Class 2': report['2']['precision'],
        'Recall Class 0': report['0']['recall'],
        'Recall Class 1': report['1']['recall'],
        'Recall Class 2': report['2']['recall'],
        'F1 Class 0': report['0']['f1-score'],
        'F1 Class 1': report['1']['f1-score'],
        'F1 Class 2': report['2']['f1-score']
    })
    fold_no += 1

results_df = pd.DataFrame(advanced_results).set_index('Fold')
summary_stats = pd.DataFrame({
    'Max': results_df.max(), 
    'Min': results_df.min(), 
    'Average': results_df.mean(), 
    'Stdev': results_df.std()
}).T
final_table = pd.concat([results_df, summary_stats])

print("\nAdvanced Experiment Results (Deep CNN + LeakyReLU + SMOTE + K-Fold):")
display(final_table.style.format("{:.4f}"))

Training fold 1 (Advanced Model + SMOTE)...
Training fold 2 (Advanced Model + SMOTE)...
Training fold 3 (Advanced Model + SMOTE)...
Training fold 4 (Advanced Model + SMOTE)...
Training fold 5 (Advanced Model + SMOTE)...

Advanced Experiment Results (Deep CNN + LeakyReLU + SMOTE + K-Fold):


Unnamed: 0,Accuracy,Prec Class 0,Prec Class 1,Prec Class 2,Recall Class 0,Recall Class 1,Recall Class 2,F1 Class 0,F1 Class 1,F1 Class 2
Group 1,0.5921,0.6683,0.7202,0.3354,0.6814,0.5092,0.6463,0.6748,0.5966,0.4417
Group 2,0.6344,0.6481,0.7174,0.4375,0.733,0.6066,0.5158,0.688,0.6574,0.4734
Group 3,0.5932,0.6919,0.6942,0.3312,0.6493,0.5316,0.6538,0.6699,0.6021,0.4397
Group 4,0.5896,0.6376,0.7219,0.338,0.6919,0.5056,0.6,0.6636,0.5947,0.4324
Group 5,0.5932,0.7041,0.7053,0.343,0.6603,0.536,0.596,0.6815,0.6091,0.4354
Max,0.6344,0.7041,0.7219,0.4375,0.733,0.6066,0.6538,0.688,0.6574,0.4734
Min,0.5896,0.6376,0.6942,0.3312,0.6493,0.5056,0.5158,0.6636,0.5947,0.4324
Average,0.6005,0.67,0.7118,0.357,0.6832,0.5378,0.6024,0.6756,0.612,0.4445
Stdev,0.019,0.0282,0.0118,0.0452,0.0325,0.0407,0.0551,0.0095,0.026,0.0166


### 4. Summary and Interpretation

By increasing the number of Dense layers and applying `LeakyReLU`, the model is expected to learn more complex semantic relationships in the text. 

*   **Dense Layers (128, 64, 32)**: Provide a hierarchy of features from the CNN output.
*   **LeakyReLU**: Ensures that neurons that fall into negative territory still contribute to the gradient, helping the model learn better than standard ReLU.
*   **Dropout (0.4)**: Crucial for deeper networks to ensure that the model generalizes well to unseen data.

Compare the 'Average Accuracy' and 'Recall Class 2' with the previous experiment to validate improvement.