# Neural Network Model Testing

This notebook tests the neural network model for sentiment prediction on news headlines.

In [1]:
# Import necessary libraries
import sys
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Add project root to path for imports
sys.path.append(os.path.abspath('../..'))

# Import project modules
from src.models.predict_model import ModelPredictor
from src.config import PROCESSED_DATA_PATH, RAW_DATA_PATH, MODEL_DIR

[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\PC\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\PC\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package wordnet to
[nltk_data]     C:\Users\PC\AppData\Roaming\nltk_data...
[nltk_data]   Package wordnet is already up-to-date!
  from .autonotebook import tqdm as notebook_tqdm


## 1. Initialize the Model Predictor

We'll initialize the ModelPredictor class that will use our trained neural network model.

In [2]:
# Initialize the predictor
predictor = ModelPredictor()

## 2. Test Single Headline Prediction

Let's test the model on a single headline first to check if everything is working.

In [3]:
# Test with a single positive headline
test_headline = "Company profits exceed expectations in Q1 2025"
result = predictor.predict_neural_network(test_headline)

# Display the result
if result:
    r = result[0]  # Get the first result
    print(f"\nHeadline: {r['headline']}")
    print(f"Predicted Sentiment: {r['sentiment']}")
    print(f"Confidence: {r['confidence']:.2f}")
    
    print("\nAll Probabilities:")
    for sentiment, prob in r['probabilities'].items():
        print(f"- {sentiment}: {prob:.2f}")
else:
    print("Prediction failed or no model found.")

Using most recent model: d:\Documents\CODE\HCMUT\Machine Learning Assignment\models\experiments\neural_network_20250303_135452.pkl
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 705ms/step

Headline: Company profits exceed expectations in Q1 2025
Predicted Sentiment: neutral
Confidence: 0.71

All Probabilities:
- negative: 0.09
- neutral: 0.71
- positive: 0.20


## 3. Test Multiple Headlines

Now let's test the model on multiple headlines with different expected sentiments.

In [4]:
# Test with multiple headlines
test_headlines = [
    "Stock market reaches all-time high as investor confidence grows",
    "Major company announces significant layoffs due to economic downturn",
    "Global trade continues at steady pace despite mild fluctuations",
    "Tech giant releases new product line with innovative features",
    "Retail sales decline for third consecutive quarter"
]

results = predictor.predict_neural_network(test_headlines)

# Display the results
if results:
    for r in results:
        print(f"Headline: {r['headline']}")
        print(f"Predicted Sentiment: {r['sentiment']} (confidence: {r['confidence']:.2f})")
        print()
else:
    print("Prediction failed or no model found.")

Using most recent model: d:\Documents\CODE\HCMUT\Machine Learning Assignment\models\experiments\neural_network_20250303_135452.pkl
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 554ms/step
Headline: Stock market reaches all-time high as investor confidence grows
Predicted Sentiment: neutral (confidence: 0.53)

Headline: Major company announces significant layoffs due to economic downturn
Predicted Sentiment: positive (confidence: 0.42)

Headline: Global trade continues at steady pace despite mild fluctuations
Predicted Sentiment: neutral (confidence: 0.56)

Headline: Tech giant releases new product line with innovative features
Predicted Sentiment: neutral (confidence: 0.79)

Headline: Retail sales decline for third consecutive quarter
Predicted Sentiment: neutral (confidence: 0.68)



## 4. Test on Real Dataset

Let's load a sample of the test dataset and predict sentiments.

In [5]:
# Load test dataset
test_data_path = os.path.join(RAW_DATA_PATH, "test_data.csv")
test_df = pd.read_csv(test_data_path)

# Use only a sample for testing
sample_size = min(100, len(test_df))
test_sample = test_df.sample(sample_size, random_state=42)
print(f"Loaded test data with {len(test_sample)} headlines")

# Show a few examples
test_sample.head(3)

Loaded test data with 3 headlines


Unnamed: 0,Sentiment,News Headline
0,neutral,"According to Gran , the company has no plans t..."
1,negative,The international electronic industry company ...
2,positive,According to the company 's updated strategy f...


In [6]:
# Make predictions
headlines = test_sample['News Headline'].tolist()
results = predictor.predict_neural_network(headlines)

# Create a dataframe with predictions
if results:
    predicted_sentiments = [r['sentiment'] for r in results]
    confidence_scores = [round(r['confidence'], 2) for r in results]
    
    # Add predictions to the dataframe
    results_df = test_sample.copy()
    results_df = results_df.rename(columns={'Sentiment': 'Actual Sentiment'})
    results_df['Predicted Sentiment'] = predicted_sentiments
    results_df['Confidence'] = confidence_scores
    
    print(f"Predictions completed")
else:
    print("Prediction failed or no model found.")

# Show some results
results_df.head(3)

Using most recent model: d:\Documents\CODE\HCMUT\Machine Learning Assignment\models\experiments\neural_network_20250303_135452.pkl
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 530ms/step
Predictions completed


Unnamed: 0,Actual Sentiment,News Headline,Predicted Sentiment,Confidence
0,neutral,"According to Gran , the company has no plans t...",neutral,0.69
1,negative,The international electronic industry company ...,positive,0.51
2,positive,According to the company 's updated strategy f...,neutral,0.61


## 5. Model Evaluation

Let's evaluate the model's performance on the test dataset.

In [7]:
from sklearn.metrics import classification_report, confusion_matrix

# Calculate metrics
y_true = results_df['Actual Sentiment']
y_pred = results_df['Predicted Sentiment']

# Print classification report
print(classification_report(y_true, y_pred))

              precision    recall  f1-score   support

    negative       0.00      0.00      0.00         1
     neutral       0.50      1.00      0.67         1
    positive       0.00      0.00      0.00         1

    accuracy                           0.33         3
   macro avg       0.17      0.33      0.22         3
weighted avg       0.17      0.33      0.22         3



  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
