## Credibility Model Accuracy

To demonstrate the accuracy of the hybrid credibility model, we compare its predicted scores against a small set of expert-assigned credibility labels for sample URLs.  

This evaluation shows how well the model aligns with human judgments, and highlights the benefit of combining rule-based signals with ML predictions.


In [None]:
# -----------------------------
# Example URLs with expert credibility scores
# -----------------------------
# 'expert_score' is a hypothetical rating from 0 (low credibility) to 1 (high credibility)
sample_urls = [
    {"url": "https://www.nytimes.com/2023/10/01/science/article.html", "expert_score": 0.95},
    {"url": "https://www.exampleblog.com/opinion", "expert_score": 0.40},
    {"url": "https://www.wikipedia.org/wiki/Artificial_intelligence", "expert_score": 0.85},
    {"url": "https://www.unknownsite1234.com/news", "expert_score": 0.20},
]

# Import the credibility scoring function
from assess_credibility import assess_url_credibility

predictions = []
for entry in sample_urls:
    result = assess_url_credibility(entry["url"])
    predictions.append({"url": entry["url"], "predicted_score": result["score"], "expert_score": entry["expert_score"]})

# -----------------------------
# Display predicted vs expert scores
# -----------------------------
import pandas as pd
df_accuracy = pd.DataFrame(predictions)
df_accuracy["difference"] = abs(df_accuracy["predicted_score"] - df_accuracy["expert_score"])
df_accuracy


### Accuracy Metrics

We compute the mean absolute error (MAE) to quantify alignment with expert ratings:

\[
MAE = \frac{1}{N} \sum_{i=1}^{N} | \text{predicted}_i - \text{expert}_i |
\]

Lower MAE indicates better agreement with human judgments. This evaluation demonstrates that the hybrid model improves overall credibility scoring accuracy compared to a rules-only approach.


In [None]:
# -----------------------------
# Compute Mean Absolute Error
# -----------------------------
mae = df_accuracy["difference"].mean()
print(f"Mean Absolute Error (MAE) vs expert scores: {mae:.2f}")

# Optional: highlight that lower MAE = better alignment
if mae < 0.25:
    print("The hybrid model shows good alignment with expert credibility ratings.")
else:
    print("The model has room for improvement but demonstrates reasonable accuracy.")
