In [8]:


""" 1. High-Confidence Fraud (The "Control" Group)These should trigger "Fraud/Alert" with High Confidence ($>85\%$).
1."URGENT: Your account has been compromised. Log in at [link] immediately to prevent permanent suspension."
2."Action Required: We detected a suspicious login from a new device. Click here to secure your bank account now."
3."Your credit card has been blocked due to unusual activity. Call 1-800-FAKE-NUM to verify your identity."
    2. High-Confidence Normal (Legitimate Communication)These should trigger "Normal" with High Confidence ($>85\%$).
1."Hi Mom, I am running late for dinner, see you in 20 minutes."
2."The team meeting has been moved to Room 4B at 3:00 PM today. Please bring your notes."
3."I have attached the grocery list for this weekend. Let me know if I missed anything!"
    3. Ambiguous Sentences (The "Uncertainty" Test)These contain "trigger words" but in a safe context. The model should show Medium to Low Confidence ($60\% - 75\%$).
1."I need to go to the bank later to open a savings account for my daughter." (Uses fraud keywords in a normal way).
2."Please click the link in my bio to see the photos from my vacation." (Uses "click link" in a social context).
3."It is urgent that we finish this project by Friday, or the boss will be upset." (Uses "urgent" without a financial threat).
    4. Adversarial Examples (The "Stress" Test)These are designed to "trick" the model. A good system should report Low Confidence ($<60\%$).
1."Congratulations! You've won a $1000 gift card! No link, just reply YES." (Missing the typical 'link' trigger).
2."B-a-n-k a-c-c-o-u-n-t s-u-s-p-e-n-d-e-d. V-e-r-i-f-y n-o-w." (Spacing intended to bypass simple keyword filters).
3."I am a Nigerian Prince and I have a business proposal for your account." (Classic scam, but uses very polite, non-urgent language)."""

  """ 1. High-Confidence Fraud (The "Control" Group)These should trigger "Fraud/Alert" with High Confidence ($>85\%$).


' 1. High-Confidence Fraud (The "Control" Group)These should trigger "Fraud/Alert" with High Confidence ($>85\\%$).\n1."URGENT: Your account has been compromised. Log in at [link] immediately to prevent permanent suspension."\n2."Action Required: We detected a suspicious login from a new device. Click here to secure your bank account now."\n3."Your credit card has been blocked due to unusual activity. Call 1-800-FAKE-NUM to verify your identity."\n    2. High-Confidence Normal (Legitimate Communication)These should trigger "Normal" with High Confidence ($>85\\%$).\n1."Hi Mom, I am running late for dinner, see you in 20 minutes."\n2."The team meeting has been moved to Room 4B at 3:00 PM today. Please bring your notes."\n3."I have attached the grocery list for this weekend. Let me know if I missed anything!"\n    3. Ambiguous Sentences (The "Uncertainty" Test)These contain "trigger words" but in a safe context. The model should show Medium to Low Confidence ($60\\% - 75\\%$).\n1."I need 

In [6]:
import pandas as pd
import numpy as np
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.calibration import CalibratedClassifierCV
from sklearn.pipeline import Pipeline
import json

# ==========================================
# 1. IMPROVED TRAINING (SPAM/FRAUD FOCUSED)
# ==========================================
def train_fraud_text_model():
    """Trains a model specifically for urgency and fraud detection."""
    try:
        data = {
            'text': [
                'Urgent: Your account is locked, click here', 'Action required: verify your password now',
                'Your bank account will be blocked', 'Winner! Claim your lottery prize immediately',
                'Suspicious activity detected on your card', 'Click this link to avoid account suspension',
                'Hello, how are you today?', 'The meeting is scheduled for 3 PM',
                'I sent you the documents via email', 'Let us go for lunch later',
                'Your order has been shipped', 'Looking forward to our catch up'
            ],
            'label': [1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0]
        }
        df = pd.DataFrame(data)

        pipeline = Pipeline([
            ('tfidf', TfidfVectorizer(ngram_range=(1, 2))),
            ('clf', LogisticRegression())
        ])

        calibrated_model = CalibratedClassifierCV(pipeline, method='sigmoid', cv=2)
        calibrated_model.fit(df['text'], df['label'])

        return calibrated_model
    except Exception as e:
        print(f"Training Error: {e}")
        return None

# ==========================================
# 2. ENHANCED UNCERTAINTY EXPLANATION
# ==========================================
def get_fraud_explanation(text, proba):
    """Deeply analyzes why the model is confident or unconfident."""
    try:
        explanations = []
        conf_score = max(proba)
        urgency_keywords = ['urgent', 'immediately', 'blocked', 'link', 'click', 'account']

        if conf_score < 0.70:
            explanations.append("High ambiguity: The language pattern is unfamiliar to the training set.")

        found_keywords = [word for word in urgency_keywords if word in text.lower()]
        if found_keywords and conf_score < 0.85:
            explanations.append(f"Security Warning: Contains high-pressure keywords {found_keywords} but lacks typical fraud syntax.")

        if len(text) < 10:
            explanations.append("Input is too brief for a reliable security assessment.")

        if not explanations:
            explanations.append("Pattern matches known historical data with high reliability.")

        return explanations
    except Exception as e:
        return [f"Explanation error: {e}"]

# ==========================================
# 3. ANALYSIS EXECUTION (WITH LOOP SUPPORT)
# ==========================================
def analyze_sentence(model):
    """Performs a single inference and returns True to keep the loop running."""
    try:
        print("\n" + "-"*50)
        user_input = input("Enter sentence to analyze (or type 'exit' to quit): ").strip()

        # Termination check
        if user_input.lower() in ['exit', 'quit', 'q']:
            print("Shutting down the Real-Time Scorer. Goodbye!")
            return False

        if not user_input:
            print("Empty input. Please provide a sentence.")
            return True

        # Get calibrated probability
        probabilities = model.predict_proba([user_input])[0]
        prediction_idx = np.argmax(probabilities)
        conf_score = probabilities[prediction_idx]

        # Classification Level
        if conf_score >= 0.85:
            level = "High"
        elif conf_score >= 0.65:
            level = "Medium"
        else:
            level = "Low"

        # Construct Structured Output
        result = {
            "prediction": "Fraud/Alert" if prediction_idx == 1 else "Normal",
            "confidence_score": f"{round(conf_score * 100, 2)}%",
            "confidence_level": level,
            "uncertainty_explanation": get_fraud_explanation(user_input, probabilities)
        }

        print("\n[SYSTEM OUTPUT]")
        print(json.dumps(result, indent=2))
        return True

    except Exception as e:
        print(f"Analysis Error: {e}")
        return True

# ==========================================
# 4. MAIN PRODUCTION LOOP
# ==========================================
if __name__ == "__main__":
    fraud_system = train_fraud_text_model()

    if fraud_system:
        print("Model Loaded Successfully.")
        print("Instructions: Type a sentence to see confidence scores. Type 'exit' to stop.")

        # This loop keeps the system alive
        running = True
        while running:
            try:
                running = analyze_sentence(fraud_system)
            except KeyboardInterrupt:
                print("\nInterrupted by user. Exiting...")
                break

Model Loaded Successfully.
Instructions: Type a sentence to see confidence scores. Type 'exit' to stop.

--------------------------------------------------
Enter sentence to analyze (or type 'exit' to quit): Your credit card has been blocked due to unusual activity. Call 1-800-FAKE-NUM to verify your identity.

[SYSTEM OUTPUT]
{
  "prediction": "Fraud/Alert",
  "confidence_score": "50.96%",
  "confidence_level": "Low",
  "uncertainty_explanation": [
    "High ambiguity: The language pattern is unfamiliar to the training set.",
  ]
}

--------------------------------------------------
Enter sentence to analyze (or type 'exit' to quit): The team meeting has been moved to Room 4B at 3:00 PM today. Please bring your notes

[SYSTEM OUTPUT]
{
  "prediction": "Normal",
  "confidence_score": "91.87%",
  "confidence_level": "High",
  "uncertainty_explanation": [
    "Pattern matches known historical data with high reliability."
  ]
}

--------------------------------------------------
Enter sen