<a href="https://colab.research.google.com/github/SamudralaAnuhya/FraudDetection/blob/main/FraudDetection.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [32]:
!pip install -q groq


In [33]:
pip install us




Fraud Detection Chatbot using Groq Language Models
------------------------------------------------
This chatbot implements real-time fraud detection for financial transactions using Groq's powerful
language models (mixtral-8x7b-32768, llama2-70b-4096, gemma-7b-it). It utilizes multiple prompting
techniques (zero-shot, few-shot, chain-of-thought) to analyze transaction patterns and detect potential fraud risks.

Key Features:
- Interactive input for transaction details (amount, location : transaction location , user location , time)
- Multiple prompting strategies for varied analysis approaches
- Comparative model evaluation with detailed scoring
- Transaction history tracking and CSV export
- Real-time model performance comparison
- Auto-suggestions based on customer history

Evaluation bettween models:
- Given based on accuracy , clarity and reasoning score

Setup & Usage:
1. Install: ipywidgets, pandas , us , groq
2. Configure Groq API credentials
3. Run cells in sequence
4. Input transaction details and select model/prompt type
5. Use buttons:
   - Submit: Generate analysis
   - Export: Save to CSV
   - Evaluate: Compare model responses (needs 2+ transactions)

Note: For best results, maintain consistent Customer IDs and try different prompting techniques

In [34]:
from google.colab import userdata
from groq import Groq

# Set up the Groq client
client = Groq(
    api_key=userdata.get('GROQ_API_KEY')
)

In [133]:
# Define Prompt Templates
ZERO_SHOT_PROMPT_TEMPLATE = """
System: You are a highly skilled fraud detection expert. Your task is to analyze financial transactions and determine if they are likely fraudulent.

User: Analyze the following transaction and provide your assessment:
- Amount: {amount}
- Location: {location}
- User Location: {user_location}
- Transaction Time: {time}

Respond with a structured analysis:
- Prediction: [Fraudulent or Legitimate]
- Reason: Provide a concise explanation for your prediction, highlighting any suspicious factors.
"""

FEW_SHOT_PROMPT_TEMPLATE = """
System: You are a fraud detection expert. Utilize the provided examples to guide your analysis of new transactions.

Example 1:
- Amount: $50
- Location: New York, USA
- User Location: New York, USA
- Transaction Time: 2:00 PM EST
- Result: Legitimate
- Reason: The transaction aligns with the user's typical spending patterns and location, indicating normal activity.

Example 2:
- Amount: $2,000
- Location: San Francisco, USA
- User Location: Los Angeles, USA
- Transaction Time: 10:00 AM PST
- Result: Potentially Fraudulent
- Reason: While the transaction location is within the same country, it's outside the user's usual city and involves a larger-than-usual amount, requiring further investigation.

Now analyze the following transaction:
- Amount: {amount}
- Location: {location}
- User Location: {user_location}
- Transaction Time: {time}

Provide your assessment:
- Prediction: [Fraudulent, Legitimate, Potentially Fraudulent]
- Reason: Explain your prediction, citing specific factors that influenced your judgment.
"""

CHAIN_OF_THOUGHT_PROMPT_TEMPLATE = """
System: You are a meticulous fraud detection analyst. Your task is to systematically evaluate transactions for potential fraud.

User: Analyze the following transaction for potential fraud:
- Amount: {amount}
- Location: {location}
- User Location: {user_location}
- Transaction Time: {time}

Think step-by-step:
1. Compare the transaction location to the user's usual location. Are they significantly different? If so, this could be a red flag.
2. Analyze the transaction time. Does it fall outside the user's typical active hours? Unusual transaction times can indicate suspicious activity.
3. Evaluate the transaction amount against the user's usual spending pattern. Is it significantly higher than their average transaction value? Large deviations from typical spending are a potential indicator of fraud.
4. Combine all these factors to decide if the transaction is likely fraudulent.

Provide your assessment:
- Prediction: [Fraudulent or Legitimate]
- Reason: Explain your prediction, clearly outlining the factors that led to your conclusion.
"""


In [134]:
# Cell 1: Imports and Constants
import ipywidgets as widgets
from IPython.display import Markdown, display
from us import states
import pandas as pd
import re

MODEL_OPTIONS = ["mixtral-8x7b-32768", "llama-3.3-70b-versatile", "gemma2-9b-it"]
PROMPT_OPTIONS = {
    "Zero-Shot": ZERO_SHOT_PROMPT_TEMPLATE,
    "Few-Shot": FEW_SHOT_PROMPT_TEMPLATE,
    "Chain of Thought": CHAIN_OF_THOUGHT_PROMPT_TEMPLATE,
}


In [135]:
# Cell 2: LLM Query Functions
def query_llm(model_name, chat_history):
    try:
        messages = [{"role": msg["role"], "content": msg["content"]} for msg in chat_history]
        completion = client.chat.completions.create(
            messages=messages,
            model=model_name,
            temperature=0.3,
        )
        return completion.choices[0].message.content if completion.choices else "No response generated."
    except Exception as e:
        return f"Error querying the model: {e}"


In [136]:
#divide our llm repsonse into reason and main prediction word
def parse_prediction_and_reason(response_text):
    match_pred = re.search(r'(?i)\bprediction:\s*(.*)', response_text)
    match_reason = re.search(r'(?i)\breason:\s*(.*)', response_text, re.DOTALL)
    prediction = match_pred.group(1).strip() if match_pred else "Unknown"
    reason = match_reason.group(1).strip() if match_reason else "No reason provided."
    return prediction, reason

In [137]:
def query_fraud_detection(model_name, chat_history, template, amount, location, user_location, time):
    try:
        prompt = template.format(amount=amount, location=location, user_location=user_location, time=time)
        chat_history.append({"role": "user", "content": prompt})
        response_text = query_llm(model_name, chat_history)
        chat_history.append({"role": "assistant", "content": response_text})
        prediction, reason = parse_prediction_and_reason(response_text)
        return prediction, reason
    except Exception as e:
        return f"Error querying the model: {e}", ""

In [140]:
def evaluate_stored_responses(transactions_df, evaluator_model):
    if len(transactions_df) < 2:
        print("Need at least 2 responses. Please make more predictions.")
        return None

    # Group by prompt type and model
    grouped = transactions_df.groupby(['prompt_type', 'model_name'])
    prompt_models = grouped.groups.keys()

    # Find prompt types with multiple models
    comparable_prompts = {}
    for prompt, model in prompt_models:
        if prompt not in comparable_prompts:
            comparable_prompts[prompt] = []
        comparable_prompts[prompt].append(model)

    comparable_prompts = {k:v for k,v in comparable_prompts.items() if len(v) > 1}

    if not comparable_prompts:
        print("No comparable responses found. Need same prompt type with different models.")
        return None

    # Get latest comparison for each prompt type
    results = []
    for prompt_type, models in comparable_prompts.items():
        prompt_data = transactions_df[transactions_df['prompt_type'] == prompt_type]
        responses = []
        for model in models[:2]:
            model_response = prompt_data[prompt_data['model_name'] == model].iloc[-1]
            responses.append({
                'model': model,
                'response': model_response['Response'],
                'customer_id': model_response['customer_id']
            })

        eval_prompt = f"""
        Comparing models {responses[0]['model']} vs {responses[1]['model']}
        For prompt type: {prompt_type}
        Customer ID: {responses[0]['customer_id']}

        Response 1 ({responses[0]['model']}): {responses[0]['response']}
        Response 2 ({responses[1]['model']}): {responses[1]['response']}

        Evaluate on:
        1. Accuracy (1-5)
        2. Clarity (1-5)
        3. Reasoning (1-5)

        Provide scores and brief explanation for each model.
        """

        messages = [{"role": "user", "content": eval_prompt}]
        evaluation = query_llm(evaluator_model, messages)
        results.append(f"Evaluation using {evaluator_model}:\n{evaluation}\n")

    return "\n".join(results)

In [143]:
def fraud_detection_interactive():
    us_states = [state.name for state in states.STATES]
    hour_options = [f"{i:02d}:00" for i in range(24)]

    transactions_data = []
    chat_history = []
    customer_data = {}

    # Create widgets
    customer_id_widget = widgets.Text(description="Customer ID:", placeholder="Enter customer ID")
    model_widget = widgets.Dropdown(description="Model:", options=MODEL_OPTIONS)
    prompt_widget = widgets.Dropdown(description="Prompt:", options=list(PROMPT_OPTIONS.keys()))
    amount_widget = widgets.Text(description="Amount ($):", placeholder="Enter amount")
    location_widget = widgets.Combobox(description="Location:", options=us_states,
                                     placeholder="Enter or select location", ensure_option=False)
    user_location_widget = widgets.Combobox(description="User Location:", options=us_states,
                                          placeholder="Enter or select location", ensure_option=False)
    time_widget = widgets.Combobox(description="Time:", options=hour_options,
                                 placeholder="Enter or select time", ensure_option=False)
    submit_button = widgets.Button(description="Submit")
    export_button = widgets.Button(description="Export to CSV")
    evaluate_button = widgets.Button(description="Evaluate Stored Responses")
    output_area = widgets.Output()
    table_output = widgets.Output()

    def on_location_change(change):
        if change['name'] == 'value' and change['new'] == 'Other':
            location_widget.value = ''
            location_widget.placeholder = 'Enter a custom location...'

    def on_user_location_change(change):
        if change['name'] == 'value' and change['new'] == 'Other':
            user_location_widget.value = ''
            user_location_widget.placeholder = 'Enter a custom location...'

    def on_customer_id_change(change):
        new_customer_id = change['new']
        if new_customer_id in customer_data:
            user_location_widget.value = customer_data[new_customer_id]['user_location']
        else:
            user_location_widget.value = ""

    def on_submit(_):
        with output_area:
            output_area.clear_output()
            data = {
                'customer_id': customer_id_widget.value,
                'model_name': model_widget.value,
                'prompt_type': prompt_widget.value,
                'amount': amount_widget.value,
                'location': location_widget.value,
                'user_location': user_location_widget.value,
                'time': time_widget.value
            }

            if not all(data.values()):
                print("Error: All fields are required.")
                return

            prediction, response = query_fraud_detection(
                data['model_name'], chat_history, PROMPT_OPTIONS[data['prompt_type']],
                data['amount'], data['location'], data['user_location'], data['time']
            )

            print("Transaction Details:")
            for key, value in data.items():
                print(f"{key}: {value}")
            print("\nModel's Response:")
            display(Markdown(response))

            transactions_data.append({**data, 'prediction': prediction, 'Response': response})
            #if already customer is there then automatically fills the user location
            customer_data[data['customer_id']] = {'user_location': data['user_location']}

            for widget in [customer_id_widget, amount_widget, location_widget,
                         user_location_widget, time_widget]:
                widget.value = ""

            with table_output:
                table_output.clear_output()
                display(pd.DataFrame(transactions_data))

    def on_export(_):
        with output_area:
            output_area.clear_output()
            if not transactions_data:
                print("No transaction data to export!")
                return
            df = pd.DataFrame(transactions_data)
            try:
                df.to_csv("transactions_log.csv", index=False)
                display(Markdown("**Exported to transactions_log.csv!**"))
                # display(df.head())
            except Exception as e:
                print(f"Error exporting: {e}")

    def on_evaluate(_):
        with output_area:
            output_area.clear_output()
            df = pd.DataFrame(transactions_data)
            eval_result = evaluate_stored_responses(df, model_widget.value)
            if eval_result:
                display(Markdown(eval_result))

    # Setup observers
    location_widget.observe(on_location_change, names='value')
    user_location_widget.observe(on_user_location_change, names='value')
    customer_id_widget.observe(on_customer_id_change, names='value')

    # Setup button handlers
    submit_button.on_click(on_submit)
    export_button.on_click(on_export)
    evaluate_button.on_click(on_evaluate)

    # Display widgets
    display(
        customer_id_widget, model_widget, prompt_widget,
        amount_widget, location_widget, user_location_widget,
        time_widget, submit_button, export_button, evaluate_button,
        output_area, table_output
    )

# Run the interface
fraud_detection_interactive()

Text(value='', description='Customer ID:', placeholder='Enter customer ID')

Dropdown(description='Model:', options=('mixtral-8x7b-32768', 'llama-3.3-70b-versatile', 'gemma2-9b-it'), valu…

Dropdown(description='Prompt:', options=('Zero-Shot', 'Few-Shot', 'Chain of Thought'), value='Zero-Shot')

Text(value='', description='Amount ($):', placeholder='Enter amount')

Combobox(value='', description='Location:', options=('Alabama', 'Alaska', 'Arizona', 'Arkansas', 'California',…

Combobox(value='', description='User Location:', options=('Alabama', 'Alaska', 'Arizona', 'Arkansas', 'Califor…

Combobox(value='', description='Time:', options=('00:00', '01:00', '02:00', '03:00', '04:00', '05:00', '06:00'…

Button(description='Submit', style=ButtonStyle())

Button(description='Export to CSV', style=ButtonStyle())

Button(description='Evaluate Stored Responses', style=ButtonStyle())

Output()

Output()