# 3. Human-AI Interaction

***Your information:***
* Name:     Yugal Jagtap
* UBT ID:   bt727466
* E-Mail:   bt727466@uni-bayreuth.de
<br>
I confirm the solution in this notebook is my own work.

**Background:**<br>
Problems with the COMPAS case do not only arise from the software, but also with how judges use the software. With the analysis of task 1 and 2 in mind, you want to consult Northpointe and judges on how to effectively deploy the COMPAS risk scores for productive use. For this purpose you apply what you learned about human-AI interaction and cognitive biases.

**Objective:**<br>
As a showcase, you deploy your surrogate model and build a prototpyical interface that steers the human-AI interaction. The human-AI interaction should explicitly account for biases and limitations of your surrogate model as well as biases and limitations of judges.

**Deliverables:**<br>
1. Briefly comment on general concerns you have with using recidivism prediction software in a legal system. Given COMPAS software is still in use in the US, what safeguards should there be to reduce ethical and societal risks?
2. Create a web (`localhost`) interface that deploys your surrogate model and predicts a risk score for submitted defendant profiles.
3. In bullet-points, sketch a human-AI interaction framework (not in code!) that explicitly accounts for the anticipated biases and limitations of both your surrogate model and the judges interacting with it. It must include a model prediction and a final decision, but you are free about the roles, sequences, explanations, visualizations etc.
4. Discuss, how your design choices reduce the risks.
5. Integrate **at least one** of your design choices in your deployed interface.

## 1. Concerns with Recidivism Prediction Software:
* **Bias Perpetuation:** Algorithms trained on historical arrest data may learn and reinforce existing racial and socioeconomic biases. As seen with COMPAS, this can lead to higher False Positive Rates for certain demographic groups (e.g., African-Americans).
* **Lack of Transparency:** Proprietary algorithms ("black boxes") make it difficult for defendants and defense attorneys to challenge the validity of the risk score.
* **Automation Bias:** Judges may over-rely on the algorithm's output, treating it as objective truth rather than a probabilistic estimate, effectively delegating judicial discretion to a machine.

**Safeguards for Ethical Use:**
* **Explainability:** Systems must provide interpretable reasons for their scores (e.g., "Risk high due to X prior convictions") rather than just a number.
* **Mandatory Human-in-the-Loop:** Risk scores should never automatically determine sentencing; they should serve only as one of many factors considered by a human judge.
* **Training**: Stakeholders (judges, lawyers) must be trained on the limitations and statistical meaning of these scores.

## 2. Human-AI Interaction Framework
**Framework Sketch:**

* **Role Definition:** 
    * **AI:** Decision Support / Second Opinion. NOT the decision maker.
    * **Judge:** Final Decision Maker / Arbiter of Context.

* **Interaction Sequence:**
    1. **Case Review:** Judge acknowledges case details *before* seeing the AI score (to form an independent initial impression).
    2. **AI Input:** Judge requests AI assessment.
    3. **Presentation:** AI presents Risk Score, **Confidence Interval**, and **Top 3 Contributing Factors** (Why is this score high?).
    4. **Bias Alert:** If the defendant belongs to a group with known high FPR, display a specific "Bias Caution" warning.
    5. **Decision:** Judge enters final decision. If the decision aligns with a high-risk AI score but the defendant has few priors, the system prompts for a briefly typed justification (forcing cognitive engagement).

## 3. Design Choices
* **Addressing Automation Bias:** By forcing the judge to review the case *before* seeing the score, we reduce the anchoring effect where the judge creates a mental narrative to fit the AI's score. The "justification prompt" for contentious decisions adds constructive friction, forcing the judge to think critically rather than passively accepting the AI's suggestion.
* **Addressing Transparency:** Showing "Contributing Factors" moves the system from a black box to a glass box, allowing the judge to spot if the AI is relying on proxies for race (e.g., zip code) or irrelevant data.
* **Addressing Fairness:** The "Bias Alert" explicitly reminds the user of the model's known limitations regarding specific demographics, encouraging a more cautious interpretation of high-risk scores for those groups.

## 4. Deployed Interface
The following cell contains the code to run the Flask application directly from this notebook.

In [1]:

from flask import Flask, render_template, request
import pandas as pd
import numpy as np
import joblib
import os

# Initializing Flask App
app = Flask(__name__)

# Loading Models
MODEL_DIR = 'models'
try:
    print("Loading models...")
    tabular_model = joblib.load(os.path.join(MODEL_DIR, 'tabular_model.pkl'))
    text_model = joblib.load(os.path.join(MODEL_DIR, 'text_model.pkl'))
    meta_model = joblib.load(os.path.join(MODEL_DIR, 'meta_model.pkl'))
    print("Models loaded successfully.")
except Exception as e:
    print(f"Error loading models: {e}")

@app.route('/')
def index():
    return render_template('index.html')

@app.route('/predict', methods=['POST'])
def predict():
    try:
        # Extracting features from form
        data = {
            'age': [int(request.form['age'])],
            'sex': [request.form['sex']],
            'race': [request.form['race']],
            'priors_count': [int(request.form['priors_count'])],
            'juv_fel_count': [int(request.form['juv_fel_count'])],
            'juv_misd_count': [int(request.form['juv_misd_count'])],
            'juv_other_count': [int(request.form['juv_other_count'])],
            'c_charge_degree': [request.form['c_charge_degree']],
            'c_charge_desc': [request.form['c_charge_desc']],
            'days_b_screening_arrest': [float(request.form['days_b_screening_arrest'])],
            'c_days_from_compas': [float(request.form['c_days_from_compas'])]
        }
        
        # Create DataFrame
        X_input = pd.DataFrame(data)
        
        # Generate Predictions
        # 1. Tabular Model Probs
        tab_probs = tabular_model.predict_proba(X_input)
        
        # 2. Text Model Probs
        text_probs = text_model.predict_proba(X_input['c_charge_desc'].astype(str))
        
        # 3. Meta Features
        meta_features = np.hstack((tab_probs, text_probs))
        
        # 4. Final Prediction
        prediction = meta_model.predict(meta_features)[0]
        
        # Determine visual feedback
        score = int(prediction)
        risk_level = "Low"
        risk_class = "low-risk"
        
        if score > 4 and score <= 7:
            risk_level = "Medium"
            risk_class = "medium-risk"
        elif score > 7:
            risk_level = "High"
            risk_class = "high-risk"

        return render_template('result.html', score=score, risk_level=risk_level, risk_class=risk_class)

    except Exception as e:
        return f"An error occurred during prediction: {e}"

if __name__ == '__main__':
    print("Starting Flask app...")
    print("Go to http://127.0.0.1:5000 to view the app.")
    print("Interrupt the kernel to stop the server.")
    app.run(port=5000, debug=False)

Loading models...
Models loaded successfully.
Starting Flask app...
Go to http://127.0.0.1:5000 to view the app.
Interrupt the kernel to stop the server.
 * Serving Flask app '__main__'
 * Debug mode: off


 * Running on http://127.0.0.1:5000
[33mPress CTRL+C to quit[0m
127.0.0.1 - - [17/Feb/2026 10:32:25] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [17/Feb/2026 10:32:26] "[33mGET /favicon.ico HTTP/1.1[0m" 404 -
127.0.0.1 - - [17/Feb/2026 10:32:55] "POST /predict HTTP/1.1" 200 -
127.0.0.1 - - [17/Feb/2026 10:33:00] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [17/Feb/2026 10:33:18] "POST /predict HTTP/1.1" 200 -
127.0.0.1 - - [17/Feb/2026 10:33:22] "GET / HTTP/1.1" 200 -
127.0.0.1 - - [17/Feb/2026 10:33:42] "POST /predict HTTP/1.1" 200 -
