# COMPAS Recidivism Prediction Analysis - Exercise
## Building and Evaluating Fair Machine Learning Models

### Introduction
In this exercise, you will analyze the COMPAS (Correctional Offender Management Profiling for Alternative Sanctions) dataset to:
1. Build a recidivism prediction model
2. Evaluate fairness across different demographic groups
3. Compare different ML algorithms

**What is recidivism?** Whether someone will commit another crime within 2 years

**Your goal:** Build a model that predicts recidivism while considering fairness across racial groups

### Dataset Information
The COMPAS dataset contains criminal history data and risk assessments. Key columns include:
- `two_year_recid`: Our target variable (0 = no recidivism, 1 = recidivated within 2 years)
- `race`: Demographic information
- Criminal history features: `juv_fel_count`, `juv_misd_count`, `juv_other_count`, `priors_count`
- `decile_score`: COMPAS risk score (1-10, higher = higher risk)

## Part 1: Building the Model

### Step 1: Import Required Libraries
Import all the necessary packages for this analysis.

In [None]:
# TODO: Import required packages
# You'll need: pandas, sklearn modules for decision trees, train_test_split, accuracy_score
# Also import matplotlib.pyplot and seaborn for visualizations



print("Libraries imported successfully!")

### Step 2: Load and Explore the Data
Load the COMPAS dataset and explore its basic properties.

In [None]:
# TODO: Load the dataset 'compas-scores-two-years.csv'


# TODO: Print the shape of the dataset and view the first few rows



### Step 3: Clean the Data
Clean the dataset according to the following criteria:
1. Keep only cases where `days_b_screening_arrest` is between -30 and 30
2. Remove cases where `is_recid` equals -1
3. Remove traffic offenses (where `c_charge_degree` equals "O")
4. Remove rows where `score_text` equals "N/A"

In [None]:
print(f"Starting with {len(df)} records")

# TODO: Apply the four cleaning steps described above
# Print the number of records after each step




print(f"Final dataset: {len(df)} records")

### Step 4: Explore Key Variables
Analyze the distribution of the target variable and demographic information.

In [None]:
# TODO: Calculate and print the distribution of two_year_recid
# Show both counts and percentages


# TODO: Show the distribution of races in the dataset



### Step 5: Prepare Features and Target
Select your features and target variable for the model.

In [None]:
# TODO: Create X with these features: "juv_fel_count", "juv_misd_count", "juv_other_count", "priors_count"
# Create y with the target: "two_year_recid"



# TODO: Print the shapes and show basic statistics of X



### Step 6: Split Data for Training and Testing
Create train and test sets with 70% for training and 30% for testing.

In [None]:
# TODO: Split the data using train_test_split
# Use test_size=0.3 and random_state=3



# TODO: Print the sizes of train and test sets
# Also print the recidivism rates in each set



### Step 7: Train a Decision Tree Model
Train a decision tree classifier with max_depth=3.

In [None]:
# TODO: Create and train a DecisionTreeClassifier
# Use max_depth=3



print("Model trained successfully!")

### Step 8: Make Predictions and Evaluate
Make predictions on the test set and evaluate the model's performance.

In [None]:
# TODO: Make predictions on the test set


# TODO: Calculate and print the accuracy


# TODO: Create and display a confusion matrix
# Hint: Use sklearn.metrics.confusion_matrix


# TODO: Print the decision tree structure using export_text



### Step 9: Compare with COMPAS
Compare your model's predictions with the original COMPAS scores.

In [None]:
# TODO: Convert COMPAS decile_score to binary (>5 means high risk = 1)


# TODO: Get COMPAS predictions for the test set indices


# TODO: Calculate the agreement between your model and COMPAS
# Also compare the accuracy of both models



## Part 2: Fairness Analysis

### Understanding Fairness Metrics

You will implement three fairness metrics:

1. **Demographic Parity Difference**: Measures if all groups have similar positive prediction rates
2. **False Positive Rate**: Rate of incorrectly predicting recidivism for people who don't actually reoffend
3. **Equalized Odds Difference**: Measures if the model is equally accurate across groups

### Step 1: Calculate Demographic Parity Difference

In [None]:
# TODO: Import demographic_parity_difference from fairlearn.metrics


# TODO: Get race information for the test set


# TODO: Calculate and print the demographic parity difference


# Interpret the result - what does this value mean?

### Step 2: Analyze False Positive Rates
Calculate the false positive rate for Black individuals and compare with the overall rate.

In [None]:
# TODO: Import false_positive_rate from fairlearn.metrics


# TODO: Get race data for test set


# TODO: Filter for Black/African-American individuals
# Calculate FPR for this group


# TODO: Calculate overall FPR


# TODO: Print and compare the results



### Step 3: Implement Third Fairness Metric
Choose and implement one additional fairness metric from fairlearn.

Options include:
- `equalized_odds_difference`
- `true_positive_rate_difference`
- `false_negative_rate_difference`

In [None]:
# TODO: Choose and implement a third fairness metric
# Import the metric, calculate it, and interpret the results




### Experiment with Model Parameters
Try different max_depth values and see how they affect accuracy and fairness.

In [None]:
# TODO: Test max_depth values of 2, 3, and 5
# For each, calculate accuracy and demographic parity difference
# Which provides the best balance?

for depth in [2, 3, 5]:
    # TODO: Train model with this depth
    
    
    # TODO: Calculate metrics
    
    
    print(f"\nMax Depth = {depth}:")
    # TODO: Print results
    

## Part 3: Try Different Models

Implement at least two different models from the following options:
- Random Forest
- Support Vector Machine (SVM)
- K-Nearest Neighbors (KNN)
- Logistic Regression

### Model 1: [Choose Your Model]

In [None]:
# TODO: Import and implement your first alternative model
# Train it, make predictions, and calculate accuracy and fairness metrics



# TODO: Print results



### Model 2: [Choose Your Model]

In [None]:
# TODO: Import and implement your second alternative model
# Train it, make predictions, and calculate accuracy and fairness metrics



# TODO: Print results



### Compare All Models
Create a summary comparing all the models you've trained.

In [None]:
# TODO: Create a comparison table or visualization
# Include at least: model name, accuracy, and one fairness metric



# TODO: Which model would you recommend and why?



## Reflection Questions

Answer the following questions based on your analysis:

1. **Accuracy vs. Fairness**: Did you notice any trade-offs between model accuracy and fairness? Explain.

2. **Feature Selection**: The model uses only criminal history features. What are the ethical implications of this choice?

3. **Real-world Impact**: How might false positives and false negatives affect real people in the criminal justice system?

4. **Improvements**: What changes would you suggest to make the model more fair while maintaining reasonable accuracy?

### Your Answers:

1. [Your answer here]

2. [Your answer here]

3. [Your answer here]

4. [Your answer here]

## Bonus Challenges (Optional)

If you finish early, try these additional challenges:

1. **Feature Engineering**: Create new features (e.g., total juvenile offenses) and see if they improve the model

2. **Visualization**: Create visualizations showing fairness metrics across different demographic groups

3. **Additional Sensitive Features**: Analyze fairness with respect to age or sex

4. **Cross-validation**: Implement cross-validation to get more robust performance estimates

In [None]:
# Space for bonus challenges

