# Chapter 36: Why Accuracy Misleads

⚠️ **DO NOT SKIP THIS CELL**

## Run the Next cell.
### Before executing any other cell you must run the next cell to set up the project folder environment.

In [None]:
from pathlib import Path

try:
    from google.colab import drive
    IN_COLAB = True
except ImportError:
    IN_COLAB = False

if IN_COLAB:
    drive.mount("/content/drive")
    PROJECT_ROOT = Path("/content/drive/MyDrive/DataScience/census-education-analysis")
else:
    PROJECT_ROOT = Path.cwd().parent

DATA_DIR = PROJECT_ROOT / "data"
RAW_DIR = DATA_DIR / "raw"
STAGING_DIR = DATA_DIR / "staging"
PROCESSED_DIR = DATA_DIR / "processed"
OUTPUTS_DIR = PROJECT_ROOT / "outputs"

PROJECT_ROOT


## Problem 1: What Dataset Are We Evaluating?

In [None]:
import pandas as pd

input_path = OUTPUTS_DIR / "india_model_scored.csv"
df = pd.read_csv(input_path)

df.head()

## Problem 2: What Does Accuracy Actually Measure?

## Problem 3: Why Can a Model Be Accurate and Still Useless?

In [None]:
baseline_accuracy = (df["priority_flag"] == False).mean()
baseline_accuracy

## Problem 4: What Are False Positives and False Negatives in Decision Terms?

## Problem 5: Why Unequal Error Costs Break Accuracy

## Problem 6: How Does Accuracy Hide Structural Bias?

## Problem 7: What Should Analysts Evaluate Instead?

In [None]:
df.sort_values("risk_score", ascending=False).head(10)

## Problem 8: Comparing Model Behavior to the Reference Decision

In [None]:
threshold = 0.5
df["predicted_flag"] = df["risk_score"] >= threshold

In [None]:
disagreements = df[df["predicted_flag"] != df["priority_flag"]]
disagreements.head()

## Problem 9: Saving Evaluation-Ready Data for Interpretation

In [None]:
output_path = OUTPUTS_DIR / "india_model_evaluation_ready.csv"
df.to_csv(output_path, index=False)

output_path

## End-of-Chapter Direction