# Chapter 27: Ranking States

⚠️ **DO NOT SKIP THIS CELL**

## Run the Next cell.
### Before executing any other cell you must run the next cell to set up the project folder environment.

In [None]:
from pathlib import Path

try:
    from google.colab import drive
    IN_COLAB = True
except ImportError:
    IN_COLAB = False

if IN_COLAB:
    drive.mount("/content/drive")
    PROJECT_ROOT = Path("/content/drive/MyDrive/DataScience/census-education-analysis")
else:
    PROJECT_ROOT = Path.cwd().parent

DATA_DIR = PROJECT_ROOT / "data"
RAW_DIR = DATA_DIR / "raw"
STAGING_DIR = DATA_DIR / "staging"
PROCESSED_DIR = DATA_DIR / "processed"
OUTPUTS_DIR = PROJECT_ROOT / "outputs"

PROJECT_ROOT


## Problem 1: What Data Are We Ranking?

In [None]:
import pandas as pd

state_path = PROCESSED_DIR / "india_gender_metrics.csv"
state_df = pd.read_csv(state_path)

state_df.head()

## Problem 2: Why Must We Rank Derived Metrics, Not Raw Totals?

## Problem 3: How Do We Rank States by Literacy Rate?

In [None]:
ranked_by_literacy = (
    state_df
    .sort_values("literacy_rate", ascending=False)
    .reset_index(drop=True)
)

In [None]:
ranked_by_literacy.head()

## Problem 4: How Do We Clearly Identify Top and Bottom Performers?

In [None]:
top_states = ranked_by_literacy.head(5)
top_states

In [None]:
bottom_states = ranked_by_literacy.tail(5)
bottom_states

## Problem 5: How Do Rankings Change When the Metric Changes?

In [None]:
ranked_by_gender_gap = (
    state_df
    .sort_values("gender_literacy_gap")
    .reset_index(drop=True)
)

In [None]:
ranked_by_gender_gap.head()

In [None]:
ranked_by_gender_gap.tail()

## Problem 6: How Do We Avoid Misleading Rankings?

## Problem 7: What New Questions Do Rankings Reveal?

## Problem 8: How Do We Save Ranked Results for the Next Chapter?

In [None]:
ranked_output_path = PROCESSED_DIR / "state_ranked_by_literacy.csv"
ranked_by_literacy.to_csv(ranked_output_path, index=False)

ranked_output_path

## End-of-Chapter Direction