# Degree Requirements Analysis

This notebook analyzes degree requirement data from Vanderbilt University.

## Purpose
- Load and examine degree requirement data
- Analyze requirement group distributions
- Compare different degree programs
- Generate visualizations for requirement completion status

## Data Sources
- Degree requirements CSV files from data/processed/
- Student transcript data

## Outputs
- Summary statistics of requirements
- Completion status reports
- Visualizations of degree progress

## Setup and Data Loading

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Set display options
pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)

In [None]:
# Load degree requirements data
try:
    df = pd.read_csv('../data/processed/degree_requirements.csv')
    print(f"Loaded {len(df)} requirement records")
    print(f"Columns: {list(df.columns)}")
except FileNotFoundError:
    print("Degree requirements file not found. Run data processing scripts first.")
    df = None

## Data Overview

In [None]:
if df is not None:
    # Basic data information
    print("Dataset Shape:", df.shape)
    print("\nData Types:")
    print(df.dtypes)
    print("\nFirst few rows:")
    df.head()

## Requirement Group Analysis

In [None]:
if df is not None:
    # Analyze requirement groups
    group_counts = df['group_name'].value_counts()
    print("Requirement Groups:")
    print(group_counts)

In [None]:
if df is not None:
    # Visualize requirement distribution
    plt.figure(figsize=(12, 6))
    group_counts.plot(kind='bar')
    plt.title('Distribution of Requirement Groups')
    plt.xlabel('Requirement Group')
    plt.ylabel('Number of Requirements')
    plt.xticks(rotation=45)
    plt.tight_layout()
    plt.show()

## Units Analysis

In [None]:
if df is not None:
    # Analyze units required vs completed
    units_summary = df.groupby('group_name')[['units_required', 'units_used', 'units_needed']].sum()
    print("Units Summary by Group:")
    print(units_summary)