# Data Mining Project: Predicting Student Behavioral Disruptions



## Table of Contents
1. [Introduction](#introduction)
2. [Data Loading and Initial Exploration](#data-loading)
3. [Exploratory Data Analysis (EDA)](#eda)
4. [Hypothesis Testing](#hypothesis-testing)
5. [Feature Engineering](#feature-engineering)
6. [Model Development](#model-development)
7. [Model Evaluation and Interpretation](#model-evaluation)
8. [Result Cleanup](#result-cleanup)


## 1. Introduction
This notebook documents our team's effort to predict and analyze student behavioral disruptions to minimize in-class interruptions.

**Team Members:**  
**Customer:** Adam West

**Objectives:**
- Predict behavioral disruptions
- Identify anomalous patterns
- Provide clear interpretations


## 2. Data Loading and Initial Exploration

In [None]:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

bus_conduct = pd.read_csv('TTU Data - Bus Conduct.csv')
family_engagement = pd.read_csv('TTU Data - Family Engagement.csv')
disciplinary_referral = pd.read_csv('TTU Data - Disciplinary Referral.csv')

print("Bus Conduct Dataset:")
display(bus_conduct.head())

print("Family Engagement Dataset:")
display(family_engagement.head())

print("Disciplinary Referral Dataset:")
display(disciplinary_referral.head())


## 3. Exploratory Data Analysis (EDA)

In [None]:

bus_conduct['Date of Incident'] = pd.to_datetime(bus_conduct['Date of Incident'])
disciplinary_referral['Date of Incident'] = pd.to_datetime(disciplinary_referral['Date of Incident'])

print("Missing values in Bus Conduct Data:")
print(bus_conduct.isnull().sum())

print("\nMissing values in Disciplinary Referral Data:")
print(disciplinary_referral.isnull().sum())


In [None]:

disciplinary_referral['Month'] = disciplinary_referral['Date of Incident'].dt.month
monthly_referrals = disciplinary_referral.groupby('Month').size()

plt.figure(figsize=(10,5))
sns.barplot(x=monthly_referrals.index, y=monthly_referrals.values)
plt.title('Monthly Disciplinary Referrals')
plt.xlabel('Month')
plt.ylabel('Number of Referrals')
plt.show()


In [None]:

frequent_students = disciplinary_referral['Student Identifier'].value_counts().head(10)

plt.figure(figsize=(10,5))
sns.barplot(y=frequent_students.index, x=frequent_students.values, orient='h')
plt.title('Top 10 Students by Number of Referrals')
plt.xlabel('Number of Referrals')
plt.ylabel('Student Identifier')
plt.show()


## 4. Hypothesis Testing
*(To be completed by team)*

## 5. Feature Engineering
*(To be completed by team)*

## 6. Model Development
*(To be completed by team)*

## 7. Model Evaluation and Interpretation
*(To be completed by team)*

## 8. Result Cleanup
*(To be completed by team)*