# Naive Bayes for Predicting Game Outcomes

This notebook demonstrates how to build a simple Gaussian Naive Bayes model to classify NFL game outcomes using synthetic statistics. The goal is to highlight the conditional independence assumption and how Bayes' theorem is applied in a machine-learning context.

## Bayes' Theorem
Given features $X$ and a binary outcome $Y$, Bayes' theorem states:
\[ P(Y|X) = rac{P(X|Y)P(Y)}{P(X)} \]\nNaive Bayes assumes the features are conditionally independent given the class, so $P(X|Y)$ factors into the product of individual feature likelihoods.

In [None]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score

# synthetic dataset: home field advantage and turnovers
np.random.seed(0)
N = 200
home_field = np.random.binomial(1, 0.5, N)
turnover_margin = np.random.normal(0, 1, N)
# true model: logit of win prob depends on both features
logit = 0.8*home_field + 0.5*turnover_margin
prob = 1/(1+np.exp(-logit))
result = np.random.binomial(1, prob)
X = pd.DataFrame({'home_field': home_field, 'turnover_margin': turnover_margin})
y = result
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
model = GaussianNB()
model.fit(X_train, y_train)
pred = model.predict(X_test)
print('Accuracy:', accuracy_score(y_test, pred))

## Discussion
The naive Bayes classifier performs reasonably on this synthetic dataset. In practice, the independence assumption is often violated in football data, but the model can still offer a quick baseline for comparison.