# 📊 Session 9: Reproducible Research with Jupyter Notebooks

**Objective:** Document and share research workflows using Jupyter Notebooks.

This notebook shows how to combine code, text, and visualizations to produce reproducible research documents.

## 📥 Step 1: Load Dataset

We’ll use the **Iris dataset**, which is available in scikit-learn.

In [None]:
from sklearn.datasets import load_iris
import pandas as pd

iris = load_iris(as_frame=True)
df = iris.frame
df.head()

## 📈 Step 2: Visualize the Data

In [None]:
import seaborn as sns
import matplotlib.pyplot as plt

# Histogram
plt.figure(figsize=(8, 5))
sns.histplot(df['sepal length (cm)'], kde=True, bins=20)
plt.title("Distribution of Sepal Length")
plt.show()

In [None]:
# Pairplot
sns.pairplot(df, hue='target', diag_kind='hist')
plt.show()

## 🧠 Bonus Challenge: Build a Predictive Model

**Task:** Train a simple classification model on the Iris dataset and evaluate its accuracy.

In [None]:
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

X = df.drop(columns='target')
y = df['target']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model = LogisticRegression(max_iter=200)
model.fit(X_train, y_train)

y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"Model Accuracy: {accuracy:.2f}")

## 📤 Step 3: Export This Notebook

- Go to `File > Download as > HTML (.html)` or `PDF via LaTeX (.pdf)`
- Or use the command line:
```bash
jupyter nbconvert --to html session9_reproducible_research.ipynb
jupyter nbconvert --to pdf session9_reproducible_research.ipynb
```