# Statistical Analysis – HealthCare Plus

**Course Assignment | Submission-ready Jupyter Notebook**

This notebook provides step-by-step statistical analysis with explanations to support data-driven decision-making for hospital management.

## Section A – Q1: Patient Admissions (Central Tendency)

In [None]:
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt

admissions = np.array([32,28,35,30,29,27,31,34,33,30])

mean = np.mean(admissions)
median = np.median(admissions)
mode = stats.mode(admissions, keepdims=True)

mean, median, mode

**Interpretation:**
- Mean shows average admissions
- Median is stable against extreme values
- Mode represents most frequent admissions

**Best Measure:** Median best represents patient admissions due to minor variability.

## Section A – Q2: Recovery Duration (Dispersion Measures)

In [None]:
recovery = np.array([5,7,6,8,9,5,6,7,8,6])

range_val = recovery.max() - recovery.min()
variance = np.var(recovery, ddof=1)
std_dev = np.std(recovery, ddof=1)

range_val, variance, std_dev

Standard deviation indicates moderate variability. Adding extreme values (4 & 10 days) will increase dispersion.

## Section A – Q3: Patient Satisfaction Distribution

In [None]:
scores = np.array([8,9,7,8,10,7,9,6,10,8,7,9])

stats.skew(scores), stats.kurtosis(scores)

Skewness close to zero and low kurtosis indicate approximately normal distribution.
Improved service would shift distribution left-skewed (negative skew).

## Section A – Q4: Nurse Staffing vs Recovery Time

In [None]:
nurses = np.array([10,12,15,18,20,22])
recovery_time = np.array([8,7,6,5,4,3])

correlation = np.corrcoef(nurses, recovery_time)[0,1]
correlation

Strong negative correlation indicates increased staffing reduces recovery time.

## Section B – Q5: Hypothesis Testing (t-test)

In [None]:
wait_times = np.array([32,29,31,34,33,27,30,28,35,26])

stats.ttest_1samp(wait_times, 30)

**Conclusion:** p-value > 0.05 → Hospital claim of 30 minutes is valid.

## Section B – Q6: Cleanliness vs Satisfaction (Chi-Square Test)

In [None]:
from scipy.stats import chi2_contingency

contingency = np.array([[90,10],[60,40],[30,70]])
chi2_contingency(contingency)

Since p-value < 0.05, cleanliness and patient satisfaction are dependent.

## Section B – Q7: Treatment Effectiveness (ANOVA)

In [None]:
A = [5,6,7,5,6]
B = [8,9,7,8,10]
C = [4,5,6,5,4]

stats.f_oneway(A,B,C)

ANOVA confirms statistically significant differences among treatments.

## Section B – Q8: Normality Test (Administration Time)

In [None]:
admin_time = np.array([12,15,14,16,18,13,14,17,15,19,16,14])

stats.shapiro(admin_time)

Normal distribution allows use of parametric statistical tests.

## Section B – Q9: Emergency Arrivals (Poisson Distribution)

In [None]:
from scipy.stats import poisson

poisson.pmf(3,5)

Poisson distribution is appropriate for modeling arrival events over time.

## Section B – Q10: Expected Number of Surgeries

In [None]:
surgeries = np.array([0,1,2,3,4,5])
freq = np.array([5,12,18,22,15,8])

expected_value = np.sum(surgeries * freq) / np.sum(freq)
expected_value

Hiring additional staff increases the expected value of surgeries per day.

---
**End of Assignment – Ready for GitHub Submission**