# Descriptive Statistics for Lighting Fixture Sampling

**CMVP Capstone — Statistics Foundations**

This notebook replicates the *Descriptive Stats Step 1* spreadsheet.
You will:
1. Enter a sample of measured fixture wattages
2. Compute mean, variance, standard deviation, and CV step by step
3. Scale up to a building-level energy estimate with uncertainty

---

## 1. Import the functions

We use the companion script `descriptive_stats.py` so you can also run this from the command line.

In [None]:
import sys, os
sys.path.insert(0, os.path.join(os.getcwd(), '..', 'scripts'))

from descriptive_stats import descriptive_stats, building_energy

## 2. Enter your sample data

These are measured wattages from 12 sampled fixtures (same as the spreadsheet default).

In [None]:
# Fixture wattage measurements
data = [120, 100, 130, 122, 120, 78, 100, 100, 130, 80, 100, 120]

# Building parameters
total_fixtures = 1000
hours_per_year = 4000

## 3. Compute descriptive statistics (step by step)

In [None]:
stats = descriptive_stats(data)

print(f"{'Fixture':<10} {'Watts':<10} {'Deviation':<12} {'Dev²':<12}")
print("-" * 44)
for i, (w, d, d2) in enumerate(zip(data, stats['deviations'], stats['sq_deviations']), 1):
    print(f"{i:<10} {w:<10.1f} {d:<12.2f} {d2:<12.2f}")
print("-" * 44)
print(f"{'Sum':<10} {sum(data):<10.1f} {'':12} {sum(stats['sq_deviations']):<12.2f}")

In [None]:
print(f"Sample size (n):    {stats['n']}")
print(f"Mean:               {stats['mean']:.2f} W")
print(f"Sample Variance:    {stats['variance']:.2f} W²")
print(f"Sample Std Dev:     {stats['std_dev']:.2f} W")
print(f"CV:                 {stats['cv']:.4f} ({stats['cv']*100:.2f}%)")

### Key formulas

| Statistic | Formula |
|---|---|
| Mean | $\bar{x} = \frac{1}{n} \sum x_i$ |
| Sample Variance | $s^2 = \frac{\sum (x_i - \bar{x})^2}{n-1}$ |
| Std Dev | $s = \sqrt{s^2}$ |
| CV | $CV = s / \bar{x}$ |

## 4. Scale to building-level energy estimate

In [None]:
energy = building_energy(stats['mean'], total_fixtures, hours_per_year, stats['cv'])

print(f"Total fixtures:       {total_fixtures:,}")
print(f"Hours/year:           {hours_per_year:,}")
print(f"Mean wattage:         {stats['mean']:.2f} W")
print()
print(f"Total connected load: {energy['total_kw']:.1f} kW")
print(f"Annual energy:        {energy['total_kwh']:,.0f} kWh")
print(f"Uncertainty (±1 CV):  ±{energy['uncertainty_kwh']:,.0f} kWh ({stats['cv']*100:.1f}%)")
print(f"Range:                {energy['total_kwh'] - energy['uncertainty_kwh']:,.0f} – {energy['total_kwh'] + energy['uncertainty_kwh']:,.0f} kWh")

## 5. Exercises

**Try these:**
1. Change the sample data to your own measurements. How does the CV change?
2. What happens to the uncertainty range if CV doubles?
3. How many fixtures would you need to sample to achieve ±10% precision at 90% confidence? (Hint: use the sampling notebook.)

---
*CMVP Capstone · Counterfactual Designs*