---
title: "Random Variables"
subtitle: "The symbols by which we track what is known and unknown"
format: 
  revealjs:
    theme: default
    transition: slide
    transition-speed: fast
    controls: true
    controls-layout: bottom-right
    controls-tutorial: true
    slide-number: true
    show-slide-number: all
    hash: true
    keyboard: true
    overview: true
    center: true
    touch: true
    loop: false
    rtl: false
    navigation-mode: linear
    shuffle: false
    fragments: true
    fragment-in-url: false
    embedded: false
    help: true
    show-notes: false
    auto-play-media: false
    preload-iframes: false
    auto-animate: true
    auto-animate-matching: true
    auto-animate-transition: false
    auto-animate-easing: ease
    auto-animate-duration: 1.0
    auto-animate-unmatched: true
    auto-animate-restart: false
    backdrop-transition: fade
    title-slide-attributes: 
      data-background-image: "graphics/udairy.jpeg"
      data-background-size: "cover"
      data-background-position: "center"
      data-background-opacity: "0.3"
    logo: ""
    # css: styles/custom-dark.css
execute:
  echo: false
  eval: true
  warning: false
  error: false
---

## Random Variables {.center .title-slide}

### The symbols by which we track what is known and unknown

**BUAD 442 - Interpretable Data Models**  
**Fall 2025**

<style>
/* Default font color for entire presentation */
.reveal {
  color: #435a7f;
}

/* Headings and titles - coordinating color */
.reveal h1, .reveal h2, .reveal h3, .reveal h4, .reveal h5, .reveal h6 {
  color: #2c3e50 !important;
  font-weight: bold;
}

.title-slide h1, .title-slide h2, .title-slide p {
  color: white !important;
  text-shadow: 2px 2px 4px rgba(0,0,0,0.8);
  background: rgba(0,0,0,0.3);
  padding: 10px;
  border-radius: 5px;
}
</style>

::: {.notes}
Introduce the concept of random variables as the foundation of statistical modeling.
:::

---

## What is a Random Variable? {.center background-color="white" background-image="graphics/sunnyDay.png" background-size="cover" background-position="center" background-opacity="0.3"}

**Random Variable (informally):** A measurement whose value is subject to uncertainty or randomness.
```{dot}
// | label: fig-rv
// | fig-cap: Example random variables (informally).
// | fig-width: 7

digraph G {
  A [label="sunny day", shape=ellipse, style=filled, fillcolor=aliceblue]
  B [label="ice cream sales\nat UDairy", shape=ellipse, style=filled, fillcolor=aliceblue]
}
```

::: {.notes}
The ellipse represents a random variable - it's a container for uncertain outcomes.
:::

---

## Types of Random Variables {.center}

### Discrete vs. Continuous

**Discrete Random Variables:**
- Countable outcomes (1, 2, 3, ...)
- Examples: Number of customers, coin flips

**Continuous Random Variables:**
- Uncountable outcomes (any value in a range)
- Examples: Temperature, height, time

```{dot}
// | label: fig-rv
// | fig-cap: Example random variables (informally).
// | fig-width: 7
// | echo: false

digraph G {
  A [label="sunny day", shape=ellipse, style=filled, fillcolor=aliceblue]
  B [label="ice cream sales\nat UDairy", shape=ellipse, style=filled, fillcolor=aliceblue]
}
```

::: {.notes}
Explain the fundamental distinction between discrete and continuous random variables.
:::

---

## Random Variables (Mathematically) {.center}

### Mapping Real-World Outcomes to Numbers

<br>

| Real-World Outcome | Numerical Value |
|-------------------|-----------------|
| Sunny day         | 1               |
| Not sunny day     | 0               |

<br>

**Random Variable:** A function that maps each possible outcome to a real number

::: {.notes}
Show how we convert qualitative outcomes into quantitative measurements for analysis.
:::

---

## Mathematical Formalism {.center}

### Random Variable Definition

A **random variable** $X$ is a function that maps the sample space $\Omega$ to the real numbers:

$$X: \Omega \rightarrow \mathbb{R}$$

**Example:**

- $\Omega = \{\text{sunny}, \text{not sunny}\}$
- $X(\text{sunny}) = 1$
- $X(\text{not sunny}) = 0$

::: {.notes}
Introduce the formal mathematical definition and notation for random variables.
:::

---

## Mathematical Formalism (Continued) {.center}

### Key Properties

A **random variable** $X$ is a function that maps the sample space $\Omega$ to the real numbers:

$$X: \Omega \rightarrow \mathbb{R}$$

**Key Properties:**

- Each outcome gets exactly one number
- Different outcomes can map to the same number
- We can now do mathematical analysis on the numbers

::: {.notes}
Explain the key properties of random variables and their mathematical implications.
:::

---

## Probability Distributions {.center}

### Spreading Plausibility Over Outcomes

**Key Insight:** Random variables are most useful when we assign **probabilities** to each outcome.

**Example - Weather Random Variable:**

- $X(\text{sunny}) = 1$ with probability $P(X = 1) = 0.7$
- $X(\text{not sunny}) = 0$ with probability $P(X = 0) = 0.3$

**Total probability must sum to 1:** $0.7 + 0.3 = 1.0$

::: {.notes}
Introduce the concept that random variables need probability assignments to be useful.
:::

---

## Discrete Probability Distribution {.center}

### Example: Ice Cream Sales

<br>

| Outcome | Value | Probability |
|---------|-------|-------------|
| High sales (>100) | 1 | 0.6 |
| Low sales (≤100) | 0 | 0.4 |

<br>

**Properties:**

- Each probability ≥ 0
- Probabilities sum to 1: $0.6 + 0.4 = 1.0$

::: {.notes}
Show a concrete example of a discrete probability distribution with clear properties.
:::

---

## Continuous Probability Distribution {.center}

### Example: Temperature

For continuous random variables, we use **probability density functions (PDFs)** to show how we allocate plausibility:

- Temperature $T$ can be any of the infinite values in a range (e.g., 60°F to 90°F)
- $P(T = 75°F) = \frac{1}{\infty} \approx 0$ (exact values have zero probability)
- Instead: $P(70°F < T < 80°F) = 0.3$ (ranges have positive probability)

**Key:** Area under the PDF curve represents probability

::: {.notes}
Introduce continuous distributions and the concept of probability density.
:::

---

## Discrete Example: Coin Flips {.center}

In [None]:
import matplotlib.pyplot as plt
import numpy as np

# Coin flip probabilities
outcomes = ['Heads', 'Tails']
probabilities = [0.5, 0.5]
colors = ['aliceblue', 'cadetblue']

plt.figure(figsize=(8, 5))
bars = plt.bar(outcomes, probabilities, color=colors, edgecolor='navy', linewidth=2)
plt.title('Probability Mass Function: Fair Coin', fontsize=20, fontweight='bold', color='#2c3e50')
plt.ylabel('Probability', fontsize=16, fontweight='bold')
plt.xlabel('Outcome', fontsize=16, fontweight='bold')
plt.ylim(0, 0.6)
plt.xticks(fontsize=14)
plt.yticks(fontsize=14)

# Add probability values on bars
for bar, prob in zip(bars, probabilities):
    plt.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.01, 
             f'{prob:.1f}', ha='center', va='bottom', fontsize=16, fontweight='bold')

plt.tight_layout()
plt.show()

::: {.notes}
Show a simple discrete probability mass function for coin flips.
:::

---

## Discrete Example: Customer Count {.center}

In [None]:
# Customer count probabilities (Poisson-like)
customers = [0, 1, 2, 3, 4, 5, 6, 7, 8]
probs = [0.05, 0.15, 0.25, 0.25, 0.15, 0.08, 0.04, 0.02, 0.01]
colors = ['aliceblue', 'cadetblue', '#9370db', 'aliceblue', 'cadetblue', '#9370db', 'aliceblue', 'cadetblue', '#9370db']

plt.figure(figsize=(8, 5))
bars = plt.bar(customers, probs, color=colors, edgecolor='navy', linewidth=1.5)
plt.title('Probability Mass Function: Daily Customer Count', fontsize=20, fontweight='bold', color='#2c3e50')
plt.ylabel('Probability', fontsize=16, fontweight='bold')
plt.xlabel('Number of Customers', fontsize=16, fontweight='bold')
plt.xticks(customers, fontsize=14)
plt.yticks(fontsize=14)

# Add probability values on bars
for bar, prob in zip(bars, probs):
    if prob > 0.05:  # Only show values for significant probabilities
        plt.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.005, 
                 f'{prob:.2f}', ha='center', va='bottom', fontsize=12, fontweight='bold')

plt.tight_layout()
plt.show()

::: {.notes}
Show a more complex discrete distribution for customer counts.
:::

---

## Continuous Example: Temperature {.center}

In [None]:
# Temperature distribution (normal-like)
x = np.linspace(60, 90, 1000)
mean_temp = 75
std_temp = 5
pdf = (1/(std_temp * np.sqrt(2 * np.pi))) * np.exp(-0.5 * ((x - mean_temp) / std_temp)**2)

plt.figure(figsize=(8, 5))
plt.plot(x, pdf, color='cadetblue', linewidth=3, label='PDF')
plt.fill_between(x, pdf, alpha=0.3, color='aliceblue')
plt.title('Probability Density Function: Daily Temperature', fontsize=20, fontweight='bold', color='#2c3e50')
plt.ylabel('Probability Density', fontsize=16, fontweight='bold')
plt.xlabel('Temperature (°F)', fontsize=16, fontweight='bold')
plt.xticks(fontsize=14)
plt.yticks(fontsize=14)
plt.grid(True, alpha=0.3)

# Add mean line
plt.axvline(mean_temp, color='#9370db', linestyle='--', linewidth=2, label=f'Mean = {mean_temp}°F')
plt.legend(fontsize=14)

plt.tight_layout()
plt.show()

::: {.notes}
Show a continuous probability density function for temperature.
:::

---

## Continuous Example: Sales Revenue {.center}

In [None]:
# Sales revenue distribution (lognormal-like)
x = np.linspace(0, 2000, 1000)
mean_log = 6.5
std_log = 0.5
pdf = (1/(x * std_log * np.sqrt(2 * np.pi))) * np.exp(-0.5 * ((np.log(x) - mean_log) / std_log)**2)
pdf[0] = 0  # Handle log(0)

plt.figure(figsize=(8, 5))
plt.plot(x, pdf, color='#9370db', linewidth=3, label='PDF')
plt.fill_between(x, pdf, alpha=0.3, color='aliceblue')
plt.title('Probability Density Function: Daily Sales Revenue', fontsize=20, fontweight='bold', color='#2c3e50')
plt.ylabel('Probability Density', fontsize=16, fontweight='bold')
plt.xlabel('Revenue ($)', fontsize=16, fontweight='bold')
plt.xticks(fontsize=14)
plt.yticks(fontsize=14)
plt.grid(True, alpha=0.3)

# Add mean line
mean_revenue = np.exp(mean_log + std_log**2/2)
plt.axvline(mean_revenue, color='cadetblue', linestyle='--', linewidth=2, label=f'Mean ≈ ${mean_revenue:.0f}')
plt.legend(fontsize=14)

plt.tight_layout()
plt.show()

::: {.notes}
Show a continuous probability density function for sales revenue with a skewed distribution.
:::

---