---
title: "Selection Bias & Missing Data Challenge"
subtitle: "Creating a Statistics Meme: Write Your Own Functions"
format:
  html: default
execute:
  echo: false
  eval: true
---

# üé® Selection Bias & Missing Data Challenge

::: {.callout-important}
## üìä Challenge Overview

**Your Task:** Create a four-panel statistics meme demonstrating selection bias. You'll write three Python functions yourself to complete the workflow, then assemble them into a professional meme.
:::

## Step 1: Prepare Image

Load an image, convert to grayscale, and resize to appropriate dimensions while maintaining aspect ratio.

In [None]:
#| label: step1-prepare
#| echo: false
#| fig-cap: Original image prepared for processing

import numpy as np
import matplotlib.pyplot as plt
from step1_prepare_image import prepare_image



# Load and prepare the image
# CHANGE THIS to use your own image!
img_path = 'sobina.jpg'  # Example image - replace with your own image
gray_image = prepare_image(img_path, max_size=512)

# Display the prepared image
fig, ax = plt.subplots(figsize=(6.5, 5))
ax.imshow(gray_image, cmap='gray', vmin=0, vmax=1)
ax.axis('off')
ax.set_title('Step 1: Prepared Image', fontsize=14, fontweight='bold', pad=10)
plt.tight_layout()
plt.show()

## Step 2: Create Stippled Image

Generate a blue noise stippling pattern from the prepared image. This creates a pattern of dots that preserves visual information while maintaining good spatial distribution.

In [None]:
#| label: step2-stipple
#| echo: false
#| fig-cap: Blue noise stippling pattern
#| warning: false

from step2_create_stipple import create_stipple

# Create stippled image
stipple_pattern, samples = create_stipple(
    gray_image,
    percentage=0.08,  # 8% of pixels will be stippled
    sigma=0.9,  # Repulsion radius
    content_bias=0.9,  # Strongly follow importance map
    noise_scale_factor=0.1,  # Moderate exploration
    extreme_downweight=0.5,  # Moderate downweighting of extremes
    extreme_threshold_low=0.2,  # Downweight tones below 0.2
    extreme_threshold_high=0.8,  # Downweight tones above 0.8
    extreme_sigma=0.1  # Smooth transition width
)

# Display the stippled image
fig, ax = plt.subplots(figsize=(6.5, 5))
ax.imshow(stipple_pattern, cmap='gray', vmin=0, vmax=1)
ax.axis('off')
ax.set_title('Step 2: Stippled Image', fontsize=14, fontweight='bold', pad=10)
plt.tight_layout()
plt.show()

## Step 3: Create Tonal Analysis (Optional Refinement Step)

::: {.callout-note}
## üîß Optional Refinement Step


In [None]:
#| label: step3-tonal
#| echo: false
#| fig-cap: Box-averaged tonal analysis showing brightness distribution

from step3_create_tonal import create_tonal
import matplotlib.pyplot as plt

# Create tonal analysis with a 16√ó12 grid
grid_rows = 16
grid_cols = 12
tonal_image, average_tones, tonal_stats = create_tonal(
    gray_image,
    grid_rows=grid_rows,
    grid_cols=grid_cols,
    return_full_image=True
)

# Display the box-averaged tonal image with text annotations
fig, ax = plt.subplots(figsize=(6.5, 5))

# Show box-averaged tonal image
ax.imshow(tonal_image, cmap='gray', vmin=0, vmax=1)
ax.axis('off')
ax.set_title('Step 3: Box-Averaged Tonal Analysis', fontsize=14, fontweight='bold', pad=10)

# Calculate grid cell dimensions for text placement
h, w = gray_image.shape
section_h = h / grid_rows
section_w = w / grid_cols

# Add text annotations showing tone values (2 decimals) at the center of each grid cell
for i in range(grid_rows):
    for j in range(grid_cols):
        tone = average_tones[i, j]
        # Calculate center position of the grid cell
        y_center = (i + 0.5) * section_h
        x_center = (j + 0.5) * section_w
        # Use white text for dark sections, black text for light sections
        text_color = 'white' if tone < 0.5 else 'black'
        ax.text(x_center, y_center, f'{tone:.2f}', 
                ha='center', va='center', 
                color=text_color, fontsize=6, fontweight='bold')

plt.tight_layout()
plt.show()

# Print key statistics for parameter tuning
print(f"\nüìä Tonal Statistics for Parameter Tuning:")
print(f"  Mean brightness: {tonal_stats['mean']:.3f}")
print(f"  Standard deviation: {tonal_stats['std']:.3f}")
print(f"  Brightness range: [{tonal_stats['min']:.3f}, {tonal_stats['max']:.3f}]")
print(f"\nüí° Tuning Tips:")
print(f"  - If mean < 0.4: Image is dark, consider lowering extreme_threshold_low")
print(f"  - If mean > 0.6: Image is light, consider raising extreme_threshold_high")
print(f"  - If std > 0.2: High contrast, may need stronger extreme_downweight")
print(f"  - Use mid_tone_center around {tonal_stats['mean']:.2f} to emphasize average tones")

## Step 4: Create Block Letter "S" ‚ö†Ô∏è **YOUR TASK**

::: {.callout-warning}
## üéØ Your Challenge: Write `step4_create_block_letter.py`

**Task:** Create a function `create_block_letter_s()` that generates a block letter "S" matching your image dimensions.


In [None]:
#| label: step4-block-letter
#| echo: false
#| fig-cap: Block letter S representing selection bias
#| eval: false

# UNCOMMENT AND USE THIS ONCE YOU'VE WRITTEN step4_create_block_letter.py:
from step4_create_block_letter import create_block_letter_s

# Get image dimensions
h, w = gray_image.shape

# Create block letter S
block_letter = create_block_letter_s(h, w, letter="S", font_size_ratio=0.9)

# Display the block letter
fig, ax = plt.subplots(figsize=(6.5, 5))
ax.imshow(block_letter, cmap='gray', vmin=0, vmax=1)
ax.axis('off')
ax.set_title('Step 4: Selection Bias (Block Letter S)', fontsize=14, fontweight='bold', pad=10)
plt.tight_layout()
plt.show()

## Step 5: Create Masked Image ‚ö†Ô∏è **YOUR TASK**

::: {.callout-warning}
## üéØ Your Challenge: Write `step5_create_masked.py`

**Task:** Create a function `create_masked_stipple()` that applies the block letter mask to the stippled image.


:::

In [None]:
#| label: step5-masked
#| echo: false
#| fig-cap: Masked stippled image showing selection bias effect
#| eval: false

# UNCOMMENT AND USE THIS ONCE YOU'VE WRITTEN step5_create_masked.py:
from step5_create_masked import create_masked_stipple

# Create masked stippled image
masked_stipple = create_masked_stipple(
    stipple_pattern,
    block_letter,
    threshold=0.5  # Pixels below 0.5 are considered part of the mask
)

# Display the masked image
fig, ax = plt.subplots(figsize=(6.5, 5))
ax.imshow(masked_stipple, cmap='gray', vmin=0, vmax=1)
ax.axis('off')
ax.set_title('Step 5: Masked Stippled Image (Estimate)', fontsize=14, fontweight='bold', pad=10)
plt.tight_layout()
plt.show()

## Create the Final Statistics Meme ‚ö†Ô∏è **YOUR TASK**

::: {.callout-warning}
## üéØ Your Challenge: Write `create_meme.py`

**Task:** Create a function `create_statistics_meme()` that assembles all four panels into a professional-looking meme.


:::

In [None]:
#| label: create-final-meme
#| echo: false
#| eval: false

# UNCOMMENT AND USE THIS ONCE YOU'VE WRITTEN create_meme.py:
from create_meme import create_statistics_meme

# Create the final meme
create_statistics_meme(
    original_img=gray_image,
    stipple_img=stipple_pattern,
    block_letter_img=block_letter,
    masked_stipple_img=masked_stipple,
    output_path="my_statistics_meme.png",
    dpi=150,
    background_color="pink" # "pink", "lightgray", etc.
)

In [None]:
#| label: final-statistics-meme
#| echo: false
#| fig-cap: Statistics meme demonstrating selection bias

import numpy as np
import matplotlib.pyplot as plt
from step1_prepare_image import prepare_image
from step2_create_stipple import create_stipple
from step4_create_block_letter import create_block_letter_s
from step5_create_masked import create_masked_stipple
from create_meme import create_statistics_meme

# Step 1: Prepare the image (use the same image path from earlier steps)
img_path = 'sobina.jpg'  # Change to your own image file
gray_img = prepare_image(img_path, max_size=512)

# Step 2: Generate stippled version using blue noise stippling
stipple_pattern, samples = create_stipple(
    gray_img,
    percentage=0.08,  # 8% of pixels will be stippled
    sigma=0.9,  # Repulsion radius
    content_bias=0.9,  # Strongly follow importance map
    noise_scale_factor=0.1,  # Moderate exploration
    extreme_downweight=0.5,  # Moderate downweighting of extremes
    extreme_threshold_low=0.2,  # Downweight tones below 0.2
    extreme_threshold_high=0.8,  # Downweight tones above 0.8
    extreme_sigma=0.1  # Smooth transition width
)

# Step 3: Generate a block-style "S" letter mask with the same image size
h, w = gray_img.shape
block_letter_img = create_block_letter_s(
    height=h,
    width=w,
    letter="S",
    font_size_ratio=0.9
)

# Step 4: Remove stipples inside the block letter (simulate selection bias)
masked_stipple_img = create_masked_stipple(
    stipple_pattern,
    block_letter_img,
    threshold=0.5  # Pixels below 0.5 are considered part of the mask
)

# Step 5: Assemble the four-panel meme and save
create_statistics_meme(
    original_img=gray_img,
    stipple_img=stipple_pattern,
    block_letter_img=block_letter_img,
    masked_stipple_img=masked_stipple_img,
    output_path="my_statistics_meme.png",
    dpi=150,
    background_color="pink"
)

# Display the meme
from IPython.display import Image as IPyImage, display
display(IPyImage("my_statistics_meme.png"))