# Web Accessibility Compliance

**Author:** Luke Steuber  
**Created:** 2026-02-13

This notebook analyzes web accessibility trends using five comprehensive datasets covering automated testing, screen reader user preferences, legal enforcement, government compliance, and HTTP Archive metrics from 2019-2024.

## Data Sources

1. **WebAIM Million Report 2025** - WAVE automated analysis of top 1M website homepages
2. **WebAIM Screen Reader Survey 2024** - 1,539 respondents on screen reader preferences and problematic elements
3. **ADA Digital Lawsuits 2017-2024** - Tracking of web accessibility litigation
4. **Section 508 Compliance 2024** - Federal government website compliance assessment
5. **Web Accessibility Metrics 2019-2024** - HTTP Archive trends in ARIA, alt text, contrast, and more

## Key Findings

- **94.8%** of top 1M sites have detectable WCAG failures
- **51 errors per page** on average
- **79.1%** of sites have low contrast text (most common failure)
- **55.5%** of sites have images missing alt text
- Only **23%** of federal websites conform to Section 508

In [None]:
# Import libraries
import json
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
from pathlib import Path

%matplotlib inline

# Set style
sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (12, 6)
plt.rcParams['font.size'] = 10

## 1. Load Datasets

In [None]:
# Load all datasets
data_dir = Path('../')

# WebAIM Million Report
with open(data_dir / 'webaim_million_2025.json') as f:
    webaim = json.load(f)

# WebAIM Screen Reader Survey
with open(data_dir / 'webaim_screen_reader_survey_2024.json') as f:
    screen_reader = json.load(f)

# ADA Digital Lawsuits
with open(data_dir / 'ada_digital_lawsuits.json') as f:
    lawsuits = json.load(f)

# Section 508 Compliance
with open(data_dir / 'section_508_compliance_2024.json') as f:
    section_508 = json.load(f)

# Web Accessibility Metrics
with open(data_dir / 'web_accessibility_metrics.json') as f:
    http_archive = json.load(f)

print("✓ All datasets loaded successfully")

## 2. WebAIM Million Report - Dataset Overview

In [None]:
# Display summary statistics
summary = webaim['summary']
summary_df = pd.DataFrame([summary])
print("WebAIM Million 2025 Summary:")
summary_df

In [None]:
# Convert common failures to DataFrame
failures_df = pd.DataFrame(webaim['common_failures'])
failures_df.columns = ['Failure Type', 'Pages Affected (%)', 'Avg Instances per Page']
print("\nMost Common WCAG Failures:")
failures_df

## 3. Visualization: Common Failure Types

Low contrast text affects nearly 4 out of 5 websites, making it the most pervasive accessibility barrier.

In [None]:
# Horizontal bar chart of common failures
fig, ax = plt.subplots(figsize=(12, 6))

# Prepare data
failures = pd.DataFrame(webaim['common_failures'])
failures = failures.sort_values('pages_affected_pct')

# Create horizontal bar chart
colors = sns.color_palette('RdYlGn_r', len(failures))
bars = ax.barh(failures['type'], failures['pages_affected_pct'], color=colors)

# Formatting
ax.set_xlabel('Pages Affected (%)', fontsize=12, fontweight='bold')
ax.set_title('Most Common Web Accessibility Failures (WebAIM Million 2025)', 
             fontsize=14, fontweight='bold', pad=20)
ax.set_xlim(0, 100)

# Add value labels
for i, (bar, val) in enumerate(zip(bars, failures['pages_affected_pct'])):
    ax.text(val + 1, bar.get_y() + bar.get_height()/2, 
            f'{val:.1f}%', va='center', fontsize=10, fontweight='bold')

# Clean up labels
labels = {
    'low_contrast_text': 'Low Contrast Text',
    'missing_alt_text': 'Missing Alt Text',
    'missing_form_labels': 'Missing Form Labels',
    'empty_links': 'Empty Links',
    'empty_buttons': 'Empty Buttons',
    'missing_language': 'Missing Language Attribute'
}
ax.set_yticklabels([labels.get(t, t) for t in failures['type']])

plt.tight_layout()
plt.show()

## 4. Visualization: Yearly Trends (2019-2025)

Overall failure rates are declining slowly but steadily from 97.8% in 2019 to 94.8% in 2025.

In [None]:
# Line chart showing yearly trends
fig, ax = plt.subplots(figsize=(14, 7))

trends = webaim['yearly_trends']
years = trends['years']

# Plot multiple metrics
ax.plot(years, trends['pages_with_failures_pct'], marker='o', linewidth=2.5, 
        label='Pages with Failures', color='#d62728', markersize=8)
ax.plot(years, trends['low_contrast_pct'], marker='s', linewidth=2, 
        label='Low Contrast Text', color='#ff7f0e', markersize=7)
ax.plot(years, trends['missing_alt_text_pct'], marker='^', linewidth=2, 
        label='Missing Alt Text', color='#2ca02c', markersize=7)
ax.plot(years, trends['empty_links_pct'], marker='D', linewidth=2, 
        label='Empty Links', color='#1f77b4', markersize=6)

# Formatting
ax.set_xlabel('Year', fontsize=12, fontweight='bold')
ax.set_ylabel('Percentage of Pages (%)', fontsize=12, fontweight='bold')
ax.set_title('Web Accessibility Failure Trends (2019-2025)', 
             fontsize=14, fontweight='bold', pad=20)
ax.legend(loc='upper right', frameon=True, shadow=True, fontsize=11)
ax.grid(True, alpha=0.3)
ax.set_ylim(0, 100)

plt.tight_layout()
plt.show()

## 5. Screen Reader Survey - Dataset Overview

In [None]:
# Screen reader survey overview
print(f"Survey: {screen_reader['source']}")
print(f"Respondents: {screen_reader['respondents']:,}")
print(f"Period: {screen_reader['survey_period']}")
print(f"\nDisability Status:")
print(f"  - Uses due to disability: {screen_reader['demographics']['disability_status']['uses_due_to_disability_pct']}%")
print(f"  - No disability: {screen_reader['demographics']['disability_status']['no_disability_pct']}%")

In [None]:
# Primary screen readers
sr_df = pd.DataFrame(screen_reader['primary_screen_reader'])
sr_df.columns = ['Screen Reader', 'Percentage']
print("\nPrimary Screen Reader Usage:")
sr_df

## 6. Visualization: Screen Reader Market Share

JAWS and NVDA dominate the market with nearly equal share, together accounting for 78% of primary screen reader users.

In [None]:
# Pie chart of screen reader market share
fig, ax = plt.subplots(figsize=(10, 8))

sr_data = pd.DataFrame(screen_reader['primary_screen_reader'])
sr_data = sr_data.sort_values('pct', ascending=False)

# Combine small segments into "Other"
threshold = 3.0
large = sr_data[sr_data['pct'] >= threshold]
small = sr_data[sr_data['pct'] < threshold]
if len(small) > 0:
    other_row = pd.DataFrame([{'name': 'Other', 'pct': small['pct'].sum()}])
    plot_data = pd.concat([large, other_row], ignore_index=True)
else:
    plot_data = large

# Create pie chart
colors = sns.color_palette('Set2', len(plot_data))
wedges, texts, autotexts = ax.pie(plot_data['pct'], 
                                    labels=plot_data['name'],
                                    autopct='%1.1f%%',
                                    startangle=90,
                                    colors=colors,
                                    textprops={'fontsize': 11, 'fontweight': 'bold'})

# Enhance percentage labels
for autotext in autotexts:
    autotext.set_color('white')
    autotext.set_fontweight('bold')

ax.set_title('Primary Screen Reader Usage (WebAIM Survey 2024)', 
             fontsize=14, fontweight='bold', pad=20)

plt.tight_layout()
plt.show()

## 7. Most Problematic Elements for Screen Reader Users

In [None]:
# Display top problematic items
problems_df = pd.DataFrame(screen_reader['problematic_items_ranked'][:8])
problems_df.columns = ['Item', 'Rank', 'Points']
print("Most Problematic Elements (Ranked by Points):")
problems_df

## 8. Visualization: ADA Digital Lawsuits Growth

Digital accessibility lawsuits grew nearly 400% from 2017 to 2024, reflecting increased awareness and enforcement.

In [None]:
# Bar chart of ADA lawsuits over time
fig, ax = plt.subplots(figsize=(12, 6))

years = lawsuits['years']
cases = lawsuits['total_lawsuits']

# Create gradient color based on values
colors = plt.cm.OrRd(np.linspace(0.3, 0.9, len(years)))
bars = ax.bar(years, cases, color=colors, edgecolor='black', linewidth=1.2)

# Add value labels on bars
for bar, val in zip(bars, cases):
    height = bar.get_height()
    ax.text(bar.get_x() + bar.get_width()/2, height + 50,
            f'{val:,}', ha='center', va='bottom', fontsize=10, fontweight='bold')

# Formatting
ax.set_xlabel('Year', fontsize=12, fontweight='bold')
ax.set_ylabel('Number of Lawsuits', fontsize=12, fontweight='bold')
ax.set_title('ADA Digital Accessibility Lawsuits (2017-2024)', 
             fontsize=14, fontweight='bold', pad=20)
ax.grid(axis='y', alpha=0.3)
ax.set_ylim(0, max(cases) * 1.15)

# Add annotation
ax.annotate('391% increase\nfrom 2017 to 2024', 
            xy=(2024, cases[-1]), xytext=(2021, cases[-1] * 0.7),
            arrowprops=dict(arrowstyle='->', lw=2, color='red'),
            fontsize=11, fontweight='bold', color='red',
            bbox=dict(boxstyle='round,pad=0.5', facecolor='white', edgecolor='red', linewidth=2))

plt.tight_layout()
plt.show()

## 9. Section 508 Federal Compliance

In [None]:
# Display Section 508 findings
print(f"Source: {section_508['source']}")
print(f"Year: {section_508['year']}")
print(f"Reporting Entities: {section_508['reporting_entities']}")
print(f"Assessment Criteria: {section_508['assessment_criteria']}")
print(f"\nKey Findings:")
print(f"  - Public websites conforming: {section_508['key_findings']['public_websites_conforming_pct']}%")
print(f"  - Intranet pages conforming: {section_508['key_findings']['intranet_pages_conforming_pct']}%")
print(f"  - Trend: {section_508['key_findings']['conformance_trend']}")

## 10. HTTP Archive Trends (2019-2024)

Long-term trends show slow but steady improvement across most accessibility metrics.

In [None]:
# Extract trend data
trends = http_archive['trend_analysis']['metric_trends']

# Convert to DataFrame for easier plotting
metrics = {
    'ARIA Adoption': trends['aria_adoption']['values_by_year'],
    'Alt Text Coverage': trends['alt_text_coverage']['values_by_year'],
    'Low Contrast Pages': trends['low_contrast_pages']['values_by_year'],
    'Form Labels': trends['form_label_coverage']['values_by_year']
}

trends_df = pd.DataFrame(metrics)
print("HTTP Archive Accessibility Trends (2019-2024):")
trends_df

In [None]:
# Plot HTTP Archive trends
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 6))

years = [2019, 2020, 2021, 2022, 2024]

# Left plot: Positive trends (higher is better)
ax1.plot(years, trends['aria_adoption']['values_by_year'].values(), 
         marker='o', linewidth=2, label='ARIA Adoption', markersize=8)
ax1.plot(years, trends['alt_text_coverage']['values_by_year'].values(), 
         marker='s', linewidth=2, label='Alt Text Coverage', markersize=8)
ax1.plot(years, trends['form_label_coverage']['values_by_year'].values(), 
         marker='^', linewidth=2, label='Form Labels', markersize=8)

ax1.set_xlabel('Year', fontsize=11, fontweight='bold')
ax1.set_ylabel('Percentage (%)', fontsize=11, fontweight='bold')
ax1.set_title('Improving Metrics (Higher is Better)', fontsize=12, fontweight='bold')
ax1.legend(loc='lower right', frameon=True, shadow=True)
ax1.grid(True, alpha=0.3)
ax1.set_ylim(0, 100)

# Right plot: Negative trends (lower is better)
ax2.plot(years, trends['low_contrast_pages']['values_by_year'].values(), 
         marker='o', linewidth=2, label='Low Contrast Pages', color='#d62728', markersize=8)
ax2.plot(years, trends['heading_skip_rate']['values_by_year'].values(), 
         marker='s', linewidth=2, label='Heading Skip Rate', color='#ff7f0e', markersize=8)

ax2.set_xlabel('Year', fontsize=11, fontweight='bold')
ax2.set_ylabel('Percentage (%)', fontsize=11, fontweight='bold')
ax2.set_title('Declining Failures (Lower is Better)', fontsize=12, fontweight='bold')
ax2.legend(loc='upper right', frameon=True, shadow=True)
ax2.grid(True, alpha=0.3)
ax2.set_ylim(0, 100)

plt.tight_layout()
plt.show()

## 11. Lighthouse Accessibility Score Trends

In [None]:
# Plot Lighthouse scores
fig, ax = plt.subplots(figsize=(12, 6))

lighthouse = trends['lighthouse_median_desktop']['values_by_year']
years = list(lighthouse.keys())
scores = list(lighthouse.values())

ax.plot(years, scores, marker='o', linewidth=3, markersize=10, 
        color='#2ca02c', label='Median Desktop Score')

# Add value labels
for year, score in zip(years, scores):
    ax.text(year, score + 1, f'{score}', ha='center', fontsize=10, fontweight='bold')

ax.set_xlabel('Year', fontsize=12, fontweight='bold')
ax.set_ylabel('Lighthouse Score (0-100)', fontsize=12, fontweight='bold')
ax.set_title('Lighthouse Accessibility Score Trends (2019-2024)', 
             fontsize=14, fontweight='bold', pad=20)
ax.grid(True, alpha=0.3)
ax.set_ylim(50, 100)
ax.legend(loc='lower right', frameon=True, shadow=True, fontsize=11)

# Add annotation for the big jump
ax.annotate('+18 point jump\n(likely scoring change)', 
            xy=(2020, 80), xytext=(2020.5, 73),
            arrowprops=dict(arrowstyle='->', lw=2, color='blue'),
            fontsize=10, fontweight='bold', color='blue',
            bbox=dict(boxstyle='round,pad=0.5', facecolor='white', edgecolor='blue'))

plt.tight_layout()
plt.show()

## 12. Summary Statistics

In [None]:
# Create summary table
summary_data = {
    'Metric': [
        'Pages with WCAG Failures',
        'Average Errors per Page',
        'Low Contrast Text',
        'Missing Alt Text',
        'Federal 508 Compliance',
        'ADA Lawsuits (2024)',
        'Screen Reader Users (Survey)',
        'Primary Screen Reader (JAWS)',
        'Lighthouse Median Score (2024)'
    ],
    'Value': [
        '94.8%',
        '51',
        '79.1%',
        '55.5%',
        '23%',
        '4,000',
        '1,539',
        '40.5%',
        '84'
    ],
    'Source': [
        'WebAIM Million 2025',
        'WebAIM Million 2025',
        'WebAIM Million 2025',
        'WebAIM Million 2025',
        'GSA Section 508 Assessment',
        'Multiple Lawsuit Trackers',
        'WebAIM SR Survey #10',
        'WebAIM SR Survey #10',
        'HTTP Archive 2024'
    ]
}

summary_table = pd.DataFrame(summary_data)
print("\n=== Web Accessibility Summary Statistics ===\n")
summary_table

## 13. Key Insights

### Crisis-Level Failure Rates
Nearly 95% of top websites have detectable WCAG failures, with an average of 51 errors per page. This is not a niche problem — it affects the vast majority of the web.

### Low Contrast Dominates
79% of sites have insufficient color contrast, making it by far the most common accessibility barrier. This affects users with low vision, color blindness, and anyone using screens in bright conditions.

### Slow but Steady Improvement
From 2019 to 2024:
- ARIA adoption grew from 60% to 74%
- Alt text coverage improved from 46% to 59%
- Lighthouse scores rose from 62 to 84
- Low contrast failures declined from 85% to 81%

### Legal Pressure Increasing
ADA digital lawsuits grew nearly 400% from 2017 to 2024. E-commerce sites account for 77% of cases. The FTC fined accessiBe $1M in January 2025 for false advertising about overlay widgets.

### Government Compliance Lags
Only 23% of federal websites conform to Section 508, and compliance declined from the prior year. Even entities with legal obligations struggle to meet standards.

### Screen Reader Landscape
JAWS (40.5%) and NVDA (37.7%) together account for 78% of primary screen reader usage. CAPTCHA remains the #1 barrier, followed by interactive elements like menus, tabs, and dialogs.

### Mobile Parity
Mobile and desktop accessibility scores have converged (83 vs 84 in 2024), suggesting responsive frameworks maintain accessibility across devices.

## Conclusion

Web accessibility is improving, but at a glacial pace. The web remains largely inaccessible to people with disabilities, with basic issues like color contrast and missing alt text affecting the majority of sites. Legal enforcement is growing, but even government agencies struggle to meet compliance standards. The good news: semantic HTML5 adoption is becoming standard practice, ARIA usage is widespread, and Lighthouse scores are trending upward. The challenge: translating these technical improvements into genuinely accessible user experiences.