Skip to content

stephaniewashburn/AviationPythonProject

 
 

Repository files navigation

Aviation Safety Analysis — Proposal, KPIs, Stakeholders, Datasets

Overview

This project analyzes ASRS aviation incident reports to (1) describe historical trends, (2) compare recent years (2024–2025) to prior years, and (3) explore factors associated with change (e.g., FAR part, airspace, time of day). We combine exploratory analysis, data cleaning, and modeling (OLS and SARIMA) to inform potential safety interventions.


Datasets


Stakeholders

  • Regulators: FAA, NTSB
  • Operators: Airlines, Part 135 operators, General Aviation community
  • Air Traffic Services: ATC facilities and management
  • Industry & Unions: OEMs, pilot associations, safety teams
  • Public & Insurers: Travelers, risk analysts

KPIs

  • Trend indicators: Year-over-year and month-over-month incident counts by FAR Part (91/121/135) and airspace class (A–E).
  • Category-specific trends: “Human Factors,” “Aircraft,” “Company Policy,” “Weather,” etc.
  • Recent-year comparison: Are 2024–2025 incident patterns different from prior years?
  • Seasonality: Magnitude and timing of monthly seasonal effects.
  • Forecast stability: SARIMA projections for 121 and 91 (short- and 5-year horizons).

Data Cleaning (high level)

  • Header fix: First row contained true headers; promoted to column names.
  • De-duplication: Where multiple reports referenced the same incident, retained a single record per incident/field to avoid double counting.
  • Date hygiene: Removed obviously invalid dates (e.g., “0 BC”).
  • Normalization:
    • FAR Parts like 121; 121121.
    • Light/Day labels standardized (e.g., Day, day, DaylightDay).
  • Airspace anonymization: Standardized airport labels to anonymized zzz to ensure consistency across years after ASRS policy changes.
  • FAR ambiguity: Excluded small cases with conflicting FAR parts to maintain clean grouping.

Exploratory Data Analysis (can be seen in repo)

  • Accidents over time: Overall decrease over time, though not entirely monotonic.
    Accidents per Year

  • Accidents per state: Top three are California, Texas, Florida.
    Accidents per State

  • Top 3 states over time: Declines since ~2015 are most apparent.
    Accidents per top 3 states

  • Accidents per country: U.S. highest (dataset focuses on U.S.-origin flights), followed by the Philippines.
    Accidents per Country

  • FAR 121 × Airspace class (A–E):
    Accidents FAR 121
    Airspace A Airspace B Airspace C Airspace D Airspace E

  • Time of day: Distribution across daylight vs night operations.
    Accidents by Time of Day

  • By FAR Part:
    FAR 91
    FAR 121
    FAR 135

  • Top 10 aircraft models: Commercial fixed-wing dominates counts (usage effects likely).
    Top 10 Aircraft Models


Modeling & Results

OLS (trends)

  • All incidents (2006–2024): Yearly incidents decrease over time (p = 0.023).
    R² = 0.337, adj. R² = 0.286.
    Trend in incidents

  • FAR 121 (commercial): Clearer decreasing trend (p = 0.000).
    R² = 0.696.
    Trend in incidents 121

  • Within FAR 121 by Primary Problem:

    • Human Factors: decreasing (p = 0.024; ~1,000/yr; coef ≈ −22).
    • Aircraft: decreasing (p = 0.014; ~1,000/yr; coef ≈ −24.78).
    • Company Policy: decreasing (p = 0.000; coef ≈ −17.23); approaches near zero by 2023; R² ≈ 0.710.
    • Procedure, Weather, Environment (non-weather), ATC Equip/Nav, Charts, Airspace Structure, Software/Automation: no significant trend (given current data coverage).
  • Airspace subset:

    • Clear decreases in A, B, E, and unspecified.
    • Class C: no significant trend.
    • Class D: marginal increase (p = 0.054; ~50/yr). Restricting to Human Factors clarifies a significant increase (p = 0.001).

  • FAR 91 (general aviation): Increasing over time (p = 0.016).

  • FAR 135 (charter/air taxi): No significant trend (less data overall).

  • Seasonality (FAR 121, monthly):

    • One-hot months (Jan baseline): February lower than January (short month effect); others mixed.
    • Fourier (cyclical) encoding reveals seasonal structure (γ₂ ≠ 0, p = 0.009).
    • Fit quality modest: adj. R² ~ 0.315–0.317.

SARIMA (forecasts)

  • FAR 121: Early years (2005–2009) higher/volatile; forecasts suggest lower, more stable rates going forward.
    SARIMA 121
    SARIMA 121 Zoomed
    5-year
    Forecast

  • FAR 91: Similar story—historically higher/volatile; forecasts remain relatively low/stable.
    SARIMA 91
    5-year


Key Findings

  • Overall incidents show a downward trend; strongest for FAR 121.
  • Human Factors, Aircraft, Company Policy categories under FAR 121 are decreasing; company-policy incidents approach zero by 2023.
  • Class D airspace shows a marginal increase, significant when focusing on Human Factors—but counts are small; interpret with caution.
  • FAR 91 incidents increase, potentially reflecting reporting or exposure changes.
  • Seasonality exists (notably February lows); month effects explain limited variance.
  • Forecasts (SARIMA) indicate stable/lower incident levels in near term for 121 and 91.

Limitations

  • ASRS is voluntary; reporting practices vary by part/category and over time.
  • Airspace labels and airport anonymization shift across years.
  • Part 135 sample smaller → less statistical power.

Next Steps

  • Extend category-specific analyses beyond 121 (where feasible).

About

Forecasting project in Python

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%