# Descriptive patterns early CN/HK → EUROCONTROL flights

Maximilian Elixhauser

In [None]:
source(here::here("R", "00_load_libs.R"))














































> Among continental EUROCONTROL member states, is a higher volume of direct inbound flights from China / Hong Kong / Macao (Dec 2019 & Mar 2020) associated with higher cumulative excess mortality on 5 May 2020?

Snapshots for 2021-2023 are analysed as robustness checks.

### Objectives

**1. Measure exposure:** Create a country-level index of inbound IFR flights (Dec 2019, Mar 2020, sum) from EUROCONTROL data.

**2. Rank & correlate:** Compute Spearman ρ between exposure ranks and excess-mortality ranks on 5 May 2020; repeat for 2021-2023.

**3. Appraise validity:** Discuss strengths and limits of flight counts as a proxy for epidemic seeding

# Results: Flight Exposure Patterns

## Top 10 Destination Bar

In [None]:
top10 <- flights_country |>
  slice_max(total_inbound_flights_combined, n = 10) |>
  pivot_longer(
    starts_with("total_"),
    names_to  = "window",
    values_to = "flights"
  ) |>
  mutate(
    window = recode(window,
      total_inbound_flights_dec19    = "Dec 2019",
      total_inbound_flights_mar20    = "Mar 2020",
      total_inbound_flights_combined = "Combined"
    ),
    iso_country = fct_reorder(iso_country, flights, .fun = max)
  )

pal <- c(
  "Dec 2019" = "#E69F00",
  "Mar 2020" = "#56B4E9",
  "Combined" = "#009E73"
)

p_top10 <- ggplot(
  top10,
  aes(iso_country, flights, fill = window)
) +
  geom_col(position = position_dodge(width = .75), width = .65) +
  coord_flip() +
  scale_y_continuous(labels = comma) +
  scale_fill_manual(values = pal, name = NULL) +
  labs(
    title = "Top-10 EUROCONTROL destinations",
    subtitle = "Direct passenger flights from China & Hong Kong*  (Dec 2019 and Mar 2020)",
    x = NULL, y = "Number of flights",
  ) +
  theme_minimal(base_size = 10) +
  theme(
    legend.position = "top",
    legend.justification = "left",
    plot.title.position = "plot",
    axis.text.y = element_text(size = 9)
  )

p_top10


-   The acronym CN/HK is used for completeness, although the EUROCONTROL records contain zero Macao–Europe flights in the study windows; analyses therefore reflect traffic from mainland China and Hong Kong only.

In December 2019 the direct CN/HK traffic into Europe was highly concentrated: Germany (DE), the United Kingdom (GB) and the Netherlands (NL) alone handled \> 60 % of all arrivals, followed by France and Belgium as secondary hubs. Every other EUROCONTROL state received fewer than 300 flights that month. By March 2020 volumes collapsed everywhere (blue bars) yet the ranking hardly changed: the same three hubs still dominated, each retaining 30-50 % of their December traffic, while Italy, Turkey and Finland - already mid-tier in December - fell below 250 flights. Combining both months (green bars) therefore produces an extremely right-skewed distribution: the top two countries (DE, GB) each saw \> 900 direct flights, four times as many as rank #4. Interpretation: epidemic “import pressure” from direct flights was not evenly spread across Europe; instead, a small group of Western European gateways bore the brunt of exposure in both the initial seeding phase (December 2019) and the onset of pandemic restrictions (March 2020), when coordinated lockdowns and border closures rapidly reduced international flight volumes.

## Route diversity

### Panel Plot: Distinct origins per destination (combined & monthly)

In [None]:
top15 <- flows_pairwise |>
  distinct(ADEP, ADES) |>
  count(ADES, name = "n_origins") |>
  slice_max(n_origins, n = 15) |>
  pull(ADES)

combined_df <- flows_pairwise |>
  filter(ADES %in% top15) |>
  distinct(ADEP, ADES) |>
  count(ADES, name = "n_origins") |>
  left_join(airports, by = c(ADES = "icao_code")) |>
  mutate(
    panel = "Combined",
    month = "Combined"
  )

monthly_df <- flights_filtered |>
  filter(ADES %in% top15) |>
  distinct(month, ADEP, ADES) |>
  count(month, ADES, name = "n_origins") |>
  left_join(airports, by = c(ADES = "icao_code")) |>
  mutate(
    month = recode(month, dec19 = "Dec 2019", mar20 = "Mar 2020"),
    panel = "Monthly"
  )

plot_df <- bind_rows(combined_df, monthly_df) |>
  mutate(
    name  = fct_reorder(name, n_origins),
    month = factor(month, levels = c("Combined", "Dec 2019", "Mar 2020")),
    panel = factor(panel, levels = c("Combined", "Monthly"))
  )

pal_fill <- c(
  "Combined" = "#009E73",
  "Dec 2019" = "#E69F00",
  "Mar 2020" = "#56B4E9"
)

panel_plot <- ggplot(plot_df, aes(n_origins, name, fill = month)) +
  geom_col(width = .6, position = position_dodge(width = .6)) +
  facet_grid(. ~ panel,
    scales = "fixed",
    space  = "free_x"
  ) +
  scale_fill_manual(values = pal_fill, name = NULL) +
  scale_x_continuous(expand = c(0, 0)) +
  labs(
    title = "How many different CN/HK airports feed each EUROCONTROL destination?",
    x = "Unique origin airports",
    y = NULL
  ) +
  theme_minimal(base_size = 9) +
  theme(
    strip.text      = element_text(face = "bold"),
    axis.text.y     = element_text(size = 7),
    legend.position = "top",
    panel.spacing.x = unit(1.1, "cm")
  )

panel_plot


Direct links were not only concentrated by volume (see @bar_top10), but also by route diversity. In total, 15 CN/HK airports fed London-Heathrow, 13 fed Paris-CDG and 12 fed Frankfurt between December 2019 and March 2020 (left panel, green bars). Every other EUROCONTROL destination received routes from ≤ 10 origins, and most from fewer than 6. The month-specific panel (right) reveals that this breadth of origins existed already in December (orange) and, despite the March collapse (blue), the major hubs still drew from 4-6 different airports while almost all secondary gateways fell to one or two. **Implication:** the first-wave seeding risk was amplified not just by sheer flight counts but by the variety of source cities funnelling into the same Western European hubs, increasing the chance of multiple independent introductions.

(\[Could be a good sentence for correlation after:\] Because both exposure volume and diversity are right-skewed toward a handful of destinations, correlations with excess mortality might be driven by these outliers as much as by a monotone trend across all countries—an issue the next section explores with rank-based statistics and robustness checks.)

### Scatterplot: log volume vs. route diversity (airport level)

In [None]:
dest_stats <- flows_pairwise |>
  group_by(ADES) |>
  summarise(
    total_flights = sum(n_flights),
    n_origins = n_distinct(ADEP),
    .groups = "drop"
  ) |>
  filter(total_flights > 0) |>
  left_join(airports, by = c("ADES" = "icao_code")) |>
  mutate(short_name = str_remove(name, " (International )?Airport$"))

# one-liner Spearman calculation
air_cor <- cor.test(dest_stats$total_flights,
  dest_stats$n_origins,
  method = "spearman", exact = FALSE
)

# round for reporting
rho_air <- round(unname(air_cor$estimate), 2) # 0.86
p_air <- format.pval(air_cor$p.value, digits = 2) # < 0.001


p_scatter <- ggplot(
  dest_stats,
  aes(x = total_flights, y = n_origins)
) +
  geom_point(size = 3, alpha = .75, colour = "#56B4E9") +
  geom_text_repel(
    data = dest_stats |> filter(total_flights > 200 | n_origins > 8),
    aes(label = short_name), size = 3, max.overlaps = 10
  ) +
  scale_x_log10(labels = comma, expand = c(0.02, 0)) +
  labs(
    title = "Volume vs Route Diversity",
    subtitle = "Each dot = one EUROCONTROL destination airport",
    x = "Total direct flights (Dec 2019 + Mar 2020, log-scale)",
    y = "Distinct CN/HK origin airports",
    caption = glue::glue("Spearman ρ = {rho_air} (p {p_air}); n = {nrow(dest_stats)} airports")
  ) +
  theme_minimal(base_size = 11)

p_scatter


[1] 47

-   **Clear positive association.** Among the 47 EUROCONTROL airports that received at least one direct flight from China/Hong Kong in Dec 2019 or Mar 2020, both flight volume and route diversity increased together (Spearman ρ ≈ 0.86, p \< 0.001). On average, every additional ~100 flights corresponded to **1–2 extra origin airports**.
-   **Right-skewed hierarchy.** Four major hubs—**London Heathrow** (748 flights, 15 origins), **Frankfurt** (638/12), **Amsterdam Schiphol** (577/9), **Paris-CDG** (469/13)—dominated, while all other airports handled \<200 flights and ≤5 origins. This demonstrates that early “import pressure” from China was highly concentrated in a small number of Western European gateways.
-   **Cargo outlier – Liège (LGG).** **Liège** stands out with high route diversity (7 origins) for its moderate volume (205 flights), reflecting its status as a **dedicated cargo hub** (notably for UPS and DHL (\[REF!\])). As the underlying dataset treats cargo and passenger flights equally, this position may **overstate the true risk of passenger-borne seeding at cargo airports**-especially during the pandemic, when cargo volumes were resilient or even increased despite a collapse in passenger traffic \[@fraport2021\].
-   **Limitation.** These results are based on **filed IFR flight plans**, not actual passenger numbers. Each origin is weighted equally, so the plot reflects the *opportunity* for multiple introductions, not the real number of travelers.
-   **Cargo vs. passenger airports.** As revealed by 2020 statistics, passenger traffic at most hubs (e.g., Frankfurt) fell \>70%, but **cargo airports like Liège saw a 23.5% increase in volume**, highlighting a key limitation of using flight counts as a proxy for person-to-person exposure \[@fraport2021\].
-   **Country-level implications.** This skew is echoed at the country level (ρ ≈ 0.53 for “combined” exposure): national rankings are driven largely by these few dominant gateways and are sensitive to the distinction between cargo and passenger operations.

> **Take-away:** Volume and route diversity rose in lock-step only for the largest hubs. Most European destinations were both low-volume **and** low-diversity, limiting their early exposure, while major Western gateways—and key cargo airports—acted as the main “funnels” for potential epidemic seeding events.

## Collapse Bar: Percentage change (countries with ≥ 5 Dec flights)

In [None]:
pct_df <- flight_exposure |>
  transmute(
    country = country_name,
    dec     = total_inbound_flights_dec19,
    mar     = total_inbound_flights_mar20,
    pct     = 100 * (mar - dec) / dec
  ) |>
  filter(dec >= 5) |>
  arrange(pct) |>
  mutate(country = fct_reorder(country, pct))

p_collapse <- ggplot(
  pct_df,
  aes(x = pct, y = country, fill = pct > 0)
) +
  geom_col(width = .72, show.legend = FALSE) +
  scale_fill_manual(values = c(`TRUE` = "#56B4E9", `FALSE` = "#E69F00")) +
  scale_x_continuous(
    limits = c(-105, 5), breaks = seq(-100, 0, 25),
    labels = \(x) paste0(x, "%")
  ) +
  labs(
    title = "March collapse in direct CN/HK → EUROCONTROL flights",
    subtitle = "Countries with ≥ 5 flights in Dec 2019",
    x = "% change  (Mar 2020 vs Dec 2019)", y = NULL
  ) +
  theme_minimal(base_size = 11)

p_collapse


-   **Near-universal contraction.** By late March 2020, every EUROCONTROL member with ≥ 5 direct CN/HK flights in December had cut at least half of them; **15 of 19 slashed \> 90 %** (Figure @pct_change_filtered).
-   **Import shut-off confirmed externally.** Independent industry data show that by **December 2020 China–Europe long-haul capacity was ≈ 7 % of 2019, while China’s domestic network had already rebounded to ≈ 93 %** of pre-crisis levels \[@warnocksmith2021, Fig 6-7\]. This echoes the post-April collapse in our combined-flights exposure index and illustrates how effectively Europe was sealed off from fresh seeding risk.
-   **Policy timing.** The cliff-edge aligns with two synchronous border measures: (i) the EU/Schengen external-border stop of **17 March 2020** (@EU2020) and (ii) China’s **“Five-One” rule of 26 March 2020**, limiting each airline to one weekly CN–EU passenger rotation (@CAAC2020).
-   **Subsidies couldn’t break the wall.** Even Beijing’s cash incentive of **≈ US\$ 0.003–0.008 per ASK** for international sectors failed to coax carriers back onto Europe routes \[@warnocksmith2021\]. Fiscal sweeteners are powerless when cross-border health rules and quarantine mandates crush demand.
-   **Hubs vs medium markets.** Major transfer states—UK, Germany, Netherlands—still operated a few hundred flights after a 50–70 % cut, whereas medium markets such as Italy, Austria and the Nordics dropped to single digits (or zero). Absolute residual volume, not percentage change, drives any continuing exposure.
-   **Interpretation.** Percentage change captures the policy shock; the *Δ flights* column shows where residual seeding potential persisted. Down-stream mortality correlations therefore need to weight absolute exposure—and discount cargo-heavy hubs—when assessing import pressure.

Note: ASK means airlines were paid roughly three-quarters of a cent for every seat-kilometre they kept in the China-Europe market—regardless of whether those seats were filled.

## Full numeric table (all ≥ 5-flight countries)

In [None]:
collapse_tbl <- pct_df |>
  mutate(
    `Dec 2019` = scales::comma(dec),
    `Mar 2020` = scales::comma(mar),
    `% change` = sprintf("%+.0f %%", pct),
    `Δ flights` = scales::comma(mar - dec)
  ) |>
  select(Country = country, `Dec 2019`, `Mar 2020`, `% change`, `Δ flights`)

knitr::kable(
  collapse_tbl,
  caption = "Direct scheduled passenger flights from China and Hong Kong to EUROCONTROL countries by destination, Dec 2019 vs. Mar 2020, for states with at least five December flights. Table shows absolute and percentage changes; note that residual flight volumes varied widely despite near-universal contraction.",
  align = c("l", "r", "r", "r", "r")
)


  Country             Dec 2019   Mar 2020   \% change   Δ flights
  ----------------- ---------- ---------- ----------- -----------
  Georgia                   14          0      -100 %         -14
  Luxembourg                 8          0      -100 %          -8
  Italy                    257          9       -96 %        -248
  Hungary                   27          1       -96 %         -26
  Poland                    28          2       -93 %         -26
  Denmark                   77          6       -92 %         -71
  Türkiye                  192         19       -90 %        -173
  Austria                   69          7       -90 %         -62
  Finland                  175         22       -87 %        -153
  Sweden                    36          5       -86 %         -31
  France                   396         74       -81 %        -322
  Greece                    14          3       -79 %         -11
  Czech Republic            44         10       -77 %         -34
  Spain                    126         30       -76 %         -96
  Switzerland              124         33       -73 %         -91
  Ukraine                    6          2       -67 %          -4
  United Kingdom           693        239       -66 %        -454
  Portugal                  11          4       -64 %          -7
  Germany                  691        353       -49 %        -338
  The Netherlands          381        196       -49 %        -185
  Belgium                  151        120       -21 %         -31

  : Direct scheduled passenger flights from China and Hong Kong to
  EUROCONTROL countries by destination, Dec 2019 vs. Mar 2020, for
  states with at least five December flights. Table shows absolute and
  percentage changes; note that residual flight volumes varied widely
  despite near-universal contraction.


-   **Every country contracted sharply.** All 19 EUROCONTROL states with ≥ 5 December flights cut inbound CN/HK service by ≥ 50 % by March 2020; 15 of them slashed **\> 90 %**.
-   **Absolute exposure still diverged.** Germany, the UK and the Netherlands each retained **200 – 350 flights** despite large percentage drops, whereas Italy, Austria and the Nordics fell to **single-digit counts** (see “Δ flights” column). Epidemiologically, residual *volume* matters more than the percentage.
-   **Read % with caution.** Extreme figures at tiny bases—e.g. Luxembourg –100 % (8→0), Georgia –100 % (14→0)—signal “service discontinued”, not a large change in risk.

## Top-10 drops and increases table

In [None]:
top_tbl <- pct_df |>
  mutate(
    direction = if_else(pct < 0, "Largest drop", "Largest increase"),
    `Dec 2019` = dec,
    `Mar 2020` = mar,
    `% change` = sprintf("%+.0f %%", pct),
    `Δ flights` = mar - dec
  ) |>
  arrange(direction, pct) |>
  group_by(direction) |>
  slice_head(n = 10) |>
  mutate(Rank = row_number()) |>
  ungroup() |>
  select(direction, Rank,
    Country = country,
    `Dec 2019`, `Mar 2020`, `% change`, `Δ flights`
  )
knitr::kable(
  top_tbl,
  caption = "Largest proportional drops and increases in direct CN/HK → EUROCONTROL flights, Mar 2020 vs. Dec 2019, by country (≥ 5 Dec flights). Note: Some 'increases' are due to low baseline volumes.",
  align = c("l", "r", "l", "r", "r", "r", "r"),
  format.args = list(big.mark = ",")
)


  --------------------------------------------------------------------------
  direction       Rank Country       Dec 2019  Mar 2020 \% change  Δ flights
  ------------- ------ ------------ --------- --------- --------- ----------
  Largest drop       1 Georgia             14         0    -100 %        -14

  Largest drop       2 Luxembourg           8         0    -100 %         -8

  Largest drop       3 Italy              257         9     -96 %       -248

  Largest drop       4 Hungary             27         1     -96 %        -26

  Largest drop       5 Poland              28         2     -93 %        -26

  Largest drop       6 Denmark             77         6     -92 %        -71

  Largest drop       7 Türkiye            192        19     -90 %       -173

  Largest drop       8 Austria             69         7     -90 %        -62

  Largest drop       9 Finland            175        22     -87 %       -153

  Largest drop      10 Sweden              36         5     -86 %        -31
  --------------------------------------------------------------------------

  : Largest proportional drops and increases in direct CN/HK →
  EUROCONTROL flights, Mar 2020 vs. Dec 2019, by country (≥ 5 Dec
  flights). Note: Some 'increases' are due to low baseline volumes.


-   **Largest drops**: Hub–and-spoke carriers in Italy, Türkiye, Finland and Denmark virtually shut direct CN/HK traffic (–86 % to –96 %).
-   **Largest “increases”** are statistical artefacts. Latvia (+400 %) and Lithuania (+200 %) rose from *one* charter to 4–5 rotations—epidemiologically negligible.
-   **Hubs absorb the floor effect.** The UK, Germany and the Netherlands appear under “increases” because they still ran hundreds of flights; their percentage fall was “only” –50 % to –66 %, but the absolute cut was 180 – 450 flights.

> **Take-away.** Percentage change captures the **policy shock**; the “Δ flights” column captures the **residual seeding potential**. Subsequent excess-mortality analysis should therefore weight *absolute* exposure, not just percentage cuts.

# Optional / Appendix

## Origin-airport bar (top 25)

In [None]:
origin_df <- flights_filtered |>
  count(month, ADEP, name = "flights") |>
  left_join(airports, by = c("ADEP" = "icao_code")) |>
  group_by(ADEP) |>
  mutate(total = sum(flights)) |>
  ungroup() |>
  arrange(desc(total)) |>
  slice_head(n = 25) |>
  mutate(
    short_name = str_remove(name, " (International )?Airport$"),
    month      = recode(month, dec19 = "Dec 2019", mar20 = "Mar 2020"),
    short_name = fct_reorder(short_name, total)
  )

pal_window <- c(
  "Dec 2019" = "#E69F00",
  "Mar 2020" = "#56B4E9"
)

p_orig <- ggplot(
  origin_df,
  aes(short_name, flights, fill = month)
) +
  geom_col(position = "stack", width = .75) +
  coord_flip() +
  scale_y_continuous(labels = scales::comma) +
  scale_fill_manual(values = pal_window, name = NULL) +
  labs(
    title = "Inbound flight counts by CN/HK origin airport",
    subtitle = "Top 25 origins  December 2019 (orange) vs March 2020 (blue)",
    x = NULL, y = "Number of direct passenger flights"
  ) +
  theme_minimal(base_size = 10) +
  theme(
    legend.position      = "top",
    legend.justification = "left",
    axis.text.y          = element_text(size = 8)
  )

p_orig


The next section quantifies how these residual volumes relate to first-wave excess mortality.