# Philippine Higher Education Research Outlook (2025–2035)

**A Strategic Forecasting Report for HEI Research Productivity**

---

| Metadata | Value |
|----------|-------|
| **Generated Date** | January 2026 |
| **Author** | IRAP System (Integrated Research Analytics Platform) |
| **Classification** | Executive Briefing Document |

## Abstract

This report provides a comprehensive analysis of research productivity trends across Philippine Higher Education Institutions (HEIs) from 2015 to 2025, with strategic projections extending to 2035. The study employs a **Pandemic-Aware Forecasting Framework** that explicitly accounts for the structural disruption caused by COVID-19 (2020–2022), ensuring that long-term projections are not contaminated by pandemic-induced volatility. Using Holt's Linear Trend method with Simple Moving Average fallback, we project continued growth in research output across most regions, with notable variations in publication quality as measured by Field-Weighted Citation Impact (FWCI).

## Methodology

### 1. The COVID-19 Structural Break

The COVID-19 pandemic (2020–2022) introduced a **structural break** in research productivity data. This period was characterized by:

- Laboratory closures and field research delays
- Transition to remote work affecting collaborative projects
- Publication pipeline disruptions and journal backlogs
- Reallocation of institutional resources to pandemic response

We treat 2020–2022 as a distinct analytical period to isolate pandemic effects from underlying growth trends.

---

### 2. Temporal Segmentation Framework

| Period | Years | Duration | Characterization |
|--------|-------|----------|------------------|
| **Pre-Pandemic** | 2015–2019 | 5 years | Established baseline trends |
| **During Pandemic** | 2020–2022 | 3 years | High volatility / Disruption |
| **Post-Pandemic** | 2023–2025 | 3 years | "New Normal" recovery |
| **Forecast Phase 1** | 2026–2030 | 5 years | Short-term projection |
| **Forecast Phase 2** | 2031–2035 | 5 years | Long-term projection |

---

### 3. Algorithm Selection Logic

The forecasting engine dynamically selects the appropriate model based on data density:

```
For each (School, Metric):
    Count non-zero observations in training period (2015–2025)
    
    IF n ≥ 3 non-zero points:
        → Apply Holt's Linear Trend (captures momentum)
    ELSE:
        → Apply Simple Moving Average (conservative estimate)
```

---

### 4. Holt's Linear Trend Method

For institutions with sufficient historical data (≥ 3 non-zero observations), we apply **Holt's Exponential Smoothing** (double exponential smoothing), which decomposes the time series into level and trend components:

**Level Equation:**
$$L_t = \alpha Y_t + (1 - \alpha)(L_{t-1} + T_{t-1})$$

**Trend Equation:**
$$T_t = \beta(L_t - L_{t-1}) + (1 - \beta)T_{t-1}$$

**Forecast Equation:**
$$\hat{Y}_{t+h} = L_t + h \cdot T_t$$

Where:
- $Y_t$ = Observed value at time $t$
- $L_t$ = Estimated level at time $t$
- $T_t$ = Estimated trend at time $t$
- $\alpha$ = Smoothing parameter for level (0 < α < 1, optimized via MLE)
- $\beta$ = Smoothing parameter for trend (0 < β < 1, optimized via MLE)
- $h$ = Forecast horizon (years ahead)

**Implementation:** `statsmodels.tsa.holtwinters.Holt`

---

### 5. Simple Moving Average (Fallback)

For institutions with sparse data (< 3 non-zero observations), Holt's method risks overfitting or producing unstable forecasts. We apply a **3-period Simple Moving Average**:

$$\hat{Y}_{t+h} = \frac{1}{k}\sum_{i=t-k+1}^{t} Y_i$$

Where:
- $k = \min(3, n)$ where $n$ is the number of available observations
- This produces a conservative, flat projection that avoids explosive or negative forecasts

---

### 6. Post-Processing Constraints

| Constraint | Implementation | Rationale |
|------------|----------------|-----------|
| Non-negativity | `max(0, forecast)` | Publication and citation counts cannot be negative |
| Discrete rounding | `round()` for count metrics | Publications and Citations are whole numbers |
| Continuous FWCI | No rounding | Field-Weighted Citation Impact is a calculated ratio |

---

### 7. Concrete Example: Benguet State University

Consider **Benguet State University (BSU)** in the Cordillera Administrative Region (CAR):

| Year | Publications |
|------|--------------|
| 2023 | 45 |
| 2024 | 52 |
| 2025 | 58 |

Since BSU has ≥ 3 non-zero observations, **Holt's Linear Trend** is applied.

After model fitting:
- $L_{2025} = 58$ (current level)
- $T_{2025} = 6.5$ (annual growth rate)

**2026 Forecast:**
$$\hat{Y}_{2026} = L_{2025} + 1 \cdot T_{2025} = 58 + 6.5 = 64.5 \approx 65 \text{ publications}$$

**2030 Forecast** ($h = 5$):
$$\hat{Y}_{2030} = 58 + 5 \times 6.5 = 90.5 \approx 91 \text{ publications}$$

---

> **Disclaimer:** These forecasts are statistical projections based on historical trends. They do not account for policy changes, institutional initiatives, or external economic factors. Treat as indicative scenarios.

In [1]:
# Setup and Data Loading
import sys
import os
import warnings

warnings.filterwarnings('ignore')
sys.path.insert(0, os.path.abspath('..'))

import pandas as pd
import numpy as np

from src.viz_utils import plot_period_geospatial_comparison, PERIOD_ORDER, assign_period

# Load forecast data
df = pd.read_parquet('../data/processed/forecasts.parquet')

# Add Period and Type columns
df['Period'] = df['Year'].apply(assign_period)
df['Type'] = df['Year'].apply(lambda y: 'History' if y <= 2025 else 'Forecast')

print(f"Dataset loaded: {len(df):,} records | {df['School'].nunique()} schools | {df['Region'].nunique()} regions")

Dataset loaded: 3,276 records | 52 schools | 17 regions


## Summary Statistics by Period

In [2]:
# Summary by Period
summary = df.pivot_table(
    index='Period',
    columns='Metric',
    values='Value',
    aggfunc='sum'
).reindex(PERIOD_ORDER)

styled_summary = (
    summary.style
    .format('{:,.0f}')
    .background_gradient(cmap='YlGnBu', axis=0)
    .set_caption('Total Research Output by Period')
)
styled_summary

Metric,Citation Quantity,Field-Weighted Citation Impact,Publication Quantity
Period,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Pre-Pandemic (2015-2019),25162,125,2278
During Pandemic (2020-2022),33922,128,3649
Post-Pandemic (2023-2025),16078,138,6766
Forecast Phase 1 (2026-2030),32322,376,19652
Forecast Phase 2 (2031-2035),37854,547,30390


## Geospatial Analysis

The following animated maps display the regional evolution of research productivity across the five strategic periods. Color intensity and bubble size represent **average annual values**.

### Publication Quantity

In [3]:
fig_pub = plot_period_geospatial_comparison(df, 'Publication Quantity')
fig_pub.show()

### Citation Quantity

In [4]:
fig_cit = plot_period_geospatial_comparison(df, 'Citation Quantity')
fig_cit.show()

### Field-Weighted Citation Impact (FWCI)

In [5]:
fig_fwci = plot_period_geospatial_comparison(df, 'Field-Weighted Citation Impact')
fig_fwci.show()

## Complete Forecast Data (2026–2035)

The table below presents forecasted values for all schools, organized in wide format by metric and year.

In [6]:
# Filter to forecast period only
forecast_df = df[df['Type'] == 'Forecast'].copy()

# Pivot to wide format: columns = Metric_Year
forecast_wide = forecast_df.pivot_table(
    index=['Region Code', 'Region', 'School'],
    columns=['Metric', 'Year'],
    values='Value',
    aggfunc='first'
)

# Flatten MultiIndex columns: "Publication Quantity_2026"
forecast_wide.columns = [f"{metric} {year}" for metric, year in forecast_wide.columns]
forecast_wide = forecast_wide.reset_index()

# Reorder columns: Region Code, Region, School, then by Metric groups
base_cols = ['Region Code', 'Region', 'School']
pub_cols = [c for c in forecast_wide.columns if 'Publication Quantity' in c]
cit_cols = [c for c in forecast_wide.columns if 'Citation Quantity' in c]
fwci_cols = [c for c in forecast_wide.columns if 'Field-Weighted' in c]

# Sort year columns
pub_cols = sorted(pub_cols, key=lambda x: int(x.split()[-1]))
cit_cols = sorted(cit_cols, key=lambda x: int(x.split()[-1]))
fwci_cols = sorted(fwci_cols, key=lambda x: int(x.split()[-1]))

forecast_wide = forecast_wide[base_cols + pub_cols + cit_cols + fwci_cols]

# Sort by Region Code, then School
forecast_wide = forecast_wide.sort_values(['Region Code', 'School']).reset_index(drop=True)

print(f"Forecast table: {len(forecast_wide)} schools × {len(forecast_wide.columns)} columns")
forecast_wide

Forecast table: 52 schools × 33 columns


Unnamed: 0,Region Code,Region,School,Publication Quantity 2026,Publication Quantity 2027,Publication Quantity 2028,Publication Quantity 2029,Publication Quantity 2030,Publication Quantity 2031,Publication Quantity 2032,...,Field-Weighted Citation Impact 2026,Field-Weighted Citation Impact 2027,Field-Weighted Citation Impact 2028,Field-Weighted Citation Impact 2029,Field-Weighted Citation Impact 2030,Field-Weighted Citation Impact 2031,Field-Weighted Citation Impact 2032,Field-Weighted Citation Impact 2033,Field-Weighted Citation Impact 2034,Field-Weighted Citation Impact 2035
0,BARMM,BARMM,Mindanao State University – Tawi-Tawi College ...,141.0,178.0,215.0,252.0,289.0,326.0,363.0,...,1.458908,1.587363,1.715817,1.844271,1.972726,2.10118,2.229635,2.358089,2.486544,2.614998
1,CAR,CAR,Benguet State University,50.0,55.0,59.0,64.0,68.0,73.0,78.0,...,0.727637,0.750273,0.772909,0.795546,0.818182,0.840818,0.863455,0.886091,0.908728,0.931364
2,CAR,CAR,Ifugao State University,31.0,34.0,37.0,40.0,44.0,47.0,50.0,...,1.049273,1.130818,1.212363,1.293909,1.375454,1.457,1.538545,1.620091,1.701636,1.783181
3,CAR,CAR,Mountain Province State Polytechnic College,13.0,15.0,16.0,18.0,19.0,20.0,22.0,...,0.599818,0.656909,0.714,0.771091,0.828182,0.885272,0.942363,0.999454,1.056545,1.113636
4,MIMAROPA,MIMAROPA,Occidental Mindoro State College,11.0,12.0,14.0,15.0,16.0,17.0,18.0,...,0.482182,0.523454,0.564727,0.606,0.647272,0.688545,0.729818,0.771091,0.812363,0.853636
5,MIMAROPA,MIMAROPA,Palawan State University,14.0,15.0,16.0,17.0,18.0,19.0,19.0,...,0.485636,0.399909,0.314182,0.228454,0.142727,0.057,0.0,0.0,0.0,0.0
6,MIMAROPA,MIMAROPA,Romblon State University,24.0,27.0,29.0,32.0,35.0,37.0,40.0,...,1.246,1.359273,1.472545,1.585818,1.699091,1.812363,1.925636,2.038909,2.152181,2.265454
7,MIMAROPA,MIMAROPA,Western Philippines University,40.0,43.0,47.0,50.0,54.0,58.0,61.0,...,1.176726,1.260271,1.343816,1.427362,1.510907,1.594452,1.677997,1.761542,1.845088,1.928633
8,NCR,NCR,"Eulogio ""Amang"" Rodriguez Institute of Science...",9.0,10.0,11.0,12.0,13.0,14.0,15.0,...,2.925273,3.245546,3.565818,3.886091,4.206364,4.526637,4.846909,5.167182,5.487455,5.807727
9,NCR,NCR,Philippine Normal University,50.0,55.0,59.0,64.0,68.0,73.0,77.0,...,1.054727,1.101273,1.147818,1.194363,1.240909,1.287454,1.334,1.380545,1.427091,1.473636


## Strategic Outlook

### Key Insights

1. **Regional Concentration Risk:** NCR continues to dominate national research output. Strategic investments in regional research centers can diversify the national research portfolio.

2. **Quantity vs. Quality Trade-off:** Regions with high publication growth but low FWCI should prioritize quality assurance mechanisms.

3. **Post-Pandemic Recovery:** The 2023–2025 recovery period shows resilient institutional capacity.

4. **Emerging Research Hubs:** Regions showing steep growth trajectories represent opportunities for targeted capacity-building.

---

### Recommended Actions

| Priority | Action Item |
|----------|-------------|
| **High** | Establish regional research consortia |
| **High** | Implement FWCI-based incentive structures |
| **Medium** | Develop pandemic-resilient research continuity plans |
| **Low** | Monitor forecast accuracy and recalibrate annually |

---

*This report was generated by the IRAP System. Data sourced from CHED and Scopus databases.*