## **Project: Panel Data Methodologies With Application To Macroeconometrics (Inflation Forecasting)**.

> ### **Title**: Training, Evaluation and Comparison Model.


#### **Table of Contents:**
<ul>
<li><a href="#0"> Dataset Description and Variable Overview.</a></li>
<li><a href="#1">1. Split dataset.</a></li>
<li><a href="#2.A">2.A. Treaning  Pooled OLS model.</a></li>
<li><a href="#2.B">2.B. Treaning  Fixed Effects model.</a></li>
<li><a href="#2.C">2.C. Treaning  Random Effects model.</a></li>
<li><a href="#3.A">3.A. Wald test: Pooled OLS vs Fixed Effects.</a></li>
<li><a href="#3.B">3.B. Hausman test: fixed vs random effects</a></li>
<li><a href="#4.A">4.A. Heteroskedasticity tests.</a></li>
<li><a href="#4.B">4.B. Serial Correlation test.</a></li>
<li><a href="#4.C">4.C. Test Pesaran CD.</a></li>
<li><a href="#4.D">4.D. test Levin, Lin & Chu.</a></li>
<li><a href="#4.E">4.E. Correcting standard error.</a></li>
<li><a href="#5">5. the Dynamic Panel (GMM) model.</a></li>
<li><a href="#5.A">5.A. Build the formula string.</a></li>
<li><a href="#5.B">5.B. Estimate the GMM model.</a></li>
<li><a href="#5.C">5.C. Diagnostics for the Difference GMM model.</a></li>
<li><a href="#">.</a></li>

</ul>




<a id='1'></a>

#### Dataset Description and Variable Overview:

The dataset includes annual macroeconomic data for  **77 countries** over the period **1980–2024**. Most variables are sourced from the **IMF’s World Economic Outlook (WEO)**, except **TRWMA**, which is derived from the **World Bank**. The target variable is **PCPIPCH** (Inflation, average consumer prices). Below is the list of variables used:

---

> #### **Inflation & Price Stability**

| **Variable Code** | **Description**                                    | **Units**                                         |
| ----------------- | -------------------------------------------------- | ------------------------------------------------- |
| **PCPIPCH**       | Inflation, average consumer prices `(Target)`      | Percent change                                    |

---

> #### **Public Finance**
| **Variable Code** | **Description**                                    | **Units**                                         |
| ----------------- | -------------------------------------------------- | ------------------------------------------------- |
| GGSB_NPGDP        | General government structural balance              | Percent of potential GDP                          |
| GGXWDG_NGDP       | General government gross debt                      | Percent of GDP                                    |

---


> #### **Economic Output & Productivity**
| **Variable Code** | **Description**                                    | **Units**                                         |
| ----------------- | -------------------------------------------------- | ------------------------------------------------- |
| PPPPC             | Gross domestic product per capita, current prices  | Purchasing power parity; international dollars    |

---

> #### **International Trade & Balance**
| **Variable Code** | **Description**                                    | **Units**                                         |
| ----------------- | -------------------------------------------------- | ------------------------------------------------- |
| TX_RPCH           | Volume of exports of goods and services            | Percent change                                    |
| BCA_NGDPD         | Current account balance                            | Percent of GDP                                    |

---

> #### **Savings & Investment **
| **Variable Code** | **Description**                                    | **Units**                                         |
| ----------------- | -------------------------------------------------- | ------------------------------------------------- |
| NID_NGDP          | Total investment                                   | Percent of GDP                                    |

---

> #### **Country Metadata**

| **Variable Code** | **Description**                                    | **Units**                                         |
| ----------------- | -------------------------------------------------- | ------------------------------------------------- |
| Country_Code      | ID number for each country                         | ID                                                |
| Country           | Name of 70 countries                               | String                                            |
| Advanced_Country  | Is the country developed (1) or developing (0)?    | Boolean                                           |
| Years             | date from 2000 to 2024                             | Date                                              |

---


#### **1. Inflation & Price Stability (التضخم واستقرار الأسعار)**

| Variable Code    | Term             | التفسير |                تأثيره على التضخم                                            |
| ---------------- | ---------------- | --------------------------------- | --------------------------------------------------- |
| **PCPIPCH**   | Inflation (CPI) | معدل التضخم بناءً على متوسط أسعار المستهلكين؛ مؤشر رئيسي لاستقرار الأسعار. | المتغير الهدف، ويقيس بشكل مباشر مدى ارتفاع الأسعار. |

---

#### **2. Public Finance (المالية العامة)**

| Variable Code    | Term             | التفسير |                تأثيره على التضخم                                            |
| ---------------- | ---------------- | --------------------------------- | --------------------------------------------------- |
| **GGSB_NPGDP**  | Structural Budget Balance         | الميزان الهيكلي بعد خصم أثر الدورة الاقتصادية.                         | الفائض الهيكلي يُعتبر إشارة إلى سياسة مالية انكماشية تقلل من التضخم.      |
| **GGXWDG_NGDP** | Gross Government Debt (% of GDP)  | الدين العام كنسبة من الناتج؛ يعكس عبء الحكومة المالي.                  | ارتفاع الدين قد يُجبر الحكومة على التوسع النقدي مستقبلاً مما يزيد التضخم. |


---


#### **3. Economic Output & Productivity (الإنتاجية والناتج الاقتصادي)**

| Variable Code    | Term             | التفسير |                تأثيره على التضخم                                            |
| ---------------- | ---------------- | --------------------------------- | --------------------------------------------------- |
| **PPPPC**      | GDP per Capita (PPP)   | نصيب الفرد من الناتج باستخدام تعادل القوة الشرائية.             | ارتفاعه يشير إلى قدرة شرائية أعلى، ما قد يدفع بالأسعار إلى الارتفاع.               |

---


#### **4. International Trade & Balance (التجارة الدولية والحساب الجاري)**

| Variable Code    | Term             | التفسير |                تأثيره على التضخم                                            |
| ---------------- | ---------------- | --------------------------------- | --------------------------------------------------- |
| **TM_RPCH**   | Import Volume Growth            | نمو حجم الواردات.                   | زيادة الواردات توفر بدائل أرخص وتقلل من التضخم.                     |
| **BCA_NGDPD** | Current Account Balance (% GDP) | رصيد الحساب الجاري كنسبة من الناتج. | فائض الحساب الجاري يعكس تدفق عملات أجنبية مما يدعم استقرار الأسعار. |


---

#### **5. Savings & Investment (الادخار والاستثمار)**

| Variable Code    | Term             | التفسير |                تأثيره على التضخم                                            |
| ---------------- | ---------------- | --------------------------------- | --------------------------------------------------- |
| **NID_NGDP**  | Gross Capital Formation | الاستثمار الإجمالي كنسبة من الناتج. | استثمار أكبر قد يرفع الإنتاج في الأجل الطويل مما يقلل التضخم. |

---

#### **6. Country Metadata (بيانات الدول)**

| Variable Code    | Term             | التفسير |                تأثيره على التضخم                                            |
| ---------------- | ---------------- | --------------------------------- | --------------------------------------------------- |
| **Country_Code**     | Country ID         | معرف رقمي فريد لكل دولة. | -                 |
| **Country**           | Country Name       | اسم الدولة.              | -                 |
| **Advanced_Country** | Development Status | متقدمة (1) أو نامية (0). | -                 |
| **Years**             | Year               | السنة ما بين 2000 و2024. | -                 |

---



In [1180]:
#install.packages("modelsummary")

**Import Library**

In [1181]:
# Load required libraries
library(dplyr)
library(readxl)
library(car)
library(gplots)
library(plm)

library(tidyverse)
library(corrplot)

library(Metrics)  # for rmse
library(caret)    # for R-squared
library(lmtest)

library(sandwich)

library(urca) # test Levin, Lin & Chu

library(AER)      # for Hausman test


**Load Dataset**

In [1182]:

# Load the dataset.
df <- read.csv("../02-Dataset/01.3.1-Data_Clean.csv")

# Display the first 5 rows of data.
dim(df)
head(df)

Unnamed: 0_level_0,WEO_Country_Code,Country,Advanced_Country,Year,PCPIPCH,GGSB_NPGDP,GGXWDG_NGDP,NGDP_RPCH,PPPPC,TM_RPCH,BCA_NGDPD,NID_NGDP
Unnamed: 0_level_1,<int>,<chr>,<int>,<int>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
1,614,Angola,0,1980,46.708,1.371915,41.49508,2.406,1510.407,-39.0,0.8,12.122
2,614,Angola,0,1981,1.391,1.35092,21.43177,-4.4,1538.971,17.3,-2.241,13.148
3,614,Angola,0,1982,1.833,1.318452,17.74538,5.232924,1591.954,1.7,-6.053,14.916
4,614,Angola,0,1983,1.833,1.26541,17.70532,4.2,1679.796,-2.7,-4.22,12.381
5,614,Angola,0,1984,1.833,1.193839,13.12216,6.0,1797.332,2.2,-1.892,12.347
6,614,Angola,0,1985,1.833,1.251438,15.59482,3.5,1703.032,-9.4,1.754,9.45


In [1183]:
# Drop "WEO_Country_Code"
df$WEO_Country_Code <- NULL

str(df)

'data.frame':	3735 obs. of  11 variables:
 $ Country         : chr  "Angola" "Angola" "Angola" "Angola" ...
 $ Advanced_Country: int  0 0 0 0 0 0 0 0 0 0 ...
 $ Year            : int  1980 1981 1982 1983 1984 1985 1986 1987 1988 1989 ...
 $ PCPIPCH         : num  46.71 1.39 1.83 1.83 1.83 ...
 $ GGSB_NPGDP      : num  1.37 1.35 1.32 1.27 1.19 ...
 $ GGXWDG_NGDP     : num  41.5 21.4 17.7 17.7 13.1 ...
 $ NGDP_RPCH       : num  2.41 -4.4 5.23 4.2 6 ...
 $ PPPPC           : num  1510 1539 1592 1680 1797 ...
 $ TM_RPCH         : num  -39 17.3 1.7 -2.7 2.2 -9.4 -6.5 -1.5 26.9 -22 ...
 $ BCA_NGDPD       : num  0.8 -2.24 -6.05 -4.22 -1.89 ...
 $ NID_NGDP        : num  12.1 13.1 14.9 12.4 12.3 ...


In [1184]:
### Display Descriptive Statistics
summary(df)


   Country          Advanced_Country      Year         PCPIPCH        
 Length:3735        Min.   :0.0000   Min.   :1980   Min.   : -66.471  
 Class :character   1st Qu.:0.0000   1st Qu.:1991   1st Qu.:   2.070  
 Mode  :character   Median :0.0000   Median :2002   Median :   4.185  
                    Mean   :0.4337   Mean   :2002   Mean   :  25.881  
                    3rd Qu.:1.0000   3rd Qu.:2013   3rd Qu.:   9.293  
                    Max.   :1.0000   Max.   :2024   Max.   :7481.691  
   GGSB_NPGDP         GGXWDG_NGDP         NGDP_RPCH           PPPPC         
 Min.   :-413.7348   Min.   :-3149.28   Min.   :-28.759   Min.   :   274.6  
 1st Qu.:  -4.2297   1st Qu.:   29.35   1st Qu.:  1.472   1st Qu.:  7667.6  
 Median :  -2.2078   Median :   47.23   Median :  3.202   Median : 14834.4  
 Mean   :  -2.7760   Mean   :   58.69   Mean   :  3.163   Mean   : 20691.6  
 3rd Qu.:  -0.2389   3rd Qu.:   69.66   3rd Qu.:  5.221   3rd Qu.: 28010.8  
 Max.   : 125.1350   Max.   : 8287.82   M

In [1185]:
# panel data
panel_df <- pdata.frame(df, index = c("Country", "Year"))
head(panel_df)


Unnamed: 0_level_0,Country,Advanced_Country,Year,PCPIPCH,GGSB_NPGDP,GGXWDG_NGDP,NGDP_RPCH,PPPPC,TM_RPCH,BCA_NGDPD,NID_NGDP
Unnamed: 0_level_1,<fct>,<int>,<fct>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
Angola-1980,Angola,0,1980,46.708,1.371915,41.49508,2.406,1510.407,-39.0,0.8,12.122
Angola-1981,Angola,0,1981,1.391,1.35092,21.43177,-4.4,1538.971,17.3,-2.241,13.148
Angola-1982,Angola,0,1982,1.833,1.318452,17.74538,5.232924,1591.954,1.7,-6.053,14.916
Angola-1983,Angola,0,1983,1.833,1.26541,17.70532,4.2,1679.796,-2.7,-4.22,12.381
Angola-1984,Angola,0,1984,1.833,1.193839,13.12216,6.0,1797.332,2.2,-1.892,12.347
Angola-1985,Angola,0,1985,1.833,1.251438,15.59482,3.5,1703.032,-9.4,1.754,9.45


In [1186]:
# # Apply log transformation: log((x + 1) - min(x)) within each country
# panel_df <- panel_df %>%
#   group_by(Country) %>%
#   mutate(
#     PCPIPCH = log((PCPIPCH + 1) - min(PCPIPCH))
#   ) %>%
#   ungroup()

# # Summary statistics
# summary(panel_df$PCPIPCH)

# #    of PCPIPCH within each country
# panel_df <- panel_df %>%
#   group_by(Country) %>%
#   mutate(PCPIPCH_diff = diff(x)
# # View result
# summary(panel_df$PCPIPCH_diff)

<a id='1'></a>

### **1. Split dataset:**

In [1187]:
# ========== Split data for out-of-sample forecasting ==========
# Define dependent and independent variables
y <- panel_df[,'PCPIPCH']  # Inflation Rate (Consumer Prices, annual %)
X_vars <- c(

    # 2. Public Finance
    "GGSB_NPGDP",   ## General government structural balance
    "GGXWDG_NGDP",  # General government gross debt (% of GDP)
    
    # 3. Economic Output & Productivity & Exchange & Purchasing Power
    "PPPPC",        # GDP per capita based on PPP


    # 4. International Trade & Balance
   "TM_RPCH",     # Import volume growth
   "BCA_NGDPD",    # Current account balance (% of GDP)


    # 5. Savings & Investment
    "NID_NGDP",     ## Investment (% of GDP)
    
    # 6. Metadata
   "Advanced_Country"
    
)

X <- panel_df[, X_vars]

# Define evaluation function (RMSE and R-squared)
evaluate <- function(y_true, y_pred) {
  rmse <- sqrt(mean((y_true - y_pred)^2))                  # Root Mean Squared Error
  ss_total <- sum((y_true - mean(y_true))^2)               # Total Sum of Squares
  ss_res <- sum((y_true - y_pred)^2)                       # Residual Sum of Squares
  r2 <- 1 - (ss_res / ss_total)                            # R-squared
  return(list(rmse = rmse, r2 = r2))
}


In [1188]:
# Convert Year to numeric to allow comparison
panel_df$Year <- as.numeric(as.character(panel_df$Year))

# Split data into train and test
train <- panel_df %>% 
  as.data.frame() %>%
  filter(Year <= 2024) %>%
  pdata.frame(index = c("Country", "Year"))

test <- panel_df %>% 
  as.data.frame() %>%
  filter(Year >= 2015) %>%
  pdata.frame(index = c("Country", "Year"))

# Define dependent and independent variables
y_train <- train$PCPIPCH
X_train <- train[, X_vars]

y_test <- test$PCPIPCH
X_test <- test[, X_vars]

# Combine y and X for training model
train_model_df <- train[, c("PCPIPCH", X_vars)]

test_model_df <- test[, c("PCPIPCH", X_vars)]


<a id='2.A'></a>

### **2.A. Treaning  Pooled OLS model:**


In [1189]:
# ===============================
# A. Pooled OLS model

# Create formula for model
formula_str <- as.formula(paste("PCPIPCH ~ -1 +", paste(X_vars, collapse = " + ")))

#  Fit Pooled OLS model
pooled_model <- plm(formula_str, data = train_model_df, model = "pooling", index = c("Country", "Year"))

# Display model summary
summary(pooled_model)

# Predict using the model
pooled_preds <- predict(pooled_model, newdata = X_test)

# Evaluate model performance
pooled_rmse <- rmse(y_test, pooled_preds)
pooled_r2   <- R2( y_test, pooled_preds)

# Print results
cat(sprintf("Pooled OLS RMSE (out-of-sample): %.4f\n", pooled_rmse))
cat(sprintf("Pooled OLS R² (out-of-sample): %.4f\n", pooled_r2))


Pooling Model

Call:
plm(formula = formula_str, data = train_model_df, model = "pooling", 
    index = c("Country", "Year"))

Balanced Panel: n = 83, T = 45, N = 3735

Residuals:
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 -617.7   -38.1   -12.1    -2.5    12.4  4173.7 

Coefficients:
                    Estimate  Std. Error  t-value  Pr(>|t|)    
GGSB_NPGDP       -6.1224e+00  3.0602e-01 -20.0067 < 2.2e-16 ***
GGXWDG_NGDP       6.1289e-01  1.8060e-02  33.9368 < 2.2e-16 ***
PPPPC            -7.7096e-04  1.6314e-04  -4.7259 2.376e-06 ***
TM_RPCH           1.3920e+00  1.9627e-01   7.0924 1.569e-12 ***
BCA_NGDPD         2.7755e+00  3.6278e-01   7.6506 2.531e-14 ***
NID_NGDP         -6.9960e-02  1.5799e-01  -0.4428 0.6579326    
Advanced_Country -2.4517e+01  6.3491e+00  -3.8615 0.0001146 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Total Sum of Squares:    170760000
Residual Sum of Squares: 87388000
R-Squared:      0.48839
Adj. R-Squared: 0.48757
F-statistic

Pooled OLS RMSE (out-of-sample): 43.1934
Pooled OLS R² (out-of-sample): 0.0126


<a id='2.B'></a>

### **2.B. Treaning  Fixed Effects model:**

In [1190]:
if ("Advanced_Country" %in% colnames(X_train)) {
  X_train <- X_train[, !colnames(X_train) %in% "Advanced_Country"]
}

if ("Advanced_Country" %in% colnames(X_test)) {
  X_test <- X_test[, !colnames(X_test) %in% "Advanced_Country"]
}

if ("Advanced_Country" %in% colnames(train_model_df)) {
  train_model_df <- train_model_df[, !colnames(train_model_df) %in% "Advanced_Country"]
}
if ("Advanced_Country" %in% colnames(test_model_df)) {
  test_model_df <- test_model_df[, !colnames(test_model_df) %in% "Advanced_Country"]
}

X_vars <- setdiff(X_vars, "Advanced_Country")


# Create formula for model
formula_str <- as.formula(paste("PCPIPCH ~ -1 +", paste(X_vars, collapse = " + " )))


In [1191]:
# ===============================
# B. Fixed Effects model

# Fit Fixed Effects (within)
fe_model <- plm(formula_str, data = train_model_df, model = "within", index = c("Country", "Year"))

# Display model summary
summary(fe_model)

# Predict using the model
fe_preds <- predict(fe_model, newdata = X_test)

# Evaluate model performance

fe_rmse <- rmse(y_test, fe_preds)
fe_r2   <- R2( y_test,fe_preds)

# Print results
cat(sprintf("Fixed Effects RMSE (out-of-sample): %.4f\n", fe_rmse))
cat(sprintf("Fixed Effects R² (out-of-sample): %.4f\n", fe_r2))


Oneway (individual) effect Within Model

Call:
plm(formula = formula_str, data = train_model_df, model = "within", 
    index = c("Country", "Year"))

Balanced Panel: n = 83, T = 45, N = 3735

Residuals:
       Min.     1st Qu.      Median     3rd Qu.        Max. 
-5.9478e+02 -1.9878e+01 -8.9609e-15  1.5585e+01  3.9129e+03 

Coefficients:
               Estimate  Std. Error  t-value  Pr(>|t|)    
GGSB_NPGDP  -6.70888686  0.31256964 -21.4637 < 2.2e-16 ***
GGXWDG_NGDP  0.61794044  0.01806620  34.2042 < 2.2e-16 ***
PPPPC       -0.00078662  0.00018422  -4.2699 2.005e-05 ***
TM_RPCH      1.32716446  0.19447849   6.8242 1.031e-11 ***
BCA_NGDPD    3.32534046  0.45149230   7.3652 2.176e-13 ***
NID_NGDP     1.85277605  0.42650408   4.3441 1.437e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Total Sum of Squares:    156630000
Residual Sum of Squares: 75850000
R-Squared:      0.51575
Adj. R-Squared: 0.50406
F-statistic: 647.198 on 6 and 3646 DF, p-value: < 2.22e-16

Fixed Effects RMSE (out-of-sample): 61.3283
Fixed Effects R² (out-of-sample): 0.0209


<a id='2.C'></a>

### **2.C. Treaning  Random Effects model:**

In [1192]:
# ===============================
# C. Random Effects model


# Step 2: Fit Random Effects model using plm
re_model <- plm(formula_str, data = train_model_df, model = "random", index = c("Country", "Year"))

# re_model <- plm(formula_str, data = train_model_df, model = "random", 
#                 random.method = "amemiya", index = c("Country", "Year"))

# re_model <- plm(formula_str, data = train_model_df, model = "random", 
#                 random.method = "walhus", index = c("Country", "Year"))

# Step 3: Display model summary
summary(re_model)

# Step 4: Predict using the model
re_preds <- predict(re_model, newdata = X_test)

# Step 5: Evaluate model performance
re_rmse <- rmse(y_test, re_preds)
re_r2   <- R2( y_test, re_preds)

# Step 6: Print results
cat(sprintf("Random Effects RMSE (out-of-sample): %.4f\n", re_rmse))
cat(sprintf("Random Effects R² (out-of-sample): %.4f\n", re_r2))


Oneway (individual) effect Random Effect Model 
   (Swamy-Arora's transformation)

Call:
plm(formula = formula_str, data = train_model_df, model = "random", 
    index = c("Country", "Year"))

Balanced Panel: n = 83, T = 45, N = 3735

Effects:
                   var  std.dev share
idiosyncratic 20803.68   144.23 0.889
individual     2608.19    51.07 0.111
theta: 0.612

Residuals:
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 -631.3   -30.9   -10.9    -5.2     8.0  4021.2 

Coefficients:
               Estimate  Std. Error  z-value  Pr(>|z|)    
GGSB_NPGDP  -6.43677041  0.30699763 -20.9668 < 2.2e-16 ***
GGXWDG_NGDP  0.61929603  0.01792697  34.5455 < 2.2e-16 ***
PPPPC       -0.00109229  0.00016231  -6.7298 1.699e-11 ***
TM_RPCH      1.32960621  0.19342557   6.8740 6.243e-12 ***
BCA_NGDPD    2.94257875  0.42500816   6.9236 4.404e-12 ***
NID_NGDP     0.19640225  0.24296244   0.8084    0.4189    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Total Sum of Squares:  

Random Effects RMSE (out-of-sample): 45.8641
Random Effects R² (out-of-sample): 0.0060


<a id='3.A'></a>

### **3.A. Wald test: Pooled OLS vs Fixed Effects:**

In [1193]:
# ===============================
# Wald test: Pooled OLS vs Fixed Effects (joint significance of entity effects)
# ===============================

comparison <- waldtest(pooled_model, fe_model, test = "F")
cat("Wald Test (F-test) for joint significance of entity effects:\n")
comparison

# Extract p-value from the comparison table (last row, last column)
wald_pval <- comparison[2, "Pr(>F)"]

# Decision rule
if (wald_pval < 0.05) {
  cat("Wald test suggests Fixed Effects preferred.\n")
} else {
  cat("Wald test suggests Pooled OLS preferred.\n")
}


Wald Test (F-test) for joint significance of entity effects:


Unnamed: 0_level_0,Res.Df,Df,F,Pr(>F)
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>
1,3728,,,
2,3646,-1.0,14.91109,0.0001146103


Wald test suggests Fixed Effects preferred.


<a id='3.B'></a>

### **3.B. Hausman test: fixed vs random effects:**

In [1194]:
# ===============================
# Hausman test: fixed vs random effects
hausman_test <- phtest(fe_model, re_model)
print(hausman_test)

haus_pval <- hausman_test$p.value

if (haus_pval < 0.05) {
  cat("Hausman test suggests Fixed Effects preferred.\n")
} else {
  cat("Hausman test suggests Random Effects preferred.\n")
}


	Hausman Test

data:  formula_str
chisq = 33.515, df = 6, p-value = 8.346e-06
alternative hypothesis: one model is inconsistent

Hausman test suggests Fixed Effects preferred.


<a id='4.A'></a>

### **4.A. Heteroskedasticity tests :**

In [1195]:
# ===============================
# A. Heteroskedasticity tests (Breusch-Pagan and White on residuals)

print("Breusch-Pagan and White Test for Heteroskedasticity:")

# Breusch-Pagan Test
bp_test <- bptest(fe_model)
cat(sprintf("Breusch-Pagan test: stat=%.4f, p-value=%.4f\n", bp_test$statistic, bp_test$p.value))

# White Test (using quadratic interactions)
fe_residuals <- residuals(fe_model)

white_test <- bptest(lm(fe_residuals^2 ~ ., data = train_model_df[, X_vars]))
cat(sprintf("White test: stat=%.4f, p-value=%.4f\n", white_test$statistic, white_test$p.value))

# Decision rule
if (bp_test$p.value < 0.05 | white_test$p.value < 0.05) {
  cat("Heteroskedasticity detected: consider robust standard errors.\n")
} else {
  cat("No significant heteroskedasticity detected.\n")
}

[1] "Breusch-Pagan and White Test for Heteroskedasticity:"
Breusch-Pagan test: stat=73.5886, p-value=0.0000
White test: stat=93.3689, p-value=0.0000
Heteroskedasticity detected: consider robust standard errors.


<a id='4.B'></a>

### **4.B. Serial Correlation test:**

In [1196]:
# ===============================
# B. Serial Correlation test (Breusch-Godfrey) residuals

bg_test <- pbgtest(fe_model)
print("Breusch-Godfrey/Wooldridge test for Serial Correlation:")
bg_test


[1] "Breusch-Godfrey/Wooldridge test for Serial Correlation:"



	Breusch-Godfrey/Wooldridge test for serial correlation in panel models

data:  formula_str
chisq = 1324.8, df = 45, p-value < 2.2e-16
alternative hypothesis: serial correlation in idiosyncratic errors


<a id='4.C'></a>

### **4.C. Test Pesaran CD:**

In [1197]:
# Test Pesaran CD
pcdtest(fe_model, test = "cd")


	Pesaran CD test for cross-sectional dependence in panels

data:  PCPIPCH ~ -1 + GGSB_NPGDP + GGXWDG_NGDP + PPPPC + TM_RPCH + BCA_NGDPD +     NID_NGDP
z = 21.476, p-value < 2.2e-16
alternative hypothesis: cross-sectional dependence


<a id='4.D'></a>

### **4.D. test Levin, Lin & Chu:**

In [1198]:
# test Levin, Lin & Chu
llc_test <- purtest(panel_df$PCPIPCH, test = "levinlin")
summary(llc_test)


Levin-Lin-Chu Unit-Root Test 
Exogenous variables: None 
Automatic selection of lags using SIC: 0 - 10 lags (max: 10)
statistic: -20.055 
p-value: 0 

                         lags obs         rho       trho       p.trho
Angola                      0  44 -0.40021247 -3.3178554 8.878553e-04
Argentina                   1  43  0.11123942  1.6985028 9.788170e-01
Australia                   0  44 -0.11195443 -2.0808989 3.596249e-02
Austria                     2  42 -0.10255813 -1.3407712 1.671187e-01
Barbados                    0  44 -0.27536168 -3.7722649 1.616201e-04
Belarus                     0  44 -0.31332765 -2.8692679 4.004598e-03
Belgium                     2  42 -0.19687123 -2.1741055 2.859874e-02
Bosnia and Herzegovina      0  44 -0.43536623 -3.5024793 4.539723e-04
Botswana                    0  44 -0.06018470 -1.3985521 1.509864e-01
Brazil                      6  38 -0.31107228 -2.0689221 3.701596e-02
Bulgaria                    0  44 -0.83398222 -5.6196792 3.416808e-08
Canada   

<a id='4.E'></a>

### **4.E. Correcting standard error:**

In [1199]:
# # Correcting standard error using Driscoll-Kraay (SCC) estimator:
# This method corrects for heteroskedasticity, serial correlation, 
# and cross-sectional dependence — all of which were detected in the diagnostic tests.
robust_se_scc <- vcovSCC(fe_model, type = "HC1")

# Testing significance using corrected standard error
coeftest(fe_model, vcov = robust_se_scc)



t test of coefficients:

               Estimate  Std. Error t value  Pr(>|t|)    
GGSB_NPGDP  -6.70888686  2.71866273 -2.4677  0.013643 *  
GGXWDG_NGDP  0.61794044  0.12616294  4.8980 1.010e-06 ***
PPPPC       -0.00078662  0.00014536 -5.4114 6.657e-08 ***
TM_RPCH      1.32716446  0.50852019  2.6099  0.009095 ** 
BCA_NGDPD    3.32534046  1.30062591  2.5567  0.010606 *  
NID_NGDP     1.85277605  1.03607819  1.7883  0.073817 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


<a id='5.C'></a>

### **5.C. Diagnostics for the Difference GMM model:**


<a id='5.'></a>

## **5. the Dynamic Panel (GMM) model:**

<a id='5.A'></a>

### **5.A. Build the formula string:**

In [1200]:
# Build the dynamic panel formula:
# The left-hand side of the formula includes:
# - lagged dependent variable: lag(PCPIPCH, 1)
# - contemporaneous exogenous variables: fiscal, trade, and income predictors
# The right-hand side of the formula provides instruments:
# - lagged inflation from period 2 to 7 as instruments

dynamic_formula <- as.formula(
  paste0("PCPIPCH ~ lag(PCPIPCH, 1) + ",
         paste(X_vars, collapse = " + "), 
         " | lag(PCPIPCH, 2:5)")
)

<a id='5.B'></a>

### **5.B. Estimate the GMM model:**

In [1201]:
# ──────────────────────────────────────────────────────────────────────────────
# Estimate the GMM model (difference GMM (Arellano–Bond))
# ──────────────────────────────────────────────────────────────────────────────
gmm_model <- pgmm(
  formula         = dynamic_formula,
  data            = panel_df,
  effect          = "individual",      # panel-level fixed effect
  model           = "twosteps",        # two-step GMM estimation
  transformation  = "ld",              # first-differenced transformation
  collapse        = TRUE               # instrument collapsing for efficiency
)

# Summarize model output
summary(gmm_model)

"a general inverse is used"


Oneway (individual) effect Two-steps model System GMM 

Call:
pgmm(formula = dynamic_formula, data = panel_df, effect = "individual", 
    model = "twosteps", collapse = TRUE, transformation = "ld")

Balanced Panel: n = 83, T = 45, N = 3735

Number of Observations Used: 7044
Residuals:
     Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
-3735.088   -13.298     0.000     2.016    11.671  3286.639 

Coefficients:
                   Estimate  Std. Error z-value  Pr(>|z|)    
lag(PCPIPCH, 1)  0.30914568  0.15708952  1.9680 0.0490728 *  
GGSB_NPGDP      -3.46085225  2.61904565 -1.3214 0.1863623    
GGXWDG_NGDP      0.56848392  0.16169286  3.5158 0.0004384 ***
PPPPC           -0.00035788  0.00021766 -1.6442 0.1001325    
TM_RPCH          0.98372319  0.31983461  3.0757 0.0020999 ** 
BCA_NGDPD        1.34167848  0.53487454  2.5084 0.0121280 *  
NID_NGDP        -1.08453586  0.34703521 -3.1251 0.0017772 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Sargan test:

<a id='5.C'></a>

### **5.C. Diagnostics for the Difference GMM model:**

In [1153]:
# ──────────────────────────────────────────────────────────────────────────────
# Diagnostics for the Difference GMM model
# ──────────────────────────────────────────────────────────────────────────────

# (i) Hansen/Sargan test for overid (validity of instruments) ((p > 0.05))
cat("Sargan test (overid):\n")
print(sargan(gmm_model))

# (ii) Arellano–Bond tests for autocorrelation in residuals

# AR(1) test: should reject (serial correlation expected)
cat("\nArellano–Bond AR(1) test:\n")
print(mtest(gmm_model, order = 1))

# AR(2) test: should not reject (no serial correlation expected(p > 0.05))
cat("\nArellano–Bond AR(2) test:\n")
print(mtest(gmm_model, order = 2))


Sargan test (overid):

	Sargan test

data:  dynamic_formula
chisq = 16.065, df = 10, p-value = 0.09778
alternative hypothesis: overidentifying restrictions not valid


Arellano–Bond AR(1) test:



	Arellano-Bond autocorrelation test of degree 1

data:  dynamic_formula
normal = -1.3781, p-value = 0.1682
alternative hypothesis: autocorrelation present


Arellano–Bond AR(2) test:

	Arellano-Bond autocorrelation test of degree 2

data:  dynamic_formula
normal = -1.2691, p-value = 0.2044
alternative hypothesis: autocorrelation present



<a id='6.'></a>

### **6. residuals models:**

# **END**