## Analyse stationarity of different macroeconomic series

**The purpose of this notebook is to show complexities related to the stationarity analysis.**  
**This analysis is not sufficient to assess whether delta GDP or unemployment of an individual country is stationary.**

Deep Research / OpenAI helped to summarise the main ideas behind this testing:
- The intuition behind stationarity testing—what it examines and how it works.
- A review of individual statistical tests for stationarity, their assumptions, and how to interpret them.
- Discussion on the proper frequency (quarterly vs. annual) for these macroeconomic variables.
- Guidance on what to do if different tests yield contradictory results.
- Recommendations on the correct sample size for reliable results.
- Python-based approaches for implementing stationarity tests.

### Stationarity Analysis in IFRS 9 PD Modeling (Europe 25-Year Macroeconomic Data)

#### 1. Intuition Behind Stationarity Testing

**What is Stationarity?** In time series analysis, *stationarity* means the statistical properties of a series (mean, variance, and autocorrelation structure) do not change over time ([](https://mpra.ub.uni-muenchen.de/27926/1/Stationarity_of_time_series_and_the_problem_of_spurious_regression.pdf#:~:text=In%20other%20words%2C%20%EF%81%BB%20%EF%81%BDT,first%20requirement%20simply%20says%20that)). In other words, a stationary series fluctuates around a constant mean and variance, and covariances depend only on the lag between observations, not on the specific time ([](https://mpra.ub.uni-muenchen.de/27926/1/Stationarity_of_time_series_and_the_problem_of_spurious_regression.pdf#:~:text=In%20other%20words%2C%20%EF%81%BB%20%EF%81%BDT,first%20requirement%20simply%20says%20that)). For example, a stationary series might oscillate around a long-run average, whereas a non-stationary series could have trends or unit roots causing its level to wander or its variance to grow over time.

**Why Stationarity Matters for IFRS 9 PD Models:** IFRS 9 Probability of Default (PD) models often incorporate macroeconomic time series (GDP, unemployment, etc.) to produce *point-in-time* (PIT) PD estimates that vary with economic conditions ([Finalyse: A practical approach to predicting the IFRS9 Macroeconomic Forward-Looking PD](https://www.finalyse.com/blog/a-practical-approach-to-predicting-the-ifrs9-macroeconomic-forward-looking-pd#:~:text=The%20estimated%20probability%20of%20default,PD%20for%20different%20economic%20scenarios)) ([Finalyse: A practical approach to predicting the IFRS9 Macroeconomic Forward-Looking PD](https://www.finalyse.com/blog/a-practical-approach-to-predicting-the-ifrs9-macroeconomic-forward-looking-pd#:~:text=Employment)). Using non-stationary variables (e.g. trending GDP levels or steadily drifting interest rates) in regression models can lead to **spurious relationships** – high correlations that are not truly predictive but arise simply because both variables trend over time ([](https://mpra.ub.uni-muenchen.de/27926/1/Stationarity_of_time_series_and_the_problem_of_spurious_regression.pdf#:~:text=However%2C%20from%20the%20practical%20point,When)). This can mislead the model: *“When all (dependent and independent) time series are non-stationary, the regression results are simply misleading”* ([](https://mpra.ub.uni-muenchen.de/27926/1/Stationarity_of_time_series_and_the_problem_of_spurious_regression.pdf#:~:text=However%2C%20from%20the%20practical%20point,When)). In a credit risk context, a spurious regression might incorrectly suggest a strong link between PD and a macro variable just because both rose or fell over the sample period, even if no causal economic relationship exists. Such models would not be reliable for forecasting or stress scenarios.

**Implications of Non-Stationary Inputs:** If non-stationary macro variables are used without correction, PD models may violate regression assumptions and produce unstable coefficients. The model might fit historically but perform poorly in forecasting, especially if the future behavior of the macro variable changes (e.g. a trend ends or reverses). Non-stationarity also complicates statistical inference (standard errors, p-values) and can inflate the likelihood of Type I errors (finding significance where there is none). In IFRS 9, which requires forward-looking PD under different economic scenarios, an unstable model can lead to inaccurate expected credit loss estimates. **Regulatory/industry guidance** therefore emphasizes transforming variables to stationary forms (e.g. using growth rates or differences) before modeling ([[PDF] A Review on the Probability of Default for IFRS 9 - GARP](https://www.garp.org/hubfs/Whitepapers/a2r5d000003s7K0AAI_RiskIntel.WP.IFRS9.PD.Jan22.pdf#:~:text=,of%20the%20probability%20of)). For example, rather than using the level of GDP (which grows over time), a PD model might use the GDP growth rate or output gap, which is more likely to be stationary around 0.

**What Stationarity Tests Do (Intuitive):** Stationarity tests help determine whether a time series has a unit root (a characteristic of many non-stationary series) or is mean-reverting. Intuitively, they assess whether shocks to the series have permanent effects (indicative of a unit root) or dissipate over time (stationary). For instance, if unemployment has a unit root, a jump in unemployment would have a lasting effect on its level; if it's stationary, unemployment would eventually revert toward a “natural” rate. Stationarity tests like the Augmented Dickey-Fuller or KPSS examine patterns in the series’ autocorrelation and trend:
- They often set up a hypothesis test where the **null hypothesis** distinguishes stationarity vs. non-stationarity. For example, the ADF test’s null is that the series has a unit root (non-stationary), while the KPSS test’s null is that the series is stationary around a trend ([Unit root tests (ADF, KPSS) | Intro to Time Series Class Notes](https://library.fiveable.me/intro-time-series/unit-3/unit-root-tests-adf-kpss/study-guide/psGQzOPIpDJwW3lx#:~:text=Notes%20library,stationarity%20%28less%20common)).
- They work by fitting an autoregressive model to see if the series essentially “needs” a unit root to explain its behavior. If the test statistic indicates that a unit root coefficient is essentially 1 (meaning the series is indistinguishable from a random walk), we conclude the series is non-stationary. If the coefficient is significantly less than 1 (mean reversion), we conclude stationarity ([arch.unitroot.PhillipsPerron - arch 7.2.0](https://arch.readthedocs.io/en/stable/unitroot/generated/arch.unitroot.PhillipsPerron.html#:~:text=The%20null%20hypothesis%20of%20the,to%20be%20a%20unit%20root)).
- In simpler terms, stationarity tests look at whether the series *wanders* too much (failing to revert to a mean) or whether its fluctuations are bounded. They check if past values have a persistent influence on future values. For IFRS 9 modelers, this is crucial: it tells us if we can rely on historical correlations. If a macro variable is stationary, its relationship with PD can be stable; if not, we may need to difference or transform it to find a stable relationship.

**Key Point:** Before using macroeconomic variables in IFRS 9 PD models, one should test for stationarity. It’s *“important to ensure time-series data is stationary with tests such as KPSS or ADF”* ([Finalyse: A practical approach to predicting the IFRS9 Macroeconomic Forward-Looking PD](https://www.finalyse.com/blog/a-practical-approach-to-predicting-the-ifrs9-macroeconomic-forward-looking-pd#:~:text=characteristics%20must%20be%20investigated%20beforehand%2C,such%20as%20KPSS%20or%20ADF)). This ensures the model captures genuine, consistent economic relationships rather than artifacts of trending data.

#### 2. Statistical Tests for Stationarity

Several statistical tests are commonly used to evaluate stationarity (or the presence of unit roots) in time series. Each test has its methodology, assumptions, and pros/cons. The three key tests are:

- **Augmented Dickey-Fuller (ADF) Test:** The ADF is an extension of the Dickey-Fuller test to more complex, higher-order autoregressive processes. Its null hypothesis is that the time series has a unit root (i.e. is non-stationary), against the alternative that the series is stationary (technically, $I(1)$ vs $I(0)$) ([Unit root tests (ADF, KPSS) | Intro to Time Series Class Notes](https://library.fiveable.me/intro-time-series/unit-3/unit-root-tests-adf-kpss/study-guide/psGQzOPIpDJwW3lx#:~:text=Notes%20library,stationarity%20%28less%20common)). The test works by regressing the first difference of the series against its lagged level and lagged differences. It includes lagged difference terms to account for autocorrelation (hence “augmented”). The core idea is to test if the coefficient $\phi$ in $\Delta y_t = \alpha + \beta t + \phi y_{t-1} + \sum_{i=1}^{k}\psi_i \Delta y_{t-i} + \epsilon_t$ is zero or not. $\phi=0$ (equivalently, unit root coefficient = 1 in level form) implies a unit root. If the ADF test statistic is more negative than the critical value, we reject the null and conclude the series is stationary. **Assumptions/Methodology:** One must specify whether to include a constant ($\alpha$) or trend ($\beta t$) term depending on whether the series has a deterministic drift. Lag length *k* must be chosen (using AIC/BIC or trial-and-error) to whiten the residuals. **Strengths:** Widely used and straightforward; critical values account for the unusual distribution under a unit root null. **Weaknesses:** The ADF has relatively low power, especially in small samples or if the true process is near-stationary (root close to 1). It may fail to reject the null even if the series is stationary but only slowly mean-reverting. It’s also sensitive to lag selection and will misidentify stationarity if a deterministic trend isn’t accounted for (e.g., a trend-stationary series might appear non-stationary unless a trend term is included in the test regression).

- **Phillips-Perron (PP) Test:** The Phillips-Perron test is another unit root test with the same null hypothesis as ADF (series has a unit root). **Methodology:** It uses a non-parametric approach to correct for autocorrelation and heteroskedasticity in the series ([regression - Phillips–Perron unit root test instead of ADF test? - Cross Validated](https://stats.stackexchange.com/questions/14076/phillips-perron-unit-root-test-instead-of-adf-test#:~:text=A%20great%20advantage%20of%20Philips,HAC%20type%20corrections)). PP effectively runs a similar regression as the Dickey-Fuller but without adding lagged difference terms; instead, it adjusts the test statistics (and p-values) using estimates of the long-run variance of the residuals (using a Newey-West correction) ([arch.unitroot.PhillipsPerron - arch 7.2.0](https://arch.readthedocs.io/en/stable/unitroot/generated/arch.unitroot.PhillipsPerron.html#:~:text=Unlike%20the%20ADF%20test%2C%20the,West)). **Underlying assumptions:** It relies on large-sample (asymptotic) theory for those corrections to be valid. **Strengths:** The big advantage is it doesn’t require choosing a lag order for augmentation ([regression - Phillips–Perron unit root test instead of ADF test? - Cross Validated](https://stats.stackexchange.com/questions/14076/phillips-perron-unit-root-test-instead-of-adf-test#:~:text=A%20great%20advantage%20of%20Philips,HAC%20type%20corrections)). This can simplify analysis and avoid the risk of mis-specifying the lag length. It’s also robust to general forms of autocorrelation and time-varying volatility in the data. **Weaknesses:** PP shares the same null and alternative as ADF and, in fact, is asymptotically equivalent to ADF under the null ([[PDF] Unit Root Tests](https://faculty.washington.edu/ezivot/econ584/notes/unitroot.pdf#:~:text=,they%20correct%20for%20serial)). In finite samples, however, they can differ. The PP test’s heavy reliance on asymptotic approximations means it *“works well only in large samples”* and has **poor small-sample power**, similar to ADF ([regression - Phillips–Perron unit root test instead of ADF test? - Cross Validated](https://stats.stackexchange.com/questions/14076/phillips-perron-unit-root-test-instead-of-adf-test#:~:text=The%20main%20disadvantage%20of%20the,resulting%20in%20unit%20root%20conclusions)). In practice, with limited observations (like 25–100 points), PP might also frequently fail to reject a false null (labeling a series as unit root when it’s actually stationary). PP, like ADF, is **sensitive to structural breaks** – a one-time shock in the series could lead to misleading results if not accounted for ([regression - Phillips–Perron unit root test instead of ADF test? - Cross Validated](https://stats.stackexchange.com/questions/14076/phillips-perron-unit-root-test-instead-of-adf-test#:~:text=The%20main%20disadvantage%20of%20the,resulting%20in%20unit%20root%20conclusions)). In summary, PP is useful as a cross-check to ADF; if both ADF and PP agree, confidence in the result is higher. If they diverge, it may be due to how they handled autocorrelation or sample size limitations.

- **Kwiatkowski-Phillips-Schmidt-Shin (KPSS) Test:** The KPSS test takes the opposite perspective: its null hypothesis is that the series is stationary (stationary around a mean or deterministic trend, depending on test version), and the alternative is that the series has a unit root (is non-stationary) ([KPSS test - Wikipedia](https://en.wikipedia.org/wiki/KPSS_test#:~:text=for%20testing%20a%20null%20hypothesis,1)). Essentially, KPSS assumes the time series can be decomposed into a deterministic trend, a stationary fluctuation, and noise; it tests whether the magnitude of the random walk component is zero (stationary) or positive (unit root present). **Methodology:** The test involves estimating the series’ long-run variance and assessing how much the series wanders relative to what would be expected if it were truly stationary. It computes a statistic based on the partial sum of residuals from a regression of the series on a constant (and trend, if testing trend-stationarity) ([KPSS test - Wikipedia](https://en.wikipedia.org/wiki/KPSS_test#:~:text=Yongcheol%20Shin%20,is%20the%20Lagrange%20multiplier%20test)). If this statistic is too large, the null of stationarity is rejected. **Assumptions:** Like PP, KPSS uses lag truncation for the Newey-West estimator of variance, so one must choose a bandwidth (often automatically or via rules). It assumes the series under the null is trend-stationary (for the trend version of the test). **Strengths:** KPSS complements the ADF/PP by testing the reverse hypothesis. It’s useful because *“neither the ADF test nor the KPSS test will confirm or disconfirm stationarity in isolation”* ([COMPARISION STUDY OF ADF vs KPSS TEST - Medium](https://medium.com/@tannyasharma21/comparision-study-of-adf-vs-kpss-test-c9d8dec4f62a#:~:text=COMPARISION%20STUDY%20OF%20ADF%20vs,decide%20if%20you%20should%20difference)) – by using both, an analyst can get a more complete picture. KPSS often has higher power to reject *stationarity* for series that are highly persistent, catching cases where ADF might not reject the unit root null. **Weaknesses:** KPSS can be **oversensitive**, sometimes rejecting stationarity for series that are stationary but with strong short-term autocorrelation. Also, if a series is stationary but with a structural break or a slow moving trend, KPSS might signal non-stationarity (because the stationary null is fairly strict). In practice, KPSS tends to **reject the null frequently** if the series length is large or if any long-term pattern exists, so it’s common to get a KPSS p-value low (indicating non-stationary) even when ADF says the opposite. This doesn’t always mean the tests outright contradict – KPSS might be flagging that the series is not *strictly stationary* but possibly **trend-stationary** (stationary after removing a deterministic trend).

**Summary – Strengths/Weaknesses:** All these tests have limitations. Unit root tests (ADF, PP) often suffer low power, especially with short data spans or with series that have near-unit-root behavior. The KPSS test can falsely signal non-stationarity if there are structural changes or if the series is stationary but highly autocorrelated. No single test is definitive. It’s common to use multiple tests in tandem to make a judgment. In practice, **contradictory results can occur**, so analysts must use judgment (see Section 4 below). Additionally, advanced tests exist for special situations (e.g. the **DF-GLS** test is an augmented Dickey-Fuller with GLS detrending for better power; the **Zivot-Andrews** test allows one endogenous structural break in the trend) ([regression - Phillips–Perron unit root test instead of ADF test? - Cross Validated](https://stats.stackexchange.com/questions/14076/phillips-perron-unit-root-test-instead-of-adf-test#:~:text=It%20is%20advisable%20to%20make,the%20data%20has%20structural%20breaks)). These can be employed if there’s reason to suspect subtle issues (like a one-time shock in GDP or interest rates regime change).

Finally, it’s worth noting that IFRS 9 modelers are not necessarily trying to do formal cointegration analysis, but they do want to avoid spurious regressions. Often, **simple transformations (like differencing)** are applied if any doubt exists about stationarity. This is implicitly recommended in industry best practices: *“Consider transforms of variables to ensure stationarity”* when building credit risk models ([[PDF] A Review on the Probability of Default for IFRS 9 - GARP](https://www.garp.org/hubfs/Whitepapers/a2r5d000003s7K0AAI_RiskIntel.WP.IFRS9.PD.Jan22.pdf#:~:text=,of%20the%20probability%20of)). By using these tests and transformations, one can justify that model inputs are stationary, supporting reliable PD estimations over time.

#### 3. Proper Frequency of Macroeconomic Data (Quarterly vs. Annual)

Choosing the data frequency (quarterly, annual, etc.) for macroeconomic inputs in IFRS 9 PD modeling involves a trade-off between granularity and sample size. Key considerations include the length of economic cycles, the amount of data available, and the nature of credit default patterns.

- **Quarterly Data:** In most cases, **quarterly frequency is preferable** for IFRS 9 PD models. Economic cycles and credit default dynamics often operate at a quarterly pace (e.g. companies report earnings quarterly, central banks adjust rates quarterly, etc.), so quarterly data can capture turning points more sensitively than annual averages. Importantly, quarterly data provides **more observations** over a given time span, which is crucial for model estimation and statistical tests. For example, 25 years of data gives only 25 annual points, but about 100 quarterly points. The authors further highlight that quarterly default rates are *“in practice also often used”* ([](https://thesis.eur.nl/pub/62139/Master_Thesis_Report_LMVerspeek.pdf#:~:text=are%20required,series%20has%20too%20many%20zeros)) in credit risk modeling. The larger sample improves the robustness of statistical analyses (including stationarity tests) and allows the model to **“see” multiple up-and-down variations** within economic cycles, rather than smoothing them into one data point per year. Another practical reason is that IFRS 9 requires frequent (at least quarterly) updates to provisions; having models built on quarterly data aligns well with the need to produce updated PD forecasts every quarter.

- **Annual Data:** Annual frequency has the advantage of simplicity and often cleaner data (less seasonal noise). However, with only ~25 data points in 25 years, an annual series provides very limited information for econometric modeling. It’s harder to detect statistically significant relationships or structural changes with such a small sample. Annual data may also **miss intra-year dynamics** – for instance, a recession that begins in Q2 and ends by Q4 might be barely reflected in the annual GDP growth rate, whereas quarterly data would show a sharp drop and recovery. Best practices generally caution that a **lower frequency can run into issues**: one, insufficient data points to reliably estimate coefficients; two, difficulty capturing short-term stress periods ([](https://thesis.eur.nl/pub/62139/Master_Thesis_Report_LMVerspeek.pdf#:~:text=are%20required,series%20has%20too%20many%20zeros)). Indeed, using annual data might smooth over important variability that IFRS 9’s forward-looking approach is supposed to capture.

- **Other Frequencies:** Extremely high frequency (monthly) for macroeconomic *defaults* can be problematic because default events are relatively infrequent. Many months could have zero defaults, especially for smaller portfolios or high-quality segments, leading to a series with many zeros and volatility that’s hard to model ([](https://thesis.eur.nl/pub/62139/Master_Thesis_Report_LMVerspeek.pdf#:~:text=and%20forecasting%20purposes%2C%20quarterly%20default,the%20zeros%20in%20those%20time)). One source notes that too high frequency can yield “many data points without any defaults” and disrupt model estimation ([](https://thesis.eur.nl/pub/62139/Master_Thesis_Report_LMVerspeek.pdf#:~:text=preferable%20frequency%20would%20be%20quarterly,reason%2C%20we%20exclude%20the%20ratings)). For macroeconomic **variables**, monthly data is often available (e.g. unemployment, CPI) and could be used, but if the PD or default data is quarterly, it’s common to aggregate or average macro data to quarterly to maintain consistency. Some research even suggests that, for PD forecasting, using monthly vs. annual inputs did not drastically change PD outcomes in their tests ([IFRS 9: Probably Weighted and Biased?](https://www.crc.business-school.ed.ac.uk/sites/crc/files/2020-11/71-Alexander_Marianski.pdf#:~:text=Our%20analysis%20suggest%20that%20the,ECL%20impact%20should%20not%20be)). Specifically, *“the choice of annual or monthly time-step has a minimal impact on PD”* in one analysis ([IFRS 9: Probably Weighted and Biased?](https://www.crc.business-school.ed.ac.uk/sites/crc/files/2020-11/71-Alexander_Marianski.pdf#:~:text=Our%20analysis%20suggest%20that%20the,ECL%20impact%20should%20not%20be)). However, this finding may depend on the specific model and portfolio; it assumes the model can adequately adjust for cycle length. In general, if one also considers loan amortization, credit migration, and discounting (factors in IFRS 9 ECL calculations), using more granular data could matter for timing of defaults and losses ([IFRS 9: Probably Weighted and Biased?](https://www.crc.business-school.ed.ac.uk/sites/crc/files/2020-11/71-Alexander_Marianski.pdf#:~:text=Our%20analysis%20suggest%20that%20the,ECL%20impact%20should%20not%20be)).

**Recommendation:** For European macroeconomic data over 25 years, **quarterly data is typically recommended** for IFRS 9 PD modeling. It offers a good balance by capturing economic cycles (usually 5–10 years per cycle) with enough detail, while providing a sufficient number of observations for analysis. Quarterly data will reflect business cycle phases (expansions, recessions) more precisely – important since IFRS 9 requires scenario-conditioned PDs in upturn and downturn scenarios. Annual data may be used if quarterly data quality is poor or unavailable, but one should be mindful that statistical power will be low. In any case, the data should be **seasonally adjusted** (especially for quarterly GDP, unemployment, etc.) so that seasonal patterns don’t masquerade as trends or affect stationarity. Many institutions also align model frequency with scenario frequency provided by regulators or economists (often annual or quarterly scenarios). Overall, quarterly is a common choice in practice for PIT PD models with macro factors ([](https://thesis.eur.nl/pub/62139/Master_Thesis_Report_LMVerspeek.pdf#:~:text=are%20required,series%20has%20too%20many%20zeros)).

*(Note: Always ensure consistency – if PDs are computed on an annual horizon, one might still use quarterly macro variables in the regression but aggregated to predict annual default rates. Some implementations use **hybrid approaches** like modeling quarterly default rates and then compounding to annual PD. But the key is capturing the cycle granularity while meeting the modeling objectives.)*

#### 4. Handling Contradictory Stationarity Test Results

It is not uncommon for different stationarity tests to give conflicting conclusions for the same variable. For example, you might find that the ADF test fails to reject the null of a unit root (suggesting non-stationarity), but the KPSS test also fails to reject its null (suggesting stationarity). Or ADF might indicate stationarity while KPSS rejects it. These situations require careful interpretation:

- **Understand the Nulls:** Remember that ADF/PP have *null = “has unit root”* and KPSS has *null = “stationary (no unit root)”*. So if ADF says “non-stationary” and KPSS says “non-stationary,” they are actually in agreement (both pointing to non-stationarity). True contradictions are when one test points to stationarity and the other to non-stationarity. Often, what seems like a conflict is just one test not having enough evidence to reject its null. For instance, if both ADF and KPSS *fail to reject* their nulls, it means ADF didn’t find stationarity and KPSS didn’t find non-stationarity – effectively an inconclusive result in both directions. As one statistician put it: *“A failure to reject the null doesn’t mean the null is true… This [situation] would be very common if the series are short”* ([[Q] What does it mean when adf and kpss shows contradictory results? : r/statistics](https://www.reddit.com/r/statistics/comments/jlmd34/q_what_does_it_mean_when_adf_and_kpss_shows/#:~:text=If%20I%20understand%20your%20post,failed%20to%20reject%20the%20null)). In other words, with limited data, tests might lack power and yield ambiguous outcomes.

- **Check for Deterministic Trends:** A frequent cause of conflicting results is the presence of a deterministic trend. For example, **trend-stationary** series (stationary around a linear trend) can fool the tests. An ADF test that doesn’t include a trend term might see the series as non-stationary (because it’s trending), whereas a KPSS test (which inherently tests for trend-stationarity) might reject stationarity due to that trend. Conversely, if ADF includes a trend and finds no unit root (stationary around trend), but KPSS (also checking stationarity around trend) rejects, it might imply subtle issues. One approach is to **detrend the series** (e.g. regress out the trend or take year-over-year changes if a growth rate is more appropriate) and then re-test. In an illustrative example, a series showed: *“KPSS indicates non-stationarity and ADF indicates stationarity – the series is difference stationary… this can be inferred that the series is trend stationary and not strict stationary”* ([Stationarity and detrending (ADF/KPSS) - statsmodels 0.14.4](https://www.statsmodels.org/stable/examples/notebooks/generated/stationarity_detrending_adf_kpss.html#:~:text=Case%204%3A%20KPSS%20indicates%20non,series%20is%20checked%20for%20stationarity)). By differencing (which removed the deterministic trend), they converted it to a strictly stationary series that then passed both tests ([Stationarity and detrending (ADF/KPSS) - statsmodels 0.14.4](https://www.statsmodels.org/stable/examples/notebooks/generated/stationarity_detrending_adf_kpss.html#:~:text=Here%2C%20due%20to%20the%20difference,differencing%20or%20by%20model%20fitting)) ([Stationarity and detrending (ADF/KPSS) - statsmodels 0.14.4](https://www.statsmodels.org/stable/examples/notebooks/generated/stationarity_detrending_adf_kpss.html#:~:text=Based%20upon%20the%20p,series%20is%20strict%20stationary%20now)). **Key tip:** If ADF and KPSS disagree, consider the possibility that the series has a deterministic trend (or intercept) that needs to be modeled. Run ADF with a trend term, or do a regression on time and inspect residuals.

- **Use Additional Tests or Visuals:** When in doubt, augment your analysis:
  - **Visual inspection:** Plot the time series. If it shows an obvious upward/downward trend or a structural break, you have clues. A strong visual trend suggests non-stationarity, even if one test claims stationary. If the series meanders without a clear direction, it might be stationary.
  - **Autocorrelation Function (ACF):** Check the ACF and PACF plots. A very slow decay in the ACF (or not decaying at all) is indicative of a unit root. ACF cutting off or decaying quickly suggests stationarity.
  - **Alternate unit root tests:** There are variants like the **DF-GLS (ERS)** test, which often has better power than ADF by detrending data first, or the **Phillips-Perron** if not already used. If ADF and PP both tell the same story, that adds confidence. If one of them rejects unit root and the other doesn’t, lean towards the result of the test known to have more power in that context (e.g. PP is more robust to serial correlation issues, ADF might have missed it due to lag selection).
  - **Structural break tests:** If you suspect a one-time shift (e.g. a crisis in 2008 or a regime change in interest rates), consider a test like **Zivot–Andrews**, which checks for unit roots with an endogenous break. This can resolve ambiguity when a structural break is present (the series might be stationary around different means pre- and post-break, which confuses standard tests). One expert recommendation: *“If results don’t match, check the properties of the time series… consider Zivot-Andrews test if you believe the data has structural breaks”* ([regression - Phillips–Perron unit root test instead of ADF test? - Cross Validated](https://stats.stackexchange.com/questions/14076/phillips-perron-unit-root-test-instead-of-adf-test#:~:text=It%20is%20advisable%20to%20make,the%20data%20has%20structural%20breaks)).

- **Domain Knowledge and Economic Rationale:** Don’t rely solely on p-values. Use your understanding of the macroeconomic variable:
  - **GDP Level:** Likely non-stationary (economies grow over time). If tests are mixed, probably it’s because GDP is trend-stationary with growth slowdowns/speedups. It’s safer to treat GDP as non-stationary (take log-differences to use GDP growth) unless you explicitly model a deterministic trend.
  - **Unemployment Rate:** Many consider it mean-reverting around a “natural rate,” but that natural rate can change over decades (due to structural shifts). Over 25 years in Europe, unemployment might have a break (e.g. higher natural rate in early 2000s vs later). If tests conflict, consider that it might be stationary around a shifting mean. You might include a structural break or treat it as non-stationary and use change in unemployment or the unemployment gap.
  - **Interest Rates:** Short-term interest rates are often bounded by policy (mean-reverting to an inflation target perhaps), but in practice interest rate series can exhibit long cycles (the decline in rates from 2000s to 2010s). If ADF says unit root but KPSS says stationary, it could be because rates revert in the long run but your sample is one-directional (down then up). Here, look at sub-periods or economic reasoning. Often, it’s prudent to assume non-stationarity for rates over a long horizon and difference them (or use spread from a target rate).
  - **Real Estate Prices:** These typically trend upward (nominally) over long periods. It’s likely non-stationary (though maybe not a unit root in real terms, but generally non-mean-reverting). Contradictory test outcomes for house price index usually mean it’s at best trend-stationary (e.g. steadily rising with cycles). Most would take first differences (house price inflation) to model credit risk.

- **Best Practices for Decision:** **If the balance of evidence leans toward non-stationarity, treat the variable as non-stationary.** It’s usually safer to difference or transform a series than to risk a spurious regression by assuming stationarity. For example, if ADF has p-value 0.08 (slightly above 0.05, so formally not rejecting unit root) and KPSS p-value 0.04 (rejecting stationarity), it’s a strong signal the series is non-stationary – you would take differences. If the opposite happens (ADF p=0.01, KPSS p=0.01; ADF says stationary, KPSS says not), it could be a trend-stationary case; you might try adding a trend to ADF or just difference anyway. In practice, modelers often go with **“difference if in doubt.”** Differencing (or taking year-over-year percentage change) will usually resolve the conflict because a true unit root series will become stationary after first differencing, and a trend-stationary series will also become stationary (losing its deterministic trend). After transforming, you can re-run the tests to confirm the variable is now stationary before using it in modeling.

- **Documentation:** Given IFRS 9 models are subject to validation and audit, document the steps taken. If tests were contradictory, explain how you resolved it (e.g. “KPSS suggested non-stationarity whereas ADF did not; visual inspection showed an upward trend, so we took first differences. The differenced series then passed both ADF and KPSS at 5%, confirming stationarity”). This shows a sound reasoning process aligning with both statistical evidence and economic intuition ([Stationarity and detrending (ADF/KPSS) - statsmodels 0.14.4](https://www.statsmodels.org/stable/examples/notebooks/generated/stationarity_detrending_adf_kpss.html#:~:text=Based%20upon%20the%20p,series%20is%20strict%20stationary%20now)) ([Stationarity and detrending (ADF/KPSS) - statsmodels 0.14.4](https://www.statsmodels.org/stable/examples/notebooks/generated/stationarity_detrending_adf_kpss.html#:~:text=Based%20upon%20the%20p,Hence%2C%20the%20series%20is%20stationary)).

In summary, contradictory test results are a cue to dig deeper. Use a combination of statistical retests (with transformations or break adjustments) and economic logic to decide on the stationarity treatment. The end goal is a model with reliable relationships, so it’s better to err on the side of caution (differencing) than to force a possibly non-stationary variable into the model untransformed.

#### 5. Appropriate Sample Size for Stationarity Analysis

The power and reliability of stationarity tests depend heavily on the sample size. With a 25-year history of European macro data, the number of observations will vary by frequency: ~25 points for annual data, ~100 points for quarterly. Some guidance on sample size and best practices:

- **Minimum Observations:** As a general rule, **the more data, the better** when it comes to unit root tests. These tests have non-standard distributions and need a decent sample to distinguish a unit root process from a stationary one with high persistence. There’s no hard cutoff, but many practitioners feel that **much less than 50 observations makes unit root test results dubious**. With very short series (e.g. 10 or 20 points), it’s often noted that formal tests are not reliable at all ([Are unit root tests necessary or useful on small samples of time ...](https://economics.stackexchange.com/questions/27585/are-unit-root-tests-necessary-or-useful-on-small-samples-of-time-series-data#:~:text=Are%20unit%20root%20tests%20necessary,unit%20root%20alternatives)). In fact, in a simulation study, researchers found you *“need lots of data… to distinguish”* non-stationarity from stationarity ([Are unit root tests necessary or useful on small samples of time ...](https://economics.stackexchange.com/questions/27585/are-unit-root-tests-necessary-or-useful-on-small-samples-of-time-series-data#:~:text=Are%20unit%20root%20tests%20necessary,unit%20root%20alternatives)). If you only have ~25 annual points, the ADF or PP test will have very low power to reject a false null (you might fail to detect a unit root even if one exists) ([[Q] What does it mean when adf and kpss shows contradictory results? : r/statistics](https://www.reddit.com/r/statistics/comments/jlmd34/q_what_does_it_mean_when_adf_and_kpss_shows/#:~:text=If%20I%20understand%20your%20post,failed%20to%20reject%20the%20null)). One commenter on this issue remarked that with 60 weekly points, *“it’s not that long… failing to reject [both ADF and KPSS] is very common if the series are short”* ([[Q] What does it mean when adf and kpss shows contradictory results? : r/statistics](https://www.reddit.com/r/statistics/comments/jlmd34/q_what_does_it_mean_when_adf_and_kpss_shows/#:~:text=If%20I%20understand%20your%20post,failed%20to%20reject%20the%20null)).

- **Regulatory/Industry Best Practice:** IFRS 9 itself doesn’t mandate a specific historical length, but it emphasizes using *reasonable and supportable information* including historical data. A common benchmark comes from the Basel regulatory framework (for capital models), which *“recommends a minimum of five years of data”* for PD modeling ([](https://lup.lub.lu.se/student-papers/record/9160767/file/9160773.pdf#:~:text=time%20series%20depth%2C%20it%20remains,BCBS)). Five years (if quarterly, ~20 obs; if annual, 5 obs) is truly a bare minimum and generally insufficient for robust time series analysis – it’s more about having enough default observations for calibration in IRB models. Experts often recommend **covering multiple economic cycles**: *“Bellini (2019) suggests considering at least two economic cycles”* in the data ([](https://lup.lub.lu.se/student-papers/record/9160767/file/9160773.pdf#:~:text=environment,BCBS)). For Europe, a 25-year span likely includes two major recessions (e.g. early 2000s dot-com/Telecom bust in some regions, 2008 Global Financial Crisis, plus the 2020 COVID shock) and expansions in between. Thus, 25 years is a reasonable span to capture cyclical behavior, which is crucial for PIT PD models. If possible, including even more history (e.g. early 90s if data allow) to cover an additional cycle can be beneficial for model development, though data from far back might be less relevant due to structural changes (e.g. pre-Euro era).

- **Stationarity Test Considerations:** With 25 years of **quarterly** data (~100 points), standard unit root tests can be meaningfully applied. ~100 observations is often cited as a decent sample for ADF/PP to have moderate power. With **annual** data (25 points), one must be very cautious: the test critical values are valid, but the power to reject a unit root is so low that you might almost always “fail to reject” (especially for borderline cases). In such cases, you might rely more on economic reasoning or simply assume non-stationarity for variables like GDP and house prices, rather than letting the test alone decide. If working with annual data, consider using a higher significance level (e.g. 10%) as a threshold or look for consistency across multiple tests to mitigate the low power issue. Alternatively, **augment the sample** by incorporating related series or proxies if possible (though mixing series has its own issues).

- **Variables Specifics (25-year horizon):**  
  - *GDP (Real)*: Typically provided quarterly. With 100 data points, you can usually detect the unit root in GDP (most tests will indicate non-stationarity in the level, stationarity in growth rate). Ensure the data is real GDP (adjusted for inflation) if you care about real growth; nominal GDP would be even more strongly trended due to inflation.  
  - *Real Estate Prices*: Often available quarterly as an index. House price indices over 25 years often have strong upward trends with cycles. Here, even 100 points might sometimes fool the ADF if the boom-bust cycle is mild relative to the trend. It’s wise to take log differences (annual % change or quarterly % change) which yields ~99 or ~96 observations respectively – still fine for tests.  
  - *Interest Rates*: If using a policy rate or government bond yield, you might have monthly data (300 points) or quarterly averages (100 points). Monthly data could be advantageous to test stationarity (more power). However, interest rates often have structural regimes (e.g. high-inflation era vs low-inflation era). Over 25 years in Europe, you have the pre-EMU period, the Great Moderation, zero lower bound period, etc. The sample size is okay, but consider splitting the sample to test stationarity in sub-periods if regime changes are suspected.  
  - *Unemployment*: Likely monthly or quarterly data. Unemployment in many European countries from 2000–2025 might show a structural break around the Global Financial Crisis or Euro debt crisis. With monthly data (300 obs), tests are more trustable and you might find, for instance, ADF rejects unit root (since unemployment often mean-reverts after spikes). With quarterly (100 obs), results might be borderline. Ensure data is seasonally adjusted. If only annual unemployment is considered, 25 obs is probably too low to conclude much statistically; better to assume it’s mean-reverting but allow for a changing mean over long periods (or just difference it if unsure).

- **If Sample is Small:** If you only have around 5–10 years of data for a variable (say a newer economic indicator), formal stationarity testing is of limited use. In IFRS 9, one might then rely on **expert judgment** or literature (e.g. “inflation tends to be stationary around a target, we’ll assume stationarity after removing outliers”). But given the question context is 25 years of European data, we assume a reasonably rich sample for major macro variables.

**Recommendation:** Use as long a history as feasible (up to 25 years in this case, which is pretty good) to perform stationarity tests, and prefer quarterly data to increase observation count. According to the EBA/Basel guidance, at least 5 years is needed, but 25 years provides a much more robust basis ([](https://lup.lub.lu.se/student-papers/record/9160767/file/9160773.pdf#:~:text=time%20series%20depth%2C%20it%20remains,BCBS)). By covering multiple cycles, you improve the chance that your stationarity tests correctly identify the nature of the series (and your PD model will be calibrated to a full cycle range of conditions). If your series has fewer than ~30 points, interpret test outcomes with skepticism and lean on qualitative reasoning or external studies about the variable’s behavior.

Lastly, always ensure **data quality** – no matter the sample size, check for and adjust any obvious anomalies (outliers, changes in definition, etc.) before testing. And if combining data sources to extend history, ensure consistency (e.g. unemployment definitions consistent over 25 years, or splice with care) ([](https://lup.lub.lu.se/student-papers/record/9160767/file/9160773.pdf#:~:text=sources%20can%20significantly%20influence%20this,not%20prescribe%20specific%20rules%20for)) ([](https://lup.lub.lu.se/student-papers/record/9160767/file/9160773.pdf#:~:text=Finally%2C%20it%20is%20also%20important,their%20impact%20on%20model%20performance)). This will make your stationarity analysis more valid.

#### 6. Python-Based Implementation of Stationarity Tests

Implementing stationarity tests in Python is straightforward with libraries like `statsmodels` and `arch`. Below is a guide with code snippets for running ADF, PP, and KPSS tests, as well as handling transformations if needed:

**Setup:** First, ensure you have the necessary libraries installed. You’ll need `statsmodels` (for ADF and KPSS) and the `arch` package (for PP test, as `statsmodels` doesn’t have PP by default). Install via pip if necessary: `!pip install statsmodels arch`.

```python
import pandas as pd
from statsmodels.tsa.stattools import adfuller, kpss
from arch.unitroot import PhillipsPerron

# Example: suppose we have a Pandas Series for each macro variable
# e.g., gdp_series (quarterly GDP), unemp_series (unemployment rate), etc.
```

**Augmented Dickey-Fuller Test (ADF):** Use `statsmodels.tsa.stattools.adfuller`. It returns a tuple with test statistic, p-value, and other info.

```python
result = adfuller(gdp_series, autolag='AIC')  # autolag automatically chooses lag
adf_stat, adf_p = result[0], result[1]
print(f"ADF Statistic: {adf_stat:.3f}")
print(f"ADF p-value: {adf_p:.3f}")
```

By default, this ADF includes a constant but no trend. You can specify `regression='ct'` for including a trend term if you suspect a deterministic trend. For example: `adfuller(gdp_series, maxlag=4, regression='ct')`. Check the documentation or `help(adfuller)` for details.

Interpretation: If `p-value < 0.05`, we reject the null of unit root and conclude the series is likely stationary. If `p-value` is larger, we fail to reject – the series may have a unit root (non-stationary).

**Phillips-Perron Test (PP):** In the `arch` library, the `PhillipsPerron` class provides this test.

```python
pp_test = PhillipsPerron(gdp_series)
print(f"PP Statistic: {pp_test.stat:.3f}")
print(f"PP p-value: {pp_test.pvalue:.3f}")
```

This will automatically choose a lag truncation for the Newey-West estimator (you can override via parameters if needed). The null hypothesis is also that the series has a unit root ([arch.unitroot.PhillipsPerron - arch 7.2.0](https://arch.readthedocs.io/en/stable/unitroot/generated/arch.unitroot.PhillipsPerron.html#:~:text=The%20null%20hypothesis%20of%20the,to%20be%20a%20unit%20root)). Interpretation is similar: a small p-value means reject null (stationary), large p-value means cannot reject unit root (likely non-stationary).

**KPSS Test:** Use `statsmodels.tsa.stattools.kpss`. It returns the test statistic and p-value among other things.

```python
kpss_stat, kpss_p, n_lags, critical_values = kpss(gdp_series, regression='c', nlags="auto")
# regression='c' for stationarity around a constant (level stationarity)
# use 'ct' for stationarity around a trend if you want to allow a trend.
print(f"KPSS Statistic: {kpss_stat:.3f}")
print(f"KPSS p-value: {kpss_p:.3f}")
```

If `p-value < 0.05`, we reject the null of *stationarity* (so series is non-stationary). If p is high, we fail to reject stationarity. The `nlags="auto"` asks statsmodels to choose a lag truncation for the Newey-West estimator; you can also specify an integer (e.g. 4 for quarterly data).

After running these tests, you can form a judgment as discussed in section 4. For example:

```python
if adf_p < 0.05:
    print("ADF suggests series is stationary")
else:
    print("ADF suggests series is non-stationary (unit root not rejected)")

if kpss_p < 0.05:
    print("KPSS suggests series is non-stationary (stationarity rejected)")
else:
    print("KPSS suggests series is stationary (failed to reject null)")
```

It’s often helpful to run all three tests for each variable and summarize the results in a small table or printout.

**Differencing and Transformations:** If a variable is found to be non-stationary (e.g. ADF p-value high, KPSS p-value low), the typical remedy is to transform it to a stationary form:

- **Differencing:** Take first differences (or year-over-year differences for quarterly data to remove seasonal patterns). In pandas: `gdp_diff = gdp_series.diff().dropna()`. This will create a series of quarter-over-quarter changes (if original was quarter). You can then rerun the tests on `gdp_diff` to confirm stationarity. Often one difference is enough for economic series (they’re usually I(1)). If the series still isn’t stationary, you might need a second difference (I(2)), but that’s rare for macro variables like GDP or unemployment. Remember to drop the `NaN` created by differencing at the start.

- **Log Transformation:** If a series has exponential growth or heteroscedastic variance (variance growing with level), taking logs can stabilize variance. For example, use `np.log(real_estate_index)` before differencing, so you interpret differences as percentage changes. For GDP, using log is common before differencing, so that differencing yields approximate **growth rate** (Δlog ≈ % change).

- **Other transforms:** For interest rates or unemployment rates bounded between 0 and 100%, a logit transform or simply using them in percentage points is usually fine. Differencing interest rates will give changes in rates, which often are stationary if the level had a unit root. For unemployment, some prefer to model the **change in unemployment** or the **“unemployment gap”** (difference between unemployment rate and some estimated natural rate). These approaches ensure stationarity.

After transforming, **re-run the stationarity tests** to verify the issue is resolved. E.g.:

```python
unemp_diff = unemp_series.diff().dropna()
print(adfuller(unemp_diff)[1], kpss(unemp_diff, regression='c')[1])
```

If both tests now indicate stationarity (ADF p < 0.05 and KPSS p > 0.05), you can be confident in using `unemp_diff` (or whatever transformed version) in your PD model.

**Example Code Snippet Putting It Together:**

```python
variables = {"Real GDP": gdp_series, "Unemployment": unemp_series,
             "House Price Index": hpi_series, "Interest Rate": rate_series}

for name, ser in variables.items():
    print(f"\n{name}:")
    adf_p = adfuller(ser, autolag='AIC')[1]
    pp_p = PhillipsPerron(ser).pvalue
    kpss_p = kpss(ser, regression='c', nlags="auto")[1]
    print(f"  ADF p-value = {adf_p:.3f}")
    print(f"  PP p-value  = {pp_p:.3f}")
    print(f"  KPSS p-value = {kpss_p:.3f}")
```

This will output the p-values for each test, which you can analyze as discussed.

**Practical Considerations:**

- Always remove or adjust for **seasonality** before these tests if your data has a seasonal component (e.g. unemployment often seasonal, GDP is usually reported seasonally adjusted by statistical agencies). The tests assume stationarity in a wide sense without deterministic seasonal patterns.
- Ensure there are no obvious structural breaks or regime changes that you haven’t accounted for. For instance, if a macro series had a one-time jump, you might want to include a dummy variable or test the two sub-periods separately.
- Keep in mind that failing to find a unit root in a macroeconomic series is somewhat rare unless you’ve transformed it – many macro series in levels are trending. So you should expect that GDP, nominal house prices, etc., will need to be differenced. Variables like inflation, interest rate spreads, or output gaps are often already stationary by definition.
- If you difference a variable, remember to **interpret the model accordingly** – e.g. if your PD model uses $\Delta \text{Unemployment}$ as a predictor, a coefficient on that represents the impact of a change in unemployment on default rates, which might be less intuitive than level. Sometimes an equilibrium plus change formulation is used (both the level and change, or a short-term change vs long-term mean).
- **Validation:** When you document the model, note which variables were differenced or transformed due to stationarity issues. This also helps when updating the model with new data – you’ll continue applying the same transformations.

By using these Python tools and following best practices, you can rigorously assess stationarity and ensure your IFRS 9 PD models are built on solid statistical ground. This aligns with both academic best practices and regulatory expectations that models be statistically sound and not rely on fragile or spurious correlations ([](https://mpra.ub.uni-muenchen.de/27926/1/Stationarity_of_time_series_and_the_problem_of_spurious_regression.pdf#:~:text=However%2C%20from%20the%20practical%20point,When)) ([Finalyse: A practical approach to predicting the IFRS9 Macroeconomic Forward-Looking PD](https://www.finalyse.com/blog/a-practical-approach-to-predicting-the-ifrs9-macroeconomic-forward-looking-pd#:~:text=characteristics%20must%20be%20investigated%20beforehand%2C,such%20as%20KPSS%20or%20ADF)).

Below are some practical considerations on using different sample periods or frequencies for stationarity testing versus the final modeling dataset (or “target”) in IFRS 9 PD modeling. In short, *there is no strict prohibition* against using slightly different frequencies or longer time samples for stationarity tests, but you should do so with care, document your approach, and ensure consistency in how the final variables are actually fed into the model.

---

## 1. Stationarity Testing at Quarterly Frequency vs. Annual Target

### Typical Scenario
- You have a PD model whose *target* (default rates or PD estimates) is observed/reported annually.
- You have quarterly macroeconomic variables (e.g., GDP growth, unemployment, etc.) spanning a longer period.  
- You test for stationarity on those quarterly macro series.

### Is It Allowed?
- **Yes, it is permissible** to test stationarity on quarterly data, even if your ultimate PD model uses annual data.  
- **Why do it?** Stationarity tests require a decent sample size to have enough statistical power to detect a unit root (or lack thereof). Using quarterly macro data over a given span (say 25 years = ~100 points) is more robust than annual data for the same span (only 25 points).  
- **What to watch out for**: 
  1. **Aggregation/Alignment**: If you plan to feed *annualized* macro data into your final model, you must ensure the form of the variable used in the model is consistent with how you tested stationarity. Typically, if you conclude a quarterly series is non-stationary in levels but stationary in *quarterly differences*, you would also expect the *annual* (or year-over-year) differences to be stationary. But you should confirm by testing that *annualized version* of the variable as well—especially if your model input is truly annual.  
  2. **Seasonality**: Quarterlies might have seasonal patterns. Make sure you are using seasonally adjusted data or properly controlling for seasonal effects if relevant.  
  3. **Documentation**: Regulators or validators often want to see that your final variable used in the PD model has also been checked for stationarity. If you tested stationarity on the high-frequency data but the final variable is an annual average, explain that carefully (e.g., “We tested stationarity on the more granular quarterly series for better power; we confirmed that the annualized transformation also appears to be non-stationary and require differencing,” etc.).  

In practice, many modeling teams do test stationarity at the “native” macro frequency (quarterly or monthly). The better-powered test results guide whether a variable is stable or drifting. Then if you eventually *convert that variable to annual frequency*, you usually replicate or spot-check stationarity on the aggregated series as well.  

---

## 2. Using a Longer Time Span for Macros (20 Years) than for the Target (10 Years)

### Typical Scenario
- You only have 10 years of reliable default data (the *target*), but you have 20 years of macroeconomic history.  
- You want to test stationarity on the full 20-year macro series, even though the final PD model might only use 10 years of overlap with the default data.

### Is It Allowed?
- **Yes, in general, it is allowed**—and often *useful*—to leverage all available historical data for stationarity testing. There is no rule saying you must only test stationarity on precisely the same window used in the final regression.  
- **Rationale**:  
  1. **Improved Statistical Power**: Stationarity tests (ADF, PP, KPSS) work better with more observations. Testing over a longer 20-year window gives you a stronger basis for concluding whether a variable is truly stationary or not.  
  2. **Stable Economic Behavior**: If the macro variable has remained structurally the same (no major definition changes, etc.), then the additional 10 years (beyond your default data window) can still inform you about its inherent time-series properties.  
  3. **Possibility of More Cycles**: IFRS 9 models should ideally consider multiple economic cycles. Even if your defaults are only reliably measured for 10 years, there may be an earlier cycle in the macro series that helps confirm whether the variable’s behavior is cyclical/stationary.  

### Caveats and Recommendations
1. **Structural Breaks**: If the earlier 10 years of macro data come from a fundamentally different regime (e.g., pre-Euro introduction, or a different regulatory environment) and you do *not* believe it reflects the same economic relationship relevant to your current PD model, including it could *muddy* the stationarity analysis (e.g., you might pick up a break or a trend that no longer applies).  
2. **Check the Subsample**: Even if you test stationarity using 20 years, it is still good practice to *re-test* on the final 10-year period to see if the conclusion is consistent. Sometimes, a variable might be stationary over a 20-year horizon but exhibit near-non-stationary behavior over the last 10 years (or vice versa).  
3. **Modeling Implication**: Stationarity in the long sample is *helpful to know*, but your PD model ultimately uses the 10-year overlap. If the 10-year portion is borderline or suggests a different property (e.g., near-unit root), you might want to treat it more conservatively. Real estate prices, for instance, may look trend-stationary over 20–30 years but appear almost monotonic in a short 10-year sample—leading to contradictory test results.  
4. **Regulatory/Validation Perspective**: You should be transparent about your approach. Document that you used a broader historical window “to enhance the reliability of stationarity tests,” then mention any additional checks on the restricted sample.  

---

## 3. Practical Tips and Best Practice

1. **Test at the Frequency You *Intend to Model***  
   - In many cases, teams do parallel tests: they test stationarity on the highest available frequency (for better power), but also test the exact series that goes into the model (e.g., annual transformations). This way, you have a thorough understanding and a consistent story for validators.  
2. **When in Doubt, Transform**  
   - If there is any ambiguity—especially with limited data (10 years) or contradictory test outcomes—many modelers opt for a safe approach like differencing or using growth rates. This reduces the risk of spurious regressions.  
3. **Document Everything**  
   - IFRS 9 models are often under scrutiny by auditors, regulators, and internal validators. If you tested stationarity on 20 years but used only 10 years for PD modeling, clearly explain why you did it (e.g., more data for robust testing, consistent macro definition over 20 years, only 10 years of default data, etc.).  
4. **Structural Breaks and Economic Logic**  
   - Even with 20 years, check for changes in macro regimes (especially in Europe—pre-/post-Euro, financial crisis, etc.). If you find a break, consider running stationarity tests on each sub-regime or use break-adjusted tests (e.g., Zivot-Andrews).  
5. **No Hard Regulatory Prohibition**  
   - There is no explicit IFRS 9 rule that forbids using different sample spans for stationarity vs. PD modeling. IFRS 9 guidance focuses on *sound methodology* and *forward-looking models* that are not spurious. As long as you can demonstrate statistical rigor and logical consistency, it is typically acceptable.  

---

### Bottom Line

- **Testing quarterly macro data even if the target is annual** is common and acceptable, especially if you then appropriately **aggregate or transform** the variables for annual modeling *and* confirm they still behave as expected (stationary vs. non-stationary) post-aggregation.  
- **Testing stationarity on a longer macro series** than the final PD time window is also acceptable. It provides more robust statistical evidence—*provided* you check for consistency over the shorter horizon.  
- Always **explain and justify** any difference in frequencies or sample periods to stakeholders, including how you ensured the final model inputs are consistent with your stationarity findings.

In [1]:
import pandas as pd
import pandasdmx as sdmx
import numpy as np
from statsmodels.tsa.stattools import adfuller, kpss, zivot_andrews
import requests
import warnings
from statsmodels.tools.sm_exceptions import InterpolationWarning

# Optionally suppress the warning:
warnings.filterwarnings("ignore", category=InterpolationWarning)
warnings.filterwarnings("ignore", category=RuntimeWarning)

# Ensure no limitations in Jupyter Notebook output
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)
pd.set_option('display.max_colwidth', None)

  warn(


### A. Retrieve macroeconomic data

In [2]:
# Parametrisation
countries = [
    "AUT",  # Austria
    "BEL",  # Belgium
    "BGR",  # Bulgaria
    "CYP",  # Cyprus
    "CZE",  # Czechia
    "DEU",  # Germany
    "DNK",  # Denmark
    "EA20", # Euro area (20 countries)
    "ESP",  # Spain
    "EST",  # Estonia
    "EU27", # European Union (27 countries)
    "FIN",  # Finland
    "FRA",  # France
    "GBR",  # United Kingdom
    "GRC",  # Greece
    "HRV",  # Croatia
    "HUN",  # Hungary
    "IRL",  # Ireland
    "ITA",  # Italy
    "LTU",  # Lithuania
    "LUX",  # Luxembourg
    "LVA",  # Latvia
    "MLT",  # Malta
    "NLD",  # Netherlands
    "POL",  # Poland
    "PRT",  # Portugal
    "ROM",  # Romania
    "SVK",  # Slovakia
    "SVN",  # Slovenia
    "SWE",  # Sweden
]

macroeconomic_variable = 'OVGD'

In [3]:
def retrieve_macro_series(countries, macroeconomic_variable):
    # Build the series key:
    series = f"AME/A.{'+'.join(countries)}.1.0.0.0.{macroeconomic_variable}"
    url = 'https://sdw-wsrest.ecb.europa.eu/service/data/'

    # Headers used as content negotiation to return data in json format
    headers = {'Accept':'application/json'}
    r = requests.get(f'{url}{series}', headers=headers).json()

    # Process request
    date_list = r['structure']['dimensions']['observation'][0]['values']
    dates = {i: v['id'] for i, v in enumerate(date_list)}    
    areas = [v['name'] for v in r['structure']['dimensions']['series'][1]['values']]
    
    df = pd.DataFrame()
    for i, area in enumerate(areas):
        s_key = f'0:{i}:0:0:0:0:0'
        s_list = r['dataSets'][0]['series'][s_key]['observations']
        df[area] = pd.Series({dates[int(i)]: v[0] for i, v in s_list.items()})
    
    df.index = df.index.astype(int)
    df = df.sort_index()
    
    return df

In [4]:
# Get annual changes of GDP and unemployment series
gdp_df = retrieve_macro_series(countries, 'OVGD')
gdp_df = gdp_df.pct_change(fill_method=None)

unemployment_df = retrieve_macro_series(countries, 'ZUTN')

In [5]:
gdp_df.loc[2000: 2024]

Unnamed: 0,Austria,Belgium,Bulgaria,Cyprus,Czech Republic,FR. Germany,Denmark,Euro area (20 countries),Spain,Estonia,European Union (27 countries),Finland,France,United Kingdom,Greece,Croatia,Hungary,Ireland,Italy,Lithuania,Luxembourg,Latvia,Malta,Netherlands,Poland,Portugal,Romania,Slovakia,Slovenia,Sweden
2000,0.031895,0.037167,0.045872,0.059653,0.040107,0.028773,0.03724,0.038944,0.052006,0.100877,0.038939,0.057537,0.04141,0.043417,0.041378,0.029453,0.044094,0.094035,0.038821,0.034186,0.069381,0.05841,0.064134,0.042198,0.046563,0.038162,0.024614,0.007893,0.034979,0.046311
2001,0.01317,0.010996,0.038238,0.039526,0.029172,0.016365,0.0095,0.021853,0.039191,0.058807,0.021758,0.026403,0.018994,0.025727,0.046504,0.031112,0.04062,0.053058,0.020065,0.064841,0.030744,0.064606,-0.007572,0.023236,0.012337,0.019437,0.052182,0.029251,0.028287,0.013608
2002,0.014844,0.017069,0.058719,0.037229,0.015135,-0.002283,0.004565,0.009468,0.027555,0.069331,0.011106,0.016873,0.010678,0.017957,0.046832,0.058033,0.047301,0.058994,0.002699,0.067199,0.032254,0.076655,0.027411,0.002458,0.019013,0.007709,0.05703,0.044172,0.032826,0.022777
2003,0.011416,0.010379,0.052371,0.026233,0.033008,-0.005299,0.004411,0.007532,0.029393,0.075962,0.009081,0.020118,0.009678,0.031523,0.057968,0.055684,0.039388,0.030139,0.000665,0.105519,0.026194,0.084291,0.036933,0.000978,0.035234,-0.009305,0.023412,0.048568,0.03195,0.018809
2004,0.025653,0.035712,0.065106,0.050263,0.047363,0.011624,0.027764,0.023121,0.031145,0.068007,0.025924,0.040054,0.028683,0.024579,0.053778,0.041705,0.049631,0.067881,0.01474,0.06502,0.042319,0.087251,0.004073,0.020163,0.050908,0.017887,0.10428,0.053893,0.04546,0.041795
2005,0.023204,0.023217,0.070564,0.048531,0.06375,0.008857,0.023596,0.017718,0.035512,0.095231,0.019247,0.027773,0.018886,0.027327,0.011835,0.043271,0.043003,0.057398,0.007624,0.077315,0.024829,0.116153,0.028813,0.02034,0.032608,0.007819,0.046681,0.064851,0.038541,0.027932
2006,0.032693,0.025523,0.068026,0.047138,0.066232,0.038557,0.038165,0.033215,0.040434,0.097626,0.034944,0.040196,0.02714,0.023807,0.064434,0.050815,0.039339,0.049878,0.017995,0.073954,0.060167,0.12825,0.023361,0.035374,0.062021,0.01625,0.080288,0.089256,0.059088,0.04676
2007,0.037752,0.036769,0.066543,0.05098,0.054888,0.028901,0.009871,0.030114,0.035335,0.075709,0.03156,0.053128,0.025305,0.026249,0.035068,0.050489,0.003324,0.053102,0.014623,0.11078,0.080987,0.104146,0.050364,0.038853,0.067605,0.025066,0.072339,0.108187,0.071392,0.032249
2008,0.014533,0.004469,0.061295,0.036468,0.026123,0.009104,-0.004172,0.004164,0.007671,-0.051253,0.006411,0.007844,0.003802,-0.002488,0.000575,0.019686,0.009934,-0.044841,-0.010231,0.025994,-0.003002,-0.033887,0.044107,0.021168,0.043837,0.003192,0.093074,0.053634,0.033722,-0.009231
2009,-0.035863,-0.019065,-0.033471,-0.020153,-0.047983,-0.055452,-0.049745,-0.044827,-0.037681,-0.146302,-0.043487,-0.08076,-0.028246,-0.046205,-0.041193,-0.068139,-0.067402,-0.050958,-0.053051,-0.148386,-0.032389,-0.1604,-0.013953,-0.036653,0.026151,-0.031221,-0.055167,-0.055054,-0.075906,-0.042556


In [6]:
unemployment_df.loc[2000: 2024]

Unnamed: 0,Austria,Belgium,Bulgaria,Cyprus,Czech Republic,FR. Germany,Denmark,Euro area (20 countries),Spain,Estonia,European Union (27 countries),Finland,France,United Kingdom,Greece,Croatia,Hungary,Ireland,Italy,Lithuania,Luxembourg,Latvia,Malta,Netherlands,Poland,Portugal,Romania,Slovakia,Slovenia,Sweden
2000,3.8,7.1,19.6,4.9,8.8,7.4,4.6,,13.9,14.6,9.9,9.9,8.6,5.459822,11.6,15.6,6.2,4.5,10.7,16.4,2.4,14.5,6.6,3.6,16.8,4.8,8.9,18.8,6.8,6.7
2001,3.9,6.7,23.6,3.9,8.2,7.5,4.6,,10.6,13.0,9.5,9.2,7.8,5.099012,11.0,16.0,5.5,4.2,9.7,17.3,2.3,13.9,6.9,2.8,19.0,4.8,8.3,19.3,6.2,5.0
2002,4.3,7.6,21.1,3.6,7.3,8.2,4.6,,11.5,11.2,9.8,9.2,7.9,5.188219,10.6,15.0,5.6,4.7,9.1,13.7,2.9,12.6,6.9,3.4,20.7,6.0,10.5,18.7,6.3,5.2
2003,4.6,8.3,15.9,4.3,7.8,9.3,5.4,,11.5,10.3,9.9,9.1,8.5,5.012749,10.0,14.2,5.7,4.8,8.8,12.5,3.7,11.7,7.6,4.5,20.4,7.5,8.5,17.6,6.7,5.8
2004,5.9,8.5,14.1,4.7,8.3,10.2,5.5,,11.0,10.1,10.1,8.9,8.9,4.753824,10.8,13.7,5.9,4.7,8.1,10.9,5.1,11.8,7.2,5.6,19.7,7.8,9.9,18.2,6.3,6.6
2005,6.0,8.6,11.7,5.3,7.9,10.5,4.8,,9.2,8.0,9.8,8.5,8.9,4.830368,10.2,12.8,7.0,4.6,7.8,8.3,4.5,10.1,6.9,7.2,18.5,9.0,8.8,16.3,6.5,7.6
2006,5.7,8.4,10.5,4.6,7.2,9.6,3.9,,8.5,5.9,8.8,7.8,8.8,5.422897,9.2,11.3,7.3,4.8,6.9,5.8,4.7,7.1,6.8,6.1,14.4,9.1,8.9,13.4,6.0,7.2
2007,5.3,7.6,8.0,3.9,5.3,8.1,3.8,,8.2,4.6,7.7,7.0,8.0,5.332777,8.6,9.9,7.2,5.0,6.2,4.3,4.1,6.2,6.5,5.2,10.0,9.5,7.8,11.1,4.9,6.3
2008,4.4,7.1,6.5,3.7,4.4,7.0,3.7,,11.3,5.5,7.4,6.5,7.4,5.685386,8.0,8.6,7.6,6.8,6.8,5.8,5.1,7.8,6.0,4.5,7.4,9.0,7.1,9.5,4.4,6.3
2009,5.7,8.0,7.9,5.4,6.7,7.3,6.4,9.7,17.9,13.5,9.3,8.3,9.1,7.613899,9.8,9.2,9.7,12.6,7.9,13.8,5.1,17.7,6.9,5.4,8.5,11.2,8.4,12.0,5.9,8.5


### B. Analyse stationarity

In [7]:
def stationarity_testing(
    df, 
    test='adf', 
    start_year_range=range(2001, 2015), 
    end_year=2024, 
    **kwargs
):
    """
    Apply a stationarity test to each time series (DataFrame column) over sliding windows.
    
    Parameters:
    -----------
    df : pd.DataFrame
        Time series data with time as the index and series (e.g., countries) as columns.
    test : str, optional (default='adf')
        Stationarity test to use. Options are:
          - 'adf': Augmented Dickey–Fuller test
          - 'kpss': KPSS test
          - 'pp': Phillips–Perron test
          - 'zivot': Zivot–Andrews test
    start_year_range : iterable, optional (default=range(2009, 2016))
        Iterable of starting years for the sliding window.
    end_year : int, optional (default=2024)
        The end year for the sliding window.
    **kwargs :
        Additional keyword arguments to pass to the chosen test function.
        
    Returns:
    --------
    pd.DataFrame
        A DataFrame with series as rows and sliding window labels as columns, 
        containing the p-values from the stationarity tests.
    """
    results = {}
    
    for country in df.columns:
        p_values_by_window = {}
        for start_year in start_year_range:
            # Select the sub-series for the current window.
            ts_sub = df.loc[start_year:end_year, country].dropna()
            if len(ts_sub) < 2:
                continue

            try:
                # Use explicit if/elif to choose the test.
                if test.lower() == 'adf':
                    test_result = adfuller(ts_sub, **kwargs)
                    p_value = test_result[1]
                elif test.lower() == 'kpss':
                    test_result = kpss(ts_sub, **kwargs)
                    p_value = test_result[1]
                elif test.lower() == 'zivot':
                    test_result = zivot_andrews(ts_sub, **kwargs)
                    p_value = test_result[1]
                else:
                    raise ValueError(f"Unknown test: {test}. Choose 'adf', 'kpss', 'pp', or 'zivot'.")
            except:
                p_value = 0.5
                
            
            window_label = f"{start_year}-{end_year}"
            p_values_by_window[window_label] = p_value
        
        results[country] = p_values_by_window
    
    return pd.DataFrame(results).T

def highlight_pvalue(val):
    """
    Return a CSS style string based on the p-value:
      - p < 0.05: green
      - 0.05 <= p < 0.10: orange
      - p >= 0.10: red
    The function applies white, bold text.
    """
    if pd.isna(val):
        return ""
    elif val < 0.05:
        bg_color = "rgba(0, 128, 0, 0.4)"       # Green with 50% transparency
    elif val < 0.10:
        bg_color = "rgba(255, 165, 0, 0.4)"     # Orange with 50% transparency
    else:
        bg_color = "rgba(255, 0, 0, 0.4)"         # Red with 50% transparency
    return f"background-color: {bg_color}; color: black;"

def highlight_pvalue(val, test='adf'):
    """
    Color cells based on p-value and test type:
      - For ADF or Zivot–Andrews: small p < 0.05 => stationary (green)
      - For KPSS: small p < 0.05 => non-stationary (red)
    """
    if pd.isna(val):
        return ""
    
    if test.lower() in ['adf', 'zivot']:
        # Null: non-stationary
        if val < 0.05:
            bg_color = "rgba(0, 128, 0, 0.4)"     # Green if we reject unit root
        elif val < 0.10:
            bg_color = "rgba(255, 165, 0, 0.4)"   # Orange
        else:
            bg_color = "rgba(255, 0, 0, 0.4)"     # Red
    elif test.lower() == 'kpss':
        # Null: stationarity
        # So small p-value => we reject stationarity => series likely non-stationary => color it red
        if val < 0.05:
            bg_color = "rgba(255, 0, 0, 0.4)"     # Red
        elif val < 0.10:
            bg_color = "rgba(255, 165, 0, 0.4)"   # Orange
        else:
            bg_color = "rgba(0, 128, 0, 0.4)"     # Green
    else:
        bg_color = ""
    
    return f"background-color: {bg_color}; color: black;"

In [8]:
# Test unemployment (options: adf, kpss, zivot)
test = 'adf'

df_results = stationarity_testing(gdp_df.loc['2000':'2024'], test=test)
df_results = df_results.sort_index()

df_results_styled = (
    df_results
    .style
    .format("{:.2%}")
    .map(lambda x: highlight_pvalue(x, test=test))
)

df_results_styled

Unnamed: 0,2001-2024,2002-2024,2003-2024,2004-2024,2005-2024,2006-2024,2007-2024,2008-2024,2009-2024,2010-2024,2011-2024,2012-2024,2013-2024,2014-2024
Austria,0.21%,0.08%,0.00%,0.00%,0.00%,0.00%,14.57%,18.41%,15.32%,0.05%,2.35%,22.66%,45.38%,0.20%
Belgium,0.01%,0.01%,0.00%,0.00%,0.00%,0.00%,0.00%,0.00%,0.00%,4.52%,0.00%,62.52%,29.30%,26.92%
Bulgaria,0.50%,0.50%,0.56%,0.37%,0.15%,0.04%,0.00%,0.00%,67.53%,0.02%,0.04%,0.06%,0.01%,0.00%
Croatia,0.12%,0.12%,0.12%,0.17%,0.20%,0.21%,82.34%,0.40%,0.20%,0.32%,0.42%,0.30%,52.76%,28.70%
Cyprus,2.08%,2.58%,3.26%,3.89%,4.54%,5.33%,5.99%,7.77%,0.00%,35.34%,30.02%,6.32%,13.22%,9.68%
Czech Republic,0.19%,0.97%,0.06%,0.42%,0.20%,0.05%,16.10%,23.98%,2.76%,0.30%,2.50%,7.75%,0.42%,1.06%
Denmark,0.02%,0.02%,0.03%,0.05%,0.08%,0.06%,0.25%,0.21%,0.00%,0.00%,23.59%,9.76%,17.64%,0.60%
Estonia,0.94%,3.47%,3.26%,3.42%,2.14%,1.07%,0.82%,2.93%,0.00%,1.50%,0.83%,1.69%,2.99%,4.69%
Euro area (20 countries),0.00%,0.00%,0.00%,0.00%,0.00%,0.00%,35.01%,44.69%,21.49%,8.80%,8.41%,0.06%,48.84%,0.18%
European Union (27 countries),0.02%,0.00%,0.00%,0.00%,0.01%,0.01%,37.21%,43.80%,14.40%,4.86%,5.43%,0.07%,49.34%,0.19%


In [9]:
# Test unemployment (options: adf, kpss, zivot)
test = 'adf'

df_results = stationarity_testing(unemployment_df.loc['2000':'2024'], test=test)
df_results = df_results.sort_index()

df_results_styled = (
    df_results
    .style
    .format("{:.2%}")
    .map(lambda x: highlight_pvalue(x, test=test))
)

df_results_styled

Unnamed: 0,2001-2024,2002-2024,2003-2024,2004-2024,2005-2024,2006-2024,2007-2024,2008-2024,2009-2024,2010-2024,2011-2024,2012-2024,2013-2024,2014-2024
Austria,0.28%,0.04%,0.88%,1.40%,1.97%,3.36%,20.75%,8.23%,2.13%,2.07%,2.94%,6.71%,11.51%,13.16%
Belgium,75.93%,81.54%,60.21%,57.37%,0.02%,69.92%,81.11%,83.22%,97.69%,59.04%,100.00%,87.51%,72.18%,48.20%
Bulgaria,99.80%,87.39%,51.19%,82.18%,81.91%,85.38%,89.87%,89.40%,92.10%,93.01%,86.63%,58.24%,0.47%,0.05%
Croatia,96.23%,85.85%,99.51%,99.72%,100.00%,99.91%,99.87%,99.66%,99.65%,95.93%,95.65%,91.26%,0.00%,55.07%
Cyprus,29.45%,5.16%,16.20%,16.64%,9.82%,87.54%,88.49%,17.27%,100.00%,98.43%,97.91%,94.43%,0.00%,2.89%
Czech Republic,99.32%,99.90%,99.90%,99.38%,0.00%,0.35%,0.00%,87.52%,85.69%,58.71%,65.05%,27.74%,6.84%,2.73%
Denmark,31.84%,23.78%,43.43%,44.80%,0.04%,0.39%,0.00%,0.12%,0.01%,0.67%,0.00%,0.01%,18.56%,0.00%
Estonia,20.08%,26.76%,5.63%,0.02%,2.33%,40.23%,36.52%,40.50%,40.51%,0.00%,0.26%,4.55%,5.77%,13.46%
Euro area (20 countries),99.36%,99.36%,99.36%,99.36%,99.36%,99.36%,99.36%,99.36%,99.36%,96.25%,95.80%,90.48%,39.42%,22.15%
European Union (27 countries),96.90%,99.73%,99.87%,66.17%,98.80%,67.96%,90.73%,90.30%,98.53%,94.83%,42.52%,39.04%,45.23%,5.62%


In [10]:
# Test unemployment (options: adf, kpss, zivot)
test = 'zivot'

df_results = stationarity_testing(unemployment_df.loc['2000':'2024'], test=test)
df_results = df_results.sort_index()

df_results_styled = (
    df_results
    .style
    .format("{:.2%}")
    .map(lambda x: highlight_pvalue(x, test=test))
)

df_results_styled

Unnamed: 0,2001-2024,2002-2024,2003-2024,2004-2024,2005-2024,2006-2024,2007-2024,2008-2024,2009-2024,2010-2024,2011-2024,2012-2024,2013-2024,2014-2024
Austria,25.77%,0.30%,5.42%,9.82%,11.07%,19.40%,14.50%,44.41%,21.22%,38.01%,50.00%,50.00%,0.00%,50.00%
Belgium,3.31%,6.73%,50.00%,9.78%,50.00%,nan%,nan%,1.15%,50.00%,50.00%,50.00%,51.04%,99.09%,99.59%
Bulgaria,0.07%,9.11%,0.07%,50.00%,nan%,nan%,50.00%,50.00%,0.00%,50.00%,50.00%,98.31%,99.39%,96.21%
Croatia,50.00%,nan%,39.66%,50.00%,98.18%,50.00%,0.08%,nan%,5.08%,50.00%,50.00%,50.00%,50.00%,99.31%
Cyprus,50.00%,50.00%,94.89%,50.00%,nan%,50.00%,50.00%,nan%,nan%,nan%,0.00%,50.00%,50.00%,50.00%
Czech Republic,0.00%,0.03%,50.00%,0.01%,50.00%,50.00%,50.00%,nan%,50.00%,50.00%,50.00%,50.00%,99.39%,99.12%
Denmark,50.00%,50.00%,nan%,50.00%,0.12%,50.00%,nan%,0.00%,nan%,50.00%,50.00%,0.00%,50.00%,50.00%
Estonia,56.85%,63.57%,65.95%,59.86%,50.00%,nan%,50.00%,nan%,nan%,75.79%,0.00%,50.00%,0.00%,94.91%
Euro area (20 countries),nan%,nan%,nan%,nan%,nan%,nan%,nan%,nan%,nan%,nan%,90.54%,50.00%,50.00%,50.00%
European Union (27 countries),50.00%,50.00%,50.00%,50.00%,50.00%,50.00%,50.00%,3.50%,nan%,50.00%,50.00%,50.00%,50.00%,99.44%
