# Appendex: Ljung-Box Test

The Ljung-Box test is a type of statistical test used primarily in time series analysis to check for the absence of autocorrelation at multiple lag lengths. In simpler terms, it's used to determine whether past values in the time series affect future values.

## Understanding the Ljung-Box Test:

- Purpose: It's commonly applied to residuals from a model to confirm that the model captures all the patterns in the data, and the residuals are random (white noise). If the residuals show significant autocorrelation, that indicates the model is not adequately capturing some aspect of the data's structure.

- How It Works: The test examines a group of autocorrelations simultaneously and computes a single test statistic. It tests the null hypothesis that the data are independently distributed (i.e., the correlations for lagged terms are zero).

## Components of the Test:

1. X-squared (χ²) Statistic:

- This is the test statistic calculated by the Ljung-Box test. It's essentially a sum of the squared autocorrelations of the series, adjusted for the sample size and the number of lags being tested.

- A higher χ² value indicates more evidence against the null hypothesis. It means there's a stronger suggestion that there's autocorrelation at one of the tested lags.

2. p-value:

- The p-value tells you the probability of seeing the test results under the null hypothesis. In the context of the Ljung-Box test, it indicates the probability of observing the computed χ² statistic (or one more extreme) if the null hypothesis of no autocorrelation is true.

- The p-value tells you the probability of seeing the test results under the null hypothesis. In the context of the Ljung-Box test, it indicates the probability of observing the computed χ² statistic (or one more extreme) if the null hypothesis of no autocorrelation is true.

- Conversely, a high p-value suggests that the time series may be random (i.e., no significant autocorrelation).

## When to Use the test:

- **Model Diagnostics**: After fitting a time series model, you might use the Ljung-Box test on the residuals to check if they appear to be white noise.

- **Number of Lags**: The number of lags used in the test affects its power. Too few lags might miss autocorrelation, while too many might dilute the test with unnecessary comparisons.

In practice, the Ljung-Box test is a tool among many that analysts use to understand time series data and assess model fit. It's often used in combination with other diagnostic tools and plots to build a comprehensive understanding of the data and the model's performance.

In [4]:
import yfinance as yf
import numpy as np
import pandas as pd
from statsmodels.stats.diagnostic import acorr_ljungbox

# AAPL stock data
aapl = yf.download('AAPL', start='2020-01-01', end='2020-12-31')

# 일일 종가를 사용하여 수익률 계산
aapl['Return'] = aapl['Close'].pct_change()

# 첫 번째 NaN 값 제거
aapl = aapl.dropna()

# 정 박스 검정 수행
result = acorr_ljungbox(aapl['Return'], lags=[10], return_df=False)

# Print result value
result
#print(f"Ljung-Box Test Statistic: {lb_stat}")
#print(f"P-value: {p_value}")

[*********************100%%**********************]  1 of 1 completed


Unnamed: 0,lb_stat,lb_pvalue
10,69.541689,5.435371e-11


The results from your Ljung-Box test on the AAPL stock's daily returns show a lb_stat (Ljung-Box statistic) of 69.541689 and a lb_pvalue (p-value) of approximately 5.435371e-11. Here's how to interpret these results:

## 1. Ljung-Box Statistic (lb_stat = 69.541689):

- This is a relatively high value. The Ljung-Box statistic is a measure of the overall significance of the autocorrelations up to the specified lag (in your case, probably 10 lags). A higher value typically indicates stronger evidence against the null hypothesis, which in this context is that there is no autocorrelation in the time series data.

## 2. P-value (lb_pvalue ≈ 5.435371e-11):

- The p-value is a very small number (close to zero). In statistical testing, a small p-value (commonly < 0.05) is usually interpreted as strong evidence against the null hypothesis. In your case, this small p-value suggests that you can reject the null hypothesis of no autocorrelation.

- Essentially, this means that there is strong statistical evidence that the time series data (AAPL stock's daily returns) exhibit autocorrelation at least at one of the lags tested.

## Interpretation:

Given the high Ljung-Box statistic and the very small p-value, it is highly likely that the AAPL stock's daily returns are not random and do exhibit significant autocorrelation. This could imply that past values have some influence on future values in this time series.

## Implications:

- For model building: If these returns were residuals from a model, the presence of autocorrelation would suggest that the model has not fully captured all the dynamics in the data, and further refinement of the model might be necessary.

- For investment or trading strategies: The presence of autocorrelation might be exploited or accounted for in the development of strategies.
