# 簡述

AIC（Akaike信息准則）和BIC（Bayesian信息准則）是用於模型選擇的兩種常見統計標準。它們有助於在給定數據集上確定哪個模型更適合，以及是否需要考慮模型的複雜性。

## AIC

$AIC = -2ln(L) + 2k$, L是概似估計值, k 是參數數量

目標是 AIC 越小越好. 若多個模型結論相似, 選擇是以 AIC 越小越好.

## BIC

$BIC = -2ln(L) + kln(n) $, n 是樣本大小

BIC 同樣也是越小越好

In [1]:
import numpy as np
import pandas as pd
import statsmodels.api as sm

# 創建一個示例數據集
np.random.seed(0)
X1 = np.random.rand(100)
X2 = np.random.rand(100)
Y = 2 * X1 + 3 * X2 + np.random.rand(100)

# 將數據轉換為DataFrame
data = pd.DataFrame({'X1': X1, 'X2': X2, 'Y': Y})

# 模型1：Y = β0 + β1 * X1
X1_model = sm.add_constant(data['X1'])
model1 = sm.OLS(data['Y'], X1_model).fit()
AIC1 = model1.aic
BIC1 = model1.bic

# 模型2：Y = β0 + β1 * X1 + β2 * X2
X2_model = sm.add_constant(data[['X1', 'X2']])
model2 = sm.OLS(data['Y'], X2_model).fit()
AIC2 = model2.aic
BIC2 = model2.bic

print("AIC for Model 1:", AIC1)
print("BIC for Model 1:", BIC1)
print("AIC for Model 2:", AIC2)
print("BIC for Model 2:", BIC2)

# 選擇較小AIC或BIC的模型
if AIC1 < AIC2:
    print("Model 1 is preferred based on AIC.")
else:
    print("Model 2 is preferred based on AIC.")
    
if BIC1 < BIC2:
    print("Model 1 is preferred based on BIC.")
else:
    print("Model 2 is preferred based on BIC.")


AIC for Model 1: 254.0513159314317
BIC for Model 1: 259.2616563034079
AIC for Model 2: 48.355646053144284
BIC for Model 2: 56.17115661110856
Model 2 is preferred based on AIC.
Model 2 is preferred based on BIC.


In [2]:
import numpy as np
import scipy.stats as stats

# 假設你有一個觀測值和模型預測值的數組
observed_values = np.array([2.5, 3.1, 4.0, 5.2, 6.8])
predicted_values = np.array([2.3, 3.0, 3.8, 5.0, 6.5])

# 計算每個觀測值的似然性
likelihoods = stats.norm.pdf(observed_values, loc=predicted_values, scale=1.0)

# 計算整個數據集的似然性
total_likelihood = np.prod(likelihoods)

print("似然性:", total_likelihood)


似然性: 0.009052695991472468
