# Elastic Net 正则化（L1 + L2）

## 核心思想

Elastic Net 结合了 L1 和 L2 正则化的优点，同时获得稀疏性和权重约束。

## 数学表达

```
Loss_total = Loss_original + λ₁ × Σ|wᵢ| + λ₂ × Σwᵢ²
```

## 优势

| 特性 | L1 | L2 | Elastic Net |
|------|----|----|-------------|
| 稀疏性 | 强 | 无 | 中等 |
| 权重约束 | 弱 | 强 | 强 |
| 相关特征处理 | 随机选一个 | 均匀分配 | 分组选择 |

In [None]:
import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt

# 设置随机种子
tf.random.set_seed(42)
np.random.seed(42)

print(f"TensorFlow 版本: {tf.__version__}")

## 1. 基本用法

In [None]:
# L1 + L2 正则化（Elastic Net）
# l1_l2(l1=0.01, l2=0.01) 同时应用 L1 和 L2
layer_elastic = keras.layers.Dense(
    units=64,
    activation='relu',
    kernel_regularizer=keras.regularizers.l1_l2(l1=0.01, l2=0.01),
    kernel_initializer='he_normal'
)

print("Elastic Net 正则化层创建成功")
print(f"L1 系数: 0.01, L2 系数: 0.01")

## 2. 完整模型示例

In [None]:
# 构建带 Elastic Net 正则化的模型
model_elastic = keras.models.Sequential([
    keras.layers.Flatten(input_shape=(28, 28)),
    keras.layers.Dense(
        256, 
        activation='relu',
        kernel_regularizer=keras.regularizers.l1_l2(l1=0.001, l2=0.001),
        kernel_initializer='he_normal'
    ),
    keras.layers.Dense(
        128, 
        activation='relu',
        kernel_regularizer=keras.regularizers.l1_l2(l1=0.001, l2=0.001),
        kernel_initializer='he_normal'
    ),
    keras.layers.Dense(10, activation='softmax')
])

model_elastic.compile(
    optimizer='adam',
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

model_elastic.summary()

## 3. 三种正则化对比

In [None]:
# 加载数据
(X_train_full, y_train_full), (X_test, y_test) = keras.datasets.fashion_mnist.load_data()
X_valid, X_train = X_train_full[:5000] / 255.0, X_train_full[5000:] / 255.0
y_valid, y_train = y_train_full[:5000], y_train_full[5000:]
X_test = X_test / 255.0

print(f"训练集: {X_train.shape}")

In [None]:
def create_model(regularizer_type='none', lambda_val=0.001):
    """
    创建带指定正则化类型的模型
    
    Parameters:
    -----------
    regularizer_type : str
        正则化类型: 'none', 'l1', 'l2', 'elastic'
    lambda_val : float
        正则化系数
    """
    if regularizer_type == 'l1':
        reg = keras.regularizers.l1(lambda_val)
    elif regularizer_type == 'l2':
        reg = keras.regularizers.l2(lambda_val)
    elif regularizer_type == 'elastic':
        reg = keras.regularizers.l1_l2(l1=lambda_val, l2=lambda_val)
    else:
        reg = None
    
    model = keras.models.Sequential([
        keras.layers.Flatten(input_shape=(28, 28)),
        keras.layers.Dense(128, activation='relu', kernel_regularizer=reg),
        keras.layers.Dense(64, activation='relu', kernel_regularizer=reg),
        keras.layers.Dense(10, activation='softmax')
    ])
    
    model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
    return model

# 训练各种正则化模型
results = {}
for reg_type in ['none', 'l1', 'l2', 'elastic']:
    print(f"训练 {reg_type} 正则化模型...")
    model = create_model(reg_type)
    history = model.fit(X_train, y_train, epochs=15, batch_size=64,
                       validation_data=(X_valid, y_valid), verbose=0)
    results[reg_type] = history.history
    print(f"  验证准确率: {history.history['val_accuracy'][-1]:.4f}")

In [None]:
# 可视化对比
fig, axes = plt.subplots(1, 2, figsize=(14, 5))
colors = {'none': 'blue', 'l1': 'green', 'l2': 'red', 'elastic': 'purple'}

for reg_type, history in results.items():
    axes[0].plot(history['val_accuracy'], color=colors[reg_type], label=reg_type)
    axes[1].plot(history['val_loss'], color=colors[reg_type], label=reg_type)

axes[0].set_title('验证准确率')
axes[0].set_xlabel('Epoch')
axes[0].set_ylabel('Accuracy')
axes[0].legend()
axes[0].grid(True, alpha=0.3)

axes[1].set_title('验证损失')
axes[1].set_xlabel('Epoch')
axes[1].set_ylabel('Loss')
axes[1].legend()
axes[1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

## 4. 选择建议

| 场景 | 推荐正则化 |
|------|------------|
| 需要特征选择 | L1 |
| 防止过拟合 | L2 |
| 特征高度相关 | Elastic Net |
| 不确定时 | 先尝试 L2 |

In [None]:
# 验证代码正确性
print("Elastic Net 正则化模块测试完成")
print("\n关键要点:")
print("1. Elastic Net = L1 + L2，结合两者优点")
print("2. 适合高度相关特征的情况")
print("3. 通过调整 l1 和 l2 系数控制偏向")