# 输入特征

5-day 自回归

# 特征建模

[开盘价，最高价-开盘价，最低价-开盘价，收盘价-开盘价，交易量]

## 纳斯达克指数数据解释
Date: 2015-01-02，表示交易日期。
Ticker: IXIC，代表纳斯达克综合指数。
Open: 4760.240234375，开盘价。
High: 4777.009765625，当天最高价。
Low: 4698.10986328125，当天最低价。
Close: 4726.81005859375，收盘价。
Adjusted: 4726.81005859375，调整后的收盘价（考虑分红、拆股等因素）。
Returns: nan，收益率（此处为缺失值）。
Volume: 1435150000，交易量。
## 苹果公司股票数据解释
Date: 2015-01-05，交易日期。
Ticker: AAPL，代表苹果公司。
Open: 27.07250022888184，开盘价。
High: 27.162500381469727，当天最高价。
Low: 26.352500915527344，当天最低价。
Close: 26.5625，收盘价。
Adjusted: 23.63528251647949，调整后的收盘价。
Returns: -0.0281718672358495，收益率（表示相对于前一日的涨跌比率）。
Volume: 257142000，交易量。

# 创建模型

In [2]:
from deeplotx import AutoRegression
model = AutoRegression(feature_dim=5, hidden_dim=128, recursive_layers=2)

# 选择数据集

In [3]:
from data_preprocess import get_dataset
batch_size = 16
train_loader, valid_loader = get_dataset('AAPL', batch_size=batch_size)

Tickers:  ['DJI', 'IXIC', 'GSPC', 'AAPL', 'ABBV', 'AXP', 'BA', 'BOOT', 'CALM', 'CAT', 'CL', 'CSCO', 'CVX', 'DD', 'DENN', 'DIS', 'F', 'GE', 'GM', 'GS', 'HON', 'IBM', 'INTC', 'IP', 'JNJ', 'JPM', 'KO', 'LMT', 'MA', 'MCD', 'MG', 'MMM', 'MS', 'MSFT', 'NKE', 'PEP', 'PFE', 'PG', 'RTX', 'SO', 'T', 'TDW', 'V', 'VZ', 'WFC', 'WMT', 'XELB', 'XOM']


# 训练模型

In [4]:
import torch
from torch import nn, optim

num_epochs = 10
elastic_net_param = {'alpha': 1e-4, 'rho': 0.2}
learning_rate = 2e-6
train_loss_threshold = 0.
valid_loss_threshold = 0.
criterion = nn.MSELoss()
optim = optim.Adam(model.parameters(), lr=learning_rate)

for epoch in range(num_epochs):
    model.train()
    total_loss = 0.0
    for batch_texts, batch_labels in train_loader:
        if batch_texts.shape[0] != batch_size:
            continue
        outputs = model.forward(batch_texts, model.initial_state(batch_size=batch_size))[0]
        loss = criterion(outputs, batch_labels) + model.elastic_net(alpha=elastic_net_param['alpha'], rho=elastic_net_param['rho'])
        optim.zero_grad()
        loss.backward()
        optim.step()
        total_loss += loss.item()
    if epoch % 3 == 0:
        total_valid_loss = 0.0
        for batch_texts, batch_labels in valid_loader:
            if batch_texts.shape[0] != batch_size:
                continue
            with torch.no_grad():
                model.eval()
                outputs = model.forward(batch_texts, model.initial_state(batch_size=batch_size))[0]
                loss = criterion(outputs, batch_labels) + model.elastic_net(alpha=elastic_net_param['alpha'], rho=elastic_net_param['rho'])
                total_valid_loss += loss.item()
                model.train()
        print(f"Epoch {epoch + 1}/{num_epochs} | "
              f"Train Loss: {total_loss:.4f} | "
              f"Valid Loss: {total_valid_loss:.4f}")
        if total_valid_loss <= valid_loss_threshold:
            break
    print(f"Epoch {epoch + 1}/{num_epochs} | Train Loss: {total_loss:.4f}")
    if total_loss <= train_loss_threshold:
        break

Epoch 1/10 | Train Loss: 570609739428265984.0000 | Valid Loss: 5498673488199680.0000
Epoch 1/10 | Train Loss: 570609739428265984.0000
Epoch 2/10 | Train Loss: 570859180794052608.0000
Epoch 3/10 | Train Loss: 570735739675869184.0000
Epoch 4/10 | Train Loss: 570059619481681920.0000 | Valid Loss: 5247907393437696.0000
Epoch 4/10 | Train Loss: 570059619481681920.0000
Epoch 5/10 | Train Loss: 570450153610149888.0000
Epoch 6/10 | Train Loss: 570235738507968512.0000
Epoch 7/10 | Train Loss: 570840475943043072.0000 | Valid Loss: 5498975176097792.0000
Epoch 7/10 | Train Loss: 570840475943043072.0000
Epoch 8/10 | Train Loss: 570490088216068096.0000
Epoch 9/10 | Train Loss: 570809418766090240.0000
Epoch 10/10 | Train Loss: 569850391492034560.0000 | Valid Loss: 5500010363879424.0000
Epoch 10/10 | Train Loss: 569850391492034560.0000


# 模型测试

In [6]:
test_loader_large, test_loader_small = get_dataset('GSPC', batch_size=batch_size)
total_eval_loss = 0.0
for batch_texts, batch_labels in test_loader_large:
    if batch_texts.shape[0] != batch_size:
        continue
    with torch.no_grad():
        model.eval()
        outputs = model.forward(batch_texts, model.initial_state(batch_size=batch_size))[0]
        loss = criterion(outputs, batch_labels)
        total_eval_loss += loss.item()
        model.train()
print(f"Eval Loss: {total_eval_loss:.4f}")

Eval Loss: 516106744310857203712.0000
