# Deeplearning Project (PyTorch)
## Application of Mamba Model on Time Series Forecasting
### Author: ZHANG Xiaopeng

### Project Syllabus
1. **学习和理解RNN、S3、S4、Mamba、Mamba2模型**
   - **目标**: 掌握模型的基本概念和工作原理。
   - **任务**: 
     - 阅读相关文献、教程和技术博客。
     - 观看在线课程或讲座。
     - 撰写学习笔记，记录关键概念和问题。
2. **在Notebook上用PyTorch代码实现模型**
   - **目标**: 熟练使用PyTorch，实现并训练模型。
   - **任务**: 
     - 安装和配置PyTorch环境。
     - 在Jupyter Notebook上编写代码，实现基础模型。
     - 测试并调试模型，确保其能够正常运行。
3. **在时间序列数据集上测试模型**
   - **目标**: 在实际数据集上验证模型的性能，评估其预测效果。
   - **任务**: 
     - 选择和准备合适的时间序列数据集。
     - 数据预处理和分割（训练集、验证集、测试集）。
     - 训练模型并进行评估，记录模型性能指标。
4. **优化模型，使用Patch等技术实现长距离预测**
   - **目标**: 提高模型的预测能力，特别是长距离预测的效果。
   - **任务**: 
     - 学习和理解Patch技术及其他优化方法。
     - 应用这些技术对模型进行优化和改进。
     - 在相同数据集上进行对比实验，记录优化前后的性能变化。
(可选)5. **整合进U-Net模型，并完成实验对比结果分析**
   - **目标**: 将优化后的模型整合到U-Net架构中，进行综合实验，并分析结果。
   - **任务**: 
     - 学习U-Net模型的架构和实现方法。
     - 将优化后的模型嵌入U-Net架构中。
     - 在相同数据集上进行实验，对比U-Net模型在不同情况下的性能。
     - 分析实验结果，撰写总结报告。
### 最终目标
通过以上步骤，你将能够：
- 深入理解RNN、S3、S4、Mamba、Mamba2模型及其原理。
- 熟练使用PyTorch实现和训练时间序列预测模型。
- 提升模型的预测能力，特别是针对长距离预测的优化。
- 将优化后的模型整合进U-Net架构中，并进行全面的实验分析。
祝你顺利完成这个暑期项目！

### Project Syllabus
1. **Learn and understand RNN, S3, S4, Mamba, Mamba2 models**
   - **Goal**: Master the basic concepts and working principles of the models.
   - **Tasks**:
     - Read relevant literature, tutorials, and technical blogs.
     - Watch online courses or lectures.
     - Write study notes, documenting key concepts and questions.
2. **Implement models with PyTorch on Notebook**
   - **Goal**: Proficiently use PyTorch to implement and train models.
   - **Tasks**:
     - Install and configure the PyTorch environment.
     - Write code on Jupyter Notebook to implement basic models.
     - Test and debug the models to ensure they run smoothly.
3. **Test models on time series datasets**
   - **Goal**: Validate the performance of the models on real-world datasets and evaluate their predictive capabilities.
   - **Tasks**:
     - Select and prepare suitable time series datasets.
     - Preprocess and split the data (training set, validation set, test set).
     - Train the models, evaluate their performance, and record model performance metrics.
4. **Optimize models and implement long-range prediction using Patch and other techniques**
   - **Goal**: Improve the predictive capabilities of the models, especially for long-range predictions.
   - **Tasks**:
     - Learn and understand Patch techniques and other optimization methods.
     - Apply these techniques to optimize and improve the models.
     - Conduct comparative experiments on the same dataset, documenting performance changes before and after optimization.
(Optional) 5. **Integrate into the U-Net model and analyze experimental comparison results**
   - **Goal**: Integrate the optimized models into the U-Net architecture, conduct comprehensive experiments, and analyze the results.
   - **Tasks**:
     - Learn the architecture and implementation methods of the U-Net model.
     - Embed the optimized models into the U-Net architecture.
     - Conduct experiments on the same dataset, comparing the performance of the U-Net model under different conditions.
     - Analyze the experimental results and write a summary report.
### Final Goal
Through the above steps, you will be able to:
- Gain in-depth understanding of RNN, S3, S4, Mamba, Mamba2 models and their principles.
- Proficiently implement and train time series prediction models using PyTorch.
- Enhance the predictive capabilities of the models, especially for long-range predictions.
- Integrate the optimized models into the U-Net architecture and conduct comprehensive experimental analysis.
Wishing you success in completing this summer project!

# 1. Time series data preprocessing

In [1]:
import torch
from torch.utils.data import Dataset, DataLoader
import numpy as np

# Define the custom dataset
class TimeSeriesDataset(Dataset):
    def __init__(self, data, seq_length):
        self.data = data
        self.seq_length = seq_length

    # __len__ method to return the number of sequences
    def __len__(self):
        return len(self.data) - self.seq_length

    # __getitem__ method to get a specific sequence and its target value
    def __getitem__(self, idx):
        return (
            self.data[idx:idx + self.seq_length],   # Input sequence
            self.data[idx + self.seq_length]        # Target value
        )


In [2]:
# Generate example data (sine wave)
data = np.sin(np.linspace(0, 100, 1000))
seq_length = 10

# Instantiate the dataset
dataset = TimeSeriesDataset(torch.tensor(data, dtype=torch.float32), seq_length)

In [3]:
# Dataset is subscriptable
dataset[0]

(tensor([0.0000, 0.0999, 0.1989, 0.2958, 0.3898, 0.4799, 0.5651, 0.6448, 0.7179,
         0.7839]),
 tensor(0.8420))

In [4]:
# Create a DataLoader
dataloader = DataLoader(dataset, batch_size=16, shuffle=True)

# Iterate through the DataLoader
for batch_idx, (inputs, targets) in enumerate(dataloader):
    print(f"Batch {batch_idx + 1}")
    print("Inputs:", inputs)
    print("Targets:", targets)
    # break  # Break after one batch for demonstration

Batch 1
Inputs: tensor([[ 0.9882,  0.9986,  0.9989,  0.9893,  0.9698,  0.9405,  0.9019,  0.8542,
          0.7979,  0.7337],
        [-0.8421, -0.8918, -0.9325, -0.9639, -0.9857, -0.9976, -0.9995, -0.9914,
         -0.9734, -0.9456],
        [ 0.1232,  0.2217,  0.3181,  0.4112,  0.5003,  0.5843,  0.6625,  0.7340,
          0.7982,  0.8544],
        [ 0.3900,  0.4801,  0.5654,  0.6450,  0.7181,  0.7841,  0.8422,  0.8918,
          0.9326,  0.9640],
        [ 0.3178,  0.4110,  0.5000,  0.5841,  0.6623,  0.7338,  0.7980,  0.8543,
          0.9019,  0.9406],
        [-0.6088, -0.6851, -0.7544, -0.8163, -0.8699, -0.9148, -0.9506, -0.9769,
         -0.9933, -0.9999],
        [ 0.8620,  0.8070,  0.7440,  0.6735,  0.5962,  0.5130,  0.4247,  0.3321,
          0.2362,  0.1379],
        [ 0.5839,  0.4999,  0.4108,  0.3177,  0.2213,  0.1228,  0.0230, -0.0770,
         -0.1763, -0.2738],
        [ 0.9881,  0.9678,  0.9378,  0.8984,  0.8501,  0.7932,  0.7283,  0.6562,
          0.5775,  0.4931],
   

In [5]:
# an example of how to access the first batch of data
for batch_idx, (inputs, targets) in enumerate(dataloader):
    if batch_idx == 0:
        print(f"Batch {batch_idx}")
        print("Inputs:", inputs[0])
        print("Targets:", targets[0])

Batch 0
Inputs: tensor([-0.6328, -0.5523, -0.4662, -0.3755, -0.2810, -0.1836, -0.0845,  0.0155,
         0.1153,  0.2140])
Targets: tensor(0.3106)


## 2. Fully Connected Neural Network

In [2]:
# Create a random input dataset
X = np.random.random((1000, 10))

# Create a random output dataset
y = np.random.randint(2, size=(1000, 1))

# Define the fully connected neural network model
class FullyConnectedNN(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super().__init__()
        self.fc1 = nn.Linear(input_size, hidden_size)
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(hidden_size, output_size)
        self.sigmoid = nn.Sigmoid()

    def forward(self, x):
        out = self.fc1(x)
        out = self.relu(out)
        out = self.fc2(out)
        out = self.sigmoid(out)
        return out

# Create an instance of the fully connected neural network model
input_size = 10
hidden_size = 32
output_size = 1
model = FullyConnectedNN(input_size, hidden_size, output_size)

# Convert the input data to a PyTorch tensor
X_tensor = torch.tensor(X, dtype=torch.float32)

# Define the loss function and optimizer
criterion = nn.BCELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

# Train the model
num_epochs = 10
batch_size = 32
total_samples = X.shape[0]
total_batches = total_samples // batch_size

for epoch in range(num_epochs):
    for batch in range(total_batches):
        # Generate a batch of random indices
        indices = np.random.choice(total_samples, size=batch_size, replace=False)
        
        # Get the batch input and target data
        batch_X = X_tensor[indices]
        batch_y = torch.tensor(y[indices], dtype=torch.float32)
        
        # Forward pass
        outputs = model(batch_X)
        loss = criterion(outputs, batch_y)
        
        # Backward pass and optimization
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
    # Print the loss for every epoch
    print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item()}")

# Generate a test dataset
X_test = np.random.random((100, 10))
y_test = np.random.randint(2, size=(100, 1))
X_test_tensor = torch.tensor(X_test, dtype=torch.float32)
y_test_tensor = torch.tensor(y_test, dtype=torch.float32)

Epoch [1/10], Loss: 0.7065831422805786
Epoch [2/10], Loss: 0.7102320194244385
Epoch [3/10], Loss: 0.6973278522491455
Epoch [4/10], Loss: 0.7019574046134949
Epoch [5/10], Loss: 0.678600013256073
Epoch [6/10], Loss: 0.7113694548606873
Epoch [7/10], Loss: 0.6928130388259888
Epoch [8/10], Loss: 0.6777318120002747
Epoch [9/10], Loss: 0.6613604426383972
Epoch [10/10], Loss: 0.6494036912918091


Batched training is a common technique used in machine learning to train models more efficiently. Instead of updating the model's parameters after processing each individual sample, batched training updates the parameters after processing a batch of samples.

There are several reasons why batched training is beneficial:

1. **Efficiency**: Processing samples in batches allows for more efficient computation. Modern hardware, such as GPUs, can perform parallel computations on batches of data, which can significantly speed up the training process. By processing multiple samples simultaneously, the model can take advantage of the hardware's parallel processing capabilities.

2. **Stability**: Batched training provides more stable updates to the model's parameters. By averaging the gradients computed over multiple samples in a batch, the updates to the model's parameters become less noisy and more representative of the overall trend in the data. This can help the model converge to a better solution and reduce the risk of overfitting.

3. **Memory efficiency**: Processing samples in batches reduces the memory requirements during training. Instead of storing the intermediate results for each sample separately, the model can compute the results for multiple samples in a single forward pass. This can be particularly important when working with large datasets that may not fit entirely in memory.

4. **Generalization**: Batched training allows the model to learn from a more diverse set of samples within each batch. By randomly shuffling the data and selecting different batches for each iteration, the model gets exposed to different patterns and variations in the data. This can improve the model's ability to generalize and make accurate predictions on unseen data.

In the provided code, the training process uses batched training by randomly selecting a batch of indices and extracting the corresponding input and target data. The model then performs a forward pass, computes the loss, and updates its parameters based on the gradients calculated from the batch. This process is repeated for multiple epochs to train the model effectively.

## 3. Recurrent Neural network

In [5]:
# Define the RNN model
class RNN(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super().__init__()
        self.hidden_size = hidden_size
        self.rnn = nn.RNN(input_size, hidden_size, batch_first=True)
        self.fc = nn.Linear(hidden_size, output_size)

    def forward(self, x):
        out, _ = self.rnn(x)
        out = self.fc(out[:, -1, :])
        return out

# Define the hyperparameters
input_size = 1
hidden_size = 32
output_size = 1
num_epochs = 10
learning_rate = 0.001

# Create the RNN model
model = RNN(input_size, hidden_size, output_size)

# Define the loss function and optimizer
criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

# Convert the input and target data to tensors
x = torch.tensor([[0.1], [0.2], [0.3], [0.4], [0.5]]).view(-1, 1, 1)  # Example input data
y = torch.tensor([[0.2], [0.3], [0.4], [0.5], [0.6]]).view(-1, 1)  # Example target data

# Training loop
for epoch in range(num_epochs):
    # Forward pass
    outputs = model(x) # model.forward(x)
    loss = criterion(outputs, y)

    # Backward and optimize
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    # Print the loss for each epoch
    print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}')

# Test the trained model
test_input = torch.tensor([[0.6], [0.7], [0.8], [0.9], [1.0]]).view(-1, 1, 1)  # Example test input data
predicted_output = model(test_input) # model.forward(test_input)
print(f'Predicted Output: {predicted_output.detach().numpy()}')

Epoch [1/10], Loss: 0.1077
Epoch [2/10], Loss: 0.1013
Epoch [3/10], Loss: 0.0951
Epoch [4/10], Loss: 0.0892
Epoch [5/10], Loss: 0.0835
Epoch [6/10], Loss: 0.0780
Epoch [7/10], Loss: 0.0729
Epoch [8/10], Loss: 0.0679
Epoch [9/10], Loss: 0.0633
Epoch [10/10], Loss: 0.0589
Predicted Output: [[0.21518499]
 [0.21590489]
 [0.21660629]
 [0.21728857]
 [0.21795109]]
