Chuy·ªÉn t·ª´ **Ridge Regression** sang **Time-Series LSTM** y√™u c·∫ßu m·ªôt s·ªë thay ƒë·ªïi trong vi·ªác x·ª≠ l√Ω d·ªØ li·ªáu v√† x√¢y d·ª±ng m√¥ h√¨nh. D∆∞·ªõi ƒë√¢y l√† h∆∞·ªõng d·∫´n v√† m·ªôt phi√™n b·∫£n c·∫≠p nh·∫≠t cho class ƒë·ªÉ s·ª≠ d·ª•ng **LSTM**:

---

### **C√°c b∆∞·ªõc chuy·ªÉn ƒë·ªïi:**

1. **Ti·ªÅn x·ª≠ l√Ω d·ªØ li·ªáu cho LSTM:**
   - LSTM y√™u c·∫ßu d·ªØ li·ªáu ƒë·∫ßu v√†o c√≥ ƒë·ªãnh d·∫°ng **3D**: `(samples, timesteps, features)`.
   - Chuy·ªÉn d·ªØ li·ªáu `X_train` v√† `X_test` th√†nh c√°c **sequences** (c·ª≠a s·ªï th·ªùi gian).

2. **X√¢y d·ª±ng m√¥ h√¨nh LSTM:**
   - S·ª≠ d·ª•ng Keras ho·∫∑c TensorFlow ƒë·ªÉ x√¢y d·ª±ng m·ªôt m·∫°ng LSTM.
   - K√≠ch th∆∞·ªõc ƒë·∫ßu v√†o (input shape) ph·∫£i ph√π h·ª£p v·ªõi s·ªë l∆∞·ª£ng timesteps v√† features.

3. **ƒê√†o t·∫°o m√¥ h√¨nh LSTM:**
   - Chia d·ªØ li·ªáu th√†nh `train` v√† `test` v·ªõi `shuffle=False`.
   - S·ª≠ d·ª•ng callback (v√≠ d·ª•: EarlyStopping) ƒë·ªÉ tr√°nh overfitting.

4. **ƒê√°nh gi√° v√† tr·ª±c quan h√≥a k·∫øt qu·∫£:**
   - T√≠nh c√°c ch·ªâ s·ªë nh∆∞ `R2`, `MSE`, v√† `MAPE`.
   - V·∫Ω c√°c bi·ªÉu ƒë·ªì t∆∞∆°ng t·ª± nh∆∞ tr∆∞·ªõc.

---

### **C·∫≠p nh·∫≠t class ƒë·ªÉ s·ª≠ d·ª•ng LSTM:**

```python
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.metrics import r2_score, mean_squared_error, mean_absolute_error, mean_absolute_percentage_error
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense
from tensorflow.keras.callbacks import EarlyStopping
from sklearn.preprocessing import MinMaxScaler
from matplotlib.ticker import FuncFormatter

class TimeSeriesLSTMModel:
    def __init__(self, file_path, look_back=10):
        """
        Initialize the class with the file path of the dataset and look-back window.
        """
        self.file_path = file_path
        self.look_back = look_back
        self.data = None
        self.model = None
        self.scaler = MinMaxScaler(feature_range=(0, 1))

    def load_and_preprocess_data(self):
        """
        Load the dataset and preprocess the data for LSTM.
        """
        # Load data
        self.data = pd.read_csv(self.file_path)
        self.data["close_tomor"] = self.data["close"].shift(-1)
        self.data = self.data.iloc[:-1]
        
        # Scaling data
        self.data_scaled = self.scaler.fit_transform(self.data[['close_tomor']])
        
        # Create sequences
        X, y = [], []
        for i in range(self.look_back, len(self.data_scaled)):
            X.append(self.data_scaled[i - self.look_back:i, 0])  # Sequence of look_back days
            y.append(self.data_scaled[i, 0])  # Target value

        X, y = np.array(X), np.array(y)
        X = X.reshape((X.shape[0], X.shape[1], 1))  # Reshape to (samples, timesteps, features)
        
        # Split into training and testing sets
        train_size = int(len(X) * 0.75)
        X_train, X_test = X[:train_size], X[train_size:]
        y_train, y_test = y[:train_size], y[train_size:]

        return X_train, X_test, y_train, y_test

    def build_model(self):
        """
        Build the LSTM model.
        """
        self.model = Sequential([
            LSTM(50, activation='relu', return_sequences=True, input_shape=(self.look_back, 1)),
            LSTM(50, activation='relu'),
            Dense(1)
        ])
        self.model.compile(optimizer='adam', loss='mse')

    def train_model(self, X_train, y_train, epochs=50, batch_size=32):
        """
        Train the LSTM model.
        """
        early_stop = EarlyStopping(monitor='loss', patience=5, restore_best_weights=True)
        self.model.fit(X_train, y_train, epochs=epochs, batch_size=batch_size, callbacks=[early_stop])

    def evaluate_model(self, y_true, y_pred):
        """
        Evaluate the model performance.
        """
        y_true = self.scaler.inverse_transform(y_true.reshape(-1, 1))
        y_pred = self.scaler.inverse_transform(y_pred.reshape(-1, 1))

        print("R2:", r2_score(y_true, y_pred))
        print("MSE:", mean_squared_error(y_true, y_pred))
        print("RMSE:", np.sqrt(mean_squared_error(y_true, y_pred)))
        print("MAE:", mean_absolute_error(y_true, y_pred))
        print("MAPE%:", f"{mean_absolute_percentage_error(y_true, y_pred) * 100:.2f}%")

    def plot_predictions(self, y_test, y_pred):
        """
        Plot predictions vs actual values.
        """
        y_test = self.scaler.inverse_transform(y_test.reshape(-1, 1))
        y_pred = self.scaler.inverse_transform(y_pred.reshape(-1, 1))

        plt.figure(figsize=(10, 5))
        plt.plot(y_test, label="Actual")
        plt.plot(y_pred, label="Predicted", linestyle="--")
        plt.title("Test Predictions")
        plt.xlabel("Time")
        plt.ylabel("Value")
        plt.legend()
        plt.show()

    def process(self):
        """
        Execute the full pipeline: load, preprocess, train, evaluate, and plot.
        """
        X_train, X_test, y_train, y_test = self.load_and_preprocess_data()
        self.build_model()
        self.train_model(X_train, y_train)
        y_pred_train = self.model.predict(X_train)
        y_pred_test = self.model.predict(X_test)
        self.evaluate_model(y_test, y_pred_test)
        self.plot_predictions(y_test, y_pred_test)

# Example usage for one file
file_path = "D:/OneDrive - Hanoi University of Science and Technology/GIT/MiniProj_StockPrediction_ML_SpManhGraduationProj_2024/data/raw20192024/FPT_stock_data.csv"
model = TimeSeriesLSTMModel(file_path)
model.process()
```

---

### **M√¥ t·∫£ ch√≠nh trong Class:**
1. **`load_and_preprocess_data`:**
   - T·∫°o c√°c c·ª≠a s·ªï th·ªùi gian (`look_back`) cho LSTM.
   - Chu·∫©n h√≥a d·ªØ li·ªáu b·∫±ng `MinMaxScaler`.

2. **`build_model`:**
   - X√¢y d·ª±ng m·∫°ng LSTM v·ªõi 2 l·ªõp LSTM v√† 1 l·ªõp Dense.

3. **`train_model`:**
   - ƒê√†o t·∫°o m√¥ h√¨nh v·ªõi `EarlyStopping` ƒë·ªÉ tr√°nh overfitting.

4. **`evaluate_model`:**
   - T√≠nh c√°c ch·ªâ s·ªë ƒë√°nh gi√° v√† in k·∫øt qu·∫£.

5. **`plot_predictions`:**
   - V·∫Ω bi·ªÉu ƒë·ªì so s√°nh gi·ªØa gi√° tr·ªã th·ª±c t·∫ø v√† gi√° tr·ªã d·ª± ƒëo√°n.

---

### **L∆∞u √Ω:**
- ƒê·∫£m b·∫£o r·∫±ng file CSV c√≥ c·ªôt `"close"` ƒë·ªÉ t·∫°o c√°c sequences th·ªùi gian.
- Ki·ªÉm tra `look_back` ph√π h·ª£p (v√≠ d·ª•: 10 ng√†y g·∫ßn nh·∫•t).

B·∫°n c√≥ th·ªÉ ch·∫°y v√† ki·ªÉm tra k·∫øt qu·∫£ tr·ª±c ti·∫øp t·ª´ LSTM model n√†y! üòä

### **C·∫≠p nh·∫≠t ƒë·ªÉ th√™m `R2` v√† `DA` (Directional Accuracy)**
ƒê·ªÉ th√™m c√°c ch·ªâ s·ªë **R2** v√† **DA (Directional Accuracy)**, ch√∫ng ta ch·ªâ c·∫ßn m·ªü r·ªông h√†m ƒë√°nh gi√° v√† tr·ª±c quan h√≥a m√¥ h√¨nh.

### **C·∫≠p nh·∫≠t Class ƒë·ªÉ c·∫£i thi·ªán:**

#### **1. T√≠nh `R2` v√† `DA`:**
- **R2:** ƒê√£ c√≥ trong th∆∞ vi·ªán `sklearn.metrics`.
- **DA:** T·ª± t√≠nh b·∫±ng c√°ch so s√°nh h∆∞·ªõng bi·∫øn ƒë·ªông gi·ªØa gi√° tr·ªã th·ª±c (`y_true`) v√† gi√° tr·ªã d·ª± ƒëo√°n (`y_pred`).

#### **2. H·ª£p nh·∫•t bi·ªÉu ƒë·ªì:**
- Thay v√¨ t√°ch ra hai bi·ªÉu ƒë·ªì (scatter v√† line), ta c√≥ th·ªÉ hi·ªÉn th·ªã ch√∫ng tr√™n m·ªôt c·ª≠a s·ªï v·ªõi hai subplot.

#### **3. C·∫≠p nh·∫≠t Code:**
```python
class TimeSeriesLSTMModel:
    def __init__(self, file_path, look_back=10):
        """
        Initialize the class with the file path of the dataset and look-back window.
        """
        self.file_path = file_path
        self.look_back = look_back
        self.data = None
        self.model = None
        self.scaler = MinMaxScaler(feature_range=(0, 1))

    def load_and_preprocess_data(self):
        """
        Load the dataset and preprocess the data for LSTM.
        """
        # Load data
        self.data = pd.read_csv(self.file_path)
        self.data["close_tomor"] = self.data["close"].shift(-1)
        self.data = self.data.iloc[:-1]
        
        # Scaling data
        self.data_scaled = self.scaler.fit_transform(self.data[['close_tomor']])
        
        # Create sequences
        X, y = [], []
        for i in range(self.look_back, len(self.data_scaled)):
            X.append(self.data_scaled[i - self.look_back:i, 0])  # Sequence of look_back days
            y.append(self.data_scaled[i, 0])  # Target value

        X, y = np.array(X), np.array(y)
        X = X.reshape((X.shape[0], X.shape[1], 1))  # Reshape to (samples, timesteps, features)
        
        # Split into training and testing sets
        train_size = int(len(X) * 0.75)
        X_train, X_test = X[:train_size], X[train_size:]
        y_train, y_test = y[:train_size], y[train_size:]

        return X_train, X_test, y_train, y_test

    def build_model(self):
        """
        Build the LSTM model.
        """
        self.model = Sequential([
            LSTM(50, activation='relu', return_sequences=True, input_shape=(self.look_back, 1)),
            LSTM(50, activation='relu'),
            Dense(1)
        ])
        self.model.compile(optimizer='adam', loss='mse')

    def train_model(self, X_train, y_train, epochs=50, batch_size=32):
        """
        Train the LSTM model.
        """
        early_stop = EarlyStopping(monitor='loss', patience=5, restore_best_weights=True)
        self.model.fit(X_train, y_train, epochs=epochs, batch_size=batch_size, callbacks=[early_stop])

    def evaluate_model(self, y_true, y_pred):
        """
        Evaluate the model performance.
        """
        y_true_rescaled = self.scaler.inverse_transform(y_true.reshape(-1, 1))
        y_pred_rescaled = self.scaler.inverse_transform(y_pred.reshape(-1, 1))

        # Calculate R2
        r2 = r2_score(y_true_rescaled, y_pred_rescaled)

        # Calculate Directional Accuracy
        da = np.mean(
            np.sign(y_true_rescaled[1:] - y_true_rescaled[:-1]) ==
            np.sign(y_pred_rescaled[1:] - y_pred_rescaled[:-1])
        )

        print("R2:", r2)
        print("Directional Accuracy (DA):", f"{da * 100:.2f}%")
        print("MSE:", mean_squared_error(y_true_rescaled, y_pred_rescaled))
        print("RMSE:", np.sqrt(mean_squared_error(y_true_rescaled, y_pred_rescaled)))
        print("MAE:", mean_absolute_error(y_true_rescaled, y_pred_rescaled))
        print("MAPE%:", f"{mean_absolute_percentage_error(y_true_rescaled, y_pred_rescaled) * 100:.2f}%")

        return r2, da

    def plot_predictions(self, y_test, y_pred, num_samples=50):
        """
        Plot predictions vs actual values with both scatter and line plots.
        """
        y_test_rescaled = self.scaler.inverse_transform(y_test.reshape(-1, 1))
        y_pred_rescaled = self.scaler.inverse_transform(y_pred.reshape(-1, 1))

        fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 7))

        # Scatter plot
        ax1.scatter(y_test_rescaled, y_pred_rescaled, alpha=0.5)
        ax1.set_title("Predictions vs Actual")
        ax1.set_xlabel("Actual Values")
        ax1.set_ylabel("Predicted Values")

        # Add y=x line
        min_val = min(y_test_rescaled.min(), y_pred_rescaled.min())
        max_val = max(y_test_rescaled.max(), y_pred_rescaled.max())
        ax1.plot([min_val, max_val], [min_val, max_val], 'r--')

        # Line plot
        x_points = range(num_samples)
        ax2.plot(x_points, y_test_rescaled[:num_samples], 'b-', label='Actual')
        ax2.plot(x_points, y_pred_rescaled[:num_samples], 'orange', linestyle='--', label='Predicted')
        
        ax2.set_title("Test Predictions")
        ax2.set_xlabel("Point")
        ax2.set_ylabel("Value")
        ax2.legend()

        plt.tight_layout()
        plt.show()

    def process(self):
        """
        Execute the full pipeline: load, preprocess, train, evaluate, and plot.
        """
        X_train, X_test, y_train, y_test = self.load_and_preprocess_data()
        self.build_model()
        self.train_model(X_train, y_train)
        y_pred_train = self.model.predict(X_train)
        y_pred_test = self.model.predict(X_test)
        self.evaluate_model(y_test, y_pred_test)
        self.plot_predictions(y_test, y_pred_test)

# Example usage for one file
file_path = "D:/OneDrive - Hanoi University of Science and Technology/GIT/MiniProj_StockPrediction_ML_SpManhGraduationProj_2024/data/raw20192024/FPT_stock_data.csv"
model = TimeSeriesLSTMModel(file_path)
model.process()
```

---

### **C√°c thay ƒë·ªïi ch√≠nh:**
1. **Th√™m `R2` v√† `Directional Accuracy`:**
   - **R2**: ƒê√°nh gi√° ƒë·ªô ph√π h·ª£p c·ªßa m√¥ h√¨nh.
   - **DA**: ƒêo l∆∞·ªùng kh·∫£ nƒÉng d·ª± ƒëo√°n ƒë√∫ng h∆∞·ªõng bi·∫øn ƒë·ªông gi√° tr·ªã.

2. **Bi·ªÉu ƒë·ªì h·ª£p nh·∫•t:**
   - Scatter plot (tr·ª•c `y_true` v√† `y_pred`) v√† line plot (bi·ªÉu di·ªÖn m·∫´u) trong c√πng m·ªôt c·ª≠a s·ªï.

3. **G·ªçn g√†ng h∆°n:**
   - D√πng `min()` v√† `max()` cho ƒë∆∞·ªùng y=x trong scatter plot.
   - Hi·ªÉn th·ªã s·ªë l∆∞·ª£ng m·∫´u t√πy ch·ªçn tr√™n line plot (`num_samples`).

---

### **K·∫øt qu·∫£ mong ƒë·ª£i:**
- **Bi·ªÉu ƒë·ªì scatter:** So s√°nh gi√° tr·ªã d·ª± ƒëo√°n v·ªõi th·ª±c t·∫ø, k√®m ƒë∆∞·ªùng y=x ƒë·ªÉ ki·ªÉm tra t√≠nh ch√≠nh x√°c.
- **Bi·ªÉu ƒë·ªì line plot:** Hi·ªÉn th·ªã xu h∆∞·ªõng c·ªßa m·∫´u th·ª±c v√† d·ª± ƒëo√°n, d·ªÖ ki·ªÉm tra s·ª± sai l·ªách.
- **Ch·ªâ s·ªë R2 v√† DA:** ƒê∆∞·ª£c in r√µ r√†ng trong qu√° tr√¨nh ƒë√°nh gi√°.

H√£y th·ª≠ √°p d·ª•ng v√† ki·ªÉm tra k·∫øt qu·∫£ nh√©! üòä

### **LSTM (Long Short-Term Memory): Gi·∫£i th√≠ch ƒë∆°n gi·∫£n**

---

#### **ƒê·ªãnh nghƒ©a ƒë∆°n gi·∫£n**:
- LSTM l√† m·ªôt thu·∫≠t to√°n th√¥ng minh trong lƒ©nh v·ª±c Tr√≠ tu·ªá Nh√¢n t·∫°o (AI), ƒë∆∞·ª£c s·ª≠ d·ª•ng ƒë·ªÉ **hi·ªÉu v√† d·ª± ƒëo√°n c√°c chu·ªói s·ª± ki·ªán theo th·ªùi gian**.
- N√≥ gi·ªëng nh∆∞ m·ªôt cu·ªën s·ªï tay th·∫ßn k·ª≥, c√≥ th·ªÉ **ghi nh·ªõ nh·ªØng ƒëi·ªÅu quan tr·ªçng trong qu√° kh·ª©** v√† **qu√™n nh·ªØng ƒëi·ªÅu kh√¥ng quan tr·ªçng**, ƒë·ªÉ ƒë∆∞a ra d·ª± ƒëo√°n ch√≠nh x√°c h∆°n.

---

#### **V√≠ d·ª• ƒë∆°n gi·∫£n**:
1. **K√Ω ·ª©c c·ªßa b·∫°n v·ªÅ m·ªôt b√†i ki·ªÉm tra:**
   - B·∫°n ƒëang h·ªçc ƒë·ªÉ thi. ƒê·∫ßu ti√™n, b·∫°n nh·ªõ r·∫•t nhi·ªÅu th·ª©, nh∆∞ng sau m·ªôt th·ªùi gian, b·∫°n qu√™n ƒëi c√°c chi ti·∫øt kh√¥ng quan tr·ªçng (v√≠ d·ª•: m√†u b√∫t b·∫°n d√πng), nh∆∞ng v·∫´n gi·ªØ l·∫°i nh·ªØng g√¨ c·∫ßn thi·∫øt (c√¥ng th·ª©c to√°n).
   - LSTM ho·∫°t ƒë·ªông t∆∞∆°ng t·ª±: n√≥ quy·∫øt ƒë·ªãnh n√™n nh·ªõ hay qu√™n th√¥ng tin c≈©, d·ª±a v√†o ƒë·ªô quan tr·ªçng c·ªßa ch√∫ng.

2. **D·ª± ƒëo√°n gi√° c·ªï phi·∫øu:**
   - ƒê·ªÉ d·ª± ƒëo√°n gi√° c·ªï phi·∫øu ng√†y mai, ta kh√¥ng ch·ªâ d·ª±a v√†o gi√° h√¥m nay m√† c√≤n c·∫£ **xu h∆∞·ªõng d√†i h·∫°n** (v√≠ d·ª•, gi√° ƒëang tƒÉng ƒë·ªÅu trong 1 th√°ng) v√† **s·ª± ki·ªán g·∫ßn ƒë√¢y** (v√≠ d·ª•, th√¥ng b√°o t·ª´ c√¥ng ty h√¥m qua). LSTM s·∫Ω ghi nh·ªõ c·∫£ hai lo·∫°i th√¥ng tin n√†y.

---

#### **C√°ch ho·∫°t ƒë·ªông c·ªßa LSTM:**
1. **Ba c√°nh c·ª≠a th·∫ßn k·ª≥:**
   - LSTM c√≥ ba "c·ª≠a" ƒë·ªÉ qu·∫£n l√Ω th√¥ng tin:
     - **C·ª≠a qu√™n**: Quy·∫øt ƒë·ªãnh qu√™n th√¥ng tin n√†o kh√¥ng c√≤n quan tr·ªçng.
     - **C·ª≠a nh·ªõ**: Quy·∫øt ƒë·ªãnh th√™m th√¥ng tin n√†o m·ªõi v√†o "tr√≠ nh·ªõ d√†i h·∫°n".
     - **C·ª≠a ƒë·∫ßu ra**: Quy·∫øt ƒë·ªãnh s·ª≠ d·ª•ng ph·∫ßn n√†o c·ªßa tr√≠ nh·ªõ ƒë·ªÉ ƒë∆∞a ra d·ª± ƒëo√°n.

2. **C√°ch ch√∫ng l√†m vi·ªác:**
   - M·ªói b∆∞·ªõc, LSTM ki·ªÉm tra th√¥ng tin ƒë·∫ßu v√†o v√† quy·∫øt ƒë·ªãnh "gi·ªØ g√¨", "qu√™n g√¨", v√† "n√™n d·ª± ƒëo√°n g√¨ ti·∫øp theo".

---

#### **So s√°nh v·ªõi Linear Regression, Lasso v√† Ridge**:
| **Thu·∫≠t to√°n**         | **D√πng cho lo·∫°i d·ªØ li·ªáu n√†o**                  | **ƒêi·ªÉm m·∫°nh**                                  | **ƒêi·ªÉm y·∫øu**                                |
|-------------------------|-----------------------------------------------|-----------------------------------------------|--------------------------------------------|
| **Linear Regression**   | D·ªØ li·ªáu kh√¥ng th·ªùi gian, quan h·ªá tuy·∫øn t√≠nh    | D·ªÖ hi·ªÉu, nhanh                                | Kh√¥ng ph√π h·ª£p v·ªõi chu·ªói th·ªùi gian           |
| **Lasso & Ridge**       | D·ªØ li·ªáu nhi·ªÅu y·∫øu t·ªë, c·∫ßn gi·∫£m overfitting     | Gi·∫£m ph·ª©c t·∫°p, ch·ªçn l·ª±a y·∫øu t·ªë quan tr·ªçng     | Kh√¥ng hi·ªÉu m·ªëi quan h·ªá theo th·ªùi gian       |
| **LSTM**                | Chu·ªói th·ªùi gian, d·ªØ li·ªáu c√≥ m·ªëi quan h·ªá ph·ª©c t·∫°p| Hi·ªÉu ƒë∆∞·ª£c m·ªëi quan h·ªá d√†i h·∫°n v√† ng·∫Øn h·∫°n    | Ph·ª©c t·∫°p, c·∫ßn nhi·ªÅu t√†i nguy√™n t√≠nh to√°n    |

---

#### **M√£ minh h·ªça ƒë∆°n gi·∫£n v·ªõi chu·ªói th·ªùi gian:**
```python
import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense

# D·ªØ li·ªáu m·∫´u: Gi√° c·ªï phi·∫øu qua c√°c ng√†y (gi·∫£ ƒë·ªãnh)
data = np.array([10, 20, 30, 40, 50, 60, 70]).reshape(-1, 1)

# T·∫°o c√°c b∆∞·ªõc th·ªùi gian
X = np.array([[10], [20], [30], [40], [50], [60]])
y = np.array([20, 30, 40, 50, 60, 70])

# Chuy·ªÉn ƒë·ªïi th√†nh ƒë·ªãnh d·∫°ng 3D ƒë·ªÉ d√πng LSTM
X = X.reshape((X.shape[0], 1, X.shape[1]))

# X√¢y d·ª±ng m√¥ h√¨nh LSTM
model = Sequential()
model.add(LSTM(50, activation='relu', input_shape=(X.shape[1], X.shape[2])))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')

# Train m√¥ h√¨nh
model.fit(X, y, epochs=200, verbose=0)

# D·ª± ƒëo√°n gi√° tr·ªã ti·∫øp theo
X_test = np.array([[70]]).reshape((1, 1, 1))
y_pred = model.predict(X_test)
print("D·ª± ƒëo√°n gi√° tr·ªã ti·∫øp theo:", y_pred[0][0])
```

---

#### **T√≥m t·∫Øt:**
- LSTM r·∫•t m·∫°nh khi l√†m vi·ªác v·ªõi **chu·ªói th·ªùi gian** ho·∫∑c **d·ªØ li·ªáu c√≥ t√≠nh li√™n t·ª•c**.
- N√≥ ƒë·∫∑c bi·ªát h·ªØu √≠ch trong c√°c b√†i to√°n nh∆∞ **d·ª± ƒëo√°n th·ªùi ti·∫øt**, **gi√° c·ªï phi·∫øu**, ho·∫∑c **hi·ªÉu vƒÉn b·∫£n d√†i** (nh∆∞ chatbot). 

N·∫øu b·∫°n mu·ªën m√¥ h√¨nh "ghi nh·ªõ v√† qu√™n" nh∆∞ c√°ch n√£o b·ªô con ng∆∞·ªùi ho·∫°t ƒë·ªông, h√£y th·ª≠ d√πng LSTM! üåü

In [2]:
import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense

# D·ªØ li·ªáu m·∫´u: Gi√° c·ªï phi·∫øu qua c√°c ng√†y (gi·∫£ ƒë·ªãnh)
# M·ªói gi√° tr·ªã trong m·∫£ng ƒë·∫°i di·ªán cho gi√° c·ªï phi·∫øu trong m·ªôt ng√†y.
data = np.array([10, 20, 30, 40, 50, 60, 70]).reshape(-1, 1)
print("D·ªØ li·ªáu ban ƒë·∫ßu (gi√° c·ªï phi·∫øu theo ng√†y):")
print(data)

# T·∫°o c√°c b∆∞·ªõc th·ªùi gian (time steps)
# X ch·ª©a c√°c gi√° tr·ªã ƒë·∫ßu v√†o (gi√° c·ªï phi·∫øu t·∫°i ng√†y tr∆∞·ªõc), 
# y ch·ª©a gi√° tr·ªã c·∫ßn d·ª± ƒëo√°n (gi√° c·ªï phi·∫øu ng√†y ti·∫øp theo).
X = np.array([[10], [20], [30], [40], [50], [60]])
y = np.array([20, 30, 40, 50, 60, 70])
print("\nD·ªØ li·ªáu ƒë·∫ßu v√†o (X - gi√° ng√†y tr∆∞·ªõc):")
print(X)
print("\nD·ªØ li·ªáu ƒë·∫ßu ra (y - gi√° ng√†y ti·∫øp theo):")
print(y)

# Chuy·ªÉn ƒë·ªïi d·ªØ li·ªáu ƒë·∫ßu v√†o th√†nh ƒë·ªãnh d·∫°ng 3D ƒë·ªÉ d√πng cho LSTM
# LSTM y√™u c·∫ßu d·ªØ li·ªáu c√≥ d·∫°ng (samples, time_steps, features).
# Trong v√≠ d·ª• n√†y:
# - samples: s·ªë l∆∞·ª£ng chu·ªói (6 chu·ªói t·ª´ X)
# - time_steps: s·ªë b∆∞·ªõc th·ªùi gian (1 ng√†y)
# - features: s·ªë ƒë·∫∑c tr∆∞ng (1, v√¨ m·ªói ng√†y ch·ªâ c√≥ gi√° c·ªï phi·∫øu).
X = X.reshape((X.shape[0], 1, X.shape[1]))
print("\nD·ªØ li·ªáu sau khi chuy·ªÉn th√†nh ƒë·ªãnh d·∫°ng 3D (cho LSTM):")
print(X)

# X√¢y d·ª±ng m√¥ h√¨nh LSTM
# Sequential: X√¢y d·ª±ng m√¥ h√¨nh t·ª´ng l·ªõp m·ªôt.
# LSTM: L·ªõp Long Short-Term Memory v·ªõi 50 ƒë∆°n v·ªã ·∫©n v√† h√†m k√≠ch ho·∫°t ReLU.
# Dense: L·ªõp ƒë·∫ßu ra v·ªõi 1 n∆°-ron (d·ª± ƒëo√°n gi√° tr·ªã ti·∫øp theo).
model = Sequential()
model.add(LSTM(50, activation='relu', input_shape=(X.shape[1], X.shape[2])))
model.add(Dense(1))  # L·ªõp Dense v·ªõi ƒë·∫ßu ra 1 gi√° tr·ªã (gi√° ng√†y ti·∫øp theo).
model.compile(optimizer='adam', loss='mse')  # D√πng Adam optimizer, h√†m l·ªói MSE.

# Hu·∫•n luy·ªán m√¥ h√¨nh
# epochs=200: L·∫∑p l·∫°i qu√° tr√¨nh hu·∫•n luy·ªán 200 l·∫ßn tr√™n to√†n b·ªô d·ªØ li·ªáu.
# verbose=0: Kh√¥ng hi·ªÉn th·ªã th√¥ng tin hu·∫•n luy·ªán.
model.fit(X, y, epochs=200, verbose=0)

# D·ª± ƒëo√°n gi√° tr·ªã ti·∫øp theo
# D·ª± ƒëo√°n gi√° c·ªï phi·∫øu sau ng√†y c√≥ gi√° 70
# C·∫ßn chuy·ªÉn d·ªØ li·ªáu test th√†nh ƒë·ªãnh d·∫°ng 3D gi·ªëng d·ªØ li·ªáu hu·∫•n luy·ªán.
X_test = np.array([[70]]).reshape((1, 1, 1))
y_pred = model.predict(X_test)

# Hi·ªÉn th·ªã gi√° tr·ªã d·ª± ƒëo√°n
print("\nGi√° tr·ªã d·ª± ƒëo√°n cho ng√†y ti·∫øp theo sau gi√° 70:")
print(y_pred[0][0])


D·ªØ li·ªáu ban ƒë·∫ßu (gi√° c·ªï phi·∫øu theo ng√†y):
[[10]
 [20]
 [30]
 [40]
 [50]
 [60]
 [70]]

D·ªØ li·ªáu ƒë·∫ßu v√†o (X - gi√° ng√†y tr∆∞·ªõc):
[[10]
 [20]
 [30]
 [40]
 [50]
 [60]]

D·ªØ li·ªáu ƒë·∫ßu ra (y - gi√° ng√†y ti·∫øp theo):
[20 30 40 50 60 70]

D·ªØ li·ªáu sau khi chuy·ªÉn th√†nh ƒë·ªãnh d·∫°ng 3D (cho LSTM):
[[[10]]

 [[20]]

 [[30]]

 [[40]]

 [[50]]

 [[60]]]


  super().__init__(**kwargs)


[1m1/1[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m0s[0m 215ms/step

Gi√° tr·ªã d·ª± ƒëo√°n cho ng√†y ti·∫øp theo sau gi√° 70:
89.75144


### **Gi·∫£i th√≠ch chi ti·∫øt v·ªÅ LSTM trong Deep Learning**

#### **1. T·ªïng quan v·ªÅ LSTM trong Deep Learning**
LSTM (Long Short-Term Memory) l√† m·ªôt lo·∫°i **Recurrent Neural Network (RNN)** ƒë·∫∑c bi·ªát ƒë∆∞·ª£c thi·∫øt k·∫ø ƒë·ªÉ x·ª≠ l√Ω d·ªØ li·ªáu chu·ªói (sequence data). ƒêi·ªÉm m·∫°nh c·ªßa LSTM l√† kh·∫£ nƒÉng **ghi nh·ªõ th√¥ng tin quan tr·ªçng trong th·ªùi gian d√†i** v√† **qu√™n th√¥ng tin kh√¥ng c·∫ßn thi·∫øt**. ƒêi·ªÅu n√†y l√†m cho LSTM r·∫•t hi·ªáu qu·∫£ trong c√°c b√†i to√°n c√≥ t√≠nh li√™n h·ªá th·ªùi gian nh∆∞:

- D·ª± ƒëo√°n chu·ªói th·ªùi gian (gi√° c·ªï phi·∫øu, th·ªùi ti·∫øt, l∆∞·ª£ng m∆∞a, v.v.).
- Ph√¢n t√≠ch ng√¥n ng·ªØ t·ª± nhi√™n (d·ªãch m√°y, chatbot, ph√¢n t√≠ch c·∫£m x√∫c).
- Nh·∫≠n d·∫°ng gi·ªçng n√≥i ho·∫∑c √¢m thanh.

---

#### **2. Ki·∫øn tr√∫c b√™n trong LSTM**
LSTM c√≥ ki·∫øn tr√∫c ph·ª©c t·∫°p h∆°n RNN th√¥ng th∆∞·ªùng, nh·ªù ba th√†nh ph·∫ßn ch√≠nh g·ªçi l√† **c√°c c·ªïng (gates)**, gi√∫p ki·ªÉm so√°t d√≤ng ch·∫£y c·ªßa th√¥ng tin:

1. **C·ªïng qu√™n (Forget Gate):**
   - Quy·∫øt ƒë·ªãnh th√¥ng tin n√†o t·ª´ tr·∫°ng th√°i tr∆∞·ªõc ƒë√≥ c·∫ßn qu√™n.
   - S·ª≠ d·ª•ng m·ªôt h√†m sigmoid ƒë·ªÉ t·∫°o ra gi√° tr·ªã t·ª´ 0 ƒë·∫øn 1, bi·ªÉu th·ªã m·ª©c ƒë·ªô c·∫ßn qu√™n (0 l√† qu√™n ho√†n to√†n, 1 l√† gi·ªØ nguy√™n).
   - C√¥ng th·ª©c:
     \[
     f_t = \sigma(W_f \cdot [h_{t-1}, x_t] + b_f)
     \]

2. **C·ªïng nh·ªõ (Input Gate):**
   - Quy·∫øt ƒë·ªãnh th√¥ng tin n√†o c·∫ßn th√™m v√†o tr·∫°ng th√°i hi·ªán t·∫°i.
   - K·∫øt h·ª£p m·ªôt h√†m sigmoid (ƒë·ªÉ x√°c ƒë·ªãnh m·ª©c ƒë·ªô nh·ªõ) v√† m·ªôt h√†m tanh (ƒë·ªÉ t·∫°o th√¥ng tin m·ªõi).
   - C√¥ng th·ª©c:
     \[
     i_t = \sigma(W_i \cdot [h_{t-1}, x_t] + b_i)
     \]
     \[
     \tilde{C}_t = \tanh(W_c \cdot [h_{t-1}, x_t] + b_c)
     \]

3. **C·ªïng ƒë·∫ßu ra (Output Gate):**
   - Quy·∫øt ƒë·ªãnh tr·∫°ng th√°i ·∫©n \( h_t \) s·∫Ω ƒë∆∞·ª£c ƒë∆∞a ra ƒë·ªÉ s·ª≠ d·ª•ng.
   - K·∫øt h·ª£p sigmoid v√† tanh ƒë·ªÉ x√°c ƒë·ªãnh th√¥ng tin c·∫ßn xu·∫•t.
   - C√¥ng th·ª©c:
     \[
     o_t = \sigma(W_o \cdot [h_{t-1}, x_t] + b_o)
     \]
     \[
     h_t = o_t \cdot \tanh(C_t)
     \]

4. **Tr·∫°ng th√°i t·∫ø b√†o (Cell State):**
   - ƒê√¢y l√† n∆°i l∆∞u tr·ªØ th√¥ng tin d√†i h·∫°n. Tr·∫°ng th√°i t·∫ø b√†o ƒë∆∞·ª£c c·∫≠p nh·∫≠t th√¥ng qua c·ªïng qu√™n v√† c·ªïng nh·ªõ.
   - C√¥ng th·ª©c:
     \[
     C_t = f_t \cdot C_{t-1} + i_t \cdot \tilde{C}_t
     \]

---

#### **3. D√≤ng ch·∫£y d·ªØ li·ªáu trong LSTM**
Khi d·ªØ li·ªáu chu·ªói ƒëi qua LSTM:
1. D·ªØ li·ªáu hi·ªán t·∫°i \( x_t \) v√† tr·∫°ng th√°i ·∫©n tr∆∞·ªõc ƒë√≥ \( h_{t-1} \) ƒë∆∞·ª£c ƒë∆∞a v√†o c√°c c·ªïng.
2. C·ªïng qu√™n quy·∫øt ƒë·ªãnh th√¥ng tin n√†o t·ª´ tr·∫°ng th√°i tr∆∞·ªõc c·∫ßn qu√™n.
3. C·ªïng nh·ªõ th√™m th√¥ng tin m·ªõi v√†o tr·∫°ng th√°i t·∫ø b√†o.
4. Tr·∫°ng th√°i t·∫ø b√†o \( C_t \) ƒë∆∞·ª£c c·∫≠p nh·∫≠t v√† chuy·ªÉn ti·∫øp.
5. C·ªïng ƒë·∫ßu ra t·∫°o ra tr·∫°ng th√°i ·∫©n \( h_t \), d√πng ƒë·ªÉ d·ª± ƒëo√°n ho·∫∑c truy·ªÅn ƒë·∫øn b∆∞·ªõc ti·∫øp theo.

---

#### **4. ∆Øu ƒëi·ªÉm c·ªßa LSTM so v·ªõi RNN th∆∞·ªùng**
- **Kh·∫£ nƒÉng x·ª≠ l√Ω "k√Ω ·ª©c d√†i h·∫°n":** LSTM c√≥ th·ªÉ l∆∞u gi·ªØ th√¥ng tin quan tr·ªçng qua nhi·ªÅu b∆∞·ªõc th·ªùi gian nh·ªù tr·∫°ng th√°i t·∫ø b√†o.
- **Gi·∫£m v·∫•n ƒë·ªÅ vanishing gradient:** C√°c RNN th√¥ng th∆∞·ªùng g·∫∑p kh√≥ khƒÉn trong vi·ªác x·ª≠ l√Ω chu·ªói d√†i v√¨ gradient b·ªã ti√™u bi·∫øn. LSTM kh·∫Øc ph·ª•c v·∫•n ƒë·ªÅ n√†y nh·ªù c∆° ch·∫ø c√°c c·ªïng.
- **ƒêa d·∫°ng ·ª©ng d·ª•ng:** T·ª´ chu·ªói th·ªùi gian ƒë·∫øn x·ª≠ l√Ω ng√¥n ng·ªØ v√† √¢m thanh.

---

#### **5. M√¥ h√¨nh LSTM trong ƒëo·∫°n code**
Trong ƒëo·∫°n code b·∫°n cung c·∫•p:

1. **T·∫°o d·ªØ li·ªáu chu·ªói th·ªùi gian:**
   - Chu·ªói \( X \): Gi√° c·ªï phi·∫øu c·ªßa ng√†y h√¥m tr∆∞·ªõc.
   - Chu·ªói \( y \): Gi√° c·ªï phi·∫øu c·ªßa ng√†y k·∫ø ti·∫øp.
   - ƒê·ªãnh d·∫°ng 3D \( (samples, time_steps, features) \): Ph√π h·ª£p v·ªõi ƒë·∫ßu v√†o LSTM.

2. **X√¢y d·ª±ng m√¥ h√¨nh:**
   - LSTM v·ªõi 50 ƒë∆°n v·ªã ·∫©n: H·ªçc m·ªëi quan h·ªá gi·ªØa chu·ªói gi√° c·ªï phi·∫øu.
   - Dense: M·ªôt l·ªõp ƒë·∫ßu ra ƒë·ªÉ d·ª± ƒëo√°n gi√° tr·ªã ng√†y ti·∫øp theo.

3. **Qu√° tr√¨nh h·ªçc:**
   - M√¥ h√¨nh ƒë∆∞·ª£c hu·∫•n luy·ªán qua 200 v√≤ng (epochs) ƒë·ªÉ t√¨m ra m·ªëi quan h·ªá gi·ªØa \( X \) v√† \( y \).

4. **D·ª± ƒëo√°n:**
   - ƒê∆∞a gi√° tr·ªã cu·ªëi c√πng \( 70 \) v√†o m√¥ h√¨nh, chuy·ªÉn ƒë·ªãnh d·∫°ng ph√π h·ª£p, v√† d·ª± ƒëo√°n gi√° tr·ªã ti·∫øp theo.

---

#### **6. ·ª®ng d·ª•ng th·ª±c t·∫ø**
- **D·ª± ƒëo√°n chu·ªói th·ªùi gian:** 
  - Gi√° c·ªï phi·∫øu, th·ªùi ti·∫øt, l∆∞u l∆∞·ª£ng giao th√¥ng.
- **X·ª≠ l√Ω ng√¥n ng·ªØ t·ª± nhi√™n:**
  - D·ªãch m√°y, t·∫°o vƒÉn b·∫£n, chatbot.
- **Nh·∫≠n d·∫°ng √¢m thanh:**
  - D·ª± ƒëo√°n t·ª´ ti·∫øp theo, nh·∫≠n di·ªán gi·ªçng n√≥i.

---

#### **7. H·∫°n ch·∫ø c·ªßa LSTM**
- **T·ªën t√†i nguy√™n:** C·∫ßn nhi·ªÅu b·ªô nh·ªõ v√† th·ªùi gian hu·∫•n luy·ªán h∆°n c√°c m√¥ h√¨nh th√¥ng th∆∞·ªùng.
- **Kh√≥ t·ªëi ∆∞u:** V·ªõi chu·ªói r·∫•t d√†i, d√π LSTM t·ªët h∆°n RNN, nh∆∞ng v·∫´n c√≥ th·ªÉ m·∫•t th√¥ng tin (gradient b√£o h√≤a).

---

#### **8. Ph√°t tri·ªÉn th√™m:**
- Th·ª≠ s·ª≠ d·ª•ng **GRU (Gated Recurrent Unit)** ‚Äì ƒë∆°n gi·∫£n h∆°n LSTM nh∆∞ng hi·ªáu qu·∫£ t∆∞∆°ng t·ª±.
- S·ª≠ d·ª•ng **m√¥ h√¨nh Seq2Seq** n·∫øu c·∫ßn d·ª± ƒëo√°n nhi·ªÅu b∆∞·ªõc th·ªùi gian trong t∆∞∆°ng lai.
- T√≠ch h·ª£p Dropout ƒë·ªÉ gi·∫£m overfitting.

# gi·∫£i th√≠ch 1 c√°ch th·∫≠t ƒë∆°n gi·∫£n --- sau ƒë√≥ link ƒë·∫øn c√¥ng th·ª©c t·ª´ng b∆∞·ªõc c·ªßa ki·∫øn tr√∫c m·∫°ng c∆° b·∫£n nh·∫•t 

T·ª´ ƒë√≥ m·ªü r·ªông ra LSTM

### **Gi·∫£i th√≠ch ƒë∆°n gi·∫£n v·ªÅ M·∫°ng Neural v√† LSTM**

---

#### **1. M·∫°ng Neural l√† g√¨?**
- H√£y t∆∞·ªüng t∆∞·ª£ng b·∫°n mu·ªën d·ª± ƒëo√°n gi√° tr·ªã, v√≠ d·ª•: **"ƒêi·ªÉm s·ªë s·∫Ω tƒÉng bao nhi√™u n·∫øu h·ªçc th√™m 1 gi·ªù m·ªói ng√†y?"**.
- **M·∫°ng neural** gi·ªëng nh∆∞ m·ªôt nh√≥m "ng∆∞·ªùi tr·ª£ l√Ω" gi√∫p b·∫°n ƒë∆∞a ra d·ª± ƒëo√°n. 
  - M·ªói "ng∆∞·ªùi tr·ª£ l√Ω" h·ªçc m·ªôt ph·∫ßn nh·ªè th√¥ng tin (v√≠ d·ª•: "h·ªçc th√™m 1 gi·ªù ·∫£nh h∆∞·ªüng nh∆∞ th·∫ø n√†o").
  - C√°c tr·ª£ l√Ω n√†y c√πng l√†m vi·ªác ƒë·ªÉ ƒë∆∞a ra c√¢u tr·∫£ l·ªùi cu·ªëi c√πng.

---

#### **2. C√°ch ho·∫°t ƒë·ªông c∆° b·∫£n nh·∫•t c·ªßa m·ªôt M·∫°ng Neural**
M·∫°ng Neural c∆° b·∫£n ho·∫°t ƒë·ªông qua **3 b∆∞·ªõc ch√≠nh**:
1. **Input (ƒê·∫ßu v√†o):**
   - ƒê√¢y l√† d·ªØ li·ªáu b·∫°n ƒë∆∞a v√†o, v√≠ d·ª•: s·ªë gi·ªù h·ªçc.
2. **Hidden Layers (L·ªõp ·∫©n):**
   - M·ªói l·ªõp ·∫©n c√≥ nhi·ªÅu "n∆°-ron" (gi·ªëng nh∆∞ c√°c tr·ª£ l√Ω). M·ªói n∆°-ron t√≠nh to√°n d·ª±a tr√™n th√¥ng tin nh·∫≠n ƒë∆∞·ª£c t·ª´ ƒë·∫ßu v√†o.
   - C√°c n∆°-ron n√†y s·ª≠ d·ª•ng c√¥ng th·ª©c:
     \[
     z = W \cdot X + b
     \]
     Trong ƒë√≥:
     - \( X \): Gi√° tr·ªã ƒë·∫ßu v√†o (s·ªë gi·ªù h·ªçc).
     - \( W \): Tr·ªçng s·ªë (m·ª©c ƒë·ªô quan tr·ªçng c·ªßa m·ªói y·∫øu t·ªë).
     - \( b \): Sai s·ªë (bias).
   - Sau ƒë√≥, k·∫øt qu·∫£ \( z \) ƒë∆∞·ª£c ƒë∆∞a qua m·ªôt h√†m k√≠ch ho·∫°t (activation function) ƒë·ªÉ t·∫°o ra ƒë·∫ßu ra. H√†m ph·ªï bi·∫øn l√† **ReLU**:
     \[
     a = \text{ReLU}(z) = \max(0, z)
     \]
3. **Output (ƒê·∫ßu ra):**
   - Cu·ªëi c√πng, l·ªõp ƒë·∫ßu ra l·∫•y th√¥ng tin t·ª´ l·ªõp ·∫©n v√† ƒë∆∞a ra d·ª± ƒëo√°n, v√≠ d·ª•: "ƒêi·ªÉm s·ªë s·∫Ω tƒÉng 5 ƒëi·ªÉm n·∫øu h·ªçc th√™m 1 gi·ªù."

---

#### **3. Ki·∫øn tr√∫c c∆° b·∫£n c·ªßa m·∫°ng Neural**
V√≠ d·ª•, v·ªõi m·ªôt b√†i to√°n d·ª± ƒëo√°n ƒë∆°n gi·∫£n:
- **Input:** S·ªë gi·ªù h·ªçc.
- **Hidden Layer:** 3 n∆°-ron t√≠nh to√°n.
- **Output:** ƒêi·ªÉm s·ªë d·ª± ƒëo√°n.

**C√¥ng th·ª©c c·ªßa t·ª´ng n∆°-ron trong l·ªõp ·∫©n:**
\[
z^{(1)} = W^{(1)} \cdot X + b^{(1)}
\]
\[
a^{(1)} = \text{ReLU}(z^{(1)})
\]

**C√¥ng th·ª©c ƒë·∫ßu ra:**
\[
y = W^{(2)} \cdot a^{(1)} + b^{(2)}
\]

---

#### **4. T·ª´ M·∫°ng Neural c∆° b·∫£n ƒë·∫øn LSTM**

- **V·∫•n ƒë·ªÅ v·ªõi d·ªØ li·ªáu chu·ªói th·ªùi gian:**  
  N·∫øu b·∫°n mu·ªën d·ª± ƒëo√°n d·ª±a tr√™n nhi·ªÅu ng√†y li√™n ti·∫øp (d·ªØ li·ªáu th·ªùi gian), m·∫°ng neural c∆° b·∫£n s·∫Ω kh√¥ng "nh·ªõ" ƒë∆∞·ª£c th√¥ng tin t·ª´ c√°c ng√†y tr∆∞·ªõc ƒë√≥.

- **Gi·∫£i ph√°p: Recurrent Neural Network (RNN):**  
  RNN th√™m m·ªôt c∆° ch·∫ø "nh·ªõ t·∫°m th·ªùi" b·∫±ng c√°ch ƒë∆∞a tr·∫°ng th√°i t·ª´ b∆∞·ªõc tr∆∞·ªõc v√†o b∆∞·ªõc ti·∫øp theo. Nh∆∞ng RNN c√≥ h·∫°n ch·∫ø l·ªõn: khi chu·ªói qu√° d√†i, th√¥ng tin c≈© b·ªã "qu√™n" (vanishing gradient).

- **LSTM (Long Short-Term Memory):**  
  LSTM l√† m·ªôt c·∫£i ti·∫øn c·ªßa RNN, th√™m kh·∫£ nƒÉng "qu√™n" v√† "ghi nh·ªõ" th√¥ng minh h∆°n, v·ªõi ba **c·ªïng ch√≠nh**:
  1. **C·ªïng qu√™n:** Quy·∫øt ƒë·ªãnh th√¥ng tin n√†o c·∫ßn x√≥a.
  2. **C·ªïng nh·ªõ:** Quy·∫øt ƒë·ªãnh th√¥ng tin n√†o c·∫ßn th√™m.
  3. **C·ªïng ƒë·∫ßu ra:** Quy·∫øt ƒë·ªãnh th√¥ng tin n√†o c·∫ßn ƒë∆∞a ra l√†m k·∫øt qu·∫£.

**C√¥ng th·ª©c LSTM:**
1. **C·ªïng qu√™n:**
   \[
   f_t = \sigma(W_f \cdot [h_{t-1}, x_t] + b_f)
   \]
   - \( f_t \): T·ªâ l·ªá th√¥ng tin c·∫ßn qu√™n (0 ƒë·∫øn 1).

2. **C·ªïng nh·ªõ:**
   \[
   i_t = \sigma(W_i \cdot [h_{t-1}, x_t] + b_i)
   \]
   - \( i_t \): T·ªâ l·ªá th√¥ng tin m·ªõi c·∫ßn th√™m.
   \[
   \tilde{C}_t = \tanh(W_c \cdot [h_{t-1}, x_t] + b_c)
   \]

3. **C·∫≠p nh·∫≠t tr·∫°ng th√°i:**
   \[
   C_t = f_t \cdot C_{t-1} + i_t \cdot \tilde{C}_t
   \]

4. **C·ªïng ƒë·∫ßu ra:**
   \[
   o_t = \sigma(W_o \cdot [h_{t-1}, x_t] + b_o)
   \]
   \[
   h_t = o_t \cdot \tanh(C_t)
   \]

---

#### **5. T√≥m l·∫°i**
- **M·∫°ng Neural c∆° b·∫£n:** D√πng ƒë·ªÉ h·ªçc c√°c m·ªëi quan h·ªá ƒë∆°n gi·∫£n t·ª´ ƒë·∫ßu v√†o.
- **RNN:** Th√™m kh·∫£ nƒÉng "nh·ªõ" th√¥ng tin t·ª´ qu√° kh·ª© g·∫ßn.
- **LSTM:** C·∫£i ti·∫øn ƒë·ªÉ "nh·ªõ l√¢u" v√† "qu√™n" nh·ªØng g√¨ kh√¥ng c·∫ßn thi·∫øt, gi√∫p gi·∫£i quy·∫øt b√†i to√°n chu·ªói ph·ª©c t·∫°p.

---

B·∫°n c√≥ th·ªÉ d√πng v√≠ d·ª• v·ªÅ LSTM trong d·ª± ƒëo√°n gi√° c·ªï phi·∫øu ƒë·ªÉ hi·ªÉu s√¢u h∆°n c√°ch n√≥ "ghi nh·ªõ" v√† "qu√™n" d·ªØ li·ªáu quan tr·ªçng qua th·ªùi gian! üåü