# Task


## Project Task: Multivariate Stock Price Prediction

**Goal:**
The objective of this project is to build and train a Long Short-Term Memory (LSTM) network to predict the closing price of a stock. Unlike simple models that only use past prices, you will build a **multivariate** model that leverages multiple features for each time step, including technical indicators and overall market performance. This approach mimics a more realistic trading analysis.

---

### Part 1: Data Collection and Feature Engineering

Your first step is to gather and prepare the data. You will use the `yfinance` library in Python to download historical stock data.

**1.  Data Acquisition:**
    * Choose a major technology stock, for example, **Apple Inc. (AAPL)**.
    * Download its daily data from **January 1, 2018, to December 31, 2023,** for the training set.
    * Download its data from **January 1, 2024, to the present** for the testing set.
    * Download the data for the **S&P 500 index (`^GSPC`)** for the same date ranges. This will serve as a feature representing the overall market condition.
    * Optionally, download another index or stock data

**2.  Feature Engineering:**
    Create a single DataFrame for your stock and add the following features:
    * **Technical Indicators:** Calculate the following for your chosen stock:
        * A **20-day Simple Moving Average (SMA)** of the 'Close' price.
        * A **50-day Simple Moving Average (SMA)** of the 'Close' price.
        * The **14-day Relative Strength Index (RSI)**.
    * **Market Data:** Add the 'Close' price of the S&P 500 (`^GSPC`) as a feature for each corresponding day.
    * **Final Features:** Your model will use the following features to predict the next day's 'Close' price: `['Close', 'Volume', 'SMA_20', 'SMA_50', 'RSI', 'GSPC_Close']`.
    * Consider other features that can be added

---

### Part 2: Data Preprocessing and Model Building

With your features ready, you need to prepare the data for the LSTM network.

**1.  Scaling:**
    * Scale / normalize all your features. It's crucial to fit the scaler **only on the training data** and then use it to transform both the training and testing data.

**2.  Sequence Creation:**
    * Write a function that converts your time-series data into supervised learning sequences.
    * Use a sequence length of **30 time steps**. This means your model will look at the features from the past 30 days (`X`) to predict the 'Close' price on the 31st day (`y`).
    * For each sequence, `X` should have a shape of `(30, 6)` (30 days, 6 features), and `y` will be the single 'Close' price to be predicted.

**3.  Build the LSTM Model:**
    * Construct a neural network using Keras/TensorFlow. Example architecture:
        * An `LSTM` layer with 50 units.
        * A `Dropout` layer with a rate of 0.2 to prevent overfitting.
        * A second `LSTM` layer with 50 units.
        * A second `Dropout` layer with a rate of 0.2.
        * A `Dense` output layer with **1 unit** (for the single predicted price).
    * Compile the model using the `adam` optimizer and `mean_squared_error` as the loss function.

---

### Part 3: Training, Evaluation, and Visualization

Now you will train your model and evaluate its performance.

**1.  Training:**
    * Train the model on your prepared training sequences for at least 25 epochs.

**2.  Evaluation:**
    * Make predictions on the test set.
    * Remember that your predictions will be scaled. You must use the scaler's `inverse_transform` method to convert them back to their original dollar values.
    * Calculate the **Root Mean Squared Error (RMSE)** between the actual prices and your model's predictions.

**3.  Visualization:**
    * Create a plot that clearly shows:
        * The actual stock prices from the test set.
        * The predicted stock prices from your model.
    * Ensure your plot is properly labeled with a title, x-axis label ('Time'), and y-axis label ('Stock Price USD').

---

### Part 4: Change the task to classification task

1. Now we want to provide good recommendation regarding the stock - BUY, SELL, HOLD.
2. In general, the thresholds are up to your decision, but you can think about something like:
    * if the closing price is 3% or more less then it was t days ago - BUY
    * if the closing price is 3% or more higher then it was t days ago - SELL
    * otherwise - HOLD


---


### 🌟 Bonus Challenge (Optional)

* Experiment with different sequence lengths (e.g., 60 or 90 days) and report if it improved your RMSE.
* Add another technical indicator, such as the **MACD**, as a feature.
* Try changing model architecture.