# 📊 AI Financial Analyst - Predicting Stock Market Prices 📈

Welcome, AI Financial Analysts! Your mission is to **predict stock market closing prices** based on historical financial data.

You will use **Decision Trees, Random Forests, and XGBoost** to analyze stock trends and make predictions.

**Let's get started! 🚀**


In [1]:
# 📌 Step 1: Import necessary libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import RandomForestRegressor, VotingRegressor
from xgboost import XGBRegressor
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
from sklearn.model_selection import GridSearchCV

print('Libraries imported successfully! ✅')

Libraries imported successfully! ✅


## 📂 Step 2: Load the Stock Market Dataset
Let's load the dataset and inspect the first few rows.

In [1]:
# df = pd.read_csv('stock_data.csv')
# df.head()

## 🏗 Step 3: Feature Engineering & Preprocessing
We need to extract relevant features from the dataset and prepare it for model training.

In [2]:
# Extract numerical features from the dataset


# Scale numerical features


## 🏗 Step 4: Train-Test Split
We need to split the data into training and testing sets.

## 🌳 Step 5: Train a Decision Tree Regressor
Let's train a **Decision Tree** model to predict stock prices.

In [3]:

print('Decision Tree RMSE:')

Decision Tree RMSE:


## 🌲 Step 6: Train a Random Forest Regressor
Let's improve our model using **Random Forest**.

In [4]:

print('Random Forest RMSE:')

Random Forest RMSE:


## ⚡ Step 7: Train an XGBoost Regressor
Let's use **XGBoost** for optimized performance.

In [5]:

print('XGBoost RMSE:')

XGBoost RMSE:


## ⚙️ Step 8: Hyperparameter Tuning for Random Forest & XGBoost
Now let's **optimize** our models using GridSearchCV to find the best hyperparameters.

In [6]:
# Hyperparameter tuning for Random Forest

# use GridSearch 

print('Best Random Forest Params:')

Best Random Forest Params:


In [7]:
# Hyperparameter tuning for XGBoost

# use GridSearch 
print('Best XGBoost Params:')

Best XGBoost Params:


## 🏆 Step 9: Build a Stacked Model
Now that we have optimized our models, let's combine them into an **ensemble model**.

In [8]:
# Build a stacked model using the best-tuned classifiers
# stacked_model = VotingClassifier()

# Train stacked model
print('Stacked Model RMSE:')

Stacked Model RMSE:


## 📊 Step 10: Final Model Comparison
Let's compare all models, including the stacked model.

In [9]:
# model_results = {
#     'Decision Tree': accuracy_score(),
#     'Random Forest': accuracy_score(),
#     'Tuned RF': accuracy_score(),
#     'XGBoost': accuracy_score(),
#     'Tuned XGBoost': accuracy_score(),
#     'Stacked Model': accuracy_score()
# }

# results_df = pd.DataFrame(list(model_results.items()), columns=['Model', 'RMSE'])
# print(results_df)

# plt.figure(figsize=(10,5))
# plt.bar(results_df['Model'], results_df['RMSE'], color=['blue', 'green', 'red', 'orange', 'purple', 'black'])
# plt.xlabel('Models')
# plt.ylabel('RMSE Score')
# plt.title('Final Model Performance Comparison')
# plt.xticks(rotation=45)
# plt.show()

## 📝 Step 11: Final Questions
Please answer the following questions in the markdown cell below:

1. **Model Comparison:** Which model had the best RMSE score? Why?
2. **Hyperparameter Tuning:** How much did tuning improve the performance of Random Forest and XGBoost?
3. **Stacked Model:** Did the stacked model outperform individual models? Why or why not?
4. **Feature Importance:** Which features were most important in predicting stock prices? Use the feature importance of XGBoost to analyze this.
5. **Real-World Application:** How can this approach be applied in real-world stock trading and investment decisions?

📌 Write your answers in the markdown cell below.

### ✍️ Your Answers Here
(Provide detailed responses to each question above.)