## Predictive Analysis on Stock Prices using Machine Learning and LightningChart Python

### 1. Introduction

#### 1.1 What is the stock market and how does it operate?
The stock market is a complex network of buyers and sellers trading shares, operating as one of the most pivotal components of a free-market economy. It functions on fluctuating prices of shares based on supply and demand dynamics, investor sentiment, and various economic indicators.

#### 1.2 Why is it important for stock traders to attempt stock price prediction analyses?
For stock traders, predicting price movements can mean the difference between significant gains and losses. This highlights the crucial role of stock price prediction using machine learning in Python.

#### 1.3 How can machine learning help in stock prices prediction?
Machine learning, capable of analyzing vast datasets and identifying patterns, has emerged as a powerful tool for stock price prediction. This project explores various models and methodologies used to forecast market trends.

#### 1.4 LSTM Model for Predicting Stock Prices
The Long Short-Term Memory (LSTM) network, a type of recurrent neural network (RNN), is well-suited for sequence prediction problems like stock price forecasting due to its ability to capture temporal dependencies.

### 2. LightningChart Python

#### 2.1 Overview of LightningChart Python
LightningChart Python is a high-performance data visualization library designed for creating complex, interactive, and real-time charts, particularly useful in financial applications.

#### 2.2 Features and Chart Types to be Used in the Project
- **XY Chart**: For visualizing data in two dimensions with series types such as Line Series, Point Line Series, and Area Series.
- **3D Chart**: For a more immersive and detailed view of data trends.
- **Line Chart**: Used for visualizing changes in stock prices over time.
- **Stacked Bar Chart and Grouped Bar Chart**: For comparing different components of stock data.

#### 2.3 Performance Characteristics
Key performance characteristics include real-time data updating, high refresh rates, and efficient data handling, essential for financial applications where data needs to be processed and visualized in real-time.

### 3. Setting Up Python Environment

#### 3.1 Installing Python and Necessary Libraries
Install Python from the official website and use pip to install necessary libraries including LightningChart Python from PyPI.

In [1]:
# pip install lightningcharts random numpy pandas scikit-learn tensorflow

In [2]:
import lightningchart as lc
import random
lc.set_license('my-license-key')

import math
import numpy as np
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM, Dropout
from datetime import datetime
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score

#### 3.2 Overview of Libraries Used
- **LightningChart**: Advanced data visualization.
- **NumPy**: Numerical computation.
- **Pandas**: Data manipulation and analysis.
- **Scikit-learn**: Data mining and data analysis.
- **Tensorflow**: Machine learning model development.

#### 3.3 Setting Up the Development Environment
Recommended IDEs include Jupyter Notebook, PyCharm, or Visual Studio Code.

### 4. Loading and Processing Data

#### 4.1 How to Load the Data Files
Data can be sourced from financial databases like Yahoo Finance, Alpha Vantage, and Quandl.

In [3]:
# Load and preprocess the dataset
df_googl = pd.read_csv('./Alphabet Inc - Class A (GOOGL).csv')
df_googl.rename(columns={"Date":"date","Open":"open","High":"high","Low":"low","Close":"close"}, inplace=True)
df_googl['date'] = pd.to_datetime(df_googl.date)
df_googl.sort_values(by='date', inplace=True)
specified_start_date = pd.to_datetime('2020-01-01')
specified_end_date = pd.to_datetime('2024-05-14')
filtered_df = df_googl[(df_googl['date'] >= specified_start_date) & (df_googl['date'] <= specified_end_date)]

#### 4.2 Handling and preprocessing the data
Preprocessing involves cleaning the data, handling missing values, and transforming it for machine learning models.

In [4]:
# Normalize/scale the close values between 0 and 1
close_stock_values = filtered_df['close'].values.reshape(-1, 1)
scaler = MinMaxScaler(feature_range=(0, 1))
normalized_close_values = scaler.fit_transform(close_stock_values)

# Split the data into training and testing sets
training_size = int(len(normalized_close_values) * 0.65)
test_size = len(normalized_close_values) - training_size
train_data, test_data = normalized_close_values[0:training_size, :], normalized_close_values[training_size:len(normalized_close_values), :]

# Function to create dataset matrix for time-series prediction
def create_dataset(dataset, time_step=1):
    dataX, dataY = [], []
    for i in range(len(dataset)-time_step-1):
        a = dataset[i:(i+time_step), 0]
        dataX.append(a)
        dataY.append(dataset[i + time_step, 0])
    return np.array(dataX), np.array(dataY)

time_step = 15
X_train, y_train = create_dataset(train_data, time_step)
X_test, y_test = create_dataset(test_data, time_step)

# Reshape the data for LSTM model
X_train = X_train.reshape(X_train.shape[0], X_train.shape[1], 1)
X_test = X_test.reshape(X_test.shape[0], X_test.shape[1], 1)

#### 4.3 Validation of the Study
- **Training Data Metrics**: Include R² Score, RMSE, MSE, and MAE to showcase model accuracy.
- **Testing Data Metrics**: Evaluate model generalizability and accuracy.

### 5. Visualizing Data with LightningChart

#### 5.1 Introduction to LightningChart for Python
A tool for creating highly interactive and customizable charts, suitable for financial data visualization.

#### 5.2 Creating the charts
Create various charts using LightningChart Python to visualize stock data effectively.

#### 5.3 Customizing visualizations
LightningChart offers extensive customization options, including adjusting colors, adding markers, or integrating real-time data updates.

In [6]:
# Build and train the LSTM model
model = Sequential()
model.add(LSTM(50, return_sequences=True, input_shape=(time_step, 1)))
model.add(LSTM(50, return_sequences=True))
model.add(LSTM(50))
model.add(Dropout(0.2))
model.add(Dense(25))
model.add(Dense(1))

model.compile(optimizer='adam', loss='mean_squared_error')
model.fit(X_train, y_train, batch_size=64, epochs=100, validation_data=(X_test, y_test), verbose=2)

# Make predictions
train_predict = model.predict(X_train)
test_predict = model.predict(X_test)

# Invert predictions back to original scale
train_predict = scaler.inverse_transform(train_predict)
y_train_inv = scaler.inverse_transform(y_train.reshape(-1, 1))
test_predict = scaler.inverse_transform(test_predict)
y_test_inv = scaler.inverse_transform(y_test.reshape(-1, 1))

# Calculate metrics
train_rmse = np.sqrt(mean_squared_error(y_train_inv, train_predict))
test_rmse = np.sqrt(mean_squared_error(y_test_inv, test_predict))
train_mse = mean_squared_error(y_train_inv, train_predict)
test_mse = mean_squared_error(y_test_inv, test_predict)
train_mae = mean_absolute_error(y_train_inv, train_predict)
test_mae = mean_absolute_error(y_test_inv, test_predict)
train_r2 = r2_score(y_train_inv, train_predict)
test_r2 = r2_score(y_test_inv, test_predict)

# Evaluating the Model
# Making predictions
train_predict = model.predict(X_train)
test_predict = model.predict(X_test)

# Inverting predictions back to original scale
train_predict = scaler.inverse_transform(train_predict)
y_train_inv = scaler.inverse_transform(y_train.reshape(-1, 1))
test_predict = scaler.inverse_transform(test_predict)
y_test_inv = scaler.inverse_transform(y_test.reshape(-1, 1))

# Calculate RMSE, MSE, and MAE for training data
train_rmse = math.sqrt(mean_squared_error(y_train_inv, train_predict))
train_mse = mean_squared_error(y_train_inv, train_predict)
train_mae = mean_absolute_error(y_train_inv, train_predict)

# Calculate RMSE, MSE, and MAE for testing data
test_rmse = math.sqrt(mean_squared_error(y_test_inv, test_predict))
test_mse = mean_squared_error(y_test_inv, test_predict)
test_mae = mean_absolute_error(y_test_inv, test_predict)

# R2 score for regression
train_r2 = r2_score(y_train_inv, train_predict)
test_r2 = r2_score(y_test_inv, test_predict)

# Print training and testing metrics
print("----Training Data Metrics----")
print("Train RMSE: ", train_rmse)
print("Train MSE: ", train_mse)
print("Train MAE: ", train_mae)
print("-------------------------------")

print("----Testing Data Metrics----")
print("Test RMSE: ", test_rmse)
print("Test MSE: ", test_mse)
print("Test MAE: ", test_mae)
print("-------------------------------")

print("----R2 score for regression----")
print("Train R2 Score: ", train_r2)
print("Test R2 Score: ", test_r2)


# Predict next 10 days
x_input = test_data[-time_step:].reshape(1, -1)
temp_input = list(x_input)
temp_input = temp_input[0].tolist()

lst_output = []
n_steps = time_step
pred_days = 10

for i in range(pred_days):
    if len(temp_input) > time_step:
        x_input = np.array(temp_input[1:])
        x_input = x_input.reshape(1, -1)
        x_input = x_input.reshape((1, n_steps, 1))
        yhat = model.predict(x_input, verbose=0)
        temp_input.extend(yhat[0].tolist())
        temp_input = temp_input[1:]
        lst_output.extend(yhat.tolist())
    else:
        x_input = x_input.reshape((1, n_steps, 1))
        yhat = model.predict(x_input, verbose=0)
        temp_input.extend(yhat[0].tolist())
        lst_output.extend(yhat.tolist())

predicted_values = scaler.inverse_transform(np.array(lst_output).reshape(-1, 1))

# Create a DataFrame to display the results
future_dates = pd.date_range(start=filtered_df['date'].iloc[-1], periods=pred_days + 1, inclusive='right')
prediction_df = pd.DataFrame({'date': future_dates, 'predicted_close': predicted_values.flatten()})

# Prepare data for LC chart
actual_dates = filtered_df['date'].tolist()
actual_close = filtered_df['close'].tolist()
predicted_dates = prediction_df['date'].tolist()
predicted_close = prediction_df['predicted_close'].tolist()

# Initialize LightningChart and set the license key
chart = lc.ChartXY(title='Actual vs Predicted Close Prices')

# Dispose the default x-axis and create a high precision datetime axis
chart.get_default_x_axis().dispose()
axis_x = chart.add_x_axis(axis_type='linear-highPrecision')
axis_x.set_tick_strategy('DateTime')

# Convert datetime to timestamps for plotting
actual_date_timestamps = [x.timestamp() * 1000 for x in actual_dates]
predicted_date_timestamps = [x.timestamp() * 1000 for x in predicted_dates]

# Plot actual prices
series_actual = chart.add_line_series()
series_actual.add(x=actual_date_timestamps, y=actual_close)
series_actual.set_name('Actual Prices')

# Plot train predicted prices
series_train_predicted = chart.add_line_series()
series_train_predicted.add(x=actual_date_timestamps[:len(train_predict)], y=train_predict.flatten())
series_train_predicted.set_name('Train Predictions')

# Plot test predicted prices
series_test_predicted = chart.add_line_series()
series_test_predicted.add(x=actual_date_timestamps[len(train_predict):len(train_predict)+len(test_predict)], y=test_predict.flatten())
series_test_predicted.set_name('Test Predictions')

# Plot future predicted prices
series_future_predicted = chart.add_line_series()
series_future_predicted.add(x=predicted_date_timestamps, y=predicted_close)
series_future_predicted.set_name('Future Predictions')

# Add a legend to the chart
legend = chart.add_legend()
legend.add(series_actual)
legend.add(series_train_predicted)
legend.add(series_test_predicted)
legend.add(series_future_predicted)

# Open the chart
chart.open()


  super().__init__(**kwargs)


Epoch 1/100
11/11 - 5s - 484ms/step - loss: 0.0847 - val_loss: 0.0087
Epoch 2/100
11/11 - 1s - 77ms/step - loss: 0.0198 - val_loss: 0.0195
Epoch 3/100
11/11 - 0s - 26ms/step - loss: 0.0119 - val_loss: 0.0067
Epoch 4/100
11/11 - 0s - 27ms/step - loss: 0.0066 - val_loss: 0.0022
Epoch 5/100
11/11 - 0s - 23ms/step - loss: 0.0046 - val_loss: 0.0030
Epoch 6/100
11/11 - 0s - 25ms/step - loss: 0.0034 - val_loss: 0.0028
Epoch 7/100
11/11 - 0s - 24ms/step - loss: 0.0038 - val_loss: 0.0021
Epoch 8/100
11/11 - 0s - 31ms/step - loss: 0.0039 - val_loss: 0.0021
Epoch 9/100
11/11 - 0s - 24ms/step - loss: 0.0034 - val_loss: 0.0022
Epoch 10/100
11/11 - 0s - 24ms/step - loss: 0.0030 - val_loss: 0.0027
Epoch 11/100
11/11 - 0s - 26ms/step - loss: 0.0029 - val_loss: 0.0024
Epoch 12/100
11/11 - 0s - 26ms/step - loss: 0.0027 - val_loss: 0.0023
Epoch 13/100
11/11 - 0s - 25ms/step - loss: 0.0029 - val_loss: 0.0023
Epoch 14/100
11/11 - 0s - 24ms/step - loss: 0.0026 - val_loss: 0.0031
Epoch 15/100
11/11 - 0s - 24

127.0.0.1 - - [04/Jun/2024 15:18:10] "GET / HTTP/1.1" 200 -


### Some results' images

![Stacked Bar Chart](./images/Stacked%20Bar%20Chart.png)
![Grouped Bar Chart](./images/Grouped%20Bar%20Chart.png)
![Line Chart](./images/Line%20Chart.png)
![Point Line Chart](./images/Point%20Line%20Chart.png)
![Simple Line Chart](./images/Simple%20Line%20Chart.png)
![Area Chart](./images/Area%20Chart.png)
![3D Line Chart](./images/3D%20Line%20Chart.png)
![Line Chart Comparison](./images/Line%20Chart%20Comparison.png)
![Line Chart Prediction](./images/Line%20Chart%20Prediction.png)
![Line Chart Prediction 2](./images/Line%20Chart%20Prediction%202.png)
![Final Comparison and Prediction](./images/Final%20Comparison%20and%20Prediction.png)
![Final Comparison and Prediction 2](./images/Final%20Comparison%20and%20Prediction%202.png)

### 6. Conclusion

#### 6.1 Recap of creating the application and its usefulness
This application demonstrates using advanced machine learning techniques and high-performance visualization tools to predict future stock prices, providing insightful and actionable information for stock traders.

#### 6.2 Benefits of using LightningChart Python for visualizing data
The library's performance and feature set make it an excellent choice for visualizing stock market data, ensuring traders have access to real-time data for timely and informed decisions.