## **FOUNDATION PROJECT - GROUP ASSIGNMENT** ##

> **Use Case ::** Predicting Stock Movement - **"Stock Market Copilot"**

> **Dataset Source ::** Yahoo Finance - https://finance.yahoo.com/quote/TSLA/history/?filter=history

> **Group No. ::** 6

## **USER INTERFACE - STREAMLIT**

**Overall Steps -**

> 1. **Installation:** Install necessary libraries like **`streamlit, altair, uvicorn, pyngrok, scikit-learn`**.
> 2. **Import Libraries:** Import required libraries for data manipulation, visualization, API interaction, and Streamlit app development.
> 3. **`Define Streamlit App:`**
>> - Set up the Streamlit app layout, title, and styling.
>> - Create user input elements like select boxes, date input, and buttons.
>> - Define logic for handling user inputs and making predictions using an external API.
>> - Display results, including plots and data tables.
> 4. **Run Streamlit App:**
>> - Start the Streamlit server in a separate thread.
>> - Use ngrok to expose the app publicly.

**Functions Used -**
> **Inbuilt Functions ::**
- **`pip install`**: Installs Python packages.
- **`os.system`**: Executes a system command.
- **`st.set_page_config`**: Configures the Streamlit page layout.
- **`st.title`**: Sets the title of the Streamlit app.
- **`st.markdown`**: Displays Markdown text in Streamlit.
- **`st.radio`**: Creates a radio button group for user selection.
- **`st.selectbox`**: Creates a dropdown menu for user selection.
- **`st.date_input`**: Creates a date picker for user input.
- **`st.button`**: Creates a button that triggers an action when clicked.
- **`st.spinner`**: Displays a spinner while an action is being performed.
- **`requests.post`**: Sends a POST request to an API endpoint.
- **`pd.DataFrame`**: Creates a Pandas DataFrame.
- **`pd.to_datetime`**: Converts a column to datetime objects.
- **`go.Figure`**: Creates a Plotly figure.
- **`go.Scatter`**: Creates a scatter plot trace in Plotly.
- **`fig.update_layout`**: Updates the layout of a Plotly figure.
- **`st.plotly_chart`**: Displays a Plotly chart in Streamlit.
- **`st.expander`**: Creates an expandable section in Streamlit.
- **`st.dataframe`**: Displays a Pandas DataFrame in Streamlit.
- **`st.download_button`**: Creates a button for downloading data.
- **`plt.subplots`**: Creates a Matplotlib figure and axes.
- **`sn.kdeplot`**: Creates a kernel density estimation plot using Seaborn.
- **`st.pyplot`**: Displays a Matplotlib plot in Streamlit.
- **`ngrok.set_auth_token`**: Sets the authentication token for ngrok.
- **`ngrok.connect`**: Creates a public URL for a local port using ngrok.
- **`print`**: Prints output to the console.

> **User-Defined Function ::**
>
> - **`create_streamlit_app`**: This function contains the entire logic for building and running the Streamlit app. It defines the layout, user inputs, prediction logic, and display of results.

In [1]:
pip install streamlit altair uvicorn pyngrok --no-warn-script-location

Collecting streamlit
  Downloading streamlit-1.44.1-py3-none-any.whl.metadata (8.9 kB)
Collecting uvicorn
  Downloading uvicorn-0.34.2-py3-none-any.whl.metadata (6.5 kB)
Collecting pyngrok
  Downloading pyngrok-7.2.4-py3-none-any.whl.metadata (8.7 kB)
Collecting watchdog<7,>=2.1.5 (from streamlit)
  Downloading watchdog-6.0.0-py3-none-manylinux2014_x86_64.whl.metadata (44 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.3/44.3 kB[0m [31m1.8 MB/s[0m eta [36m0:00:00[0m
Collecting pydeck<1,>=0.8.0b4 (from streamlit)
  Downloading pydeck-0.9.1-py2.py3-none-any.whl.metadata (4.1 kB)
Downloading streamlit-1.44.1-py3-none-any.whl (9.8 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m9.8/9.8 MB[0m [31m68.7 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading uvicorn-0.34.2-py3-none-any.whl (62 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m62.5/62.5 kB[0m [31m4.0 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading pyngrok-7.2.4-py3-none

In [2]:
# Importing necessary libraries
from multiprocessing import Process
from threading import Thread
from pyngrok import ngrok
import streamlit as st
import pandas as pd
import numpy as np
from joblib import load
import os
import plotly.graph_objects as go

import seaborn as sn
import matplotlib.pyplot as plt

# Google Colab Notbook related imports
from google.colab import userdata

### **User Interface - Streamlit**
> In this stage we will creating a User interface for user to select the input values and display the predicted value through the best performance model deployed. The UI app will use the FastAPI service hosted at localhost:8501 to post the post the request to model and fetch the value.

> **Streamlit -**
>> Streamlit is a free and open-source framework to rapidly build and share web apps without extensive web development knowledge.

#### **Creating the APP**
> We will be defining the app layout i.e., the fields that are to be displayed to the user over the screen and the type of field to be provided (e.g., dropdown for stratified data or text field etc.) along the set of values acceptable in each field (e.g, range of values for free text or definite values to be displayed in dropdown).

**Steps Performed -**

**`create_streamlit_app:`**
>
> 1. **Setup:** Sets up the page configuration, title, and custom styling for the Streamlit app.
> 2. User Inputs: Defines input elements for stock selection, prediction date, and prediction period using **`st.selectbox`** and **`st.date_input`**.
> 3. **Profile Selection:** Allows the user to choose between "Stock Market Analyst" and "Developer" profiles using **`st.radio`**.
> 4. **Prediction Logic (Analyst Profile):**
>> - Sends a POST request to an API endpoint with user inputs to generate predictions.
>> - Processes the API response and creates visualizations using Plotly.
>> - Displays the predictions and historical data using **`st.plotly_chart`** and **`st.dataframe`**.
>> - Provides options to export predictions as a CSV file.
> 5. **Drift Check (Developer Profile):**
>> - Sends a POST request to a different API endpoint for drift checking.
>> - Processes the response and displays the drift status and feature-wise summary.
>> - Generates KDE plots for numeric features using Seaborn and Matplotlib.
> 6. **Footer:** Displays a disclaimer and data source information.
>
>The web application is hosted at a localserver started at endpoint 0.0.0.0:8501 using the uvicorn module. The streamlit module automatically converts the outline defined in app.py to html content with a user friendly interface.
>
> When user enters the details and clicks Predict the request is received by the app and posted to the secure API endpoint exposed over public URL.

**Note - Pls ensure the "ngrokPublicURL.txt" file to be fetched and uploaded from Deployment Code Notebook and "ngrokPublicURL2.txt" from Monitoring Code Notebook to the local storage of this notebook.**

**Using NGROK::**

- We use ngrok to create a secure endpoint to which an external user can access the web application hosted. Ngrok creates a secure tunnel between the public exposed endpoint and the local server (0.0.0.0:8501) end point on which the service is running.
- Once the enters the details on the page and hits the Predict button,traffic/request hits the public endpoint, ngrok forwards the traffic over the secure channel thereby abstracting the internal working endpoint from outside world.


In [3]:
pip install scikit-learn==1.3.2

Collecting scikit-learn==1.3.2
  Downloading scikit_learn-1.3.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (11 kB)
Collecting numpy<2.0,>=1.17.3 (from scikit-learn==1.3.2)
  Downloading numpy-1.26.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (61 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m61.0/61.0 kB[0m [31m2.1 MB/s[0m eta [36m0:00:00[0m
Downloading scikit_learn-1.3.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (10.9 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m10.9/10.9 MB[0m [31m60.1 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading numpy-1.26.4-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.3 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m18.3/18.3 MB[0m [31m35.5 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: numpy, scikit-learn
  Attempting uninstall: numpy
    Found existing installation: numpy 2.0.2
    Uninst

In [15]:
%%writefile app.py

from pyngrok import ngrok
import streamlit as st
import pandas as pd
import numpy as np
import requests
from datetime import datetime, timedelta
import matplotlib.pyplot as plt
import plotly.graph_objects as go
import seaborn as sn

# Load FastAPI public URLs
with open("ngrokPublicURL.txt", "r") as f:
    endPointUrl = f.read().strip()

with open("ngrokPublicURL2.txt", "r") as g:
    endPointUrl2 = g.read().strip()

def create_streamlit_app():
    st.set_page_config(page_title="Stock Market Copilot", layout="wide")
    st.title("Stock Market Copilot")

    # Custom style
    st.markdown("""
    <style>
    .stAlert { padding: 20px; }
    .st-b7 { color: black !important; }
    .st-cg, .stSelectbox div[data-baseweb="select"],
    div[data-baseweb="select"] > div,
    div[data-baseweb="menu"],
    div[data-baseweb="menu"] > div {
        background-color: white !important;
        color: black !important;
        border-radius: 5px;
        padding: 1px;
        font-size: 16px;
    }
    div[data-baseweb="option"]:hover {
        background-color: #f0f0f0 !important;
        color: black !important;
    }
    </style>
    """, unsafe_allow_html=True)

    # Profile toggle
    profile = st.radio("Profile", ["Stock Market Analyst", "Developer"], horizontal=True)

    # Common Inputs
    stock_name = st.selectbox(
        "Select Stock",
        ["APPLE (AAPL)", "Tesla, Inc. (TSLA)", "Berkshire Hathaway Inc. (BRK-B)"],
        index=0
    )
    prediction_date = st.date_input("Prediction Start Date", value=datetime.now())
    prediction_period = st.selectbox("Select Prediction Period", ["2 weeks", "1 month", "3 months"], index=0)
    n_days = 10 if prediction_period == "2 weeks" else 20 if prediction_period == "1 month" else 60

    # Stock Analyst Mode – Predictions
    if profile == "Stock Market Analyst":
        if st.button("Generate Predictions"):
            with st.spinner("Generating predictions..."):
                try:
                    response = requests.post(
                        endPointUrl + "/predict",
                        json={
                            "Date": prediction_date.strftime('%Y-%m-%d'),
                            "n_days": n_days,
                            "range": prediction_period,
                            "stock": stock_name
                        },
                        headers={"Content-Type": "application/json"},
                        timeout=30
                    )

                    if response.status_code != 200:
                        raise Exception(f"API Error: {response.text}")

                    results = response.json()
                    predictions = pd.DataFrame(results['predictions'])
                    validation = pd.DataFrame(results['validation'])

                    predictions['Date'] = pd.to_datetime(predictions['Date'])
                    predictions['Day'] = predictions['Date'].dt.day_name()
                    predictions['Week'] = predictions['Date'].dt.isocalendar().week

                    if not validation.empty:
                        validation['Date'] = pd.to_datetime(validation['Date'])

                    # Plotting logic
                    fig = go.Figure()

                    if not validation.empty:
                        fig.add_trace(go.Scatter(
                            x=validation['Date'],
                            y=validation['Close'],
                            name='Historical Data',
                            line=dict(color='blue', width=2),
                            hovertemplate='Date: %{x|%Y-%m-%d}<br>Value: %{y:.2f}<extra></extra>'
                        ))

                    predictions['Change'] = predictions['Predicted'].diff()
                    predictions['Direction'] = predictions['Change'].apply(lambda x: 'green' if x > 0 else ('red' if x < 0 else 'gray'))
                    predictions['Prev_Date'] = predictions['Date'].shift(1).dt.strftime('%Y-%m-%d')
                    predictions['Prev_Day'] = predictions['Day'].shift(1)
                    predictions['Prev_Week'] = predictions['Week'].shift(1)
                    predictions['Prev_Value'] = predictions['Predicted'].shift(1)

                    # Add only one legend entry for predicted
                    fig.add_trace(go.Scatter(
                        x=[predictions['Date'].iloc[1]],
                        y=[predictions['Predicted'].iloc[1]],
                        name="Predicted Values",
                        line=dict(color='green'),
                        mode='lines',
                        showlegend=True,
                        visible='legendonly'
                    ))

                    for i in range(1, len(predictions)):
                        fig.add_trace(go.Scatter(
                            x=predictions['Date'].iloc[i-1:i+1],
                            y=predictions['Predicted'].iloc[i-1:i+1],
                            line=dict(
                                color=predictions['Direction'].iloc[i],
                                width=2
                            ),
                            mode='lines+markers',
                            marker=dict(size=8, color=predictions['Direction'].iloc[i-1]),
                            showlegend=False,
                            customdata=np.array([[  # tooltip
                                predictions['Day'].iloc[i],
                                int(predictions['Week'].iloc[i]),
                                predictions['Prev_Date'].iloc[i-1],
                                predictions['Prev_Day'].iloc[i],
                                int(predictions['Prev_Week'].iloc[i]),
                                float(predictions['Prev_Value'].iloc[i-1])
                            ]]),
                            hovertemplate=(
                                '<b>Current Day</b><br>'
                                'Date: %{x|%Y-%m-%d}<br>'
                                'Day: %{customdata[0]}<br>'
                                'Week: %{customdata[1]}<br>'
                                'Value: %{y:.2f}<br><br>'
                                '<b>Previous Day</b><br>'
                                'Date: %{customdata[2]}<br>'
                                'Day: %{customdata[3]}<br>'
                                'Week: %{customdata[4]}<br>'
                                'Value: %{customdata[5]:.2f}'
                                '<extra></extra>'
                            )
                        ))

                    fig.update_layout(
                        title='Stock Price Prediction with Directional Trends',
                        xaxis_title='Date',
                        yaxis_title='Price ($)',
                        hovermode='closest',
                        xaxis=dict(
                            range=[predictions['Date'].min() - timedelta(days=3),
                                   predictions['Date'].max() + timedelta(days=3)]
                        ),
                        legend=dict(orientation="h", yanchor="bottom", y=1.02, xanchor="right", x=1)
                    )
                    st.plotly_chart(fig, use_container_width=True)

                    with st.expander("View Detailed Prediction Data"):
                        def style_predictions(row):
                            if row.name == 0:
                                return [''] * len(row)
                            return ['background-color: lightgreen' if row['Predicted'] > predictions.at[row.name-1, 'Predicted']
                                    else 'background-color: lightcoral' for _ in row]
                        st.dataframe(predictions.style.apply(style_predictions, axis=1))

                        if not predictions.empty:
                            ticker = stock_name.split('(')[-1].replace(')', '')
                            timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
                            filename = f"{ticker}_{timestamp}.csv"

                            predictions_export = predictions[['Date', 'Predicted']].copy()
                            predictions_export['Day'] = predictions_export['Date'].dt.day_name()

                            csv_data = predictions_export.to_csv(index=False).encode('utf-8')
                            st.download_button(
                                label="📥 Export Predictions to CSV",
                                data=csv_data,
                                file_name=filename,
                                mime='text/csv'
                            )

                    if not validation.empty:
                        with st.expander("View Historical Validation Data"):
                            st.dataframe(validation)

                except Exception as e:
                    st.error(f"Prediction failed: {str(e)}")

    # Developer Mode – Drift Check
    if profile == "Developer":
        if st.button("Drift Check"):
            with st.spinner("Checking data drift..."):
                try:
                    response = requests.post(
                        endPointUrl2 + "/checkDrift",
                        json={
                            "Date": prediction_date.strftime('%Y-%m-%d'),
                            "n_days": n_days,
                            "range": prediction_period,
                            "stock": stock_name
                        },
                        headers={"Content-Type": "application/json"},
                        timeout=30
                    )

                    if response.status_code != 200:
                        raise Exception(f"Drift API Error: {response.text}")

                    result = response.json()

                    st.markdown(f"### Overall Drift Status: **{'Drift Detected' if result['overall_drift_status'] else 'No Significant Drift'}**")
                    st.markdown("### Feature-wise Drift Summary:")
                    summary_df = pd.DataFrame(result['feature_summary'])
                    st.dataframe(summary_df)

                    st.markdown("### KDE Plots of Numeric Features (Train vs Prediction)")

                    train_df = pd.DataFrame(result['numeric_features_train'])
                    prod_df = pd.DataFrame(result['numeric_feature_predict'])
                    num_vars = result['num_features']

                    for num_feature in num_vars:
                        fig, ax = plt.subplots(figsize=(8, 4))
                        sn.kdeplot(train_df[num_feature], alpha=0.3, color='purple', fill=True, label='Train Dataset', ax=ax)
                        sn.kdeplot(prod_df[num_feature], alpha=0.3, color='yellow', fill=True, label='Prediction Dataset', ax=ax)
                        ax.set_title(f"Distribution of Feature :: {num_feature}")
                        ax.set_xlabel(num_feature)
                        ax.set_ylabel("Density")
                        ax.legend(loc='upper right')
                        st.pyplot(fig)

                except Exception as e:
                    st.error(f"Drift check failed: {str(e)}")

    # Footer
    st.markdown(f"""
    ---
    **Disclaimer:** This application is for informational purposes only and should not be considered as financial advice.
    The predictions are based on historical data and statistical models, and past performance is not indicative of future results.
    Always consult a qualified financial advisor before making investment decisions.

    Data Source: Yahoo Finance | Model Time: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}
    """)

if __name__ == "__main__":
    create_streamlit_app()


Overwriting app.py


#### **Running the UI**
> The UI service shall be hosted on port 8501 in localhost.

In [6]:
# Storing the ngrok auth-token which will be later used to authorize the web user posting the API request when connecting to the API service hosted at port 8000
ngrok.set_auth_token("2oy6VhkcQYpcuGwVNoaRhKrA5T6_vwBUcQ4iShdyFHoMnCD4")



In [7]:
# Start the Streamlit server in a separate thread so that the execution of main programme running in this notebook is not interrupted
streamlit_thread = Thread(
    target=lambda: os.system("streamlit run app.py --server.port 8501"), daemon=True
)
streamlit_thread.start()

# Expose the Streamlit app through ngrok
# ngrokPublicURL.txt file to be fetched and uploaded from Notebook-2
streaming_url = ngrok.connect(8501)
print(f"Streamlit public URL: {streaming_url}")

Streamlit public URL: NgrokTunnel: "https://b750-35-237-53-30.ngrok-free.app" -> "http://localhost:8501"
