## Creating a Financial Data Analysis Application Using ChatGPT-o1

In this project, we will build interactive dashboards & visualizations that help us understand the stock market better. We will build a system that:

*   For given stocks, we understand its stock price

## Step 1: Build a stock price visualization tool

_Prompt Used: I would like to build a stock price visualization tool using Python. The outcome of this application needs to be an interactive visualization that is embedded in a Google Colab, where there are two drop downs, one for a given stock, and one for different time horizons (1M, 3M, 6M, etc..). Use matplotlib and seaborn for the visualizations, and Yahoo Finance to access the data. Please provide me detailed instructions on how I can paste the code into Google Colab._

In [1]:
# Install yfinance
!pip install yfinance



In [2]:
# Import libraries
import yfinance as yf
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
from ipywidgets import interact, widgets
import datetime

In [3]:
# List of stock tickers
stock_list = ['AAPL', 'GOOGL', 'MSFT', 'AMZN', 'TSLA', 'META', 'NVDA', 'JPM', 'V', 'JNJ']

# Time horizons with corresponding days
time_horizons = {
    '1 Month': 30,
    '3 Months': 90,
    '6 Months': 180,
    '1 Year': 365,
    '2 Years': 730,
    '5 Years': 1825
}

In [4]:
# Define the dropdowns
stock_dropdown = widgets.Dropdown(
    options=stock_list,
    value='AAPL',
    description='Stock:',
)

time_dropdown = widgets.Dropdown(
    options=list(time_horizons.keys()),
    value='1 Month',
    description='Time Horizon:',
)


In [5]:
def plot_stock_price(stock_symbol, time_period):
    # Calculate start and end dates
    end_date = datetime.datetime.today()
    start_date = end_date - datetime.timedelta(days=time_horizons[time_period])

    # Fetch data from Yahoo Finance
    stock_data = yf.download(stock_symbol, start=start_date, end=end_date)

    # Flatten column names if they are MultiIndex
    if isinstance(stock_data.columns, pd.MultiIndex):
        stock_data.columns = stock_data.columns.get_level_values(0)

    # Check if data is returned
    if stock_data.empty:
        print(f"No data found for {stock_symbol}")
        return

    # Reset index to use 'Date' column
    stock_data.reset_index(inplace=True)

    # Ensure 'Close' is a Series
    if isinstance(stock_data['Close'], pd.DataFrame):
        stock_data['Close'] = stock_data['Close'].squeeze()

    # Set plot style
    sns.set_style('whitegrid')

    # Create the plot
    plt.figure(figsize=(14, 7))
    sns.lineplot(x='Date', y='Close', data=stock_data, marker='o')

    # Customize the plot
    plt.title(f"{stock_symbol} Closing Prices - Last {time_period}", fontsize=16)
    plt.xlabel('Date', fontsize=14)
    plt.ylabel('Closing Price ($)', fontsize=14)
    plt.xticks(rotation=45)
    plt.tight_layout()
    plt.show()

In [6]:
# Create the interactive plot
interact(
    plot_stock_price,
    stock_symbol=stock_dropdown,
    time_period=time_dropdown
)

interactive(children=(Dropdown(description='Stock:', options=('AAPL', 'GOOGL', 'MSFT', 'AMZN', 'TSLA', 'META',…

## Step 2: Refine the stock price visualization tool, add custom date pickers, and more indicators.

_Prompt Used: Can you now help me update this script, by adding date pickers for custom data rangers, and a drop down for different indicators to visualize? Please provide a simple moving average, exponential moving average, and bollinger bands as the additional indicators._

1. Simple Moving Average (SMA)
Definition: The Simple Moving Average calculates the average of a selected range of prices, usually closing prices, by the number of periods in that range.
Purpose: It smooths out price data to identify the trend direction over a period.
Usage: Traders use SMA to determine if the asset price will continue or reverse a bull or bear trend.
2. Exponential Moving Average (EMA)
Definition: The Exponential Moving Average gives more weight to recent prices, making it more responsive to new information.
Purpose: EMA reacts faster to recent price changes compared to SMA.
Usage: Traders use EMA to spot price trends, as it can signal a change in the market earlier than SMA.
3. Bollinger Bands
Definition: Bollinger Bands consist of a middle band (usually a 20-day SMA) and two outer bands set two standard deviations above and below the middle band.
Purpose: They measure market volatility and provide a high and low range within which a security typically trades.
Usage: Traders use Bollinger Bands to identify overbought or oversold conditions and to predict possible price reversals.

In [7]:
# Install and enable required libraries
from google.colab import output
output.enable_custom_widget_manager()

import yfinance as yf
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
from ipywidgets import interact, interactive, widgets
import datetime
from IPython.display import display

In [8]:
# Create dropdown widget for stock selection
stock_dropdown = widgets.Dropdown(
    options=stock_list,
    value='AAPL',
    description='Stock:',
)

# Date pickers for start and end dates
start_date_picker = widgets.DatePicker(
    description='Start Date',
    value=datetime.date.today() - datetime.timedelta(days=365),
)

end_date_picker = widgets.DatePicker(
    description='End Date',
    value=datetime.date.today(),
)

# Indicator dropdown widget
indicator_options = [
    'None',
    'Simple Moving Average (SMA)',
    'Exponential Moving Average (EMA)',
    'Bollinger Bands'
]

indicator_dropdown = widgets.Dropdown(
    options=indicator_options,
    value='None',
    description='Indicator:',
)

In [9]:
def plot_stock_price(stock_symbol, start_date, end_date, indicator):
    # Ensure start_date and end_date are valid
    if start_date is None or end_date is None:
        print("Please select both start and end dates.")
        return

    if start_date >= end_date:
        print("Error: Start date must be before end date.")
        return

    # Fetch data from Yahoo Finance
    stock_data = yf.download(stock_symbol, start=start_date, end=end_date, progress=False)

    # Check if data is returned
    if stock_data.empty:
        print(f"No data found for {stock_symbol} between {start_date} and {end_date}")
        return

    # Flatten column names if they are MultiIndex
    if isinstance(stock_data.columns, pd.MultiIndex):
        stock_data.columns = stock_data.columns.get_level_values(0)

    # Reset index to use 'Date' column
    stock_data.reset_index(inplace=True)

    # Ensure 'Date' column is in datetime format
    if not pd.api.types.is_datetime64_any_dtype(stock_data['Date']):
        stock_data['Date'] = pd.to_datetime(stock_data['Date'])

    # Ensure 'Close' is a Series
    if isinstance(stock_data['Close'], pd.DataFrame):
        stock_data['Close'] = stock_data['Close'].squeeze()

    # Set plot style
    sns.set_style('whitegrid')

    # Create the plot
    plt.figure(figsize=(14, 7))
    plt.plot(stock_data['Date'], stock_data['Close'], label='Closing Price', marker='o')

    # Compute and plot the selected indicator
    if indicator == 'Simple Moving Average (SMA)':
        stock_data['SMA'] = stock_data['Close'].rolling(window=20).mean()
        plt.plot(stock_data['Date'], stock_data['SMA'], label='SMA (20 days)', linestyle='--')
    elif indicator == 'Exponential Moving Average (EMA)':
        stock_data['EMA'] = stock_data['Close'].ewm(span=20, adjust=False).mean()
        plt.plot(stock_data['Date'], stock_data['EMA'], label='EMA (20 days)', linestyle='--')
    elif indicator == 'Bollinger Bands':
        stock_data['SMA'] = stock_data['Close'].rolling(window=20).mean()
        stock_data['STD'] = stock_data['Close'].rolling(window=20).std()
        stock_data['Upper Band'] = stock_data['SMA'] + (stock_data['STD'] * 2)
        stock_data['Lower Band'] = stock_data['SMA'] - (stock_data['STD'] * 2)
        plt.plot(stock_data['Date'], stock_data['Upper Band'], label='Upper Bollinger Band', linestyle='-.')
        plt.plot(stock_data['Date'], stock_data['Lower Band'], label='Lower Bollinger Band', linestyle='-.')
        plt.plot(stock_data['Date'], stock_data['SMA'], label='SMA (20 days)', linestyle='--')
    # else, if indicator is 'None', do not plot any indicator

    # Customize the plot
    plt.title(f"{stock_symbol} Price with {indicator if indicator != 'None' else 'No Indicator'}", fontsize=16)
    plt.xlabel('Date', fontsize=14)
    plt.ylabel('Price ($)', fontsize=14)
    plt.xticks(rotation=45)
    plt.legend()
    plt.tight_layout()
    plt.show()

In [12]:
# Create the interactive plot
interactive_plot = interactive(
    plot_stock_price,
    stock_symbol=stock_dropdown,
    start_date=start_date_picker,
    end_date=end_date_picker,
    indicator=indicator_dropdown,
)

display(interactive_plot)

interactive(children=(Dropdown(description='Stock:', options=('AAPL', 'GOOGL', 'MSFT', 'AMZN', 'TSLA', 'META',…

## Step 3: Build a financial news sentiment analysis bot.

_Prompt Used: Now can you build a simple financial news sentiment analysis bot that analyzes headlines from financial news articles using the [Finnhub API](https://finnhub.io/docs/api). The outcome needs to be a DataFrame with 4 columns, the stock, the date, title of the article, and the sentiment._

**NOTE: Finnhub's free API for news returns lets you analyze one stock and 5 news stories only in your given time period — you need premium access to get all the stocks and news stories.**

Create a free API key by:

1. In a new browser tab, go to https://finnhub.io/
2. Click on "Get a free API key"
3. Enter your name, email, and password, and click register.
4. Copy the API key.
5. Return to your Colab notebook and use the left-hand navigation bar to open the Secrets pane.
6. Name your API key "finnhubAPI".
7. Paste the API key into the value textbox.
8. Change the Notebook Access slider to "on".

```py
from google.colab import userdata
finnhub_API = userdata.get('finnhubAPI')
```

In [13]:
# Get API
from google.colab import userdata
finnhub_API = userdata.get('finnhubAPI')

In [14]:
!pip install requests pandas nltk



In [15]:
# Import packages
import nltk
nltk.download('vader_lexicon')
import requests
import pandas as pd
from nltk.sentiment.vader import SentimentIntensityAnalyzer
from datetime import datetime, timedelta

[nltk_data] Downloading package vader_lexicon to /root/nltk_data...


In [16]:
# Define your API key
API_KEY = finnhub_API  # Replace with your actual API key
stocks = ['AAPL', 'MSFT', 'GOOGL']  # List of stock symbols
from_date = (datetime.now() - timedelta(days=7)).strftime('%Y-%m-%d')
to_date = datetime.now().strftime('%Y-%m-%d')

In [17]:
# Initiatlize your variables
sia = SentimentIntensityAnalyzer()
data = []

In [18]:
# Run sentiment analysis
for stock in stocks:
    url = f'https://finnhub.io/api/v1/company-news?symbol={stock}&from={from_date}&to={to_date}&token={API_KEY}'
    response = requests.get(url)
    if response.status_code == 200:
        news_items = response.json()
        for item in news_items:
            title = item.get('headline', '')
            date = datetime.fromtimestamp(item.get('datetime', 0)).strftime('%Y-%m-%d %H:%M:%S')
            sentiment_score = sia.polarity_scores(title)['compound']
            sentiment = 'Positive' if sentiment_score > 0 else 'Negative' if sentiment_score < 0 else 'Neutral'
            data.append({
                'Stock': stock,
                'Date': date,
                'Title': title,
                'Sentiment': sentiment
            })
    else:
        print(f"Failed to fetch news for {stock}. Status code: {response.status_code}")

In [19]:
# Convert sentiments into a table
my_data = pd.DataFrame(data, columns=['Stock', 'Date', 'Title', 'Sentiment'])
my_data

Unnamed: 0,Stock,Date,Title,Sentiment
0,AAPL,2025-03-28 18:47:42,"Magnificent Seven Stocks Dive As Amazon, Apple...",Positive
1,AAPL,2025-03-28 15:59:53,TBLD: Profit From The Market's Shift Towards E...,Positive
2,AAPL,2025-03-28 15:35:48,Why Mag 7 is a 'boy band' that needs to 'breakup',Neutral
3,AAPL,2025-03-28 15:00:00,"Feeling Ripped Off by $1,000 Phones? The Secon...",Positive
4,AAPL,2025-03-28 14:00:04,Powerbeats Pro 2 Review: Best Workout Earbuds ...,Positive
...,...,...,...,...
558,GOOGL,2025-03-22 12:15:03,Alphabet: Too Early To Buy The Dip,Neutral
559,GOOGL,2025-03-22 08:15:33,"VUG: Mag 7 Now The Lag 7, Better Valuation, We...",Negative
560,GOOGL,2025-03-22 08:05:00,4 Stocks & 3 ETFs I'm Buying As Economic Uncer...,Negative
561,GOOGL,2025-03-22 06:46:08,What Moved Markets This Week,Neutral



Passing `palette` without assigning `hue` is deprecated and will be removed in v0.14.0. Assign the `y` variable to `hue` and set `legend=False` for the same effect.




Passing `palette` without assigning `hue` is deprecated and will be removed in v0.14.0. Assign the `y` variable to `hue` and set `legend=False` for the same effect.



In [20]:
# Calculate the number of articles per stock and sentiment
stock_sentiment_counts = my_data.groupby(['Stock', 'Sentiment']).size().reset_index(name='Count')

# Display the result
stock_sentiment_counts

Unnamed: 0,Stock,Sentiment,Count
0,AAPL,Negative,38
1,AAPL,Neutral,69
2,AAPL,Positive,73
3,GOOGL,Negative,26
4,GOOGL,Neutral,81
5,GOOGL,Positive,76
6,MSFT,Negative,36
7,MSFT,Neutral,70
8,MSFT,Positive,94
