<a href="https://colab.research.google.com/github/DarthCoder501/InvestorAI/blob/main/InvestorAI_Jupyter_Notebook.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
!pip install -U langgraph langgraph-checkpoint-sqlite yfinance -U langchain-groq langchain-community

Collecting langgraph
  Downloading langgraph-0.5.2-py3-none-any.whl.metadata (6.7 kB)
Collecting langgraph-checkpoint-sqlite
  Downloading langgraph_checkpoint_sqlite-2.0.10-py3-none-any.whl.metadata (2.7 kB)
Collecting langchain-groq
  Downloading langchain_groq-0.3.6-py3-none-any.whl.metadata (2.6 kB)
Collecting langchain-community
  Downloading langchain_community-0.3.27-py3-none-any.whl.metadata (2.9 kB)
Collecting langgraph-checkpoint<3.0.0,>=2.1.0 (from langgraph)
  Downloading langgraph_checkpoint-2.1.0-py3-none-any.whl.metadata (4.2 kB)
Collecting langgraph-prebuilt<0.6.0,>=0.5.0 (from langgraph)
  Downloading langgraph_prebuilt-0.5.2-py3-none-any.whl.metadata (4.5 kB)
Collecting langgraph-sdk<0.2.0,>=0.1.42 (from langgraph)
  Downloading langgraph_sdk-0.1.72-py3-none-any.whl.metadata (1.5 kB)
Collecting aiosqlite>=0.20 (from langgraph-checkpoint-sqlite)
  Downloading aiosqlite-0.21.0-py3-none-any.whl.metadata (4.3 kB)
Collecting sqlite-vec>=0.1.6 (from langgraph-checkpoint-sql

In [2]:
from langchain.chat_models import init_chat_model
from langgraph.checkpoint.memory import MemorySaver
from langgraph.prebuilt import create_react_agent
from langchain_community.tools.yahoo_finance_news import YahooFinanceNewsTool
from langchain_core.tools import tool

from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_absolute_error, mean_absolute_percentage_error, mean_squared_error

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dropout, Dense
from tensorflow.keras.optimizers import Adam

import yfinance as yf
import numpy as np
import pandas as pd
import getpass
import os
from datetime import datetime
import random



In [None]:
os.environ["LANGSMITH_TRACING"] = "true"

In [None]:
os.environ["LANGSMITH_API_KEY"] = getpass.getpass()

In [4]:
os.environ["GROQ_API_KEY"] = getpass.getpass()

In [5]:
memory = MemorySaver()
model = init_chat_model("llama-3.1-8b-instant", model_provider="groq")

In [6]:
# Tool for LLM to use to get real time stock prices
@tool
def real_time_stock_price(ticker:str)-> float:
  """Fetches the real-time stock price for a given ticker symbol."""
  stock = yf.Ticker(ticker) # Creates a Ticker object for the specified ticker symbol from the user
  price = stock.history(period="1d")['Close'][0] # Retrieves historical data for the last trading day & extracts the closing price of the stock
  return price

In [7]:
class StockPredictor():
  # Load and preprocess data w/ new features
  def load_data(self, ticker): # Added ticker as an argument
      df = yf.download(ticker, start="2020-01-01", end=datetime.today().strftime('%Y-%m-%d'))
      df = df.reset_index() # Resetting index to make 'Date' a column
      df['Year'] = df['Date'].dt.year
      df['Month'] = df['Date'].dt.month
      df['TimeStamp'] = (df['Date'] - pd.Timestamp("2020-01-01")) // pd.Timedelta('1s')
      df['Daily Variation'] = (df['High'] - df['Low']) / df['Open'] # Volatility of the index on a specific day
      df['7-Day SMA'] = df['Close'].rolling(window=7).mean() # 7 Day simple moverging average (closed column)
      df['7-Day STD'] = df['Close'].rolling(window=7).std() # 7 Day standard deviation (closed column)
      df['High - Close'] = (df['High'] - df['Close']) / df['Open'] # Downward pressure of index on a given day
      df['Low - Open'] = (df['Low'] - df['Open']) / df['Open'] # Upwards pressure of index on a given day
      df['Cumulative Return'] = df['Close'].pct_change().cumsum() # long-term preformance of the index
      df['14-Day EMA'] = df['Close'].ewm(span=14).mean() # 14-day exponential MA (Smother & more responseive alt to SMA)
      df['Close Change'] = df['Close'].diff() # % change in the close column from pervious day
      df['MACD'] = df['Close'].ewm(span=12).mean() - df['Close'].ewm(span=26).mean() # Tracks the trend and momentum of an asset
      df['Stochastic Oscillator'] = (df['Close'] - df['Low'].rolling(14).min()) / (df['High'].rolling(14).max() - df['Low'].rolling(14).min()) # Represents position of the index relative to its recent range
      df['ATR'] = df['Daily Variation'].rolling(14).mean() # Volatility of the index over time
      df['ADX'] = df['Close'].rolling(14).mean() #  Strength & direction of the trend of the index
      df['DMI'] = df['Close'].rolling(14).mean() #  Positive and negative movements of the index

      df.dropna(inplace=True)
      return df

  # Prepare training data
  def prepare_data(self, df, target_col='Close'):
      scaler = MinMaxScaler()
      # Drop the date and the target column before scaling
      scaled_data = scaler.fit_transform(df.drop(columns=['Date', 'Close']))
      X, y = [], []
      # Access the target column ('Close', ticker) directly from the original dataframe
      target_data = df['Close']
      for i in range(30, len(scaled_data)):
          X.append(scaled_data[i-30:i])
          y.append(target_data.iloc[i])
      # Scale the target data separately and then inverse transform later
      y = scaler.fit_transform(np.array(y).reshape(-1, 1))
      return np.array(X), np.array(y), scaler, scaled_data

  # Build LSTM model
  def build_lstm(self, input_shape):
      model = Sequential([
          LSTM(50, return_sequences=True, input_shape=input_shape),
          Dropout(0.2),
          LSTM(50, return_sequences=False),
          Dropout(0.2),
          Dense(1)
      ])
      model.compile(optimizer='adam', loss='mse', metrics=['mae'])
      return model

  # Train and evaluate model
  def train_and_evaluate_model(self, X_train, X_test, y_train, y_test):
      model = self.build_lstm((X_train.shape[1], X_train.shape[2]))
      history = model.fit(X_train, y_train, epochs=50, batch_size=32, validation_data=(X_test, y_test))
      y_pred = model.predict(X_test)
      metrics = {
          "MAE": mean_absolute_error(y_test, y_pred),
          "MAPE": mean_absolute_percentage_error(y_test, y_pred),
          "MSE": mean_squared_error(y_test, y_pred),
          "RMSE": np.sqrt(mean_squared_error(y_test, y_pred))
      }
      return model, history, metrics, y_test, y_pred

  # Main function
  def thirty_day_forecast(self, ticker):
      df = self.load_data(ticker)
      X, y, scaler, scaled_data = self.prepare_data(df)
      X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

      # Train and evaluate standard LSTM
      model_lstm, history_lstm, metrics_lstm, _, y_pred_lstm = self.train_and_evaluate_model(X_train, X_test, y_train, y_test)

      # Get the last 30 days of scaled data to make future predictions
      last_30_days = scaled_data[-30:]
      # Respahe to fit the model's input shape (1 sample, 30 timestamps, 1 feature)
      x_future = last_30_days.reshape(1, last_30_days.shape[0], -1)

      # Generate the forecast
      future_predictions = []
      num_features = x_future.shape[2] # Get the number of features

      for _ in range(30):  # Predict the next 30 days
          pred = model_lstm.predict(x_future)
          future_predictions.append(pred[0, 0]) # Add prediction to the list

          # Create a new array for the next day's input with the correct number of features
          new_day_input = np.zeros((1, num_features))
          # Insert the predicted price
          new_day_input[0, 0] = pred[0, 0]

      # Update x_future w/ new prediction by removing the first value and adding the prediction
      x_future = np.append(x_future[:, 1:, :], [[new_day_input[0]]], axis=1)

      # Convert predcitions back to original scale
      future_predictions = scaler.inverse_transform(np.array(future_predictions).reshape(-1,1))

      return f"The model predicts that this will be the the prices for {ticker} for the next 30 days: {future_predictions}"

In [8]:
# Tool for LLM to use to predict stock prices
@tool
def lstm(ticker: str) -> str:
      """Predict stock prices for the next 30 for given ticker"""
      forecast = StockPredictor()
      return forecast.thirty_day_forecast(ticker)

In [9]:
# Create agent & define tools
tools = [YahooFinanceNewsTool(), real_time_stock_price, lstm]
agent_executor = create_react_agent(model, tools, checkpointer=memory)

In [10]:
# Create a random thread id
thread_id = random.randint(1, 100000)

In [11]:
# Use Agent
config = {"configurable": {"thread_id": thread_id}}

In [12]:
# Get input
message = input("Begin typing here: ")

Begin typing here: What is the real time price of AAPL?


In [13]:
user_input = {
    "role": "user",
    "content": message,
}

In [14]:
for step in agent_executor.stream(
    {"messages": [user_input]}, config, stream_mode="values"
):
    step["messages"][-1].pretty_print()


What is the real time price of AAPL?
Tool Calls:
  real_time_stock_price (5m7dae5ze)
 Call ID: 5m7dae5ze
  Args:
    ticker: AAPL


  price = stock.history(period="1d")['Close'][0] # Retrieves historical data for the last trading day & extracts the closing price of the stock


Name: real_time_stock_price

211.16000366210938

The real-time price of AAPL is 211.16000366210938.


In [15]:
user_input = {
    "role": "user",
    "content": "What happened today with Microsoft stocks?",
}

for step in agent_executor.stream(
    {"messages": [user_input]}, config, stream_mode="values"
):
    step["messages"][-1].pretty_print()


What happened today with Microsoft stocks?
Tool Calls:
  yahoo_finance_news (5eyjdryr7)
 Call ID: 5eyjdryr7
  Args:
    query: MSFT
Name: yahoo_finance_news

Microsoft (MSFT): A Bull Case Theory
We came across a bullish thesis on Microsoft on Max Dividends’s Substack. As of 2ⁿᵈ July, Microsoft’s share was trading at $491.09. MSFT’s trailing and forward P/E were 38.01 and 32.85 respectively according to Yahoo Finance. Microsoft (MSFT) is a dividend stock that stands out with a strong business model, reliable dividends, and massive growth […]

It appears that there is a bullish thesis on Microsoft stocks, with a strong business model, reliable dividends, and massive growth potential.
