<a href="https://colab.research.google.com/github/bongjoonsiong/Generative-AI-and-ML-in-Finance/blob/main/AI_based_stock_analyzer_using_OpenAI_LLM_and_Langchain.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


## Using LLM & Langchain to build a Stock Analyzer
I'm certain you've come across these recent buzzwords: AI, LLMs, GPT, Langchain. These technologies are incredibly valuable and transformative, offering limitless applications. In this project, I endeavored to create a compelling application utilizing Language Models and Langchain within the Finance domain.

*An artificial intelligence bot designed to assist with stock investments by leveraging advanced language models to analyze both real-time and historical stock-related information.*

## Rational of AI Stock Analyzer
A Langchian and LLM-based bot that can take real-time as well as historic data to make investment analysis.

The concept is to fetch the real time and historic data from Yahoo Finance that includes the following:

1. Historic Stock price data.

2. Companyâ€™s financial statement

3. Latest company related news

LLM will do a fundamental analysis on a particular stock that prompted by the AI Stock Analyzer, based on the information ingested from Yahoo Finance.

## Main Code

In [None]:
#Import All Required Library
from bs4 import BeautifulSoup
import requests
import yfinance as yf

## Create Helpers Functions

In [None]:
# Fetch stock data from Yahoo Finance
def get_stock_price(ticker,history=5):
    # time.sleep(4) #To avoid rate limit error
    if "." in ticker:
        ticker=ticker.split(".")[0]
    ticker=ticker+".NS"
    stock = yf.Ticker(ticker)
    df = stock.history(period="1y")
    df=df[["Close","Volume"]]
    df.index=[str(x).split()[0] for x in list(df.index)]
    df.index.rename("Date",inplace=True)
    df=df[-history:]
    # print(df.columns)

    return df.to_string()

# Script to scrap top5 google news for given company name
def google_query(search_term):
    if "news" not in search_term:
        search_term=search_term+" stock news"
    url=f"https://www.google.com/search?q={search_term}&cr=countryIN"
    url=re.sub(r"\s","+",url)
    return url

def get_recent_stock_news(company_name):
    # time.sleep(4) #To avoid rate limit error
    headers={'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/102.0.0.0 Safari/537.36'}

    g_query=google_query(company_name)
    res=requests.get(g_query,headers=headers).text
    soup=BeautifulSoup(res,"html.parser")
    news=[]
    for n in soup.find_all("div","n0jPhd ynAwRc tNxQIb nDgy9d"):
        news.append(n.text)
    for n in soup.find_all("div","IJl0Z"):
        news.append(n.text)

    if len(news)>6:
        news=news[:4]
    else:
        news=news
    news_string=""
    for i,n in enumerate(news):
        news_string+=f"{i}. {n}\n"
    top5_news="Recent News:\n\n"+news_string

    return top5_news


# Fetch financial statements from Yahoo Finance
def get_financial_statements(ticker):
    # time.sleep(4) #To avoid rate limit error
    if "." in ticker:
        ticker=ticker.split(".")[0]
    else:
        ticker=ticker
    ticker=ticker+".NS"
    company = yf.Ticker(ticker)
    balance_sheet = company.balance_sheet
    if balance_sheet.shape[1]>=3:
        balance_sheet=balance_sheet.iloc[:,:3]    # Remove 4th years data
    balance_sheet=balance_sheet.dropna(how="any")
    balance_sheet = balance_sheet.to_string()
    return balance_sheet

## Approach 1:
Agents in Langhian are basically something that is responsible for the decision making. I used zeroshot ReaAct agent, which stands for resposnce and action, it basically continuously thinks and takes action based on the thought. The problem with this approach is that it get stuck with infinite loop of thought and action as the end goal of stock analysis seems complicated for it and it cannot confidently decide the next action resulting in an endless loop or bad results that are not much related to input query.

Lets look into the code-

In [None]:
!pip install langchain -q
!pip install duckduckgo-search
print("done")

In [None]:
from langchain.tools import Tool, DuckDuckGoSearchRun
search=DuckDuckGoSearchRun()

# Making tool list

tools=[
    Tool(
        name="get stock data",
        func=get_stock_price,
        description="Use when you are asked to evaluate or analyze a stock. This will output historic share price data. You should input the the stock ticker to it "
    ),
    Tool(
        name="DuckDuckGo Search",
        func=search.run,
        description="Use only when you need to get NSE/BSE stock ticker from internet, you can also get recent stock related news. Dont use it for any other analysis or task"
    ),
    Tool(
        name="get recent news",
        func=get_recent_stock_news,
        description="Use this to fetch recent news about stocks"
    ),

    Tool(
        name="get financial statements",
        func=get_financial_statements,
        description="Use this to get financial statement of the company. With the help of this data companys historic performance can be evaluaated. You should input stock ticker to it"
    )
]

from langchain.agents import initialize_agent


# new_prompt="<Plz refere github repo>"
# zero_shot_agent.agent.llm_chain.prompt.template=new_prompt

zero_shot_agent=initialize_agent(
    llm=llm,
    agent="zero-shot-react-description",
    tools=tools,
    verbose=True,
    max_iteration=4,
    return_intermediate_steps=True,
    handle_parsing_errors=True
)

zero_shot_agent("Is Bajaj Finance a good investment choice right now?")


Note this code is a continuation of what we have discussed before. Here we are just converting the data scrapping functions into langchain tools and making a list so that it becomes accessible for the agent. In the later part an agent is being defined with initialize_agent class. It takes llm, tool list, agent type as arguments. The output of this approach seems okayish. This approach may work or may not, Further improvements can be made by modifying the prompt

## Approach2-
Stock analysis being a complicated task, the ReAct agent was not able to decide proper steps. So in this approach I tried defining the steps before the analysis itself. First all the data is fetched and then feeded into llm for comprehensive analysis.

# Using OpenAI Funcion

In [None]:
!pip install --upgrade openai


In [None]:
#Openai function calling

import json
import openai

#Below code is from STACKOVERFLOW to resolve openai 0.28 version issue
import os
from openai import OpenAI

# End of Stack Overflow solution..

function=[
        {
        "name": "get_company_Stock_ticker",
        "description": "This will get the indian NSE/BSE stock ticker of the company",
        "parameters": {
            "type": "object",
            "properties": {
                "ticker_symbol": {
                    "type": "string",
                    "description": "This is the stock symbol of the company.",
                },

                "company_name": {
                    "type": "string",
                    "description": "This is the name of the company given in query",
                }
            },
            "required": ["company_name","ticker_symbol"],
        },
    }
]

print(f"Will start main section next")

# Main Code

In [None]:
!openai migrate

In [None]:

def get_stock_ticker(query):
    response = OpenAi.ChatCompletion.create(
            model="gpt-3.5-turbo",
            temperature=0,
            messages=[{
                "role":"user",
                "content":f"Given the user request, what is the comapany name and the company stock ticker ?: {query}?"
            }],
            functions=function,
            function_call={"name": "get_company_Stock_ticker"},
    )
    message = response["choices"][0]["message"]
    arguments = json.loads(message["function_call"]["arguments"])
    company_name = arguments["company_name"]
    company_ticker = arguments["ticker_symbol"]
    return company_name,company_ticker

def Anazlyze_stock(query):
    #agent.run(query) Outputs Company name, Ticker
    Company_name,ticker=get_stock_ticker(query)
    print({"Query":query,"Company_name":Company_name,"Ticker":ticker})
    stock_data=get_stock_price(ticker,history=10)
    stock_financials=get_financial_statements(ticker)
    stock_news=get_recent_stock_news(Company_name)

    # available_information=f"Stock Price: {stock_data}\n\nStock Financials: {stock_financials}\n\nStock News: {stock_news}"
    available_information=f"Stock Financials: {stock_financials}\n\nStock News: {stock_news}"

    print("\n\nAnalyzing.....\n")
    analysis=llm(f"Give detail stock analysis, Use the available data and provide investment recommendation. \
             The user is fully aware about the investment risk, dont include any kind of warning like 'It is recommended to conduct further research and analysis or consult with a financial advisor before making an investment decision' in the answer \
             User question: {query} \
             You have the following information available about {Company_name}. Write (5-8) pointwise investment analysis to answer user query, At the end conclude with proper explaination.Try to Give positives and negatives  : \
              {available_information} "
             )
    print(analysis)

    return analysis

Recently Open ai has introduced function call, which is really helpful with which we can get structured output as we want in json format from LLM. In this approach, the same is being used. First stock ticker is extracted with the help of function calls, As most of the later code was dependent on this single argument. React agent in approach 1 was failing in this step only, resulting in d eviation of all the future steps. Once stock ticker is extracted correctly, in the later stages stock data, news, and financial statements are simply fetched by inputting the ticker symbol. Once all the stock related information is available, it is then utilized by the LLM for the comprehensive stock analysis.

# Eg- sample input and output of the bot-

Input

In [None]:
!openai migrate

In [None]:
Anazlyze_stock("Is it a good time to invest in Yes Bank?")



#Output:
'Query': 'Is it a good time to invest in Yes Bank?', 'Company_name': 'Yes Bank', 'Ticker': 'YESBANK'
Analyzing..... *italicized text*

**Investment Thesis for Yes Bank:**
1. Financial Performance: Yes Bank has shown improvement in its financials over the past three years. The net debt has increased, indicating higher borrowing, but the tangible book value and common stock equity have also increased, suggesting a stronger financial position.
2. Total Capitalization: The total capitalization of Yes Bank has been consistently increasing, indicating a growing investor base and potential for future growth. This can be seen as a positive sign for investors considering investing in the bank.
3. Total Assets: Yes Bank's total assets have also been increasing steadily, indicating the bank's ability to attract and manage a larger pool of assets. This growth in assets can contribute to the bank's profitability and potential for future expansion.
4. Stock News: Recent news about Yes Bank suggests that the stock has seen a marginal increase in price and has been holding steady. This stability in the stock price can be seen as a positive sign for investors, indicating a potential for future growth.
5. Weak Underlying Business: However, it is important to note that there are concerns about the bank's weak underlying business, as indicated by the soft quarter expected in Q1. This may lead to a decline in profitability, which could impact the stock price in the short term.
6. Overall Market Conditions: It is also important to consider the overall market conditions and the banking sector as a whole before making an investment decision. Factors such as economic conditions, regulatory changes, and competition can significantly impact the performance of Yes Bank and its stock price.
Based on the available data and information, it can be concluded that investing in Yes Bank at this time carries

## Further improvemetns that can be done-
a) More tools can be added. Fo eg. math tool to perform complex technical analysis
b) More robust prompting for stable output
c) Support of other opensource LLMS
Note- Its a fun hobby project, I am not a finance expert, feel free to add any suggestions/ moodification