<a href="https://colab.research.google.com/github/Kedar154/AI-Driven-Stock-Market-Intelligence/blob/main/RAG_2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Loading the Libraries

In [1]:
! pip install -q yfinance langchain-community langchain-huggingface chromadb duckduckgo-search langchain_groq ddgs

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m52.0/52.0 kB[0m [31m2.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.5/2.5 MB[0m [31m83.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m21.1/21.1 MB[0m [31m99.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m40.3/40.3 kB[0m [31m3.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m278.2/278.2 kB[0m [31m26.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m161.7/161.7 kB[0m [31m17.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m137.5/137.5 kB[0m [31m14.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.0/2.0 MB[0m [31m88.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

In [29]:
import os
import json
import yfinance as yf
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_community.vectorstores import Chroma
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.tools import DuckDuckGoSearchRun
from langchain_community.utilities import DuckDuckGoSearchAPIWrapper
from datetime import datetime, timedelta
import matplotlib.pyplot as plt
import plotly.express as px


In [3]:
embedding = HuggingFaceEmbeddings(model_name = 'sentence-transformers/all-MiniLM-L6-v2')

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

## Setting up API

In [4]:
from google.colab import userdata
from langchain_groq import ChatGroq
GAPi = userdata.get('groqAPi')

In [5]:
llm = ChatGroq(
    model_name = 'llama-3.3-70b-versatile',
    groq_api_key = GAPi,
    verbose = True, #tells what model is doing behind the scenes
    temperature = 0
)

# Extracting date and ticker

In [6]:
def extract(query):
  ''' using llm to find ticker and time frame '''
  today = datetime.now().strftime('%Y-%m-%d')
  extraction_prompt = f'''System: You are a precision data extraction assistant.
Today's Date: {today}

Your goal is to extract a stock ticker and a specific date range from the user's query.

Instructions:
1. "ticker": The official stock symbol (e.g., AAPL). Convert company names to tickers.
2. "target_date": The exact date the user is interested in (YYYY-MM-DD).
   - If a festival is mentioned (e.g., Diwali, Christmas), identify its date for the current year.
   - If a relative time is mentioned (e.g., "last Friday"), calculate the date based on {today}.
3. "start_date": Calculate the date exactly 5 days BEFORE the "target_date" (YYYY-MM-DD).

Return ONLY a JSON object. No prose.

Example:
Query: "Tesla news during Diwali"
Output: {{
  "ticker": "TSLA",
  "target_date": "2024-10-31",
  "start_date": "2024-10-26"
}}

User Query: "{query}"
JSON Output:'''

  response = llm.invoke(extraction_prompt).content
  # clearing the output if llm ads markdown backticks
  data = json.loads(response.replace('```json', '').replace('```', '').strip())
  name = data.get("ticker", 'NONE')
  start_date = data.get("start_date", 'NONE')
  end_date = data.get("target_date", 'NONE')

  return name, start_date, end_date

In [7]:
# --- TESTING ---
ticker, start_date, end_date = extract("How did Tesla do since last Makarsankranti?")
print(f"Extracted Ticker: {ticker}")
print(f"Extracted Period: {start_date}, {end_date}")

Extracted Ticker: TSLA
Extracted Period: 2026-01-09, 2026-01-14


# Functions

## Ask Price

In [8]:
def ask_price(ticker):
  # fetch news from the past trading week (5d)
  stock = yf.Ticker(ticker)
  data = stock.history(period = '5d')
  return data.to_string()

## Ask News

In [9]:
def ask_news(ticker: str, target_date: str, start_date: str):
  # searches the web for news related to
  query = f"{ticker} stock news after:{start_date} before:{target_date}"
  wrapper = DuckDuckGoSearchAPIWrapper(max_results=5)
  search = DuckDuckGoSearchRun(api_wrapper=wrapper)
  return search.invoke(query)

## to_DB

In [10]:
def to_DB(news_text):
  chunk = RecursiveCharacterTextSplitter(
      chunk_size = 600,
      chunk_overlap = 100,
  )
  docs = chunk.create_documents([news_text])
  VDB = Chroma.from_documents(
      documents=docs,
      embedding=embedding,
      collection_name = "news"
  )
  return VDB

## get_context

In [11]:
def get_context(VDB, query):
  recovered = VDB.similarity_search(query, k=3)
  context = "\n---\n".join([doc.page_content for doc in recovered])
  return context

plot_stock

In [22]:

def plot_stock_plt(ticker: str, target_date: str):
    """Fetches and plots exactly the last 15 trading days ending on target_date."""
    try:
        # 1. Calculate a wide 25-day calendar window to ensure we find 15 trading days
        end_dt = datetime.strptime(target_date, "%Y-%m-%d")
        buffer_start = end_dt - timedelta(days=25)

        # 2. Download the data
        df = yf.download(ticker, start=buffer_start.strftime("%Y-%m-%d"), end=target_date)
        print(df)
        # 3. Take exactly the last 15 trading days
        if len(df) > 15:
            df = df.tail(15)
        elif df.empty:
            return "No data found for this period."

        # 4. Create the plot
        plt.figure(figsize=(12, 6))
        plt.plot(df.index, df['Close'], marker='o', color='#2ca02c', linewidth=2, label='Close Price')

        # 5. Styling
        plt.title(f"{ticker}: 25-Day Trading History (ending {target_date})", fontsize=14)
        plt.xlabel("Date", fontsize=10)
        plt.ylabel("Price (USD)", fontsize=10)
        plt.grid(True, alpha=0.3)
        plt.legend()
        plt.xticks(rotation=45)
        plt.tight_layout()

        # 6. Save and Return
        filename = f"{ticker}_15day_analysis.png"
        plt.savefig(filename)
        plt.show()
        plt.close()

        return f"Chart generated with {len(df)} points: {filename}"

    except Exception as e:
        return f"Error: {str(e)}"

plot_px

In [30]:
def plot_stock_px(ticker: str, target_date: str):
    """Fetches data and plots a neon-themed 15-trading-day chart using Plotly Express."""
    try:
        # 1. Fetch data with 25-day buffer to ensure 15 trading days
        end_dt = datetime.strptime(target_date, "%Y-%m-%d")
        buffer_start = end_dt - timedelta(days=25)

        df = yf.download(ticker, start=buffer_start.strftime("%Y-%m-%d"), end=target_date)

        if df.empty:
            return "No data found for this period."

        # 2. Take exactly the last 15 trading days
        df = df.tail(15).reset_index()

        # 3. Create Plotly Express line chart
        fig = px.line(
            df,
            x='Date',
            y='Close',
            title=f"{ticker}: 15-Day Performance (Ending {target_date})",
            template="plotly_dark", # Base dark theme
            markers=True
        )

        # 4. Custom Neon & White Styling
        fig.update_traces(
            line=dict(color='#00FF00', width=3), # Neon Green Line
            marker=dict(
                color='white',                   # White Markers
                size=8,
                line=dict(color='#00FF00', width=2)
            ),
            hovertemplate="<b>Date:</b> %{x}<br><b>Price:</b> $%{y:.2f}<extra></extra>"
        )

        fig.update_layout(
            paper_bgcolor='#111111', # Deep dark background
            plot_bgcolor='#111111',  # Match plot area background
            font_color='white',      # White text for titles and labels
            title_font_size=20,
            xaxis=dict(
                showgrid=True,
                gridcolor='#333333', # Darker grid lines
                linecolor='white'    # White axis line
            ),
            yaxis=dict(
                showgrid=True,
                gridcolor='#333333',
                linecolor='white',
                tickprefix="$"
            )
        )

        # 5. Display the chart
        fig.show()

        filename = f"{ticker}_plotly_analysis.html"
        # Optional: Save as interactive HTML
        # fig.write_html(filename)

        return f"Interactive Plotly chart generated with {len(df)} points."

    except Exception as e:
        return f"Error: {str(e)}"

# Example Usage
# plot_stock_plotly("GC=F", "2026-01-22")

#MODEL

In [27]:
def run_analyst_model(query):
    # 1. Extraction (Same as before)
    ticker, start_date, end_date = extract(query)

    # Greeting Check
    greetings = ["hi", "hello", "who are you", "hey", "help"]
    if ticker == "NONE" or any(word == query.lower().strip() for word in greetings):
        general_prompt = f"Identity: Harshad Mehta. Explain who you are and how to use you. Mention you need tickers like Reliance or ^NSEI."
        return llm.invoke(general_prompt).content

    # 2. Data Fetching
    '''search_ticker = "^NSEI" if ticker == "NIFTY" else ticker'''
    price_data = ask_price(ticker)
    raw_news = ask_news(ticker, start_date, end_date)

    # 3. RAG & Verification Step
    db = to_DB(raw_news)
    raw_context = get_context(db, query)
    chart_file = plot_stock_px(ticker, end_date)
    # --- NEW: SOURCE VERIFICATION ---
    verification_prompt = f"""
    You are a Fact Checker.
    Ticker: {ticker}
    News Snippets: {raw_context}
    user_request : {query}

    TASK: Remove any news snippets that are NOT the related to the {ticker} and query: {query}.
    you have to check whether the news snippet is actually the cause behind the action asked by user.
    If a snippet is about a different company (like Nvidia mentioned during a Nifty query), delete it.
    Return ONLY the verified, relevant text. If nothing is relevant, return 'NO RELEVANT NEWS'.
    """
    verified_context = llm.invoke(verification_prompt).content
    # --------------------------------

    # 4. Final Analysis with Verified Context
    harshad_financial_prompt = f"""
    ROLE: Harshad Mehta.
    USER QUERY: {query}
    TICKER: {ticker}

    MARKET DATA: {price_data}
    VERIFIED NEWS: {verified_context}
    STOCK CHART: {chart_file}
    INSTRUCTIONS:
    - If VERIFIED NEWS is 'NO RELEVANT NEWS', say: "The street is quiet on this one, Lala. No direct news, so we look at the charts."
    - Otherwise, use the verified news to explain the move.

    STRUCTURE:
    - **The Big Bull Headline**
    - **The Technical Game**
    - **Market Sentiment**
    - **The Bottom Line**

    CLOSING: "Risk hai toh Ishq hai!"
    """

    return llm.invoke(harshad_financial_prompt).content

In [31]:
user_input = "tell me about honda prices recently"
print(run_analyst_agent(user_input))

  df = yf.download(ticker, start=buffer_start.strftime("%Y-%m-%d"), end=target_date)
[*********************100%***********************]  1 of 1 completed


**The Big Bull Headline**
Honda's recent stock prices have been making waves, Lala! As of January 23, 2026, the closing stock price for Honda Motor Co., Ltd. (HMC) is $30.42. This is a significant piece of information for anyone looking to invest in the automotive giant.

**The Technical Game**
Looking at the historical closing prices, we can see that the stock has been fluctuating over the past few days. On January 21, 2026, the stock price surged to $31.19, only to drop to $30.42 on January 23, 2026. This volatility suggests that the market is still trying to find its footing. With a projected revenue of 20.3 trillion yen for the fiscal year 2026, Honda is poised for growth, but the stock price will likely be influenced by various market factors.

**Market Sentiment**
The market sentiment around Honda is cautiously optimistic. The company's revenue projection for 2026 is a positive sign, but the stock price has been experiencing some turbulence. As the market adjusts to the new infor