# WebBase Loader
Load the webpage and extract the data using `WebBaseLoader`.

In [7]:
from dotenv import load_dotenv

load_dotenv("./../.env")

True

In [8]:
from langchain_community.document_loaders import WebBaseLoader

urls = [
  "https://www.cnbc.com/2025/06/24/stock-market-today-live-updates.html",
  'https://www.livemint.com/latest-news',
  "https://www.moneycontrol.com/"
]

USER_AGENT environment variable not set, consider setting it to identify your requests.


In [9]:
loader = WebBaseLoader(web_paths=urls)

In [10]:
docs = []
async for doc in loader.alazy_load():
  docs.append(doc)

In [11]:
import re

def join_docs(docs):
  return "\n\n".join([doc.page_content for doc in docs])

def clean_text(text):
  # Remove redundant spaces, enters, tabs
  text = re.sub(r'\n\n+', '\n\n', text)
  text = re.sub(r'\t+', '\t', text)
  text = re.sub(r'\s+', ' ', text)
  return text

In [12]:
context = join_docs(docs)
context = clean_text(context)

In [13]:
print(context)

Stock market today: Live updatesSkip NavigationMarketsPre-MarketsU.S. MarketsEurope MarketsChina MarketsAsia MarketsWorld MarketsCurrenciesCryptocurrencyFutures & CommoditiesBondsFunds & ETFsBusinessEconomyFinanceHealth & ScienceMediaReal EstateEnergyClimateTransportationIndustrialsRetailWealthSportsLifeSmall BusinessInvestingPersonal FinanceFintechFinancial AdvisorsOptions ActionETF StreetBuffett ArchiveEarningsTrader TalkTechCybersecurityAIEnterpriseInternetMediaMobileSocial MediaCNBC Disruptor 50Tech GuidePoliticsWhite HousePolicyDefenseCongressExpanding OpportunityEurope PoliticsChina PoliticsAsia PoliticsWorld PoliticsVideoLatest VideoFull EpisodesLivestreamTop VideoLive AudioEurope TVAsia TVCNBC PodcastsCEO InterviewsDigital OriginalsWatchlistInvesting ClubTrust PortfolioAnalysisTrade AlertsMeeting VideosHomestretchJim's ColumnsEducationSubscribeSign InPROPro NewsLivestreamFull EpisodesStock ScreenerMarket ForecastOptions InvestingChart InvestingStock ListsSubscribeSign InLivestr

In [14]:
import os
import sys

# Get the absolute path to the directory containing the script to import
# Replace 'path/to/your/other_folder' with the actual relative or absolute path
other_folder_path = os.path.abspath('./../scripts') 

# Add the folder to sys.path
sys.path.insert(0, other_folder_path) 

import llm

In [15]:
response = llm.ask_llm(context=context,
                       question="Extract stock market news.")
print(response)

Here are the extracted stock market news:

1. Gold rate today: Check the latest prices in your city on June 25 or June 23, which opened at Rs 97,311 per 10 grams.
2. JP Morgan pre-leases 1.16 lakh sq ft office space in Mumbai's BKC for Rs 6.9 crore/month.
3. Tata Group first Indian brand over the $30 billion value mark.
4. Eternal joins the top 50 list of Indian companies with a market capitalization of over $2 billion.
5. Arisinfra Solutions shares plunge over 21% after a weak market debut, citing a lack of investor interest in its space travel plans.
6. The Sudeep Pharma IPO will consist of fresh issuance of shares worth Rs 95 crore and an offer-for-sale of 1 crore shares by the promoter - the Bhayani family.
7. India's energy back up plan & trade risks in focus | Market Minutes Advisory Alert: It has come to our attention that certain individuals are representing themselves as affiliates of Moneycontrol and soliciting funds on the false promise of assured returns on their investment

In [None]:
def chunk_text(text, chunk_size, overlap=100):
  # Chunk size unit = characters
  chunks = []
  for i in range(0, len(text), chunk_size-overlap):
    chunks.append(text[i:i+chunk_size])
  return chunks

In [21]:
chunks = chunk_text(text=context, chunk_size=10000, overlap=200)

In [22]:
question = "Extract stock market news."

full_response = []
for chunk in chunks:
  response = llm.ask_llm(context=context, question=question)
  full_response.append(response)
  
full_response

["Gainers & Losers: 10 stocks that moved the most on June 25 Tehran to Tel Aviv: Israel-Iran conflict sparks global concern Market snaps two-week losing streak amid RBI booster dose Bumble to lay off 30% of global workforce as online dating industry struggles The move, which will affect 240 roles, or 30% of Bumble's staff, is part of a broader effort to revamp the platform 'Badly damaged': Iran's first admission of damage to nuclear sites in US strikes Delhi HR executive reacts after employee quits on Day 1': 'No job becomes perfect in a day' The viral post has sparked a discussion on social media on professional commitment and transparency in communication at workplaces.",
 "Here are the extracted stock market news:\n\n1. Gold rate today: Check the latest prices in your city on June 25\n2. Gold rate today: The yellow metal's August contracts on the MCX opened at Rs 97,311 per 10 grams on June 25.\n3. JP Morgan pre-leases 1.16 lakh sq ft office space in Mumbai's BKC for Rs 6.9 crore/mo

In [25]:
summary = "\n\n".join(full_response)
print(summary)

Gainers & Losers: 10 stocks that moved the most on June 25 Tehran to Tel Aviv: Israel-Iran conflict sparks global concern Market snaps two-week losing streak amid RBI booster dose Bumble to lay off 30% of global workforce as online dating industry struggles The move, which will affect 240 roles, or 30% of Bumble's staff, is part of a broader effort to revamp the platform 'Badly damaged': Iran's first admission of damage to nuclear sites in US strikes Delhi HR executive reacts after employee quits on Day 1': 'No job becomes perfect in a day' The viral post has sparked a discussion on social media on professional commitment and transparency in communication at workplaces.

Here are the extracted stock market news:

1. Gold rate today: Check the latest prices in your city on June 25
2. Gold rate today: The yellow metal's August contracts on the MCX opened at Rs 97,311 per 10 grams on June 25.
3. JP Morgan pre-leases 1.16 lakh sq ft office space in Mumbai's BKC for Rs 6.9 crore/month
4. De

In [26]:
# Generate final report from all chunk responses
response = llm.ask_llm(context=summary,
                       question="Write a detailed market news report in markdown format. Think carefully then write the report.")
print(response)

# Market News Report

## Overview

The current market scenario is characterized by a rebound after a two-week losing streak, with investors taking advantage of gains on rising stocks and precious metals like gold.

**Gold Rate Today**
-------------------

*   The yellow metal's August contracts on the MCX opened at Rs 97,311 per 10 grams on June 25.
*   Gold rate today: Yellow metal's prices fall; check the latest rates in your city on June 20

## Market Performance
--------------------

### Stocks

| Stock | Current Price |
| --- | --- |
| Gold | Rs 97,311/10g |
| JP Morgan | ₹6.9 crore/month |

**Market Insights**

*   The recent rally in Indian stocks has been fueled by gains in various sectors such as IT, BFSI, and FMCG.
*   However, the market's resilience comes from a strong support base provided by gold prices, which have remained above Rs 97,000 per kilogram for several weeks.

**Precious Metals**

*   The price of gold has fallen after rising sharply earlier in the week due to