In [1]:
from dotenv import load_dotenv
import os
load_dotenv()

True

In [2]:
GOOGLE_API_KEY = os.environ.get('GOOGLE_API_KEY')
os.environ['GOOGLE_API_KEY'] = GOOGLE_API_KEY

In [3]:
from langchain_google_genai import ChatGoogleGenerativeAI

model = ChatGoogleGenerativeAI(model = 'models/gemini-1.5-flash')

In [4]:
model.invoke("Tell me some trends on Genai ").content

'The Generative AI (GenAI) landscape is rapidly evolving, making it hard to definitively say what\'s "trending" at any given moment.  However, several key trends are clearly emerging:\n\n**1. Multimodal Models:**  The biggest trend is the move beyond text-only generation.  Models are increasingly capable of handling multiple modalities simultaneously, such as text, images, audio, and video. This leads to more creative and complex applications, like generating videos from text prompts, creating realistic images from audio descriptions, or translating languages across different media types.\n\n**2. Agent-Based AI:**  We\'re seeing a rise in AI agents that can autonomously perform tasks and interact with their environment. These agents can plan, reason, and learn, going beyond simple prompt-response interactions.  Think of AI that can browse the web, gather information, and synthesize it into a report, or an AI that manages your calendar and proactively solves scheduling conflicts.\n\n**3

In [5]:
from langchain_core.messages import HumanMessage,SystemMessage

messages = [
    SystemMessage(
        content="You are a helpful assistant!"
    ),
    HumanMessage(
        content="Tell me about K-Means Clustering Algortihm?"
    )
]

In [6]:
messages

[SystemMessage(content='You are a helpful assistant!', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='Tell me about K-Means Clustering Algortihm?', additional_kwargs={}, response_metadata={})]

In [7]:
print(model.invoke(messages).content)

The K-Means clustering algorithm is a popular unsupervised machine learning technique used to partition data points into distinct groups (clusters) based on their similarity.  The goal is to find groups where data points within the same cluster are as similar as possible, while data points in different clusters are as dissimilar as possible.  Here's a breakdown:

**Core Idea:**

K-Means aims to find *k* cluster centers (centroids) that best represent the data. Each data point is then assigned to the cluster whose centroid is closest to it.  The algorithm iteratively refines the positions of the centroids to minimize the overall distance between data points and their assigned centroids.

**Algorithm Steps:**

1. **Initialization:**
   - Choose the number of clusters, *k*. This is often a hyperparameter that needs to be determined beforehand (e.g., through experimentation or domain knowledge).
   - Randomly initialize *k* centroids.  These can be randomly selected data points or randomly

In [8]:
from langchain.document_loaders import PyPDFLoader

loader = PyPDFLoader(r'D:\innovative project\data\efinance (1).pdf')



documents = loader.load()

In [9]:
documents = loader.load()

In [10]:
documents

[Document(metadata={'producer': 'iLovePDF', 'creator': 'PyPDF', 'creationdate': '', 'moddate': '2025-03-20T14:14:04+00:00', 'source': 'D:\\innovative project\\data\\efinance (1).pdf', 'total_pages': 892, 'page': 0, 'page_label': '1'}, page_content=''),
 Document(metadata={'producer': 'iLovePDF', 'creator': 'PyPDF', 'creationdate': '', 'moddate': '2025-03-20T14:14:04+00:00', 'source': 'D:\\innovative project\\data\\efinance (1).pdf', 'total_pages': 892, 'page': 1, 'page_label': '2'}, page_content='THE \nINTELLIGENT\nINVESTOR\nA BOOK OF PRACTICAL COUNSEL\nREVISED EDITION\nBENJAMIN GRAHAM\nUpdated with New Commentary by Jason Zweig'),
 Document(metadata={'producer': 'iLovePDF', 'creator': 'PyPDF', 'creationdate': '', 'moddate': '2025-03-20T14:14:04+00:00', 'source': 'D:\\innovative project\\data\\efinance (1).pdf', 'total_pages': 892, 'page': 2, 'page_label': '3'}, page_content='Contents\nEpigraph iii\nPreface to the Fourth Edition, by Warren E. Buffett\nANote About Benjamin Graham, by Ja

In [11]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(chunk_size = 500,chunk_overlap = 20)

In [12]:
chunks = text_splitter.split_documents(documents=documents)

In [13]:
len(chunks)

3904

In [14]:
chunks[3903:]


[Document(metadata={'producer': 'iLovePDF', 'creator': 'PyPDF', 'creationdate': '', 'moddate': '2025-03-20T14:14:04+00:00', 'source': 'D:\\innovative project\\data\\efinance (1).pdf', 'total_pages': 892, 'page': 891, 'page_label': '892'}, page_content='My Environment…\nThe six adults I spend the most time with are:')]

In [15]:
from langchain.vectorstores import FAISS
from langchain_google_genai import GoogleGenerativeAIEmbeddings
embeddings = GoogleGenerativeAIEmbeddings(model='models/embedding-001')
vectorstore = FAISS.from_documents(chunks,embeddings)

In [16]:
print(type(vectorstore))

<class 'langchain_community.vectorstores.faiss.FAISS'>


In [17]:
retriever = vectorstore.as_retriever(search_kwargs={'k':3})

In [18]:
from langchain_core.prompts import ChatPromptTemplate
template = """
Use the following pieces of information to answer the user's question also add your Knowledge But response should be helpful.If you don't know the answer, just say that you don't know, don't try to make up an answer.
Context: {context}
Question: {question}

Only return the helpful answer below and nothing else.
And also don't include "based on the Given context" while generating Answer.
Helpful answer:
"""
PROMPT = ChatPromptTemplate.from_template(template)

In [19]:
retriever.invoke("what is a stock")[0].page_content

'stock: alternatives to, 15; “delisting”\nof, 385n; direct purchase of,\n128–29; good and bad, 521n;\nmental value of, 474; and\nportfolio for defensive\ninvestors, 103, 104, 105; public\nattitude about, 19–20, 19–20n;\nturnover rate of, 37, 38, 247,\n266–67; “watered,” 312n. See\nalso common stock; preferred\nstock; specific stock or sector of\nstock\nstock/equity ratio, 285\nStock Guide(Standard & Poor’s), \n144, 169, 354, 383–87, 388, \n389, 391, 403, 433, 463, 575–76,\n578\nstock market: and “beating the'

In [20]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough,RunnableParallel
parser = StrOutputParser()
chain = (
    RunnableParallel({"context":retriever,"question":RunnablePassthrough()})
    |PROMPT
    |model
    |parser
)

In [21]:
response = chain.invoke("what is a stock?")

In [22]:
from IPython.display import Markdown
Markdown(response)

Based on the provided text, a stock represents a share of ownership in a company.  The text discusses various aspects of stocks, including their purchase, valuation,  types (common and preferred),  market performance, and role in investment portfolios.  The text also mentions the concept of "delisting" of stocks and the turnover rate of stocks.

In [23]:
import gradio as gr

def ask_langchain(question):
    return chain.invoke(question)

# Create Gradio interface
iface = gr.Interface(
    fn=ask_langchain,
    inputs=gr.Textbox(lines=2, placeholder="Ask a question..."),
    outputs=gr.Textbox(label="Answer"),
    title="Efinance",
    description="the management of money and includes activities such as investing, borrowing, lending, budgeting, saving, and forecasting"
)

iface.launch(share=True)

  from .autonotebook import tqdm as notebook_tqdm


* Running on local URL:  http://127.0.0.1:7860
* Running on public URL: https://959c705bf39c3ddcec.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)


