## Simple GenAI App using Langchain

In [1]:
import os
from dotenv import load_dotenv
load_dotenv()

os.environ['OPENAI_API_KEY'] = os.getenv("OPENAI_API_KEY")
## Langsmith Tracking
os.environ['LANGCHAIN_API_KEY'] = os.getenv("LANGCHAIN_API_KEY")
os.environ['LANGCHAIN_TRACKING_V2'] = "True"
os.environ['LANGCHAIN_PROJECT'] = os.getenv("LANGCHAIN_PROJECT")

In [2]:
## Data Ingestion -- from the wensite we need to scrape the data

from langchain_community.document_loaders import WebBaseLoader

docs = WebBaseLoader("https://python.langchain.com/v0.2/docs/integrations/llms/").load()

USER_AGENT environment variable not set, consider setting it to identify your requests.


In [3]:
## Data Chunking 

from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size = 1000, chunk_overlap = 200)
final_document = text_splitter.split_documents(docs)


In [4]:
## Embeddings AND Vector storing

from langchain_community.embeddings import OllamaEmbeddings
from langchain_community.vectorstores import FAISS

embeddings = OllamaEmbeddings(model="gemma2:2b")
db = FAISS.from_documents(final_document,embeddings)


In [5]:
## Query from a vector store db

query = "Streaming support defaults"

results = db.similarity_search(query)

results[0].page_content

"If you'd like to contribute an integration, see Contributing integrations.Features (natively supported)\u200bAll LLMs implement the Runnable interface, which comes with default implementations of all methods, ie. ainvoke, batch, abatch, stream, astream. This gives all LLMs basic support for async, streaming and batch, which by default is implemented as below:Async support defaults to calling the respective sync method in asyncio's default thread pool executor. This lets other async functions in your application make progress while the LLM is being executed, by moving this call to a background thread.Streaming support defaults to returning an Iterator (or AsyncIterator in the case of async streaming) of a single value, the final result returned by the underlying LLM provider. This obviously doesn't give you token-by-token streaming, which requires native support from the LLM provider, but ensures your code that expects an iterator of tokens can work for any of our LLM integrations.Batc

In [6]:
## Retrival Chain, Document Chain

from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate
from langchain.llms import Ollama
from langchain_core.documents import Document

llm = Ollama(model="gemma2:2b")

prompt = ChatPromptTemplate.from_template(
    """
Answe the following question based only on the provided context:
<context>
{context}
</context>
"""
) 

document_chain = create_stuff_documents_chain(llm,prompt)

document_chain.invoke({
    "input":"Streaming support defaults",
    "context": [Document(page_content="to returning an Iterator (or AsyncIterator in the case of async streaming) of a single value, the final result returned by the underlying LLM provider. This obviously doesn't give you token-by-token streaming, which requires native support from the LLM provider, but ensures your code that expects an iterator of tokens can work for any of our LLM integrations.")]
})



"The context explains how to get a single value from an LLM provider using iterators. \n\n\n**Here's a breakdown:**\n\n* **Iterator:**  A way to fetch values one at a time, similar to reading data line-by-line from a file.\n* **LLM Provider:** The software you use (like ChatGPT) that actually processes the language input and generates text or data. \n\n\n**Key points:**\n\n* **Single Value Output:** The final output of the LLM is usually a single value, not individual tokens like in real-time streaming.  \n* **Iterator Support:**  This context emphasizes that using an iterator will work with any LLM provider that offers this kind of integration. \n\n\nLet me know if you'd like more information about iterators or how they relate to the process of working with LLMs!"

However, we want the documents to first come from the retriever we jsut set up. That way, we can use the retriver to dynamically to select the most relevant documents and pass those in for a given question

In [8]:
## Input --> Retriver --> vectorstoredb

from langchain.chains import create_retrieval_chain

retriver = db.as_retriever()
retriver_chain = create_retrieval_chain(retriver,document_chain)

retriver_chain

RunnableBinding(bound=RunnableAssign(mapper={
  context: RunnableBinding(bound=RunnableLambda(lambda x: x['input'])
           | VectorStoreRetriever(tags=['FAISS', 'OllamaEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x7fe268893980>), config={'run_name': 'retrieve_documents'})
})
| RunnableAssign(mapper={
    answer: RunnableBinding(bound=RunnableBinding(bound=RunnableAssign(mapper={
              context: RunnableLambda(format_docs)
            }), config={'run_name': 'format_inputs'})
            | ChatPromptTemplate(input_variables=['context'], messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context'], template='\nAnswe the following question based only on the provided context:\n<context>\n{context}\n</context>\n'))])
            | Ollama(model='gemma2:2b')
            | StrOutputParser(), config={'run_name': 'stuff_documents_chain'})
  }), config={'run_name': 'retrieval_chain'})

In [11]:
## Get the response from the LLM

response = retriver_chain.invoke({"input":"Streaming support defaults"})

response

{'input': 'Streaming support defaults',
 'context': [Document(metadata={'source': 'https://python.langchain.com/v0.2/docs/integrations/llms/', 'title': 'LLMs | 🦜️🔗 LangChain', 'description': "If you'd like to write your own LLM, see this how-to.", 'language': 'en'}, page_content="If you'd like to contribute an integration, see Contributing integrations.Features (natively supported)\u200bAll LLMs implement the Runnable interface, which comes with default implementations of all methods, ie. ainvoke, batch, abatch, stream, astream. This gives all LLMs basic support for async, streaming and batch, which by default is implemented as below:Async support defaults to calling the respective sync method in asyncio's default thread pool executor. This lets other async functions in your application make progress while the LLM is being executed, by moving this call to a background thread.Streaming support defaults to returning an Iterator (or AsyncIterator in the case of async streaming) of a singl