#  Simple Gen AI Application



In [1]:
import os
from dotenv import load_dotenv

load_dotenv()

os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY")

# Langsmith Tracking
os.environ["LANGCHAIN_API_KEY"] = os.getenv("LANGCHAIN_API_KEY")

os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_PROJECT"] = os.getenv("LANGCHAIN_PROJECT")

## Data Ingestion


From the website we need to scrape the data

`Load Data --> Docs --> Divide our text into chunks --> text --> Vector Embedding --> Vector Store DB`

In [2]:
from langchain_community.document_loaders import WebBaseLoader

url = "https://docs.langchain.com/langsmith/observability-quickstart"

loader = WebBaseLoader(url)
print(loader)

USER_AGENT environment variable not set, consider setting it to identify your requests.


<langchain_community.document_loaders.web_base.WebBaseLoader object at 0x730615b29850>


In [3]:
docs = loader.load()
docs

[Document(metadata={'source': 'https://docs.langchain.com/langsmith/observability-quickstart', 'title': 'Tracing quickstart - Docs by LangChain', 'language': 'en'}, page_content='Tracing quickstart - Docs by LangChainOur new LangChain Academy course on Deep Agents is now live! Enroll for free.Docs by LangChain home pagePythonSearch...⌘KLangSmithPlatform for LLM observability and evaluationOverviewQuickstartsTrace an applicationEvaluate an applicationTest promptsAPI & SDKsAPI referencePython SDKJS/TS SDKPricingPlansPricing FAQOur new LangChain Academy course on Deep Agents is now live! Enroll for free.Docs by LangChain home pagePythonSearch...⌘KGitHubForumForumSearch...NavigationQuickstartsTracing quickstartGet startedObservabilityEvaluationPrompt engineeringSelf-hostingAdministrationGet startedObservabilityEvaluationPrompt engineeringSelf-hostingAdministrationGitHubForumOn this pagePrerequisites1. Create a directory and install dependencies2. Set up environment variables3. Define your 

## Divide our text into chunks

In [4]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
documents = text_splitter.split_documents(docs)

In [5]:
documents[:5]

[Document(metadata={'source': 'https://docs.langchain.com/langsmith/observability-quickstart', 'title': 'Tracing quickstart - Docs by LangChain', 'language': 'en'}, page_content='Tracing quickstart - Docs by LangChainOur new LangChain Academy course on Deep Agents is now live! Enroll for free.Docs by LangChain home pagePythonSearch...⌘KLangSmithPlatform for LLM observability and evaluationOverviewQuickstartsTrace an applicationEvaluate an applicationTest promptsAPI & SDKsAPI referencePython SDKJS/TS SDKPricingPlansPricing FAQOur new LangChain Academy course on Deep Agents is now live! Enroll for free.Docs by LangChain home pagePythonSearch...⌘KGitHubForumForumSearch...NavigationQuickstartsTracing quickstartGet startedObservabilityEvaluationPrompt engineeringSelf-hostingAdministrationGet startedObservabilityEvaluationPrompt engineeringSelf-hostingAdministrationGitHubForumOn this pagePrerequisites1. Create a directory and install dependencies2. Set up environment variables3. Define your 

## Convert into Vectors using Embeddings and storing it into vector Database

In [6]:
from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings()

embeddings

  from .autonotebook import tqdm as notebook_tqdm


OpenAIEmbeddings(client=<openai.resources.embeddings.Embeddings object at 0x7304de043710>, async_client=<openai.resources.embeddings.AsyncEmbeddings object at 0x7304dd743c50>, model='text-embedding-ada-002', dimensions=None, deployment='text-embedding-ada-002', openai_api_version=None, openai_api_base=None, openai_api_type=None, openai_proxy=None, embedding_ctx_length=8191, openai_api_key=SecretStr('**********'), openai_organization=None, allowed_special=None, disallowed_special=None, chunk_size=1000, max_retries=2, request_timeout=None, headers=None, tiktoken_enabled=True, tiktoken_model_name=None, show_progress_bar=False, model_kwargs={}, skip_empty=False, default_headers=None, default_query=None, retry_min_seconds=4, retry_max_seconds=20, http_client=None, http_async_client=None, check_embedding_ctx_length=True)

In [7]:
from langchain_community.vectorstores import FAISS

vectorstore_db = FAISS.from_documents(documents, embeddings)

In [8]:
vectorstore_db

<langchain_community.vectorstores.faiss.FAISS at 0x7304dc4abf20>

In [None]:
# query from vector store db
query = " Each request generates a trace, which captures the full record of what happened. "
result = vectorstore_db.similarity_search(query)
result[0].page_content

'LangSmith addresses this by providing end-to-end visibility into how your application handles a request. Each request generates a trace, which captures the full record of what happened. Within a trace are individual runs, the specific operations your application performed, such as an LLM call or a retrieval step. Tracing runs allows you to inspect, debug, and validate your application’s behavior.\nIn this quickstart, you will set up a minimal Retrieval Augmented Generation (RAG) application and add tracing with LangSmith. You will:'

## Retrieval Chain, Documents Chain

`Chain` - A chain is just a sequence of steps where inputs/outputs flow between an LLM, prompt templates, retrievers, etc.
`Retrieval Chain` - Interfact between input and vectordb and pass it to the llm

`Document Chain` - a document chain is a special type of chain designed to handle multiple documents at once and combine them into a single response.

`stuff_documents_chain` - It takes a list of documents (retrieved from a vector store or elsewhere).It "stuffs" all of their contents together into a single prompt string.That combined string is then passed to the LLM.

In [13]:
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o")

In [None]:
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_template(
    """ Answer the following question based only on the provided context:
    <context>
    {context}
    </context>
     """
)

document_chain = create_stuff_documents_chain(llm, prompt)
document_chain

RunnableBinding(bound=RunnableBinding(bound=RunnableAssign(mapper={
  context: RunnableLambda(format_docs)
}), kwargs={}, config={'run_name': 'format_inputs'}, config_factories=[])
| ChatPromptTemplate(input_variables=['context'], input_types={}, partial_variables={}, messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context'], input_types={}, partial_variables={}, template=' Answer the following question based only on the provided context:\n    <context>\n    {context}\n    </context>\n     '), additional_kwargs={})])
| ChatOpenAI(client=<openai.resources.chat.completions.completions.Completions object at 0x7304d2f765a0>, async_client=<openai.resources.chat.completions.completions.AsyncCompletions object at 0x7304d2fcaab0>, root_client=<openai.OpenAI object at 0x7304dc590980>, root_async_client=<openai.AsyncOpenAI object at 0x7304d2fcae10>, model_name='gpt-4o', model_kwargs={}, openai_api_key=SecretStr('**********'))
| StrOutputParser(), kwargs={}, config={'

In [17]:
from langchain_core.documents import Document
document_chain.invoke({
    "input": "you will set up a minimal Retrieval Augmented Generation",
    "context": [Document(page_content='LangSmith addresses this by providing end-to-end visibility into how your application handles a request. Each request generates a trace, which captures the full record of what happened. Within a trace are individual runs, the specific operations your application performed, such as an LLM call or a retrieval step. Tracing runs allows you to inspect, debug, and validate your application’s behavior.\nIn this quickstart, you will set up a minimal Retrieval Augmented Generation (RAG) application and add tracing with LangSmith. You will:'
    )]
})

'What steps will I take in the quickstart for setting up the RAG application with LangSmith tracing?'

However, we want the documents to first come from the retriever we just set up. That way, we can use the retriever to dynamically select the most relevant documents and pass those in for a given question.

In [19]:
from langchain.chains import create_retrieval_chain


retriever= vectorstore_db.as_retriever()
retrieval_chain = create_retrieval_chain(retriever, document_chain)

In LangChain, we combine a retriever with a document chain to form a retrieval chain because each serves a different purpose in the RAG pipeline. The retriever’s job is to search a knowledge base (like a vector store) and return the most relevant documents for a given query, while the document chain’s job is to take those documents along with the user’s question, format them into a prompt, and pass them to the LLM to generate a final answer. By combining them, we get an end-to-end system where the retriever finds the right context and the document chain ensures the LLM uses that context effectively, enabling accurate and context-aware responses.

In [26]:
## Get the response from the llm
response = retrieval_chain.invoke({"input": "Summarize the entire page"})
print(response['answer'])

Was the page described in the context helpful?


In [27]:
response

{'input': 'Summarize the entire page',
 'context': [Document(id='e94566f0-4da7-4562-a304-60d9be4b2faa', metadata={'source': 'https://docs.langchain.com/langsmith/observability-quickstart', 'title': 'Tracing quickstart - Docs by LangChain', 'language': 'en'}, page_content='\u200bVideo guide\nWas this page helpful?YesNoOverviewEvaluate an applicationAssistantResponses are generated using AI and may contain mistakes.Docs by LangChain home pagegithubxlinkedinyoutubeResourcesChangelogLangChain AcademyTrust CenterCompanyAboutCareersBloggithubxlinkedinyoutubePowered by Mintlify'),
  Document(id='8b0117b4-0d35-4c0a-ba2d-71bb28712a2d', metadata={'source': 'https://docs.langchain.com/langsmith/observability-quickstart', 'title': 'Tracing quickstart - Docs by LangChain', 'language': 'en'}, page_content='variables3. Define your application4. Trace LLM calls5. Trace an entire applicationNext stepsVideo guideQuickstartsTracing quickstartCopy pageCopy pageObservability is a critical requirement for a