Simple Gen AI APP Using Langchain

In [1]:
import os
from dotenv import load_dotenv
load_dotenv()

os.environ["HF_TOKEN"] = os.getenv("HF_TOKEN")

## Langsmith Tracking

os.environ["LANGCHAIN_API_KEY"] = os.getenv("LANGCHAIN_API_KEY")
os.environ["LANGCHAIN_PROJECT"] = os.getenv("LANGCHAIN_PROJECT")
os.environ["GROQ_API_KEY"] = os.getenv("GROQ_API_KEY")
os.environ["LANGCHAIN_TRACING_V2"] = "true"

In [2]:
## Data Ingestion--From the website we need to scrape the data

from langchain_community.document_loaders import WebBaseLoader
url = "https://docs.smith.langchain.com/administration/tutorials/manage_spend"

loader = WebBaseLoader(url)
loader

  from .autonotebook import tqdm as notebook_tqdm
USER_AGENT environment variable not set, consider setting it to identify your requests.


<langchain_community.document_loaders.web_base.WebBaseLoader at 0x17daeafd6a0>

In [3]:
url_content = loader.load()
url_content

[Document(metadata={'source': 'https://docs.smith.langchain.com/administration/tutorials/manage_spend', 'title': 'Manage billing in your account - Docs by LangChain', 'language': 'en'}, page_content="Manage billing in your account - Docs by LangChainSkip to main contentüöÄ Share how you're building agents for a chance to win LangChain swag!Docs by LangChain home pageLangSmithSearch...‚åòKAsk AIGitHubTry LangSmithTry LangSmithSearch...NavigationAccount administrationManage billing in your accountGet startedObservabilityEvaluationPrompt engineeringDeploymentAgent BuilderPlatform setupOverviewPlansCreate an account and API keyAccount administrationOverviewSet up a workspaceManage organizations using the APIManage billingSet up resource tagsUser managementReferenceLangSmith Python SDKLangSmith JS/TS SDKLangGraph Python SDKLangGraph JS/TS SDKLangSmith APIAPI reference for LangSmith DeploymentAdditional resourcesReleases & changelogsData managementAccess control & AuthenticationScalability 

In [4]:
 ## Divide the data into chunks for processing

from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
documents = text_splitter.split_documents(url_content)

In [5]:
documents

[Document(metadata={'source': 'https://docs.smith.langchain.com/administration/tutorials/manage_spend', 'title': 'Manage billing in your account - Docs by LangChain', 'language': 'en'}, page_content="Manage billing in your account - Docs by LangChainSkip to main contentüöÄ Share how you're building agents for a chance to win LangChain swag!Docs by LangChain home pageLangSmithSearch...‚åòKAsk AIGitHubTry LangSmithTry LangSmithSearch...NavigationAccount administrationManage billing in your accountGet startedObservabilityEvaluationPrompt engineeringDeploymentAgent BuilderPlatform setupOverviewPlansCreate an account and API keyAccount administrationOverviewSet up a workspaceManage organizations using the APIManage billingSet up resource tagsUser managementReferenceLangSmith Python SDKLangSmith JS/TS SDKLangGraph Python SDKLangGraph JS/TS SDKLangSmith APIAPI reference for LangSmith DeploymentAdditional resourcesReleases & changelogsData managementAccess control & AuthenticationScalability 

In [6]:
## Converting the text into embeddings

from langchain_huggingface import HuggingFaceEmbeddings
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")

In [7]:
## Storing the embeddings in a vector database

from langchain_community.vectorstores import FAISS
vectorstoredb = FAISS.from_documents(documents, embeddings)

In [8]:
vectorstoredb

<langchain_community.vectorstores.faiss.FAISS at 0x17dc32551f0>

In [9]:
from langchain_groq import ChatGroq

llm = ChatGroq(
    model="llama-3.1-8b-instant",      # cheaper & fast
    # or "llama-3.1-70b-versatile"
    temperature=0.3,
)


In [10]:
## Querying the vector database

query = "LangSmith has two usage limits: total traces and extended"
result = vectorstoredb.similarity_search(query)
result[0].page_content

'For organizations with multiple workspaces only: For simplicity, LangSmith incorporates the free traces into the cost calculation of the first workspace only. In actuality, the free traces can be ‚Äúconsumed‚Äù by any workspace. Therefore, although workspace-level spend limits are approximate for multi-workspace organizations, the organization-level spend limit is absolute.\n\u200bConfigure trace tier distrubution\nLangSmith has two trace tiers: base traces and extended traces. Base traces have the base retention and are short-lived (14 days), while extended traces have extended retention and are long-lived (400 days). For more information, refer to the data retention conceptual docs.'

In [11]:
import sys
sys.executable

'c:\\Users\\ASUS\\OneDrive\\Desktop\\Agentic-AI\\.venv\\Scripts\\python.exe'

In [12]:
## Retrieval Chain

from langchain_classic.chains.combine_documents import create_stuff_documents_chain
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_template(
    """
Answer the following question based only on the context provided:
<context>
{context}
</context>
"""
)

# Pass the LLM instance (llm), not embeddings
document_chain = create_stuff_documents_chain(llm=llm, prompt=prompt)
document_chain

RunnableBinding(bound=RunnableBinding(bound=RunnableAssign(mapper={
  context: RunnableLambda(format_docs)
}), kwargs={}, config={'run_name': 'format_inputs'}, config_factories=[])
| ChatPromptTemplate(input_variables=['context'], input_types={}, partial_variables={}, messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context'], input_types={}, partial_variables={}, template='\nAnswer the following question based only on the context provided:\n<context>\n{context}\n</context>\n'), additional_kwargs={})])
| ChatGroq(profile={'max_input_tokens': 131072, 'max_output_tokens': 8192, 'image_inputs': False, 'audio_inputs': False, 'video_inputs': False, 'image_outputs': False, 'audio_outputs': False, 'video_outputs': False, 'reasoning_output': False, 'tool_calling': True}, client=<groq.resources.chat.completions.Completions object at 0x0000017DB018A090>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x0000017DC33EB860>, model_name='llama-3.1

In [None]:
from langchain_core.documents import Document
document_chain.invoke({
    "input":"LangSmith has two usage limits: total traces and extended",
    "context":[Document(page_content="LangSmith has two usage limits: total traces and extended traces. These correspond to the two metrics we've been tracking on our usage graph.")]
})

'It appears that LangSmith has two usage limits: \n\n1. Total traces\n2. Extended traces'

In [36]:
## Input --> Retriever --> Vector DB

retriever = vectorstoredb.as_retriever()
from langchain_classic.chains import create_retrieval_chain
retrieval_chain = create_retrieval_chain(retriever,document_chain)

In [37]:
retrieval_chain

RunnableBinding(bound=RunnableAssign(mapper={
  context: RunnableBinding(bound=RunnableLambda(lambda x: x['input'])
           | VectorStoreRetriever(tags=['FAISS', 'HuggingFaceEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x0000017DC32551F0>, search_kwargs={}), kwargs={}, config={'run_name': 'retrieve_documents'}, config_factories=[])
})
| RunnableAssign(mapper={
    answer: RunnableBinding(bound=RunnableBinding(bound=RunnableAssign(mapper={
              context: RunnableLambda(format_docs)
            }), kwargs={}, config={'run_name': 'format_inputs'}, config_factories=[])
            | ChatPromptTemplate(input_variables=['context'], input_types={}, partial_variables={}, messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context'], input_types={}, partial_variables={}, template='\nAnswer the following question based only on the context provided:\n<context>\n{context}\n</context>\n'), additional_kwargs={})])
            |

In [39]:
## Get the response from the llm
response = retrieval_chain.invoke({"input":"LangSmith has two usage limits: total traces and extended"})
response['answer']

'Based on the provided context, here are the answers to your questions:\n\n1. How do free traces work in multi-workspace organizations?\n\nFor simplicity, LangSmith incorporates the free traces into the cost calculation of the first workspace only. In actuality, the free traces can be "consumed" by any workspace. Therefore, although workspace-level spend limits are approximate for multi-workspace organizations, the organization-level spend limit is absolute.\n\n2. What are the two trace tiers in LangSmith?\n\nLangSmith has two trace tiers: \n- Base traces: have the base retention and are short-lived (14 days)\n- Extended traces: have extended retention and are long-lived (400 days)\n\n3. How do I set limits on usage in LangSmith?\n\nTo set limits, navigate to Settings -> Billing and Usage -> Usage limits. Input a spend limit for your selected workspace, and LangSmith will determine an appropriate number of base and extended trace limits to match that spend.\n\n4. How do I set the defau

In [40]:
response

{'input': 'LangSmith has two usage limits: total traces and extended',
 'context': [Document(id='45d7b7bd-2e11-4c64-8564-fdf0685d8297', metadata={'source': 'https://docs.smith.langchain.com/administration/tutorials/manage_spend', 'title': 'Manage billing in your account - Docs by LangChain', 'language': 'en'}, page_content='For organizations with multiple workspaces only: For simplicity, LangSmith incorporates the free traces into the cost calculation of the first workspace only. In actuality, the free traces can be ‚Äúconsumed‚Äù by any workspace. Therefore, although workspace-level spend limits are approximate for multi-workspace organizations, the organization-level spend limit is absolute.\n\u200bConfigure trace tier distrubution\nLangSmith has two trace tiers: base traces and extended traces. Base traces have the base retention and are short-lived (14 days), while extended traces have extended retention and are long-lived (400 days). For more information, refer to the data retenti