# **Task 1: Setting up the Environment (Windows)**

**Step 1:** Create a Virtual Environment
Open Command Prompt or PowerShell.

Navigate to the directory where you want to create your project.

Run the following command to create a virtual environment:

**python -m venv langchain_env**


To activate the virtual environment, run:

**langchain_env\Scripts\activate**

**Step 2:** Install Required Packages
Run the following command to install the required libraries:

**pip install langchain transformers sentence-transformers faiss-cpu**

These packages include:

*    LangChain: For developing LLM-based applications.
*    HuggingFace Transformers: For loading pre-trained language models (DistilBERT in this case).
*    SentenceTransformers: For creating embeddings and similarity searches.
FAISS: For efficient vector-based document retrieval.

In [4]:
#!pip install langchain transformers sentence-transformers faiss-cpu
!pip install -U langchain-community


Collecting langchain-community
  Downloading langchain_community-0.3.1-py3-none-any.whl.metadata (2.8 kB)
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain-community)
  Downloading dataclasses_json-0.6.7-py3-none-any.whl.metadata (25 kB)
Collecting pydantic-settings<3.0.0,>=2.4.0 (from langchain-community)
  Downloading pydantic_settings-2.5.2-py3-none-any.whl.metadata (3.5 kB)
Collecting marshmallow<4.0.0,>=3.18.0 (from dataclasses-json<0.7,>=0.5.7->langchain-community)
  Downloading marshmallow-3.22.0-py3-none-any.whl.metadata (7.2 kB)
Collecting typing-inspect<1,>=0.4.0 (from dataclasses-json<0.7,>=0.5.7->langchain-community)
  Downloading typing_inspect-0.9.0-py3-none-any.whl.metadata (1.5 kB)
Collecting python-dotenv>=0.21.0 (from pydantic-settings<3.0.0,>=2.4.0->langchain-community)
  Downloading python_dotenv-1.0.1-py3-none-any.whl.metadata (23 kB)
Collecting mypy-extensions>=0.3.0 (from typing-inspect<1,>=0.4.0->dataclasses-json<0.7,>=0.5.7->langchain-community)
  Downloa

# **Task 2: LLM Integration with LangChain**


In [8]:
from pydantic import PrivateAttr
from transformers import pipeline
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import FAISS
from langchain.schema import Document
from langchain.chains import RetrievalQA
from langchain.llms.base import LLM
from typing import List

# Load DistilBERT QA model with CPU (set device=-1 for CPU in Windows)
qa_pipeline_instance = pipeline("question-answering", model="distilbert-base-uncased-distilled-squad", device=-1)

# Custom class to wrap the QA pipeline to be compatible with LangChain
class QuestionAnsweringLLM(LLM):
    _qa_pipeline: object = PrivateAttr()  # Use PrivateAttr to exclude from Pydantic validation

    def __init__(self, qa_pipeline):
        super().__init__()  # Call the parent constructor
        self._qa_pipeline = qa_pipeline  # Assign the QA pipeline

    def _call(self, question: str, context: str) -> str:
        result = self._qa_pipeline(question=question, context=context)
        return result['answer']

    @property
    def _identifying_params(self):
        return {"model": "distilbert"}

    @property
    def _llm_type(self):
        return "custom"

# Wrap the QA pipeline with the custom LLM class
llm = QuestionAnsweringLLM(qa_pipeline_instance)

# Define the context directly as a string
context = """
LangChain is an open-source framework for developing applications powered by language models.
It simplifies the process of building applications that need to interact with various components,
such as document loaders, retrievers, and more.
"""

# Create documents directly from the context string
documents = [Document(page_content=context)]

# Use SentenceTransformers for embeddings
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")

# Create a retriever using FAISS
vector_store = FAISS.from_documents(documents, embeddings)

# Create a QA chain using the custom QA LLM and FAISS retriever
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=vector_store.as_retriever()
)

# Define a function to use the QA model
def ask_question_with_context(context, question):
    # Get the answer using the QA pipeline
    answer = llm._call(question, context)

    # Check if the answer is empty or not relevant
    if not answer or answer not in context:
        return "I can't find a relevant answer in the context."

    return answer

# Main program to take user input
if __name__ == "__main__":
    # Get user input for the question
    user_question = input("Please enter your question: ")

    # Get the answer using the QA system
    answer = ask_question_with_context(context, user_question)

    # Print the answer
    print("Answer:", answer)


  embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.7k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]



1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

Please enter your question: define langchain
Answer: an open-source framework for developing applications powered by language models


**Explanation:**

*   LLM Integration: The DistilBERT model is integrated using the pipeline function from Hugging Face. The device is set to -1 to run on CPU in Windows.
*   Document Retrieval: FAISS is used to index the document and retrieve relevant context sections based on the user's question.



# **Task 3: Working with LangChain Prompt Templates**

In [9]:
from langchain.prompts import PromptTemplate

# Define a custom prompt template
template = """
You are an AI assistant. Answer the following question concisely based on the context.

Context: {context}
Question: {question}

Answer:
"""

# Create a prompt template using LangChain's PromptTemplate
prompt_template = PromptTemplate(input_variables=["context", "question"], template=template)

# Define a function to ask questions with the custom prompt
def ask_question_with_custom_prompt(context, question):
    formatted_prompt = prompt_template.format(context=context, question=question)
    return qa_chain.run(formatted_prompt)


**How Prompt Templates Help:**

*   Structured Input: The template ensures that the question and context are structured properly before being passed to the model, enhancing the accuracy of the response.
*   Concise Responses: By instructing the model to generate concise answers, you improve both the relevance and clarity of the answers.



**Expected Deliverables:**


*   Functional QA System: The system accepts a question from the user and provides an answer based on the provided context using LangChain and DistilBERT.
*   Feature Implementation: You can implement memory or tool integration to store user interactions or retrieve additional data from external sources.
*    Documentation: The environment setup, LLM integration, and custom prompt template features are all documented and explained clearly.

