Capstone: Personal FAQ Bot
- Unlike generic ChatGPT, this bot only answers from your documents. No hallucination about your experience.
- Retrieval-Augmented Generation: search your docs, find relevant chunks, generate accurate answers.
- Deploy with Gradio for a link you can send to recruiters, clients, or colleagues.
- This pattern scales to company knowledge bases, customer support, internal wikis.
- Documents → Chunk → Embed → Store → Query → Retrieve → Generate → Answer

Libraries & Dependencies: 
- pip install langchain langchain-openai chromadb gradio python-dotenv pypdf
- OPENAI_API_KEY in .env file. Or use GROQ_API_KEY for free alternative.
- Gather your resume, portfolio descriptions, project summaries as PDF or TXT files.
- Single file works: faq_bot.py with docs/ folder for your documents.
- langchain + chromadb + gradio + pypdf



- PyPDFLoader reads PDFs page by page. Works with resumes, reports, portfolios.
- TextLoader for .txt and .md files. Good for project descriptions.
- DirectoryLoader scans a folder and loads all matching files automatically.
- Each document carries metadata (source file, page number) for citations.


vectorstore = Chroma.from_documents(chunks, OpenAIEmbeddings(), persist_directory="./chroma_db") # Create and persist embedding db / vector store on local storage in choma_db

This line is giving error: 

RateLimitError: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


OpenAIEmbeddings are powerful machine learning models that convert text into high-dimensional vectors (numerical arrays) capturing deep semantic meaning. They power similarity searches, clustering, and Retrieval-Augmented Generation (RAG) by mapping semantically similar text close together in a vector space. Primarily used via langchain-openai in Python or JS, they support models like text-embedding-3 for efficient text, code, or image analysis. 


Key Aspects of OpenAIEmbeddings:

Function: Converts text/documents into vectors, which are crucial for finding relationships in data.

Best Use Cases: Semantic search, recommendations, data classification, and RAG applications.

Model Options: Third-generation models (text-embedding-3-small, text-embedding-3-large) offer superior performance and configurable dimensions compared to ada-002.

Limits: Maximum input is 8,191 tokens per request, with support for embedding batches of text.

Integration: Used within the LangChain framework via from langchain_openai import OpenAIEmbeddings to easily embed documents for vector stores. 

What is langchain?

LangChain: An open-source framework that helps you orchestrate the interaction between LLMs, vector stores, embedding models, etc, making it easier to integrate a RAG pipeline.

https://groq.com/blog/retrieval-augmented-generation-with-groq-api

https://cdn.sanity.io/images/chol0sk5/production/d9fa5305217baf2ec61d2d064519840c9218ac9b-1012x748.png


In [None]:
from unittest import loader
from webbrowser import Chrome
import chromadb
from langchain_community.document_loaders import PyPDFLoader
from langchain_classic.chains.query_constructor.base import AttributeInfo
# from langchain_community.document_loaders import DirectoryLoader
from langchain_classic.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders.csv_loader import CSVLoader
# from langchain_community.embeddings import OpenAIEmbeddings
from langchain_openai import OpenAIEmbeddings
from langchain_aws import BedrockEmbeddings
from langchain_pinecone import PineconeVectorStore
from langchain_classic.chains.query_constructor.base import AttributeInfo
from langchain_openai import OpenAI
from langchain_classic.retrievers import SelfQueryRetriever
from pinecone import Pinecone
from langchain_classic.document_loaders import DirectoryLoader
from langchain_classic.chains import RetrievalQA
import gradio as gr
from langchain_community.vectorstores import Chroma
from langchain_groq import ChatGroq
from langchain.chat_models import init_chat_model
from langchain_core.vectorstores import InMemoryVectorStore
from langchain_classic.chains import ConversationalRetrievalChain
from langchain_classic.memory import ConversationBufferMemory
from langchain_groq import ChatGroq
from langchain_community.document_loaders import WebBaseLoader
from langchain_community.embeddings import OllamaEmbeddings
from langchain_classic.vectorstores import FAISS
import os 
FAISS_INDEX_PATH = "faiss_index"
import streamlit as st
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader 
from langchain_openai import ChatOpenAI  
from langchain_classic.schema import AIMessage, HumanMessage  
import gradio as gr
### prompts
from langchain_classic import PromptTemplate, LLMChain

# Groq API endpoint for embeddings (example model: nomic-embed-text)
embeddings = OpenAIEmbeddings(model="text-embedding-3-large")
# embeddings = GroqEmbeddings(model="text-embedding-3-large") # didn't work due to missing methods of embeddings.
# llm = OpenAI(temperature=0)
# llm = ChatGroq(model="deepseek-r1-distill-llama-70b",temperature=0.7,max_tokens=500)
llm = init_chat_model("qwen-2.5-32b", model_provider="groq")
model = ChatOpenAI(model="gpt-4o-mini")

def chat(msg, chat_history=[]):
    return qa_chain({"query": msg} )['result']

prompt_template = """
Don't try to make up an answer, if you don't know just say that you don't know.
Answer in the same language the question was asked.
Use only the following pieces of context to answer the question at the end.

{context}

Question: {question}
Answer:"""


PROMPT = PromptTemplate(
    template = prompt_template, 
    input_variables = ["context", "question"]
)

# vectorstore = Chroma.from_documents(chunks, BedrockEmbeddings(model="amazon.titan-embed-text-v1"), persist_directory="./chroma_db") # Create and persist embedding db / vector store on local storage in choma_db
# vectorstore = InMemoryVectorStore(chunks, embeddings)
# vectorstore = FAISS.from_documents(chunks, embeddings)
# demo = gr.ChatInterface(
#     answer,
#     title="Llama Index RAG Chatbot",
# ).launch()
documents = DirectoryLoader("/Users/vkdvamshi/Resumes/", glob="**/*.pdf").load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
chunks = text_splitter.split_documents(documents) # Split documents into chunks
vectorstore = Chroma.from_documents(chunks, embeddings) # Create and persist embedding db / vector store on local storage in choma_db
retriever = vectorstore.as_retriever(search_kwargs={"k": 4})
qa_chain = RetrievalQA.from_chain_type(llm=ChatOpenAI(temperature=0),chain_type="stuff",retriever=retriever,chain_type_kwargs={
        "verbose": True,
        "prompt": PROMPT,
        "memory": ConversationBufferMemory(
            memory_key="history",
            input_key="question"),
    }, return_source_documents=True,verbose=True)

gr.ChatInterface(fn=chat, title="Ask About [Your Name]").launch(share=True)

# query = "What is the summary of the document?"
# result = qa_chain({"query": query})
# print (result['result'])







PermissionError: [Errno 1] Operation not permitted