# Resume Assistant

_A Streamlit application_ based in _RAG framework_, which utilizes custom _LangChain prompts_ to retrieve information from the uploaded _resume_.

RAG- Retrieval-Augmented Generation an approach that combines document retrieval using LLMs, providing contextually relevant, accurate responses.

LangChain- Architecture leveraged to integrate custom prompts with OpenAI's gpt-3.5-turbo and retrieve relavant documents to support RAG.

Defining a function to create a temporary file out of the uploaded file on Streamlit and extracting its text.

In [None]:
import fitz

def funcExtractPDFText(strPDFPath):
    
    with open("tempResume.pdf","wb") as f:
        f.write(strPDFPath.getbuffer())

    strExtractedText=""
    filePDF=fitz.open("tempResume.pdf")
    for pg in filePDF:
        strExtractedText+=pg.get_text("text")
    return strExtractedText

Extracting text and formatting to document for Chroma DB

In [None]:
from langchain.schema import Document

fileUploaded=r"D:\Projects\ragForResume\Vaibhav Thakur.pdf"
strResRawTxt=funcExtractPDFText(fileUploaded)

docFormatted=[Document(page_content=strResRawTxt)]

Defining environment variables for LangSmith and OpenAI

In [None]:
import os

os.environ["LANGSMITH_TRACING_V2"] = "true"
os.environ["LANGSMITH_ENDPOINT"] = "https://api.smith.langchain.com"
os.environ["LANGSMITH_API_KEY"] = ""
os.environ["LANGSMITH_PROJECT"]="ragAppForDocs"
os.environ["OPENAI_API_KEY"] = ""

Fetching custom prompt defined in LangSmith and defining LLM to be used.

In [None]:
from langsmith import Client
from langchain_openai import ChatOpenAI

objResAstPrompt=Client().pull_prompt("rag-for-resume1")

objLLM=ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)

Function to define a Rag chain composed of:
1. Chroma DB for vectorization of the document using Open AI Embeddings.
2. Constructing a retriever object that uses vectorstore to retrieve documents.
3. Ultimately creating a chain of retriever, prompt template, and gpt-3.5-turbo (LLM).

In [None]:
def funcDefRagChain(objDoc):

    objVectorStore=Chroma.from_documents(documents=objDoc, embedding=OpenAIEmbeddings())
    objRetriever=objVectorStore.as_retriever()
    objRagChain=create_retrieval_chain(
            retriever=objRetriever,
            combine_docs_chain=objResAstPrompt | objLLM
            )
    return objRagChain

Function to generate response based on question asked, using chain constructed above.

In [None]:
strQuestion="What is the name on the resume?"

def funcGenResponse(inStrQuestion, inObjRagChain):
    
    dictResponse=inObjRagChain.invoke({"input":inStrQuestion})
    return dictResponse["answer"].content

In [None]:
strAnswer=funcGenResponse(strQuestion,objRagChain)
strAnswer