# Personal Assistant Chatbot: Get to Know Me in a Flash!



**Introduction:**

Imagine having a personal assistant who's always ready to answer questions about your professional background. That's the idea behind "Personal Assistant Chatbot"! This project builds a question-answering system based on my resume or this project can also be used for question-answering on any text document.

**Prerequisites:**

Before we start you'll need one key ingredient: an OpenAI API key.

**Goal:**

The goal is to create an easy and efficient way for anyone to learn about my skills, experience, and other cool stuff on my profile, without digging through entire document.

**Benefits:**

-   **Save Time:** No more searching through pages - get the info you need quickly and easily.
-   **Easy Access:** Ask questions in plain language, just like you would a friend.

**Under the Hood:**

This project uses some cool tech called LangChain to make it all work:

-   TextLoader Power: We use a library called TextLoader to extract the text from my resume.
-   Chunking it Up: Large chunks of text can be overwhelming for computers, so we use LangChain's RecursiveCharacterTextSplitter to break it down into bite-sized pieces.
-   Understanding the Jargon: OpenAI, a powerful AI tool, helps us understand the meaning behind the words in my resume by creating "embeddings." Think of it like a secret code that captures the essence of each sentence.
-   Search Engine Magic: FAISS, another LangChain library, acts like a super-fast search engine that can find relevant parts of my resume based on your questions.
-   Answer Time!: Finally, LangChain's pre-built question-answering chain takes your question, searches the retrieved resume sections using FAISS, and pulls out the answer you're looking for.

**Let's Get Cool!**

So, how does it work in practice? Imagine you want to know about my experience in data analysis. You can simply ask "Aditya's Cool Assistant," "What kind of data analysis experience do you have?" The system will then search my resume using FAISS and leverage the question-answering chain to provide you with the relevant information.

**Room for Improvement:**

This project is a springboard for further development. We can explore using different pre-trained models from OpenAI to potentially improve the understanding of my resume content. Additionally, we could expand the system to handle other file formats beyond Text documents.

**Conclusion:**

The "Personal Assistant Chatbot Project" demonstrates the power of AI in creating a user-friendly way to access information. By leveraging LangChain libraries, the project successfully builds a functional Q&A system for my professional profile. With further development and customization, this approach can be adapted to various scenarios where easy access to information is key!

## Code

Note: Run this once with your OpenAI API key before you get started with questions

In [None]:
!pip install langchain openai PyPDF2 faiss-cpu tiktoken langchain-openai langchain_community

Collecting langchain
  Downloading langchain-0.2.0-py3-none-any.whl (973 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m973.7/973.7 kB[0m [31m5.1 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting openai
  Downloading openai-1.30.1-py3-none-any.whl (320 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m320.6/320.6 kB[0m [31m12.2 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting PyPDF2
  Downloading pypdf2-3.0.1-py3-none-any.whl (232 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m232.6/232.6 kB[0m [31m12.2 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting faiss-cpu
  Downloading faiss_cpu-1.8.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (27.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m27.0/27.0 MB[0m [31m18.6 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting tiktoken
  Downloading tiktoken-0.7.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━

In [None]:
from langchain_openai import OpenAI, OpenAIEmbeddings
from langchain_community.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import FAISS
from langchain_core.vectorstores import VectorStoreRetriever
from langchain.chains import RetrievalQA
from getpass import getpass

In [None]:
import os
# Add your OpenAI API key here
os.environ['OPENAI_API_KEY'] = getpass()

··········


In [None]:
loader = TextLoader("/content/Profile.txt")

In [None]:
documents = loader.load()

In [None]:
# Split the long text into smaller chunks
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 800,
    chunk_overlap  = 100,
    length_function = len,
)

In [None]:
docs = text_splitter.split_documents(documents)
len(docs)

9

In [None]:
embeddings = OpenAIEmbeddings()

In [None]:
# Create a document search engine using FAISS to store and search the embeddings
library = FAISS.from_documents(docs, embeddings)

## Example Prompts

In [None]:
retriever = library.as_retriever()
qa = RetrievalQA.from_chain_type(llm=OpenAI(), chain_type="stuff", retriever=retriever)

In [None]:
query = "Tell me about Kritika"
results_r = qa.invoke(query)
results_r

{'query': 'Tell me about Kritika',
 'result': ' Kritika is a motivated and inquisitive Masters student in Data Science with a passion for leveraging data and AI for business innovation and efficiency. She has a strong background in programming languages such as Python, Java, SQL, and data structures/algorithms. She is also skilled in various data science tools and libraries, databases and data engineering tools, and data visualization. Kritika has experience in application development and data analysis, where she has implemented solutions that have resulted in performance improvements and increased data-driven decision-making. She is highly interested in working in a team environment and solving ambiguous problems.'}

In [None]:
retriever_query = 'Is Kritika a good fit for a Data Science role, why?'
results_r = qa.invoke(retriever_query)
results_r

{'query': 'Is Kritika a good fit for a Data Science role, why?',
 'result': "\nBased on the context provided, Kritika appears to be a strong candidate for a Data Science role. She is currently pursuing a Masters in Data Science and has completed coursework in areas such as Big Data, NLP, AI, and Machine Learning. She also has a Bachelor's degree in Computer Engineering with relevant coursework in data structures, algorithms, and big data analytics. Kritika has also completed certifications in Python for Data Science and Machine Learning, and has experience with various programming languages, data science tools and libraries, databases, and data visualization. Additionally, her work experience as an Application Development Analyst has allowed her to develop skills in data analysis, database optimization, and developing custom CDS views and OData services. Overall, Kritika's education, skills, and experience make her a strong fit for a Data Science role. "}

## Ask away!

In [None]:
query = "ENTER PROMPT HERE"
results_r = qa.invoke(query)
results_r