Vector Embeddings Demo

This repo contains source code that shows a demo of how vector embeddings can help in finding similar questions from a FAQ list. The demo is based on the Sentence Transformer from HuggingFace.

What are Vector Embeddings?

An embedding is a numerical representation of a piece of information, for example, text, documents, images, audio, etc. The representation captures the semantic meaning of what is being embedded, making it robust for many industry applications.

Embeddings are not limited to text! You can also create an embedding of an image (for example, a list of 384 numbers) and compare it with a text embedding to determine if a sentence describes the image. This concept is under powerful systems for image search, classification, description, and more!

"[...] once you understand this ML multitool (embedding), you'll be able to build everything from search engines to recommendation systems to chatbots and a whole lot more. You don't have to be a data scientist with ML expertise to use them, nor do you need a huge labeled dataset." - Dale Markowitz, Google Cloud.

Process Flow

The process flow of the demo is as follows:

Load the FAQ list and the question to be matched.
Create embeddings for the FAQ list and the question to be matched.
Calculate the cosine similarity between the question to be matched and the FAQ list.
Sort the FAQ list by the cosine similarity.
Return the top 5 questions from the FAQ list.

How to run the demo?

Clone the repo using the following command

git clone https://github.com/mwanjajoel/vector-embeddings-demo.git

Create a virtual environment and install the dependencies

pip install -r requirements.txt

Run the demo

python app.py

Run the LangChain version

python chat.py

References

Author

Joel Mwanja

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.vscode		.vscode
data		data
.gitignore		.gitignore
README.md		README.md
app.py		app.py
chat.py		chat.py
env.sample		env.sample
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.vscode

.vscode

data

data

.gitignore

.gitignore

README.md

README.md

app.py

app.py

chat.py

chat.py

env.sample

env.sample

requirements.txt

requirements.txt

Repository files navigation

Vector Embeddings Demo

What are Vector Embeddings?

Process Flow

How to run the demo?

References

Author

About

Releases

Packages

Languages

mwanjajoel/vector-embeddings-demo

Folders and files

Latest commit

History

Repository files navigation

Vector Embeddings Demo

What are Vector Embeddings?

Process Flow

How to run the demo?

References

Author

About

Resources

Stars

Watchers

Forks

Languages