Skip to content

Natural Language Processing (NLP), Large Language Models (LLM), and the Power of Vector Embeddings and Databases

Notifications You must be signed in to change notification settings

YanSte/NLP-LLM-Vector-Embeddings-DB-Search

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

| NLP | LLM | Vector | Embeddings DB Search |

Natural Language Processing (NLP), Large Language Models (LLM), and the Power of Vector Embeddings and Databases

Learning

| Overview

Embeddings, Vector Databases, and Advanced Search

Converting text into embedding vectors is the first step to any text processing pipeline. As the amount of text gets larger, there is often a need to save these embedding vectors into a dedicated vector index or library, so that developers won't have to recompute the embeddings and the retrieval process is faster. We can then search for documents based on our intended query and pass these relevant documents into a language model (LM) as additional context. We also refer to this context as supplying the LM with "state" or "memory". The LM then generates a response based on the additional context it receives!

In this notebook, we will implement the full workflow of text vectorization, vector search, and question answering workflow. While we use FAISS (vector library) and ChromaDB (vector database), and a Hugging Face model, know that you can easily swap these tools out for your preferred tools or models!

Learning Objectives

  1. Implement the workflow of reading text, converting text to embeddings, saving them to FAISS and ChromaDB
  2. Query for similar documents using FAISS and ChromaDB
  3. Apply a Hugging Face language model for question answering.

About

Natural Language Processing (NLP), Large Language Models (LLM), and the Power of Vector Embeddings and Databases

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published