## Vector Stores

### University of Virginia
### DS 7200: Distributed Computing
### Last Updated: September 21, 2023

---

### SOURCES: 

- [Vector Stores](https://js.langchain.com/docs/modules/data_connection/vectorstores/)

- [What is a Vector Database?](https://www.pinecone.io/learn/vector-database/)

- [What are vector embeddings](https://www.pinecone.io/learn/vector-embeddings-for-developers/)

- [Word2Vec](https://en.wikipedia.org/wiki/Word2vec)

- [RAG pattern](https://vitalflux.com/retrieval-augmented-generation-rag-llm-examples/)

### OBJECTIVES

- 

### CONCEPTS

- Vector embedding
- Similarity of embeddings
- Vector store

---

### Background on Vector Embeddings


Early Natural Language Processing (NLP) classifiers used presence or count of words in documents.

A large leap forward used vector representations (**embeddings**) of documents

The embeddings are vectors of fixed size like 64 or 128; values are floats.

All kinds of media are now embedded: documents, videos, images, etc.

<img src="./embed_examples.png" width=300>

Sometimes the elements are interpretable. 

In this example, (dog/puppy/cat) elements have similar sign & direction for some columns

---

### Objects as Vectors

In the figure below, each object is projected into 2D space as a vector.  
There is a notion of object similarity which can be measured by distance between points.  
The light blue objects (represented as points) are more similar than the other objects.

<img src="./vector_space.png" width=300>

### Similarity

Different embeddings can be compared using a similarity score like *cosine similarity*

<img src="./cosine_sim.png" width=300>


### Training and Use

Embeddings are formed by training a neural network on the data and taking the last hidden layer.

One of the earliest models was [Word2Vec](https://en.wikipedia.org/wiki/Word2vec)

After objects are represented as vectors, they can be stored and reused later.


The flow looks like this:

```
raw data -> embedding model -> vector embedding
```
---

### Storage

The vectors can be stored in a traditional database (relational or NoSQL)...  
...but specialized databases have emerged to efficiently store, compare, and search on embeddings.

These are called **vector stores**  

Examples:

- Pinecone
- OpenSearch

We will look at a Pinecone demo in this module

[fill in how they work]

---

### Vector Embedding Use Cases

Some important use cases are:

- **Search** - Embeddings can represent deeper attributes of an object than keywords. 
  They can be much more effective in getting good search results like this:  
  - User query can be embedded (using a specific embedding model)
  - Each piece of content was embedded earlier (using that same embedding model)
  - Similarity between query embedding and each content embedding is calculated
  - Highest-scoring matches are selected
  - Apply any relevant filters
  - Return top results
  
- **Question answering** - better search allows for better ability to answer questions  
  
  
- **Recommendation** - better search allows for more relevant recommendations  
   Example: Given attribute information for users and items, recommend items with similar vectors
  
- **Generative AI** - GenAI models can produce new content and they can power chatbots  
  One major risk is *hallucination*. If there is a request where the model wasn't sufficiently trained, it may return nonsense.  
  Popular approach now is *RAG* - retrieval augmented generation   
  
  
RAG does this:
  - Embed the query
  - Search for most similar content embedding
  - Include matching content in user prompt: "Based on the content below, tell me about neural networks."  
  - Large language model (LLM) constructs and returns result based on prompt + context

<img src="./rag.png" width=600>

---

### Conclusions

[fill this in]