Vector embeddings are a way to represent complex data like words or images as a set of numbers with fixed length. The important relationships between the data points are preserved in this representation where each dimension of the vector corresponds to a specific feature or attribute of the data. This is done using techniques like Word2Vec or GloVe for text and autoencoders for images. 

Once the vector embeddings are created, they can be used for a variety of tasks, such as:

1. **Semantic similarity**: Measuring the similarity between words or phrases based on their vector representations.

2. **Clustering**: Grouping similar data points together based on their vector embeddings.

3. **Classification**: Training a machine learning model to classify data points based on their vector embeddings.

4. **Visualization**: Plotting the vector embeddings in a lower-dimensional space to visualize the relationships between the data points.

Vector embeddings have become a fundamental building block of many modern machine learning algorithms and have enabled significant progress in natural language processing, computer vision, and other areas of AI.

![vector_embedding](../image/vector%20embedding/vector_embedding.png)

                The process of creating vector embeddings from different types of data: Audio, Text, Video.


![vector_embedding2](../image/vector%20embedding/vector_embedding2.png)

When we represent real-world objects and concepts such as images, audio recordings, news articles, user profiles, weather patterns, and political views as vector embeddings, the semantic similarity of these objects and concepts can be quantified by how close they are to each other as points in vector spaces. Vector embedding representations are thus suitable for common machine learning tasks such as clustering, recommendation, and classification.

![vector_embedding3](../image/vector%20embedding/vector_embedding3.png)

Embeddings are useful in tasks like clustering, recommendation, and classification because they measure similarity between objects in a more efficient way than traditional methods like one-hot encoding. One-hot encoding creates a column for every category, resulting in a sparse representation with lots of zeros. This can be problematic as the dataset grows in size and becomes computationally expensive to use. In contrast, embeddings transform categorical values into numerical values by representing them as vectors in a high-dimensional space, allowing for more efficient similarity measurements and better performance in machine learning models.

![vector_embedding4](../image/vector%20embedding/vector_embedding4.png)

When dealing with a large dataset like the entire English language, one-hot encoding becomes impractical as it creates a representation with too many columns. Vector embeddings provide a fixed size representation that is more efficient and dense. The model creates a vector with fewer bytes and more information. This makes it computationally less expensive to use. Vector embeddings can be used in various tasks like reverse image search, chatbots, Q&A, and recommendation systems.

## Creating Vector Embeddings
Machine learning models require numerical representations of data like text or images to understand them. Prior to machine learning, these representations were created manually through [feature engineering](https://www.kaggle.com/learn/feature-engineering). With deep learning, the model learns non-linear feature interactions automatically. Each layer focuses on a different aspect of the input data, creating new representations of it. This is how vector embeddings are created. For images, models like [convolutional neural networks (CNNs)](https://en.wikipedia.org/wiki/Convolutional_neural_network) can be used, and for audio, image embedding transformations can be used on the audio frequencies visual representation (such as its [spectrogram](https://en.wikipedia.org/wiki/Spectrogram)).

## Image Embedding with a Convolutional Neural Network
In this example, images are represented by a matrix of integer values ranging from `0` to `255`, where `0` represents black and `255` represents white. The image is displayed in greyscale and has a corresponding matrix representation.

![vector_embedding5](../image/vector%20embedding/vector_embedding5.png)

In image data, matrix representations capture the semantic information of pixel neighborhoods but are sensitive to transformations. Convolutional Neural Networks (CNNs) process visual data through hierarchical small local sub-inputs called receptive fields. Each neuron in each network layer processes a specific receptive field from the former layer. Each layer applies a convolution on the receptive field or reduces the input size, which is called subsampling. The resulting vector embedding is received via a fully connected layer. CNNs transform images into embeddings, which are more robust and can be used as inputs for further learning.



![vector_embedding6](../image/vector%20embedding/vector_embedding6.png)

The process of creating vector embeddings involves using deep learning models, such as Convolutional Neural Networks (CNNs), to transform raw data into a fixed-size representation that can be stored and used for various purposes. CNNs process inputs through small, local sub-inputs called receptive fields, and each neuron in each layer processes a specific receptive field from the previous layer. The weights of the embedding model are optimized using a large set of labeled data, so that images with the same labels are embedded closer together than images with different labels. Once the embedding model is learned, it can be used to transform new, unseen images into vectors that can be stored and compared to retrieve similar images. Vector embeddings can be created for any kind of data using different models/methods. The ResNet model, for example, is a CNN commonly used for image-related tasks and is trained to predict which of 1000 classes an object in an image belongs to.

![vector_embedding7](../image/vector%20embedding/vector_embedding7.png)

                            Diagram of a simple Convolutional Neural Network (CNN)
        

A wealth of pre-trained models exist that can easily be used for creating vector embeddings. The [Huggingface Model Hub](https://huggingface.co/models) contains many models that can created embeddings for different types of data. For example, the [all-MiniLM-L6-v2 Model](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) is hosted and runnable online, no expertise or install required.

Packages like `sentence_transformers`, also from HuggingFace, provide easy-to-use models for tasks like semantic similarity search, visual search, and many others. To create embeddings with these models, only a few lines of Python are needed:




In [5]:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')

sentences = [
  "That is a very happy Person",
  "That is a Happy Dog",
  "Today is a sunny day"
]
embeddings = model.encode(sentences)

In [6]:
embeddings

array([[-0.00248324,  0.09151708,  0.0483862 , ..., -0.02641124,
        -0.07529837,  0.02803207],
       [ 0.00504994,  0.06316981,  0.01415729, ...,  0.04035437,
         0.07584121,  0.09087349],
       [-0.01629127,  0.10406605,  0.09740778, ...,  0.00676723,
        -0.0878846 ,  0.03404384]], dtype=float32)

## Vector Embeddings for Semantic Similarity Search
Semantic Similarity Search is the process by which pieces of text are compared in order to find which contain the most similar meaning. While this might seem easy for an average human being, languages are quite complex. Distilling unstructured text data down into a format that a Machine Learning model can understand has been the subject of study for many Natural Language Processing researchers.

Vector Embeddings provide a method for anyone, not just NLP researcher or data scientists, to perform semantic similarity search. They provide a meaningful, computationally efficient, numerical representation that can be created by pre-trained models “out of the box”. Below, an example of semantic similarity is shown that outlines the vector embeddings created with the sentence_transformers library shown above.

Let’s take the following sentences:

* "That is a happy dog"

* "That is a very happy person"

* "Today is a sunny day"

Each of these sentences can be transformed into a vector embedding. Below, a simplified representation highlights the position of these example sentences in 2-dimensional vector space relative to one another. This is useful in order to visually gauge how effective our embeddings represent the semantic meaning of text. More on that below.

A simplified plot of vector embeddings projected into 2 dimensions

![vector_embedding8](../image/vector%20embedding/vector_embedding8.png)

Assume we want to compare these sentences to “That is a happy person”. First, we create the vector embedding for the query sentence.

In [7]:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')

# create the vector embedding for the query
query_embedding = model.encode("That is a happy person")

In [8]:
query_embedding

array([-3.38769369e-02,  9.19415876e-02,  4.87012975e-02, -3.48836109e-02,
       -6.48291782e-02, -2.66857557e-02,  1.34293348e-01, -6.91500213e-03,
        6.44351020e-02, -5.82762714e-03,  8.87372196e-02, -1.62496660e-02,
       -2.54945718e-02,  4.83907619e-03,  6.14902377e-03,  1.55436397e-02,
       -5.95202409e-02, -3.20247486e-02,  1.41185373e-02,  2.05600355e-03,
       -1.00310810e-01, -2.04243511e-03, -2.08597239e-02,  9.96055175e-03,
       -1.69836041e-02, -1.64660122e-02,  4.00910266e-02, -2.72038789e-03,
        8.66090879e-02,  6.33227378e-02, -2.68441048e-02, -2.35456694e-02,
        1.09181821e-01,  2.25531738e-02, -3.85773554e-02,  1.94851663e-02,
       -3.15520167e-02,  1.68708675e-02, -9.62975994e-03,  2.02890839e-02,
       -1.82441864e-02,  1.77636892e-02,  1.86448190e-02,  1.22921271e-02,
       -2.05458654e-03, -3.49595062e-02,  6.22536428e-02, -4.34290916e-02,
        7.87903816e-02, -2.45035384e-02, -1.76689737e-02,  2.36276314e-02,
       -5.71010821e-02, -

Next, we need to compare the distance between our query vector embedding and the vector embeddings in our dataset.

There are many ways to calculate the distance between vectors. Each has their own benefits and drawbacks when it comes to semantic search, but we will save that for a separate post. Below some of the common distance metrics are shown.

![vector_embedding9](../image/vector%20embedding/vector_embedding9.png)
                        
                        Distance metrics used in calculating vector similarity.


For this example, we will use the cosine similarity which measures the distance between the inner product space of two vectors.

![vector_embedding10](../image/vector%20embedding/vector_embedding10.png)            
            
                                    Formula for Cosine Similarity


In [9]:
# In Python, this looks like

def cosine_similarity(a, b):
    return np.dot(a, b)/(norm(a)*norm(b))

Running this calculation between our query vector and the other three vectors in the plot above, we can determine how similar the sentences are to one another.

![vector_embedding11](../image/vector%20embedding/vector_embedding11.png)


                    
            2D plot showing the cosine similarity between the vector embeddings created from our sentences earlier


As you might have assumed, “That is a very happy person” is the most similar sentence to “That is a happy person”. This example captures only one of many possible use cases for vector embeddings: Semantic Similarity Search

The Python code to run this entire example is listed below

In [10]:
import numpy as np

from numpy.linalg import norm
from sentence_transformers import SentenceTransformer

# Define the model we want to use (it'll download itself)
model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')

sentences = [
  "That is a very happy person",
  "That is a happy dog",
  "Today is a sunny day"
]

# vector embeddings created from dataset
embeddings = model.encode(sentences)

# query vector embedding
query_embedding = model.encode("That is a happy person")

# define our distance metric
def cosine_similarity(a, b):
    return np.dot(a, b)/(norm(a)*norm(b))

# run semantic similarity search
print("Query: That is a happy person")
for e, s in zip(embeddings, sentences):
    print(s, " -> similarity score = ",
         cosine_similarity(e, query_embedding))


Query: That is a happy person
That is a very happy person  -> similarity score =  0.9429151
That is a happy dog  -> similarity score =  0.69457734
Today is a sunny day  -> similarity score =  0.25687602


After installing `NumPy` and `sentence_transformers`, running this script should result in the following calculations

The results of this script should line up with the results that you see on the [HuggingFace inference API](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) for the model chosen.

HuggingFace inference API similarity results

![vector_embedding12](../image/vector%20embedding/vector_embedding12.png)