# Getting Started with the Nemo Retriever "Embedding" microservice
|rkharwar@nvidia.com| Author(s) | [Ruchika Kharwar](https://github.com/rasalt)

NOTE: This notebook has been tested in the following environment:
Python version = 3.10.8

## Overview


## Objective

## Before you begin


### Set up your environment.
Refer to page <> for details on how to deploy the service.
You should have a docker service running namely 
"embedding-ms" on the port of your choice. 
For the purpose of this exercise this service was deployed on port 8080.

eg. In my environment this service is running as "embedding-ms-alone" on port 8080
7f8e5cb76cb2   nvcr.io/ohlfw0olaadg/ea-participants/nemo-retriever-embedding-microservice:24.02   "/opt/nvidia/nvidia_…"   15 minutes ago   Up 15 minutes             0.0.0.0:8080->8080/tcp, :::8080->8080/tcp                                                                                                            embedding-ms-alone

### Setup the environment vairables

In [7]:
URL = "http://localhost:8080/v1/embeddings"
MODEL_ID = "NV-Embed-QA"
INPUT_TYPE = "passage"

### Initialize a sample text string which we will vectorize

In [8]:
text = "The girl threw the butter out of the window to see 'butter-fly'."

Now let's use the embedding container API to generate embedding for the string.

In [None]:
import sys
sys.path.insert(0, '../utils/')

from request_utils import *

url = "http://localhost:8080/v1"
model_id = "NV-Embed-QA"
input_type = "passage"

embed = {
  "input": [text],
  "model": MODEL_ID,
  "input_type": INPUT_TYPE
}

response = post_api(URL, embed)
passage_embeddings = [embedding['embedding'] for embedding in response['data']]
print(len(passage_embeddings))
print(passage_embeddings[:1])


Here we use FAISS as our vector store. We will create the index and add the vectors.

In [None]:
import faiss
import numpy as np

index = faiss.IndexFlatL2(len(passage_embeddings[0]))
index.add(np.array(passage_embeddings).astype('float32'))

print("Number of passages in index:", index.ntotal)


Now that we have our passages added to our index, let's get the vector embedding for our query.

In [None]:
query = "who is wearing blue pants and a yellow hat?"

embed = {
  "input": query,
  "model": model_id,
  "input_type": "query"
}

response = post_api(url+"/embeddings", embed)
query_embedding = [response['data'][0]['embedding']]
print(query_embedding)

Let's search our index for passages that are related to our query vector.

In [None]:
topk = num_passages

distances, ndxs = index.search(np.array(query_embedding).astype('float32'), topk)

print('Distances:\n', distances)
print('Ordered Indices:\n', ndxs)


Lastly, let's take a look at the results.

In [None]:
print("Query:", query)
print("Nearest Passage:", passages[ndxs[0][0]])
print("Furthest Passage:", passages[ndxs[0][len(ndxs[0])-1]])