# Google Vertex AI Embeddings 

>[Vertex AI Embeddings API](https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/get-text-embeddings) is a service on Google Cloud exposing the embedding models. 

Note: This integration is separate from the Google PaLM integration.

By default, Google Cloud [does not use](https://cloud.google.com/vertex-ai/docs/generative-ai/data-governance#foundation_model_development) Customer Data to train its foundation models as part of Google Cloud`s AI/ML Privacy Commitment. More details about how Google processes data can also be found in [Google's Customer Data Processing Addendum (CDPA)](https://cloud.google.com/terms/data-processing-addendum).

To use Vertex AI PaLM you must have the `langchain-google-vertexai` Python package installed and either:
- Have credentials configured for your environment (gcloud, workload identity, etc...)
- Store the path to a service account JSON file as the GOOGLE_APPLICATION_CREDENTIALS environment variable

This codebase uses the `google.auth` library which first looks for the application credentials variable mentioned above, and then looks for system-level auth.

For more information, see: 
- https://cloud.google.com/docs/authentication/application-default-credentials#GAC
- https://googleapis.dev/python/google-auth/latest/reference/google.auth.html#module-google.auth



## Installation
> langchain-google-vertexai : Install from PyPi

In [None]:
%pip install --upgrade --quiet langchain langchain-google-vertexai

## Text Embeddings
>Check the list of [Supported Models](https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/get-text-embeddings#supported-models)

Vertex AI text embeddings API uses dense vector representations: text-embedding-gecko, for example, uses 768-dimensional vectors. Dense vector embedding models use deep-learning methods similar to the ones used by large language models. Unlike sparse vectors, which tend to directly map words to numbers, dense vectors are designed to better represent the meaning of a piece of text. The benefit of using dense vector embeddings in generative AI is that instead of searching for direct word or syntax matches, you can better search for passages that align to the meaning of the query, even if the passages don't use the same language.

In [1]:
from langchain_google_vertexai import VertexAIEmbeddings

In [25]:
# Initialize the a specific Embeddings Model version
embeddings = VertexAIEmbeddings(model_name="text-embedding-004")

In [4]:
document = "This is a test document."
query = "This is a test query."

In [7]:
# Get embedding for a query
query_emb = embeddings.embed_query(query)

In [8]:
# Get embedding for document(s)
doc_emb = embeddings.embed_documents([document])

### [Embeddings Task Types](https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/task-types#supported_task_types)
> Check the list of [Supported Models](https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/task-types#supported_models) and [Supported task types](https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/task-types#supported_task_types)

Vertex AI embeddings models can generate optimized embeddings for various task types, such as document retrieval, question and answering, and fact verification. Task types are labels that optimize the embeddings that the model generates based on your intended use case.

<br>**Benefits of task types** : [Read More](https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/task-types#benefits_of_task_types)<br/>
<img src="https://cloud.google.com/static/vertex-ai/generative-ai/docs/embeddings/images/task_type_1.png" alt="isolated" width="500"/>
<img src="https://cloud.google.com/static/vertex-ai/generative-ai/docs/embeddings/images/task_type_2.png" alt="isolated" width="500"/>


<p>Embeddings models that use task types support the following task types:</p>

<table>
<thead>
<tr>
<th>Task type</th>
<th>Description</th>
</tr>
</thead>

<tbody>
<tr>
<td><a href="#assess_text_similarity"><code translate="no" dir="ltr">SEMANTIC_SIMILARITY</code></a></td>
<td>Used to generate embeddings that are optimized to assess text similarity</td>
</tr>
<tr>
<td><a href="#classify_texts"><code translate="no" dir="ltr">CLASSIFICATION</code></a></td>
<td>Used to generate embeddings that are optimized to classify texts according to preset labels</td>
</tr>
<tr>
<td><a href="#cluster_texts"><code translate="no" dir="ltr">CLUSTERING</code></a></td>
<td>Used to generate embeddings that are optimized to cluster texts based on their similarities</td>
</tr>
<tr>
<td><a href="#retrieve_information_from_texts"><code translate="no" dir="ltr">RETRIEVAL_DOCUMENT</code>, <code translate="no" dir="ltr">RETRIEVAL_QUERY</code>,  <code translate="no" dir="ltr">QUESTION_ANSWERING</code>, and <code translate="no" dir="ltr">FACT_VERIFICATION</code></a></td>
<td>Used to generate embeddings that are optimized for document search or information retrieval</td>
</tr>
</tbody>
</table>

<p>The best task type for your embeddings job depends on what use case you have for
your embeddings. Before you select a task type, determine your embeddings use
case.</p>

NOTE : 
- To use specific task types, you need to use `embed` method
- `embed_query` method uses task type `RETRIEVAL_QUERY` and `embed_documents` method uses `RETRIEVAL_DOCUMENT` implicitly to work seemlessly with Langchain RAG APIs


In [27]:
# Using Task Type with embed method
text1 = "The cat is sleeping"
text2 = "The feline is napping"

# Preapre inputs for the embed method
texts = [text1, text2]

# get embeddings by passing task type as 'SEMANTIC_SIMILARITY'
emb = embeddings.embed(texts=texts, embeddings_task_type="SEMANTIC_SIMILARITY")

print(len(emb[0]))

768


### Multilingual Embeddings
[Vertex Multilingual Embedding Models](https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/get-text-embeddings#supported-models) can help you get best performance while working on use cases with languages other than English.

>Check the list of [Supported Models](https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/get-text-embeddings#supported-models) and [Supported Languages](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/text-embeddings-api#supported_text_languages)

In [None]:
# Initialize the a specific Multilingual Embeddings Model version
embeddings = VertexAIEmbeddings(model_name="text-multilingual-embedding-002")

In [None]:
# Using Task Type with embed method
text1 = "The cat is sleeping"
text2 = "बिल्ली सो रही है"  # Hindi translation for "The cat is sleeping"

# Preapre inputs for the embed method
texts = [text1, text2]

# get embeddings by passing task type as 'SEMANTIC_SIMILARITY'
emb = embeddings.embed(texts=texts, embeddings_task_type="SEMANTIC_SIMILARITY")

### Get [text embeddings predictions in batches](https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/batch-prediction-genai-embeddings)
>Check the list of [Supported Models](https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/batch-prediction-genai-embeddings#text_embeddings_models_that_support_batch_predictions)

Getting responses in a batch is a way to efficiently send large numbers of non-latency sensitive embeddings requests. Different from getting online responses, where you are limited to one input request at a time, you can send a large number of LLM requests in a single batch request.

In [33]:
# Initialize the a specific Embeddings Model version
embeddings = VertexAIEmbeddings(model_name="text-embedding-004")

In [34]:
# Prepare inputs for the embed method
documents = ["foo bar"] * 8

# get batch embeddings in single request
batch_emb = embeddings.embed_documents(documents)

# check lengthnumber of documents and retruned embeddings
print(f"length input documents : {len(documents)}")
print(f"length embedded documents : {len(batch_emb)}")

length input documents : 8
length embedded documents : 8


## [MultiModal Embeddings](https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/get-multimodal-embeddings) 
>Check the list of [Supported Models](https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/get-text-embeddings#supported-models) and [Usage Limits](https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/get-multimodal-embeddings)

The multimodal embeddings model generates 1408-dimension vectors* based on the input you provide, which can include a combination of image, text, and video data. The embedding vectors can then be used for subsequent tasks like image classification or video content moderation. 

The image embedding vector and text embedding vector are in the same semantic space with the same dimensionality. Consequently, these vectors can be used interchangeably for use cases like searching image by text, or searching video by image.

By default an embedding request returns a 1408 float vector for a data type. You can also specify lower-dimension embeddings (128, 256, or 512 float vectors) for text and image data. This option lets you optimize for latency and storage or quality based on how you plan to use the embeddings. Lower-dimension embeddings provide decreased storage needs and lower latency for subsequent embedding tasks (like search or recommendation), while higher-dimension embeddings offer greater accuracy for the same tasks.

NOTE : For text-only embedding use cases, we recommend using the [Vertex AI text-embeddings API](#text-embeddings) instead

In [17]:
# Initialize the a specific Multimodel Embeddings Model version
embeddings = VertexAIEmbeddings(model_name="multimodalembedding@001")

In [19]:
# Using embed_image method
image_path = "https://cloud.google.com/static/vertex-ai/generative-ai/docs/image/images/img-embedding-cat.jpg"

# get embeddings by passing task type as 'SEMANTIC_SIMILARITY'
image_emb = embeddings.embed_image(image_path=image_path)

# view embedding length
len(image_emb)

1408

OPTIONAL : You may also provide `contextual_text` along with input image as follows

In [20]:
# Using embed_image method
image_path = "https://cloud.google.com/static/vertex-ai/generative-ai/docs/image/images/img-embedding-cat.jpg"
contextual_text = "picture of a cat"

# get embeddings by passing task type as 'SEMANTIC_SIMILARITY'
image_emb = embeddings.embed_image(
    image_path=image_path, contextual_text=contextual_text
)

# view embedding length
len(image_emb)

1408

## [Specify lower-dimension embeddings](https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/get-multimodal-embeddings#low-dimension)
By default an embedding request returns a 768 float vector for text embeddings and 1408 float vector for multimodal model. You can also specify lower-dimension embeddings (128, 256, or 512 float vectors) for text and image data. This option lets you optimize for latency and storage or quality based on how you plan to use the embeddings. Lower-dimension embeddings provide decreased storage needs and lower latency for subsequent embedding tasks (like search or recommendation), while higher-dimension embeddings offer greater accuracy for the same tasks.

In [None]:
# Initialize the a specific Embeddings Model version
embeddings = VertexAIEmbeddings(model_name="text-embedding-004")

# Embedding with reduced dimensionlaity
dimension = 256

# Using Task Type with embed method
text1 = "The cat is sleeping"
text2 = "बिल्ली सो रही है"  # Hindi translation for "The cat is sleeping"

# Preapre inputs for the embed method
texts = [text1, text2]

# get embeddings by passing task type as 'SEMANTIC_SIMILARITY'
emb = embeddings.embed(
    texts=texts, embeddings_task_type="SEMANTIC_SIMILARITY", dimensions=dimension
)

# Below code should output embedding with 256 dimensions
len(emb[0])

256