## Overview

This notebook covers the essentials of text embeddings are a NLP technique that converts textual data into numerical vectors that can be processed by machine learning algorithms, especially large models. textembedding-gecko@latest is the newest version of an embeddings model in the generative AI offerings from Google. This is done for a number of use cases including but not limited to:



*   Semantic search - Search text ranked by semantic similarity
*   Classification: Return the class of items whose text attributes are similar to the given text.
*   Outlier Detection: Return items where text attributes are least related to the given text.
*   Conversational interface: Clusters groups of sentences which can lead to similar responses, like in a conversation-level embedding space.

Learn more about text embeddings in the [official documentation](https://cloud.google.com/vertex-ai/docs/generative-ai/text/text-overview).

### Objective

In this notebook, you learn how to get the embeddings for any text using Google's textembedding-gecko@001.

### Install Vertex AI SDK

In [1]:
!pip install "shapely<2.0.0"
!pip install google-cloud-aiplatform --upgrade --user

Collecting shapely<2.0.0
  Downloading Shapely-1.8.5.post1-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (2.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.0/2.0 MB[0m [31m10.5 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: shapely
  Attempting uninstall: shapely
    Found existing installation: shapely 2.0.1
    Uninstalling shapely-2.0.1:
      Successfully uninstalled shapely-2.0.1
Successfully installed shapely-1.8.5.post1
Collecting google-cloud-aiplatform
  Downloading google_cloud_aiplatform-1.32.0-py2.py3-none-any.whl (2.9 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.9/2.9 MB[0m [31m9.9 MB/s[0m eta [36m0:00:00[0m
Collecting google-cloud-resource-manager<3.0.0dev,>=1.3.3 (from google-cloud-aiplatform)
  Downloading google_cloud_resource_manager-1.10.3-py2.py3-none-any.whl (320 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m321.0/321.0 kB[0m [31m21.9 MB/s[0m eta [36m0:0

**Colab only:** Uncomment the following cell to restart the kernel or use the button to restart the kernel. For Vertex AI Workbench you can restart the terminal using the button on top.

In [None]:
# # Automatically restart kernel after installs so that your environment can access the new packages
# import IPython

# app = IPython.Application.instance()
# app.kernel.do_shutdown(True)

### Authenticating your notebook environment
* If you are using **Colab** to run this notebook, uncomment the cell below and continue.
* If you are using **Vertex AI Workbench**, check out the setup instructions [here](https://github.com/GoogleCloudPlatform/generative-ai/tree/main/setup-env).

In [1]:
from google.colab import auth
auth.authenticate_user()

### Import libraries

**Colab only:** Uncomment the following cell to initialize the Vertex AI SDK. For Vertex AI Workbench, you don't need to run this.  

In [2]:
import vertexai

PROJECT_ID = "acn-lkmaigcp"  # @param {type:"string"}
vertexai.init(project=PROJECT_ID, location="us-central1")

In [3]:
from vertexai.language_models import TextEmbeddingModel

# Run Model

In [None]:
def text_embedding() -> list:
    """Text embedding with a Large Language Model."""
    model = TextEmbeddingModel.from_pretrained("textembedding-gecko@001")
    embeddings = model.get_embeddings(["At 6.03 pm IST on August 23, the Chandrayaan-3 lander touched down on the moon’s surface, in the south polar region."])
    for embedding in embeddings:
        vector = embedding.values
        print(vector)
        print(f"Length of Embedding Vector: {len(vector)}")

In [None]:
text_embedding()

[-0.028637759387493134, -0.0057718208990991116, -0.013384837657213211, -0.030383607372641563, 0.031182600185275078, 0.00228251819498837, 0.02990310825407505, 0.017882388085126877, 0.00396804790943861, 0.019183436408638954, 0.04699325188994408, 0.02412460558116436, 0.04850311577320099, -0.011903684586286545, -0.007311512250453234, 0.022599129006266594, -0.06142823025584221, -0.011936185881495476, 0.0016060088528320193, 0.0692877247929573, -0.053415920585393906, -0.0139314578846097, 0.013906418345868587, -0.0036567398346960545, 0.007174121681600809, -0.10589801520109177, 0.039909183979034424, -0.054545193910598755, -0.032683197408914566, 0.000433400651672855, -0.03241682052612305, 0.015719514340162277, 0.026544347405433655, -0.045198287814855576, -0.008490915410220623, 0.04002094641327858, -0.053653791546821594, -0.0260968916118145, 0.0015885669272392988, 0.0749494805932045, -0.0013810062082484365, 0.008482629433274269, 0.013697638176381588, -0.004507261794060469, -0.01311511266976595, -