##### Copyright 2023 Google LLC.

In [None]:
#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# PaLM API: Embeddings quickstart with Python

<table class="tfo-notebook-buttons" align="left">
  <td>
    <a target="_blank" href="https://ai.google.dev/palm_docs/embeddings_quickstart"><img src="https://developers.generativeai.google/static/site-assets/images/docs/notebook-site-button.png" height="32" width="32" />View on Generative AI</a>
  </td>
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/google/generative-ai-docs/blob/main/site/en/palm_docs/embeddings_quickstart.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
  <td>
    <a target="_blank" href="https://github.com/google/generative-ai-docs/blob/main/site/en/palm_docs/embeddings_quickstart.ipynb"><img src="https://www.tensorflow.org/images/GitHub-Mark-32px.png" />View source on GitHub</a>
  </td>
</table>

In this notebook, you'll learn how to get started with the PaLM API, which gives you access to Google's latest large language models. Here, you'll learn how to use the PaLM API's embedding generation features, and see an example of what you can do with these embeddings.

## Setup

**Note**: At this time, the PaLM API is [only available in certain regions](https://developers.generativeai.google/available_regions).

First, download and install the PaLM API Python library.

In [None]:
!pip install -U google-generativeai

In [None]:
import numpy as np
import google.generativeai as palm

### Grab an API Key

To get started, you'll need to [create an API key](https://developers.generativeai.google/tutorials/setup).

In [None]:
palm.configure(api_key='PALM_KEY')

## What are embeddings?

[Embeddings](https://developers.google.com/machine-learning/glossary#embedding-vector){:.external} are a technique used to represent text (like words, sentences, or entire paragraphs) as a list of floating point numbers in an array. These numbers aren't random. The key idea is that text with similar meanings will have similar embeddings. You can use the relationship between them for many important tasks.

## Embedding generation

In this section, you will see how to generate embeddings for a piece of text using PaLM API's `palm.generate_embeddings` function. Here are a list of models that support this function.

In [None]:
for model in palm.list_models():
  if 'embedText' in model.supported_generation_methods:
    print(model.name)

models/embedding-gecko-001


Use the function `palm.generate_embeddings` and pass in the name of the model as well as some text. You will get a list of floating point values. Start with a query "What do squirrels eat?" and see how related two different strings are to it.

In [None]:
x = 'What do squirrels eat?'

close_to_x = 'nuts and acorns'

different_from_x = 'This morning I woke up in San Francisco, and took a walk to the Bay Bridge. It was a good, sunny morning with no fog.'

model = "models/embedding-gecko-001"

# Create an embedding
embedding_x = palm.generate_embeddings(model=model, text=x)
embedding_close_to_x = palm.generate_embeddings(model=model, text=close_to_x)
embedding_different_from_x = palm.generate_embeddings(model=model, text=different_from_x)

In [None]:
print(embedding_x)

{'embedding': [-0.025894878, -0.02103396, 0.003574992, 0.00822288, 0.03276648, -0.10068223, -0.037702546, 0.01079403, 0.0001406235, -0.029412385, 0.01919925, 0.0048481044, 0.070619866, -0.013349887, 0.028378602, -0.018658886, -0.038629908, 0.056883123, 0.06332366, 0.039849922, -0.085393265, -0.016251814, -0.025535949, 0.0049480307, 0.048581485, -0.11295683, 0.033869933, 0.015498774, -0.07306243, 0.000857902, -0.022031788, -0.005298939, -0.08311722, -0.027091762, 0.042790364, 0.023175264, 0.011238991, -0.02432924, -0.0044626957, 0.05167071, 0.023430848, 0.027325166, -0.01492389, -0.018770715, -0.003783692, 0.040971957, -0.044652887, 0.033220302, -0.05659744, -0.055191413, -0.0023204528, -0.043687623, 0.030044463, -0.015966717, -0.04318426, 0.015735775, -0.038352676, -0.005009736, -0.03289721, 0.016246213, -0.005696393, -0.0010992853, -0.02768714, -0.03534994, -0.045970507, 0.05784305, -0.026696421, -0.013302212, 0.007055761, -0.05885901, 0.03330113, 0.04399591, 0.020755561, 0.0028288597

Now that you have created the embeddings, let's use the dot product to see how related `close_to_x` and `different_from_x` are to `x`. The dot product returns a value between -1 and 1, and represents how closely two vectors align in terms of what direction they point in. The closer the value is to 0, the less similar to objects (in this case, two strings) are. The closer the value is to 1, the more similar they are.

In [None]:
similar_measure = np.dot(embedding_x['embedding'], embedding_close_to_x['embedding'])

print(similar_measure)

0.7314063252924405


In [None]:
different_measure = np.dot(embedding_x['embedding'], embedding_different_from_x['embedding'])

print(different_measure)

0.43560702838194704


As shown here, the higher dot product value between the embeddings of `x` and `close_to_x` demonstrates more relatedness than the embeddings of `x` and `different_from_x`.

## What can you do with embeddings?

You've generated your first set of embeddings with the PaLM API! But what can you do with this list of floating point values? Embeddings can be used for a wide variety of natural language processing (NLP) tasks, including:

* Search (documents, web, etc.)
* Recommendation systems
* Clustering
* Sentiment analysis/text classification

You can find examples [here](https://developers.generativeai.google/examples/doc_search_emb). 