## Disclaimer
This notebook is using the content from https://github.com/GoogleCloudPlatform/generative-ai/blob/main/language/orchestration/langchain/intro_langchain_palm_api.ipynb

## Pre-requisite
* Singup for a google free account
* Enable VertexAI API
* Setup your google free account and the billable gcloud project in your local machine. 
  * This helps `aiplatform` gcloud library to read the right credentials and the project information

In [1]:
# Utils
import time
from typing import List

# Langchain
import langchain
from pydantic import BaseModel

print(f"LangChain version: {langchain.__version__}")

# Vertex AI
from google.cloud import aiplatform
from langchain.chat_models import ChatVertexAI
from langchain.embeddings import VertexAIEmbeddings
from langchain.llms import VertexAI
from langchain.schema import HumanMessage, SystemMessage

print(f"Vertex AI SDK version: {aiplatform.__version__}")

LangChain version: 0.0.229
Vertex AI SDK version: 1.33.1


In [2]:
# Utility functions for Embeddings API with rate limiting
def rate_limit(max_per_minute):
    period = 60 / max_per_minute
    print("Waiting")
    while True:
        before = time.time()
        yield
        after = time.time()
        elapsed = after - before
        sleep_time = max(0, period - elapsed)
        if sleep_time > 0:
            print(".", end="")
            time.sleep(sleep_time)


class CustomVertexAIEmbeddings(VertexAIEmbeddings, BaseModel):
    requests_per_minute: int
    num_instances_per_batch: int

    # Overriding embed_documents method
    def embed_documents(self, texts: List[str]):
        limiter = rate_limit(self.requests_per_minute)
        results = []
        docs = list(texts)

        while docs:
            # Working in batches because the API accepts maximum 5
            # documents per request to get embeddings
            head, docs = (
                docs[: self.num_instances_per_batch],
                docs[self.num_instances_per_batch :],
            )
            chunk = self.client.get_embeddings(head)
            results.extend(chunk)
            next(limiter)

        return [r.values for r in results]

In [4]:
# LLM model
llm = VertexAI(
    model_name="text-bison@001",
    max_output_tokens=256,
    temperature=0.1,
    top_p=0.8,
    top_k=40,
    verbose=True,
)

# Chat
chat = ChatVertexAI()

# Embedding
EMBEDDING_QPM = 100
EMBEDDING_NUM_BATCH = 5
embeddings = CustomVertexAIEmbeddings(
    requests_per_minute=EMBEDDING_QPM,
    num_instances_per_batch=EMBEDDING_NUM_BATCH,
)

In [5]:
# You'll be working with simple strings (that'll soon grow in complexity!)
my_text = "What day comes after Friday?"

llm(my_text)

'Saturday is the day that comes after Friday.'

# Chat Messages
Chat is like text, but specified with a message type (System, Human, AI)

* System - Helpful context that tells the AI what to do
* Human - Messages intended to represent the user
* AI - Messages showing what the AI responded with

In [6]:
chat([HumanMessage(content="Hello")])

AIMessage(content=' Hello! How can I help you today?', additional_kwargs={}, example=False)

In [7]:
res = chat(
    [
        SystemMessage(
            content="You are a nice AI bot that helps a user figure out what to eat in one short sentence"
        ),
        HumanMessage(content="I like tomatoes, what should I eat?"),
    ]
)

print(res.content)

 You could try a tomato and mozzarella salad with a balsamic glaze. 


In [8]:
res = chat(
    [
        HumanMessage(
            content="What are the ingredients required for making a tomato sandwich?"
        )
    ]
)
print(res.content)

 The ingredients required for making a tomato sandwich are:

- 2 slices of bread
- 1-2 tomatoes, sliced
- Mayonnaise
- Salt and pepper, to taste
- Optional: Lettuce, cheese, bacon, avocado, etc.


# Text Embedding Model
**Embeddings** are a way of representing data–almost any kind of data, like text, images, videos, users, music, whatever–as points in space where the locations of those points in space are semantically meaningful. Embeddings transform your text into a vector (a series of numbers that hold the semantic 'meaning' of your text). Vectors are often used when comparing two pieces of text together. An embedding is a relatively low-dimensional space into which you can translate high-dimensional vectors.

In [9]:
text = "Hi! It's time for the beach"

In [10]:
text_embedding = embeddings.embed_query(text)
print(f"Your embedding is length {len(text_embedding)}")
print(f"Here's a sample: {text_embedding[:5]}...")

Your embedding is length 768
Here's a sample: [0.01571330428123474, -0.02349969744682312, 0.02615266852080822, 0.01741267740726471, 0.053909461945295334]...
