# Enhance Text Generation with RAG, LangChain, and Vertex AI

In [1]:
%%capture
!pip -q install langchain_experimental langchain_core
!pip -q install google-generativeai==0.3.1
!pip -q install google-ai-generativelanguage==0.4.0
!pip -q install langchain-google-genai
!pip -q install wikipedia
!pip -q install docarray
!pip -q install --upgrade protobuf google.protobuf

In [2]:
# Before proceeding with the next cells restart the kernel by clicking the refresh icon on the top toolbar.

In [3]:
import os
import google.generativeai as genai

from IPython.display import display
from IPython.display import Markdown
from google.protobuf.empty_pb2 import Empty

key_name = !gcloud services api-keys list --filter="gemini-api-key" --format="value(name)"
key_name = key_name[0]

api_key = !gcloud services api-keys get-key-string $key_name --location="us-central1" --format="value(keyString)"
api_key = api_key[0]

os.environ["GOOGLE_API_KEY"] = api_key

genai.configure(api_key=os.environ["GOOGLE_API_KEY"])

In [4]:
models = [m for m in genai.list_models()]
models

[Model(name='models/chat-bison-001',
       base_model_id='',
       version='001',
       display_name='PaLM 2 Chat (Legacy)',
       description='A legacy text-only model optimized for chat conversations',
       input_token_limit=4096,
       output_token_limit=1024,
       supported_generation_methods=['generateMessage', 'countMessageTokens'],
       temperature=0.25,
       max_temperature=None,
       top_p=0.95,
       top_k=40),
 Model(name='models/text-bison-001',
       base_model_id='',
       version='001',
       display_name='PaLM 2 (Legacy)',
       description='A legacy model that understands text and generates text as an output',
       input_token_limit=8196,
       output_token_limit=1024,
       supported_generation_methods=['generateText', 'countTextTokens', 'createTunedTextModel'],
       temperature=0.7,
       max_temperature=None,
       top_p=0.95,
       top_k=40),
 Model(name='models/embedding-gecko-001',
       base_model_id='',
       version='001',
      

# Using Gemini directly with Python SDK

In [5]:
# generate text
prompt = 'Who are you and what can you do?'

model = genai.GenerativeModel('gemini-pro')

response = model.generate_content(prompt)

Markdown(response.candidates[0].content.parts[0].text)

**I am Gemini, a multimodal AI language model developed by Google.**

**My capabilities include:**

* **Natural Language Processing (NLP):** I can understand and generate human language, including answering questions, translating languages, writing different kinds of texts, and engaging in conversations.
* **Information Retrieval:** I can access and search large amounts of text data to find specific information or provide comprehensive summaries.
* **Summarization and Text Generation:** I can condense long pieces of text into shorter summaries, extract key points, or generate new text based on given prompts.
* **Translation:** I can translate text between over 100 languages, preserving the meaning and context of the original content.
* **Code Generation:** I can generate code in various programming languages based on natural language instructions or examples.
* **Question Answering:** I can provide answers to factual questions based on my knowledge of the world, which I have acquired through extensive training on a massive dataset of text and code.
* **Dialogue Management:** I can engage in multi-turn conversations, understand the context of the discussion, and respond in a relevant and engaging manner.

I am continuously learning and improving my capabilities through ongoing training and updates.

# Using Gemini with LangChain

## Basic LLM Chain

In [6]:
from langchain_core.messages import HumanMessage
from langchain_google_genai import ChatGoogleGenerativeAI

llm = ChatGoogleGenerativeAI(model="gemini-pro",
                             temperature=0.7)


result = llm.invoke("What is a LLM?")

Markdown(result.content)

LLM stands for "Large Language Model".

LLMs are a type of artificial intelligence (AI) that are trained on massive datasets of text and code. They are designed to understand and generate human-like language, and can perform a wide range of language-related tasks, such as:

* **Text generation:** Creating new text, such as stories, articles, or code.
* **Text translation:** Translating text from one language to another.
* **Question answering:** Answering questions based on a given context.
* **Summarization:** Summarizing long pieces of text into shorter, more concise summaries.
* **Chatbots:** Engaging in natural language conversations with humans.

LLMs are trained using unsupervised learning, which means that they are not explicitly programmed with rules or instructions. Instead, they learn by identifying patterns and relationships in the data they are trained on. This allows them to develop a deep understanding of language and the ability to generate text that is both coherent and informative.

Some of the most well-known LLMs include GPT-3, BLOOM, and Gemini. These models have been used to develop a wide range of applications, including chatbots, language translation tools, and writing assistants.

LLMs are still under development, but they have the potential to revolutionize many industries and applications that rely on language.

In [7]:
for chunk in llm.stream("Write a haiku about LLMs."):
    print(chunk.content)
    print("---")

Words flow effortlessly,
Guided by algorithms,
AI's poetic voice.
---

---


## Basic Multi Chain

In [9]:
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_google_genai import GoogleGenerativeAIEmbeddings
from langchain.prompts import ChatPromptTemplate
from langchain.schema.output_parser import StrOutputParser


model = ChatGoogleGenerativeAI(model="gemini-pro", temperature=0.7)

In [10]:
prompt = ChatPromptTemplate.from_template(
    "tell me a short joke about {topic}"
)

output_parser = StrOutputParser()

In [11]:
chain = prompt | model | output_parser

In [12]:
chain.invoke({"topic": "machine learning"})

"Why did the machine learning algorithm get lost?\n\nBecause it didn't have a good map-ping function."

## A more complicated Chain - RAG

In [26]:
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_google_genai import GoogleGenerativeAIEmbeddings
from langchain.vectorstores import DocArrayInMemorySearch

In [27]:
model = ChatGoogleGenerativeAI(model="gemini-pro",
                             temperature=0.7)

In [28]:
embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001")

In [29]:
from langchain.document_loaders import WikipediaLoader

# use Wikipedia loader to create some docs to use..
docs = WikipediaLoader(query="Machine Learning", load_max_docs=10).load()
docs += WikipediaLoader(query="Deep Learning", load_max_docs=10).load() 
docs += WikipediaLoader(query="Neural Networks", load_max_docs=10).load() 

# Take a look at a single document
docs[0]

Document(metadata={'title': 'Machine learning', 'summary': 'Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of statistical algorithms that can learn from data and generalize to unseen data, and thus perform tasks without explicit instructions. Quick progress in the field of deep learning, beginning in 2010s, allowed neural networks to surpass many previous approaches in performance.\nML finds application in many fields, including natural language processing, computer vision, speech recognition, email filtering, agriculture, and medicine. The application of ML to business problems is known as predictive analytics.\nStatistics and mathematical optimization (mathematical programming) methods comprise the foundations of machine learning. Data mining is a related field of study, focusing on exploratory data analysis (EDA) via unsupervised learning. \nFrom a theoretical viewpoint, probably approximately correct (PAC) learning prov

In [30]:
vectorstore = DocArrayInMemorySearch.from_documents(
    docs,
    embedding=embeddings # passing in the model to embed documents..
)

retriever = vectorstore.as_retriever()

In [31]:
retriever.get_relevant_documents("what is machine learning?")

[Document(metadata={'title': 'Machine learning', 'summary': 'Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of statistical algorithms that can learn from data and generalize to unseen data, and thus perform tasks without explicit instructions. Quick progress in the field of deep learning, beginning in 2010s, allowed neural networks to surpass many previous approaches in performance.\nML finds application in many fields, including natural language processing, computer vision, speech recognition, email filtering, agriculture, and medicine. The application of ML to business problems is known as predictive analytics.\nStatistics and mathematical optimization (mathematical programming) methods comprise the foundations of machine learning. Data mining is a related field of study, focusing on exploratory data analysis (EDA) via unsupervised learning. \nFrom a theoretical viewpoint, probably approximately correct (PAC) learning pro

In [32]:
retriever.get_relevant_documents("what is gemini pro?")

[Document(metadata={'title': 'Graph neural network', 'summary': 'A graph neural network (GNN) belongs to a class of artificial neural networks for processing data that can be represented as graphs.\n\nIn the more general subject of "geometric deep learning", certain existing neural network architectures can be interpreted as GNNs operating on suitably defined graphs. A convolutional neural network layer, in the context of computer vision, can be considered a GNN applied to graphs whose nodes are pixels and only adjacent pixels are connected by edges in the graph. A transformer layer, in natural language processing, can be considered a GNN applied to complete graphs whose nodes are words or tokens in a passage of natural language text.\nThe key design element of GNNs is the use of pairwise message passing, such that graph nodes iteratively update their representations by exchanging information with their neighbors. Several GNN architectures have been proposed, which implement different 

In [33]:
template = """Answer the question a a full sentence, based only on the following context:
{context}

Return you answer in three back ticks

Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)

In [34]:
from langchain.schema.runnable import RunnableMap

In [35]:
retriever.get_relevant_documents("What is a graident boosted tree?")

[Document(metadata={'title': 'Boosting (machine learning)', 'summary': 'In machine learning, boosting is an ensemble meta-algorithm for primarily reducing bias, variance. It is used in supervised learning and a family of machine learning algorithms that convert weak learners to strong ones.\nThe concept of boosting is based on the question posed by Kearns and Valiant (1988, 1989): "Can a set of weak learners create a single strong learner?" A weak learner is defined as a classifier that is only slightly correlated with the true classification (it can label examples better than random guessing). A strong learner is a classifier that is arbitrarily well-correlated with the true classification. Robert Schapire answered the question in the affirmative in a paper published in 1990.This has had significant ramifications in machine learning and statistics, most notably leading to the development of boosting.\nInitially, the hypothesis boosting problem simply referred to the process of turning

In [36]:
chain = RunnableMap({
    "context": lambda x: retriever.get_relevant_documents(x["question"]),
    "question": lambda x: x["question"]
}) | prompt | model | output_parser

In [37]:
chain.invoke({"question": "What is machine learning?"})

'```Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of statistical algorithms that can learn from data and generalize to unseen data.```'

In [38]:
chain.invoke({"question": "When was the transformer invented?"})

'```2017```'