In this task, you’ll become familiar with the libraries needed to build your LLM application.

Import the following libraries:

generativeai: This is used for AI-powered applications using Google’s Generative AI models like the Gemini Pro.

genai: This is used to create the model object using the GenerativeModel() function with gemini-pro as the parameter (model).

textwrap: This library is used to create a function that will show the result in a systematic indented format.

IPython: This library is used in Jupyter Notebooks to display different kinds of outputs that are beyond simple text, e.g., images, audio, video, HTML, etc.

Markdown: This library converts the text into the Markdown format.

Use the LangChain library to import the following functions for retrieval-augmented generation (RAG):

PyPDFLoader: The langchain_community library has a lot of document loaders. The PyPDFLoader function is used to load text content from PDF files.

RecursiveCharacterTextSplitter: LangChain uses this function to split the documents into smaller chunks for efficient processing.

GoogleGenerativeAIEmbeddings: This function integrates Google’s embedding models with LangChain. Embeddings convert text into numbers that capture semantic meaning, making them suitable for tasks like similarity search in RAG.

ChatGoogleGenerativeAI: This function is an extension of LangChain that helps interact with Google’s generative AI chat models, such as the Gemini Pro model.

Chroma: This is a vector database designed for faster and more efficient similarity search.

RetrievalQA: This function is tailored to build question-answering systems using RAG. It combines retrieval (finding relevant context) with generation (generating an answer based on the retrieved context).

Use the API key inside the genai.configure() for the model to work.

Instantiate a Google Generative AI object using the gemini-pro model.

Note: There is a unique API key while you use the model. For the API key, you need to check the Google AI Studio website and generate the API key.


If you’re unsure how to do this, click the “Show Hint” button.

Hide Hint
Use the following structure to import libraries:
import <library> as <alias>
Use the following syntax to import modules from the libraries:
import <module> from <library> as <alias>
Use the following syntax to import methods from the libraries:
from <library>.<module> import <method>
If you’re stuck, click the “Show Solution” button.

Hide Solution
Use the following code to complete this task:
import google.generativeai as genai
from IPython.display import display
from IPython.display import Markdown
import textwrap
from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_google_genai import GoogleGenerativeAIEmbeddings, ChatGoogleGenerativeAI
from langchain.vectorstores import Chroma
from langchain.chains import RetrievalQA

genai.configure(api_key='YOUR API KEY')
model = genai.GenerativeModel('gem

In this task, you’ll use a prompt as a question to get a response using the Google Gemini Pro model. The prompt is a question that acts as an input for the text-to-text generation model, which ultimately will give the output.

To complete this task, follow the instructions provided below:

Use the model object created in Task 1 to generate the text.

You must mention the prompt inside the function so it can generate a specific, accurate response. You can experiment with various prompts according to your choice.

Display the generated text using the print statement.

Display the generated text using Markdown.

If you’re unsure how to do this, click the “Show Hint” button.

Hide Hint
Use the <model_object>.generate_content() function to generate the response.
Use the response.text attribute to retrieve the generated text.
Use the Markdown() function to display the text as a markdown.
If you’re stuck, click the “Show Solution” button.

Hide Solution
Use the following code to complete this task:
response = model.generate_content("Explain Generative AI with 3 bullet points")
print(response.text)
Markdown(response.text)

In this task, you’ll ask questions about the Gemini model, and it will answer. The goal is to observe and retrieve the chat history between you and Gemini.

To complete this task, follow the instructions provided below:

Use the model to start the chat.

Ask a query to the model and generate the response text.

Iterate over the history to print each sentence of the conversation. This way, you can isolate and find a specific sentence for observation.

Find and print the number of tokens in the query.

If you’re unsure how to do this, click the “Show Hint” button.

Hide Hint
Use the following steps to proceed with talking to Gemini and retrieving the chat history:
Use the model.to_start() method to start the chat with the bot. Store the result in a variable hist.
Use the hist.send_message() object to get the response.
Access the chat history using the hist.history.
If you’re stuck, click the “Show Solution” button.

Hide Solution
Use the following code to complete this task:
hist = model.start_chat()
response = hist.send_message("Hi! Give me a recipe to make a margeritta pizza from scratch.")
Markdown(response.text)

for i in hist.history:
    print(i)
    print('\n\n')
i.parts[0].text 

model.count_tokens("Now please help me find the nearest supermarket from where I can buy the ingrediants.")

Previous

Co

You will experiment with the various configuration parameters of the Gemini Pro model and understand the effect of tweaking the parameters. The parameters you’ll focus on are temperature, max_output_tokens, top_k, top_p, and candidate_count. In this task, you'll experiment with the temperature parameter.

The temperature parameter in the Gemini Pro model is imperative in the response generation process. It is used during the sampling phase, mainly when the top_p and top_k parameters are being used. temperature influences the randomness in token selection:

Low temperatures are best for prompts that demand to-the-point, accurate, and concise responses. For example, if you ask what is the capital of France, there is only one answer, or when you ask to compute the square root of 16.

High temperatures lead to diverse and creative outcomes, such as generating a roadmap for learning machine learning or writing a poem about AI. You’ll get varied responses, which you can also customize by asking specific questions about them.

To complete this task, perform the following steps:

Define a generate_response() function that takes in the prompt and the configuration parameters, and generates a response.

Use the genai.types.GenerationConfig() function to experiment with the temperature parameter.

If you’re unsure how to do this, click the “Show Hint” button.

Hide Hint
The range of values for the temperature parameter is 0.0–1.0.
Its default value is 0.9 in gemini-pro and 0.4 in gemini-pro-vision.
If you’re stuck, click the “Show Solution” button.

Hide Solution
Use the following code to complete this task:
def get_response(prompt, generation_config={}):
    response = model.generate_content(contents=prompt, 
    generation_config=generation_config)
    return response

for temp in [0.0, 0.25, 0.5, 0.75, 1.0]:
  config = genai.types.GenerationConfig(temperature=temp)
  result = get_response("Explain the concepts of XGBoost and Random Forest with real-life use cases", generation_config=config)

  print(f"\n\nFor temperature value {temp}, the results are: \n\n")
  display(Markdown(result.text))

The parameter max_output_tokens is the upper limit of tokens generated in a response. Lower output token values lead to shorter responses, while higher token values give longer responses. A token approximates to four characters, translating to about 60–80 words for 100 tokens. You can adjust the max_output_tokens parameter based on the required response length.

In this task, you’ll experiment with the max_output_tokens parameter to get responses of various lengths and compare their quality.

If you’re unsure how to do this, click the “Show Hint” button.

Hide Hint
Use the genai.types.GenerationConfig(max_output_tokens= <parameter_value>) function to define the parameter.
The range of values for max_output_tokens is 1–8192 for gemini-pro model, and the default value is 8192.
Similarly, the range of values for the gemini-pro-vision model is 1–2048, and the default value is 2048.
If you’re stuck, click the “Show Solution” button.

Hide Solution
Use the following code to complete this task:
def get_response(prompt, generation_config={}):
    response = model.generate_content(contents=prompt, generation_config=generation_config)
    return response
for m_o_tok in [1, 50, 100, 150, 200]:
    config = genai.types.GenerationConfig(max_output_tokens=m_o_tok)
    result = get_response("Explain the concepts of XGBoost and Random Forest with real-life use cases", generation_config=config)

    print(f"\n\nFor max output token value {temp}, the results are: \n\n")
    display(Markdown(result.text)

The top_k parameter is a measure of how many of the most probable tokens are considered at each step. It affects the model’s token selection strategy for generating outputs. A higher top_k value increases the diversity, leading to more creative responses. A top_k of 1 implies a deterministic approach, choosing the most likely token.

In this task, experiment with the parameter for a prompt. You can observe various values of top_k and notice that for top_k set to 1, the response is well explained in detail as compared to when top_k is set to 32.

If you’re unsure how to do this, click the “Show Hint” button.

Hide Hint
The range of values for the top_k parameter is 1–40.
Its default value is unspecified in gemini-pro and 32 in gemini-pro-vision model.
If you’re stuck, click the “Show Solution” button.

Hide Solution
Use the following code to complete this task:
def get_response(prompt, generation_config={}):
    response = model.generate_content(contents=prompt, 
    generation_config=generation_config)
    return response

for k in [1, 4, 16, 32, 40]:
    config = genai.types.GenerationConfig(top_k=k)
    result = get_response("Explain the concepts of XGBoost and Random Forest with real-life use cases", generation_config=config)

    print(f"\n\nFor top k value {temp}, the results are: \n\n")
    display(Markdown(result.text))

The top_p parameter controls how the AI model chooses words when generating text. It looks at words from most to least likely, adding up their probabilities until reaching the top_p value. Then, it picks the next word from this group, with some influence from the temperature parameter. A lower top_p produces more focused and predictable text, while a higher top_p results in less predictable answers.

In this task, experiment with the top_p parameter for a prompt. Observe that the response changes drastically when top_p is set to 1 compared to when top_p is set to 0.

If you’re unsure how to do this, click the “Show Hint” button.

Hide Hint
The range of values for the top_p parameter is 0.0–1.0.
The default value is 1.0.
If you’re stuck, click the “Show Solution” button.

Hide Solution
Use the following code to complete this task:
def get_response(prompt, generation_config={}):
    response = model.generate_content(contents=prompt, 
    generation_config=generation_config)
    return response

for p in [0, 0.2, 0.4, 0.8, 1]:
    config = genai.types.GenerationConfig(top_p=p)
    result = get_response("Explain the concepts of XGBoost and Random Forest with real-life use cases", generation_config=config)

    print(f"\n\nFor top p value {temp}, the results are: \n\n")
    display(Markdown(result.text))

The candidate_count config parameter determines the number of potential responses the model generates internally before selecting the best one to present. A higher value of candidate_count means the model explores more possibilities, thus leading to more creative or diverse responses. However, this also increases the computational resources required for generation.

You can only set candidate_count to 1 for the config parameter since the Gemini Pro model is designed to focus on generating the single best possible response rather than creating a variety of options. This could be due to the underlying architecture or training data used for the model.

If you’re unsure how to do this, click the “Show Hint” button.

Hide Hint
Use the genai.types.GenerationConfig(candidate_count = <parameter_value>) function to configure the settings for text generation.
In this function, set the field candidate_count to a numeric value indicating the number of candidates to generate.
If you’re stuck, click the “Show Solution” button.

Hide Solution
Use the following code to complete this task:
config = genai.types.GenerationConfig(candidate_count=1)
result = get_response("Explain the concepts of XGBoost and Random Forest with real-life use cases", generation_config=config)
Markdown(result.text)

In this task, you’ll load a PDF document and extract its text. To complete this task, perform the following steps:

Load a relevant PDF to answer the questions.

Create a text splitter using the function RecursiveCharacterTextSplitter() to divide the text into chunks of a specific size and with some overlap.

The chunk size sets the maximum desired length of each extracted text. It is better to keep the chunk size between 500 and 1000 characters to have better retrieval accuracy, lesser computational costs, and also to keep it within the models’ limits.

Chunk overlap is the number of characters that overlap between consecutive chunks. Overlapping text chunks helps to retain context between the adjoining text chunks, where the answer is connected to the nearby text chunks. Here, you’ll keep it between 50 and 100 characters.

Combine the content of all the pages into a single string.

Create the text splitter to break this large string into smaller, manageable pieces.

If you’re unsure how to do this, click the “Show Hint” button.

Hide Hint
Use the PyPDFLoader() function to load the data.
Use the load_and_split() function to split the data.
Use the RecursiveCharacterTextSplitter() to create a text-splitter. This function takes the following parameters:
chunk_size: This defines the maximum length of each text chunk.
chunk_overlap: This defines the number of overlapping characters between consecutive chunks.
If you’re stuck, click the “Show Solution” button.

Hide Solution
Use the following code to complete this task:

CHUNK_SIZE = 700
CHUNK_OVERLAP = 100
pdf_path = "https://www.analytixlabs.co.in/assets/pdfs/Data_Engineering%20&_Other_Job_Roles-AnalytixLabs.pdf"
Use the following code to load and split the PDF document:

pdf_loader = PyPDFLoader(pdf_path)
split_pdf_document = pdf_loader.load_and_split()
Use the following code to split the PDF into custom-sized chunks:

# Splitting text into chunks
text_splitter = RecursiveCharacterTextSplitter(chunk_size=CHUNK_SIZE, chunk_overlap=CHUNK_OVERLAP)
context = "\n\n".join(str(p.page_content) for p in split_pdf_document)
texts = text_splitter.split_t

Continuing from the previous task, you have the text chunks ready. Now, create the Gemini model and then the embeddings.

To complete this task, perform the following steps:

Use the ChatGoogleGenerativeAI() function to instantiate a Gemini Pro model. Pass appropriate values for google_api_key and the temperature parameters.

For creating the embeddings, use the GoogleGenerativeAIEmbeddings() function to instantiate the models/embedding-001 model (also pass the google_api_key).

Use the Chroma() function with the split text (as prepared in Task 10) and the embedding model to generate the embeddings. The texts will be converted into dense vector representations. These vectors capture the semantic meaning of each text chunk and will be used to find similarities between queries and the texts.

Transform the Chroma index into a retriever object to retrieve texts for the questions asked. Pass "search_kwargs={"k": 5}" as the parameter to return the top 5 most similar documents or chunks to your queries.

If you’re unsure how to do this, click the “Show Hint” button.

Hide Hint
Use the ChatGoogleGenerativeAI() function to create the model.
Use the GoogleGenerativeAIEmbeddings() function to create the embeddings.
Then, use the Chroma() function to create the vector database of embeddings of the chunked texts using the embeddings created earlier. The function takes the following parameters:
texts: The generated text chunks from the previous task.
embeddings: The created embeddings.
search_kwargs: The top k most similar documents or chunks to the query asked.
If you’re stuck, click the “Show Solution” button.

Hide Solution
Use the following code to generate the model:
gemini_model = ChatGoogleGenerativeAI(model='gemini-pro', google_api_key=<your_api_key>, temperature=0.8)
Use the following code to generate the embeddings:
embeddings = GoogleGenerativeAIEmbeddings(model="models/embedding-001", google_api_key=<your_api_key>)
Use the following code to create the vector database and a retriever:
vector_index = Chroma.from_texts(texts, embeddings)
retriever = vector_index.as_retrie

Finally, in this task, you will create the RAG question answering chain.

To complete this task, perform the following steps:

Use the RetrievalQA library to create the RAG question answering chain.

After the question answering chain is created, ask the question where the question is a value to the key "query" in a dictionary format.

If you’re unsure how to do this, click the “Show Hint” button.

Hide Hint
Use the RetrievalQA.from_chain_type() function to create the RAG question answering chain. This function will take the following parameters:
The language model used for answering questions.
retriever: The retriever used for fetching relevant documents.
return_source_documents: The flag to include source documents in the result.
Use the qa_chain() function to get the response specific to the document (using a query).
If you’re stuck, click the “Show Solution” button.

Hide Solution
Use the following code to generate the RAG question answering chain:
qa_chain = RetrievalQA.from_chain_type(gemini_model, retriever=retriever, return_source_documents=True)
Use the following code to generate the embeddings:
# Example usage 
question = "Which tools do Data Engineers primarily work with?"
result = qa_chain.invoke({"query": question})
print("Answer:", result["result"]