### What is LLMs and PalM?

This document provides an overview of large language models (LLMs) and Google's next-generation LLM, PaLM 2. LLMs are deep learning models trained on massive datasets of text. PaLM 2 excels at tasks like advanced reasoning, translation, and code generation. It builds on Google's legacy of breakthrough research in machine learning and responsible AI. LLMs are created using unsupervised learning, where the model learns to predict the next word in a sentence given the preceding words. This enables the model to generate coherent, fluent text resembling human writing. The model's large size allows it to learn complex patterns and relationships in language and generate high-quality text for various applications.

### Available Models

Vertex AI PaLM API models
The Vertex AI PaLM API enables you to test, customize, and deploy instances of Google’s large language models (LLM) called as PaLM, so that you can leverage the capabilities of PaLM in your applications.

Model naming scheme
Foundation model names have three components: use case, model size, and version number. The naming convention is in the format:
`<use case>-<model size>@<version number>`

For example, _text-bison@001_ represents the Bison text model, version _001_.

The model sizes are as follows:

* __Bison__: The best value in terms of capability and cost.
* __Gecko__: The smallest and cheapest model for simple tasks.

Available models
The Vertex AI PaLM API currently supports five models:

* `text-bison@001` : Fine-tuned to follow natural language instructions and is suitable for a variety of language tasks.

* `chat-bison@001` : Fine-tuned for multi-turn conversation use cases like building a chatbot.

* `textembedding-gecko@001` : Returns model embeddings for text inputs.

* `code-bison@001`: A model fine-tuned to generate code based on a natural language description of the desired code. For example, it can generate a unit test for a function.

* `code-gecko@001`: A model fine-tuned to suggest code completion based on the context in code that's written.

* `codechat-bison@001`: A model fine-tuned for chatbot conversations that help with code-related questions.

You can find more information about the properties of these foundational models in the [Generative AI Studio documentation](https://cloud.google.com/vertex-ai/docs/generative-ai/learn/models#foundation_models).

In [1]:
# import libraries
import os
import vertexai
from IPython.display import Markdown, display
from google.oauth2 import service_account
from dotenv import load_dotenv
from vertexai.language_models import TextGenerationModel, \
                                     ChatModel, \
                                     InputOutputTextPair, \
                                     CodeGenerationModel, \
                                     CodeChatModel

In [2]:
# initiate service account (authentication)
json_path = '../../llm-ai.json' # replace with your own service account
credentials = service_account.Credentials.from_service_account_file(json_path)

In [3]:
# start Vertex AI
load_dotenv()
vertexai.init(project=os.environ["PROJECT_ID"], # replace with your own project
              credentials=credentials)

# Text generation with `text-bison@001`

The text generation model from PaLM API that you will use in this notebook is text-bison@001. It is fine-tuned to follow natural language instructions and is suitable for a variety of language tasks, such as:

* Classification
* Sentiment analysis
* Entity extraction
* Extractive question-answering
* Summarization
* Re-writing text in a different style
* Ad copy generation
* Concept ideation
* Concept simplification

In [4]:
# load the model
generation_model = TextGenerationModel.from_pretrained("text-bison@001")

__Hello PaLM__

Create the first prompt and send it to the text generation model.

In [5]:
prompt = "What is PaLM AI?"

response = generation_model.predict(prompt=prompt)

print(response.text)

PaLM AI (Pathways Language Model) is a large language model from Google AI. It was trained on a massive dataset of text and code, and it can understand and generate human language in a variety of ways. PaLM AI is still under development, but it has already been shown to be capable of impressive feats, such as writing different kinds of creative text, translating languages, answering your questions, and writing different kinds of creative text.


### Try Another Prompt

In [8]:
prompt = """Generate a list of AI company name ideas.
A list should be 10 bullet points.
Each company name should be 2 words or less.""" # try your own prompt

response = generation_model.predict(prompt=prompt)

# return as markdown
display(Markdown(response.text))

1. **A.I.**
2. **B.R.A.I.N.**
3. **Cognito**
4. **Intelli**
5. **MindMeld**
6. **Nervana**
7. **OpenAI**
8. **PaLM**
9. **Watson**
10. **X.AI.**

### Prompt Template

Prompt templates are useful if you have found a good way to structure your prompt that you can re-use. This can be also be helpful in limiting the open-endedness of freeform prompts. There are many ways to implement prompt templates, and below is just one example using `f-strings`.

In [10]:
my_ingredient = "meat" # try changing this to a different industry

response = generation_model.predict(
    prompt=f"""Create a list of trending menus made from {my_ingredient} as main ingredient.
    Each trend should be less than 3 words."""
)

display(Markdown(response.text))

* **Meatless Mondays** - A trend where people eat meatless meals on Mondays to reduce their meat consumption.
* **Meatless burgers** - A plant-based burger that is made to look and taste like a traditional beef burger.
* **Vegan hot dogs** - A plant-based hot dog that is made to look and taste like a traditional pork hot dog.
* **Tofu scramble** - A dish made from scrambled tofu that is seasoned to taste like scrambled eggs.
* **Chickpea salad** - A salad made from chickpeas, tomatoes, cucumbers, and other vegetables.

### Model parameters for `text-bison@001`

We can customize how the PaLM API behaves in response to your prompt by using the following parameters for text-bison@001:

* `temperature`: higher means more "creative" responses
* `max_output_tokens`: sets the max number of tokens in the output
* `top_p`: higher means it will pull from more possible next tokens, based on cumulative probability
* `top_k`: higher means it will sample from more possible next tokens

For more detail model parameters, refer to this [documentation](https://cloud.google.com/vertex-ai/docs/generative-ai/learn/models#text_model_parameters)

##### 1. The `temperature` parameter (range: 0.0 - 1.0, default 0)

The temperature is used for sampling during the response generation, which occurs when top_p and top_k are applied. Temperature controls the degree of randomness in token selection.

Lower temperatures are good for prompts that require a more deterministic and less open-ended response. In comparison, higher temperatures can lead to more "creative" or diverse results. A temperature of 0 is deterministic: the highest probability response is always selected. For most use cases, try starting with a temperature of 0.2.

A higher temperature value will result in a more exploratative output, with a higher likelihood of generating rare or unusual words or phrases. Conversely, a lower temperature value will result in a more conservative output, with a higher likelihood of generating common or expected words or phrases.

Example:
For example,

`temperature = 0.0:`

* _The cat sat on the couch, watching the birds outside._
* _The cat sat on the windowsill, basking in the sun._


`temperature = 0.9:`

* _The cat sat on the moon, meowing at the stars._
* _The cat sat on the cheeseburger, purring with delight._

In [15]:
temp_val = 0.0
prompt_temperature = "Complete the sentence: As I walked around in the downtown, I found myself in:"

response = generation_model.predict(
    prompt=prompt_temperature,
    temperature=temp_val,
)

print(f"[temperature = {temp_val}]")
display(Markdown(response.text))

[temperature = 0.0]


a park.

In [16]:
temp_val = 1.0
prompt_temperature = "Complete the sentence: As I walked around in the downtown, I found myself in:"

response = generation_model.predict(
    prompt=prompt_temperature,
    temperature=temp_val,
)

print(f"[temperature = {temp_val}]")
display(Markdown(response.text))

[temperature = 1.0]


the central business district.

##### 2. The `max_output_tokens` parameter (range: 1 - 1024, default 128)

__Tokens__
A single token may be smaller than a word. For example, a token is approximately four characters. So 100 tokens correspond to roughly 60-80 words. It's essential to be aware of the token sizes as models have a limit on input and output tokens.

__What is max_output_tokens?__
max_output_tokens is the maximum number of tokens that can be generated in the response.

__How does max_output_tokens affect the response?__
Specify a lower value for shorter responses and a higher value for longer responses. A token may be smaller than a word. A token is approximately four characters. 100 tokens correspond to roughly 60-80 words.

In [17]:
max_output_tokens_val = 5

response = generation_model.predict(
    prompt="List ten ways that generative AI can help improve the manufacturing industry",
    max_output_tokens=max_output_tokens_val,
)

print(f"[max_output_tokens = {max_output_tokens_val}]")
display(Markdown(response.text))

[max_output_tokens = 5]


1. **Automat

In [18]:
max_output_tokens_val = 500

response = generation_model.predict(
    prompt="List ten ways that generative AI can help improve the manufacturing industry",
    max_output_tokens=max_output_tokens_val,
)

print(f"[max_output_tokens = {max_output_tokens_val}]")
display(Markdown(response.text))

[max_output_tokens = 500]


1. **Automate product design.** Generative AI can be used to create new product designs, which can save time and money for manufacturers.
2. **Optimize manufacturing processes.** Generative AI can be used to optimize manufacturing processes, such as by identifying the most efficient way to assemble a product or by finding ways to reduce waste.
3. **Improve quality control.** Generative AI can be used to identify defects in products, which can help manufacturers to improve quality control and reduce the number of products that need to be scrapped.
4. **Personalize products.** Generative AI can be used to create personalized products, such as by customizing the design of a product or by suggesting features that would be most appealing to a particular customer.
5. **Reduce costs.** Generative AI can help manufacturers to reduce costs by automating tasks, optimizing processes, and improving quality control.
6. Increase productivity.** Generative AI can help manufacturers to increase productivity by automating tasks, optimizing processes, and improving quality control.
7. **Speed up product development.** Generative AI can speed up product development by automating tasks, such as generating product designs and identifying defects.
8. **Improve customer satisfaction.** Generative AI can help manufacturers to improve customer satisfaction by providing personalized products and services, and by reducing the number of defects in products.
9. **Create new business opportunities.** Generative AI can help manufacturers to create new business opportunities by developing new products and services, and by entering new markets.
10. **Stay ahead of the competition.** Generative AI can help manufacturers to stay ahead of the competition by automating tasks, optimizing processes, and improving quality control.

##### 3. The `top_p` parameter (range: 0.0 - 1.0, default 0.95)

__What is top_p?__
top_p controls how the model selects tokens for output by adjusting the probability distribution of the next word in the generated text based on a cumulative probability cutoff. Specifically, it selects the smallest set of tokens whose cumulative probability exceeds the given cutoff probability p, and samples from this set uniformly.

For example, suppose tokens `A`, `B`, and `C` have a probability of `0.3`, `0.2`, and `0.1`, and the top_p value is __0.5__. In that case, the model will select either A or B as the next token (using temperature) and not consider C, because the cumulative probability of top_p is <= 0.5. Specify a lower value for less random responses and a higher value for more random responses.

__How does top_p affect the response?__
The top_p parameter is used to control the diversity of the generated text. A _higher_ top_p parameter value results in more "diverse" and "interesting" outputs, with the model being allowed to sample from a larger pool of possibilities. In contrast, a _lower_ top_p parameter value resulted in more predictable outputs, with the model being constrained to a smaller set of possible tokens.

Example:

`top_p = 0.1:`

* _The cat sat on the mat._
* _The cat sat on the floor._

`top_p = 0.9:`

* _The cat sat on the windowsill, soaking up the sun's rays._
* _The cat sat on the edge of the bed, watching the birds outside._

In [20]:
top_p_val = 0.0
prompt_top_p_example = (
    "Create a slogan for a candidate focusing on climate change."
)

response = generation_model.predict(
    prompt=prompt_top_p_example, temperature=0.9, top_p=top_p_val
)

print(f"[top_p = {top_p_val}]")
display(Markdown(response.text))

[top_p = 0.0]


* "Climate change is real, and it's here to stay. We need to take action now, before it's too late."
* "We can't afford to ignore climate change any longer. It's time for bold action."
* "The future of our planet is at stake. We need to elect a leader who will take action on climate change."
* "Climate change is the most important issue facing our planet today. We need to elect a leader who will make it a top priority."

In [21]:
top_p_val = 1.0

response = generation_model.predict(
    prompt=prompt_top_p_example, temperature=0.9, top_p=top_p_val
)

print(f"[top_p = {top_p_val}]")
display(Markdown(response.text))

[top_p = 1.0]


"We need a leader who will take action on climate change. I'm that leader."

This slogan is effective because it is short, memorable, and to the point. It also highlights the candidate's position on climate change and their commitment to taking action.

##### 4. The `top_k` parameter (range: 0.0 - 40, default 40)

__What is top_k?__

top_k changes how the model selects tokens for output. A top_k of 1 means the selected token is the most probable among all tokens in the model's vocabulary (also called greedy decoding). In contrast, a top_k of 3 means that the next token is selected from the top 3 most probable tokens (using temperature). For each token selection step, the top_k tokens with the highest probabilities are sampled. Then tokens are further filtered based on top_p with the final token selected using temperature sampling.

__How does top_k affect the response?__

Specify a lower value for less random responses and a higher value for more random responses.

In [22]:
prompt_top_k_example = "Write a 2-day itinerary for Bali."
top_k_val = 1

response = generation_model.predict(
    prompt=prompt_top_k_example, 
    max_output_tokens=300, 
    temperature=0.9, 
    top_k=top_k_val
)

print(f"[top_k = {top_k_val}]")
display(Markdown(response.text))

[top_k = 1]


Day 1:
* Morning: Start your day with a sunrise yoga session at Uluwatu Temple. This is one of the most popular temples in Bali and offers stunning views of the Indian Ocean.
* Afternoon: After yoga, head to the Ubud Monkey Forest. This is a sacred forest that is home to hundreds of monkeys. You can walk through the forest and see the monkeys up close.
* Evening: Enjoy a traditional Balinese dinner at a local restaurant. There are many great restaurants in Ubud that serve traditional Balinese cuisine.

Day 2:
* Morning: Start your day with a visit to the Tanah Lot Temple. This temple is built on a rock formation that is surrounded by the ocean. It is one of the most iconic temples in Bali.
* Afternoon: After visiting the temple, head to the Tirta Empul Temple. This temple is dedicated to the god of water and is a popular spot for locals to come and bathe in the holy water.
* Evening: Enjoy a sunset cruise on the Balinese Sea. This is a great way to see the sunset and get a different perspective of the island.

In [23]:
top_k_val = 40

response = generation_model.predict(
    prompt=prompt_top_k_example,
    max_output_tokens=300,
    temperature=0.9,
    top_k=top_k_val,
)

print(f"[top_k = {top_k_val}]")
display(Markdown(response.text))

[top_k = 40]


**Day 1:**

* Morning: Start your day with a sunrise hike up to Mount Batur, a 1,717-meter-high volcano. The views from the top are stunning, and you'll be rewarded with a glimpse of the sun rising over the clouds.
* Afternoon: After your hike, head to the Tegalalang Rice Terraces, a UNESCO World Heritage Site. Take a walk through the fields and admire the beautiful scenery. You can also stop for a traditional Balinese lunch at one of the many restaurants in the area.
* Evening: In the evening, enjoy a relaxing sunset cruise on the Ubud River. As you cruise, you'll see the sun sink below the horizon and light up the sky with beautiful colors.

**Day 2:**

* Morning: Start your day with a visit to the Sacred Monkey Forest in Ubud. This is a popular tourist attraction, but it's worth a visit for the chance to see the monkeys up close.
* Afternoon: In the afternoon, head to the Tanah Lot Temple, a Hindu temple that is built on a rock formation in the sea. The temple is a popular spot for sunset, and you can enjoy stunning views of the ocean and the surrounding cliffs.
* Evening: In the evening, enjoy a traditional Balinese dance performance. There are many different dance performances to choose from, so you can find one that suits your interests.

# Chat model with `chat-bison@001`

The `chat-bison@001` model lets you have a freeform conversation across multiple turns. The application tracks what was previously said in the conversation. As such, if you expect to use conversations in your application, use the `chat-bison@001` model because it has been fine-tuned for multi-turn conversation use cases.

In [24]:
chat_model = ChatModel.from_pretrained("chat-bison@001")

In [25]:
chat = chat_model.start_chat()

In [26]:
print(
    chat.send_message(
        """
Hello! Can you write a course outline to learn Golang for Python developer?
"""
    )
)

## Course Outline

This course is designed to teach Python developers the basics of Golang. It covers the following topics:

* Golang syntax
* Data types
* Control flow statements
* Functions
* Packages
* Modules
* Concurrency
* Testing
* Deployment

## Learning Objectives

By the end of this course, you will be able to:

* Write simple Golang programs
* Use Golang data types and control flow statements
* Define and call functions
* Use packages and modules
* Write concurrent programs
* Write unit tests for your Golang programs
* Deploy Golang programs to production

## Prerequisites

This course assumes that you have a basic understanding of programming. You should be familiar with the following concepts:

* Variables
* Data types
* Control flow statements
* Functions
* Modules

## Resources

The following resources are provided to help you learn Golang:

* [The Go Programming Language Book](https://golang.org/doc/book/)
* [The Go Programming Language Tutorial](https://golang.org/doc/

As shown below, the model should respond based on what was previously said in the conversation:

In [29]:
print(
    chat.send_message(
        """
Could you give me short explanation on Concurrency in Go?
"""
    )
)

Concurrency in Go is achieved through the use of goroutines. A goroutine is a lightweight thread that is managed by the Go runtime. Goroutines are created using the `go` keyword.

For example, the following code creates a goroutine that prints the message "Hello, world!" to the console:

```
go func() {
  fmt.Println("Hello, world!")
}()
```

The `go` keyword tells the Go runtime to create a new goroutine and run the function inside it. The function inside the goroutine is called a closure. A closure is a function that has access to the variables of the enclosing scope.

In the example above, the closure has access to the variable `fmt`. This allows the closure to print the message "Hello, world!" to the console.

Goroutines are lightweight and efficient. They are also easy to create and manage. This makes them a powerful tool for writing concurrent programs in Go.

Here are some additional resources on concurrency in Go:

* [The Go Programming Language Book](https://golang.org/doc/boo

### Advanced Chat model with the SDK

We can also provide a context and examples to the model. The model will then respond based on the provided context and examples. We can also use `temperature`, `max_output_tokens`, `top_p`, and `top_k`. These parameters should be used when we start our chat with `chat_model.start_chat()`.

For more information on chat models, please refer to the [documentation on chat model parameters](https://cloud.google.com/vertex-ai/docs/generative-ai/learn/models#chat_model_parameters)

In [30]:
chat = chat_model.start_chat(
    context="My name is Tom. You are my personal assistant. My favorite movies are all list in Marvel Cinematic Universe.",
    examples=[
        InputOutputTextPair(
            input_text="Who do you work for?",
            output_text="I work for Tom.",
        ),
        InputOutputTextPair(
            input_text="What do I like?",
            output_text="Tom likes watching all Marvel Cinematic Universe movies.",
        ),
    ],
    temperature=0.3,
    max_output_tokens=200,
    top_p=0.8,
    top_k=40,
)
print(chat.send_message("Are my favorite movies based on a comic?"))

Yes, all of Tom's favorite movies are based on Marvel Comics.


In [31]:
print(chat.send_message("When the first movie came out?"))

The first Marvel Cinematic Universe movie was Iron Man, which came out in 2008.


In [32]:
print(chat.send_message("What is the most popular movie?"))

The most popular Marvel Cinematic Universe movie is Avengers: Endgame, which came out in 2019.


In [33]:
print(chat.send_message("How much is gross revenue for it?"))

Avengers: Endgame grossed over $2.79 billion worldwide, making it the highest-grossing film of all time.


# Code generation with `code-bison@001`

The code generation model (Codey) from PaLM API that you will use in this notebook is code-bison@001. It is fine-tuned to follow natural language instructions to generate required code and is suitable for a variety of coding tasks, such as:

* writing functions
* writing classes
* web-apges
* unit tests
* docstrings
* code translations, and many more use-cases.

Currently it supports the following languages:

* C++
* C#
* Go
* GoogleSQL
* Java
* JavaScript
* Kotlin
* PHP
* Python
* Ruby
* Rust
* Scala
* Swift
* TypeScript

You can find our more details [here](https://cloud.google.com/vertex-ai/docs/generative-ai/code/code-models-overview).

##### Load the model

In [34]:
code_generation_model = CodeGenerationModel.from_pretrained("code-bison@001")

##### Model parameters for `code-bison@001`

We can customize how the PaLM API code generation behaves in response to our prompt by using the following parameters for `code-bison@001`:

* `prefix`: it represents the beginning of a piece of meaningful programming code or a natural language prompt that describes code to be generated.

* `temperature`: higher means more "creative" code responses. range: (0.0 - 1.0, default 0).

* `max_output_tokens`: sets the max number of tokens in the output. range: (1 - 2048, default 2048)

##### Hello Codey, Create some code for me!

In [36]:
prefix = "write a python function to do scrape a website"

response = code_generation_model.predict(prefix=prefix)

display(Markdown(response.text))

```python
def scrape_website(url):
  """Scrape the given website and return the HTML content.

  Args:
    url: The URL of the website to scrape.

  Returns:
    The HTML content of the website.
  """

  # Get the HTML content of the website.
  response = requests.get(url)
  response.raise_for_status()
  html_content = response.content

  # Return the HTML content.
  return html_content
```

### Try it on your own

Some examples:

* write node js model using mongodb
* write python code to validate email address
* write a standard SQL function that concatenating 2 columns

In [38]:
prefix = """write a python function named as "sentence_similairty"\
            where it takes two arguments "sentence1" and "sentence2". \
            It then returns the similarity score in percentage with 2 decimals. \n
          """

response = code_generation_model.predict(prefix=prefix, max_output_tokens=1024)

display(Markdown(response.text))

```python
def sentence_similarity(sentence1, sentence2):

  """
  This function takes two sentences as input and returns the similarity score in percentage with 2 decimals.

  Args:
    sentence1 (str): The first sentence.
    sentence2 (str): The second sentence.

  Returns:
    float: The similarity score in percentage with 2 decimals.
  """

  # Step 1: Remove stop words and punctuation from the sentences.

  sentence1_no_stopwords = remove_stopwords(sentence1)
  sentence2_no_stopwords = remove_stopwords(sentence2)

  # Step 2: Stem the words in the sentences.

  sentence1_stemmed = stem_words(sentence1_no_stopwords)
  sentence2_stemmed = stem_words(sentence2_no_stopwords)

  # Step 3: Calculate the Jaccard similarity score.

  jaccard_similarity_score = jaccard_similarity(sentence1_stemmed, sentence2_stemmed)

  # Step 4: Return the similarity score in percentage with 2 decimals.

  return round(jaccard_similarity_score * 100, 2)

```

#### Using Prompt Template

Prompt templates are useful if we have found a good way to structure our prompt that we can re-use. This can be also be helpful in limiting the open-endedness of freeform prompts. There are many ways to implement prompt templates, and below is just one example using `f-strings`. This way we can structure the prompts as per the expected funcationality of the code.

In [40]:
language = "C++ function"
file_format = "json"
extract_info = "names"
requirments = """
              - the name should be start with capital letters.
              - There should be no duplicate names in the final list.
              """

prefix = f"""Create a {language} to parse {file_format} and extract {extract_info} 
            with the following requirements: {requirments}.
        """

response = code_generation_model.predict(prefix=prefix, max_output_tokens=1024)

display(Markdown(response.text))

```c++
#include <iostream>
#include <string>
#include <vector>
#include <map>

using namespace std;

// This function parses a JSON object and extracts all of the names that start with capital letters.
// The function also ensures that there are no duplicate names in the final list.
vector<string> getNames(const string& json) {
  // Create a map to store the names.
  map<string, bool> names;

  // Parse the JSON object.
  stringstream ss(json);
  json::value_type root;
  ss >> root;

  // Iterate over the object's properties.
  for (json::object_iterator it = root.begin(); it != root.end(); ++it) {
    // Get the property name.
    string name = it->first;

    // Check if the name starts with a capital letter.
    if (name[0] >= 'A' && name[0] <= 'Z') {
      // Add the name to the map.
      names[name] = true;
    }
  }

  // Create a vector to store the names.
  vector<string> namesList;

  // Iterate over the map and add the names to the vector.
  for (map<string, bool>::iterator it = names.begin(); it != names.end(); ++it) {
    namesList.push_back(it->first);
  }

  // Return the vector of names.
  return namesList;
}
```

# Code completion with `code-gecko@001`

Code completion uses the code-gecko foundation model to generate and complete code based on code being written. `code-gecko` completes code that was recently typed by a user.

Code completion API has few more parameters than code generation.

* __prefix__: _required_ : For code models, prefix represents the beginning of a piece of meaningful programming code or a natural language prompt that describes code to be generated.

* __suffix__: _optional_ : For code completion, suffix represents the end of a piece of meaningful programming code. The model attempts to fill in the code in between the prefix and suffix.

* __temperature__: _required_ : Temperature controls the degree of randomness in token selection. Same as for other models. range: (0.0 - 1.0, default 0)

* __maxOutputTokens__: _required_ : Maximum number of tokens that can be generated in the response. __range: (1 - 64, default 64)__

* __stopSequences__: _optional_ : Specifies a list of strings that tells the model to stop generating text if one of the strings is encountered in the response. The strings are case-sensitive.

To learn more about creating prompts for code completion, see [Create prompts for code completion](https://cloud.google.com/vertex-ai/docs/generative-ai/model-reference/code-completion#:~:text=code%20completion%2C%20see-,Create%20prompts%20for%20code%20completion,-.).

In [41]:
code_completion_model = CodeGenerationModel.from_pretrained("code-gecko@001")

In [42]:
prefix = """
          def find_x_in_string(string_s, x):
         """

response = code_completion_model.predict(prefix=prefix,
                                         max_output_tokens=64)

display(Markdown(response.text))

     return string_s.find(x)
          
          def find_y_in_string(string_s, y):
              return string_s.find(y)

In [44]:
prefix = """
         def reverse_string(s):
            return s[::-1]
         def test_empty_input_string()
         """

response = code_completion_model.predict(prefix=prefix,
                                         max_output_tokens=64)

print(response.text)

   assert reverse_string("") == ""
         def test_one_character_string()
            assert reverse_string("a") == "a"
         def test_two_character_string()
            assert reverse_string("ab") == "ba"


# Code chat with `codechat-bison@001`

The `codechat-bison@001` model lets us have a freeform conversation across multiple turns from a code context. The application tracks what was previously said in the conversation. As such, if we expect to use conversations in our application for code generation, use the `codechat-bison@001` model because it has been fine-tuned for multi-turn conversation use cases.

In [47]:
code_chat_model = CodeChatModel.from_pretrained("codechat-bison@001")

code_chat = code_chat_model.start_chat()


print(code_chat.send_message(
        "Please help write a function to calculate the min of two numbers",
    )
)

Sure, here is a function that calculates the minimum of two numbers:

```
def min(a, b):
  """
  Calculates the minimum of two numbers.

  Args:
    a: The first number.
    b: The second number.

  Returns:
    The smaller of the two numbers.
  """

  if a < b:
    return a
  else:
    return b
```

This function takes two numbers as input and returns the smaller of the two numbers. For example, if you call the function with the numbers 5 and 10, it will return 5.


As shown below, the model should respond based on what was previously asked in the conversation:

In [48]:
print(code_chat.send_message(
        "can you explain the code line by as I were 10 year old guy?",
    )
)

Sure! The function `min` takes two numbers as input, `a` and `b`. It then compares the two numbers and returns the smaller of the two.

The first line of the function is a comment. It explains what the function does.

The second line of the function is an `if` statement. The `if` statement checks if `a` is less than `b`. If it is, then the function returns `a`. Otherwise, the function returns `b`.

For example, if you call the function with the numbers 5 and 10, the `if` statement will be true because 5 is less than 10. So, the function will return 5.


We can take another example and ask the model to give more general code suggestion for a specific problem that are are working on.

In [49]:
code_chat = code_chat_model.start_chat()

print(code_chat.send_message(
        "what is the most scalable way to sort a list in python?",
    )
)

The most scalable way to sort a list in Python is to use the built-in `sort()` function. This function takes a list as its argument and sorts it in ascending order. The `sort()` function is very efficient and can sort lists of any size.

To use the `sort()` function, simply pass the list you want to sort as its argument. For example, to sort the list `[1, 2, 3, 4, 5]`, you would use the following code:

```
sorted([1, 2, 3, 4, 5])
```

The `sort()` function will return a new list that is sorted in ascending order. The original list will not be changed.

If you need to sort a list in descending order, you can use the `reverse()` function. The `reverse()` function takes a list as its argument and reverses the order of the elements in the list. To sort the list `[1, 2, 3, 4, 5]` in descending order, you would use the following code:

```
sorted([1, 2, 3, 4, 5], reverse=True)
```

The `reverse()` function will return a new list that is sorted in descending order. The original list will not

You can continue to ask follow-up questions to the origianl query.

In [50]:
print(code_chat.send_message(
        "how would i measure the iteration per second for the following code?",

    )
)

To measure the iteration per second for the following code:

```
for i in range(1000000):
    pass
```

You can use the following code:

```
import time

start = time.time()
for i in range(1000000):
    pass
end = time.time()

print(end - start)
```

This code will print the time it takes to execute the loop. You can then divide the number of iterations by the time to get the iteration per second.


###