In [4]:


from google.cloud import aiplatform  # requires >=1.25.0
from vertexai.language_models import TextGenerationModel

print(aiplatform.__version__)



1.26.0


In [5]:
def generate(
    prompt: str,
    model_name: str = "text-bison@001",
    temperature: float = 0.2,
    max_output_tokens: int = 256,
    top_p: float = 0.8,
    top_k: int = 40,
):
    model = TextGenerationModel.from_pretrained(model_name)
    response = model.predict(
        prompt,
        temperature=temperature,
        max_output_tokens=max_output_tokens,
        top_p=top_p,
        top_k=top_k,
    )

    return response

In [6]:
generate(
    "What are five important things to understand about large language models?"
)

1. **Large language models are trained on massive datasets of text and code.** This gives them a vast knowledge base that they can draw on to generate text, translate languages, write different kinds of creative content, answer your questions, and more.
2. **Large language models are still under development, and there are some limitations to their capabilities.** For example, they can sometimes be biased or inaccurate, and they may not always understand the nuances of human language.
3. **Large language models are being used in a variety of applications, such as customer service, content creation, and medical research.** As they continue to develop, we can expect to see even more uses for these powerful tools.
4. **There are some ethical concerns associated with large language models, such as the potential for bias and misinformation.** It is important to be aware of these concerns and to use large language models responsibly.
5. **Large language models are a rapidly growing field, and

In [28]:
prompt = """
Classify the following: 
text: "Zero-shot prompting is really easy with Google's PaLM API."
labels: [ technology, politics, sports, 'machine learning' ]
"""

generate(prompt, max_output_tokens=20)

The text is about machine learning. It is about the ease of using Google's PaLM API

In [8]:
prompt = """
What is the topic for a given text? 
- cats 
- dogs 

Text: They always sits on my lap and purr!
The answer is: cats

Text: They love to play fetch
The answer is: dogs

Text: I throw the frisbee in the water and they swim for hours! 
The answer is: dogs 

Text: They always knock things off my counter!
The answer is:
"""

generate(prompt, max_output_tokens=5)

cats

In [9]:
prompt = """
Provide a very short summary for the following:

A transformer is a deep learning model. It is distinguished by its adoption of self-attention, 
differentially weighting the significance of each part of the input (which includes the 
recursive output) data. Like recurrent neural networks (RNNs), transformers are designed to 
process sequential input data, such as natural language, with applications towards tasks such 
as translation and text summarization. However, unlike RNNs, transformers process the 
entire input all at once. The attention mechanism provides context for any position in the 
input sequence. For example, if the input data is a natural language sentence, the transformer 
does not have to process one word at a time. This allows for more parallelization than RNNs 
and therefore reduces training times.

Summary:

"""
generate(prompt, max_output_tokens=1024)

A transformer is a deep learning model that processes sequential input data, such as natural language, with applications towards tasks such as translation and text summarization. Unlike RNNs, transformers process the entire input all at once, which allows for more parallelization and reduces training times.

In [10]:
prompt = """
Provide four bullet points summarizing the following:

A transformer is a deep learning model. It is distinguished by its adoption of self-attention, 
differentially weighting the significance of each part of the input (which includes the 
recursive output) data. Like recurrent neural networks (RNNs), transformers are designed to 
process sequential input data, such as natural language, with applications towards tasks such 
as translation and text summarization. However, unlike RNNs, transformers process the 
entire input all at once. The attention mechanism provides context for any position in the 
input sequence. For example, if the input data is a natural language sentence, the transformer 
does not have to process one word at a time. This allows for more parallelization than RNNs 
and therefore reduces training times.

Summary:

"""
generate(prompt, max_output_tokens=1024)

- A transformer is a deep learning model that is distinguished by its adoption of self-attention.
- Transformers are designed to process sequential input data, such as natural language.
- Unlike RNNs, transformers process the entire input all at once.
- The attention mechanism provides context for any position in the input sequence.

In [11]:
prompt = """
Generate a summary of the following converstaion and at the end, summarize to-do's for the service rep: 

Kyle: Hi! I'm reaching out to customer service because I am having issues.

Service Rep: What seems to be the problem? 

Kyle: I am trying to use the PaLM API but I keep getting an error. 

Service Rep: Can you share the error with me? 

Kyle: Sure. The error says: "ResourceExhausted: 429 Quota exceeded for 
      aiplatform.googleapis.com/online_prediction_requests_per_base_model 
      with base model: text-bison"
      
Service Rep: It looks like you have exceeded the quota for usage. Please refer to 
             https://cloud.google.com/vertex-ai/docs/quotas for information about quotas
             and limits. 
             
Kyle: Can you increase my quota?

Service Rep: I cannot, but let me follow up with somebody who will be able to help.

Summary:
"""

generate(prompt, max_output_tokens=256)

Kyle is having issues using the PaLM API. He keeps getting an error that says "ResourceExhausted: 429 Quota exceeded for 
      aiplatform.googleapis.com/online_prediction_requests_per_base_model 
      with base model: text-bison".

The service rep informs Kyle that he has exceeded the quota for usage and provides a link to information about quotas and limits. Kyle asks if the service rep can increase his quota, but the service rep says that he cannot. The service rep will follow up with somebody who will be able to help.

To-do's for the service rep:
- Follow up with somebody who can increase Kyle's quota.
- Provide Kyle with more information about quotas and limits.

In [12]:
prompt = """
Extract the ingredients from the following recipe. 
Return the ingredients in JSON format with keys: ingredient, quantity, type.

Ingredients:
* 1 tablespoon olive oil
* 1 onion, chopped
* 2 carrots, chopped
* 2 celery stalks, chopped
* 1 teaspoon ground cumin
* 1/2 teaspoon ground coriander
* 1/4 teaspoon turmeric powder
* 1/4 teaspoon cayenne pepper (optional)
* Salt and pepper to taste
* 1 (15 ounce) can black beans, rinsed and drained
* 1 (15 ounce) can kidney beans, rinsed and drained
* 1 (14.5 ounce) can diced tomatoes, undrained
* 1 (10 ounce) can diced tomatoes with green chilies, undrained
* 4 cups vegetable broth
* 1 cup chopped fresh cilantro
"""
generate(prompt, max_output_tokens=1024)

```
{
  "ingredient": "olive oil",
  "quantity": "1 tablespoon",
  "type": "oil"
},
{
  "ingredient": "onion",
  "quantity": "1",
  "type": "vegetable"
},
{
  "ingredient": "carrot",
  "quantity": "2",
  "type": "vegetable"
},
{
  "ingredient": "celery",
  "quantity": "2",
  "type": "vegetable"
},
{
  "ingredient": "ground cumin",
  "quantity": "1 teaspoon",
  "type": "spice"
},
{
  "ingredient": "ground coriander",
  "quantity": "1/2 teaspoon",
  "type": "spice"
},
{
  "ingredient": "turmeric powder",
  "quantity": "1/4 teaspoon",
  "type": "spice"
},
{
  "ingredient": "cayenne pepper",
  "quantity": "1/4 teaspoon",
  "type": "spice"
},
{
  "ingredient": "salt",
  "quantity": "to taste",
  "type": "seasoning"
},
{
  "ingredient": "pepper",
  "quantity": "to taste",
  "type": "seasoning"
},
{
  "ingredient": "black beans",
  "quantity": "1 (15 ounce) can",
  "type": "bean"
},
{
  "ingredient": "kidney beans",
  "quantity": "1 (15 ounce) can",
  "type": "bean"
},
{
  "ingredient": "dice

In [13]:


prompt = """
Extract the technical specifications from the text below in JSON format.

Text: Google Nest WiFi, network speed up to 1200Mpbs, 2.4GHz and 5GHz frequencies, WP3 protocol
JSON: {
  "product":"Google Nest WiFi",
  "speed":"1200Mpbs",
  "frequencies": ["2.4GHz", "5GHz"],
  "protocol":"WP3"
}

Text: Google Pixel 7, 5G network, 8GB RAM, Tensor G2 processor, 128GB of storage, Lemongrass
JSON:
"""

generate(prompt, max_output_tokens=1024)



{
  "product":"Google Pixel 7",
  "network":"5G",
  "RAM":"8GB",
  "processor":"Tensor G2",
  "storage":"128GB",
  "color":"Lemongrass"
}

In [34]:
candidate_labels = ["Computer Science", "Physics", "Mathematics", "Statistics", 
                    "Quantitative Biology", "Quantitative Finance", "None" ]
label_str=",".join(candidate_labels)

In [44]:
def get_pred(text):
    prompt = f"""
Classify the following:
text: "{text}"
labels: [{label_str}]
"""
    return generate(prompt, max_output_tokens=4)

generate(prompt, max_output_tokens=20)

def get_df_with_prediction(data, inp_columns):
    small_df = df
    arr_pred_multi_class = []
    arr_pred = []
    it = 1
    
    for id in tqdm.tqdm(small_df.index):
        sequence = " ".join([small_df[col][id] for col in inp_columns])
        pred = classifier(sequence, candidate_labels)
        arr_pred.append(pred)
        pred_multi = classifier(sequence, candidate_labels, multi_label=True)
        arr_pred_multi_class.append(pred_multi)
    small_df['pred_multi_class'] = arr_pred_multi_class
    small_df['pred'] = arr_pred
    return small_df
    

In [45]:
get_pred("Multivariate Dependency Measure based on Copula and Gaussian Kernel")

The text is about

In [58]:
prompt = """
Classify the following, with multiple label if applicable: 
text: Multivariate Dependency Measure based on Copula and Gaussian Kernel
labels: ['Computer Science','Physics','Mathematics', 'Statistics', 'Quantitative Biology','Quantitative Finance']
"""

generate(prompt, max_output_tokens=20)

The text is about a multivariate dependency measure based on copula and Gaussian kernel. It is a mathematical topic

In [59]:
prompt = """
What are the posssible choices for the topic of the text:
 - Computer Science
 - Physics 
 - Mathematics 
 - Statistics 
 - Quantitative Biology
 - Quantitative Finance
 
Text: "Multivariate Dependency Measure based on Copula and Gaussian Kernel"
The Answer is : Mathematics, statistics

Text: Deep Neural Network Optimized to Resistive Memory with Nonlinear Current-Voltage Characteristics
The Answer is : Computer Science

Text: On the rotation period and shape of the hyperbolic asteroid 1I/`Oumuamua (2017) U1 from its lightcurve
The Answer is:

"""

generate(prompt, max_output_tokens=20)

Physics

In [62]:
prompt = """
What are the posssible choices for the topic of the text:
 - Computer Science
 - Physics 
 - Mathematics 
 - Statistics 
 - Quantitative Biology
 - Quantitative Finance
 
Title: Multivariate Dependency Measure based on Copula and Gaussian Kernel
Abstract:   We propose a new multivariate dependency measure. It is obtained by considering a Gaussian kernel based distance between the copula transform of the given d-dimensional distribution and the uniform copula and then appropriately normalizing it. The resulting measure is shown to satisfy a number of
The Answer is : Mathematics, statistics

Title: Deep Neural Network Optimized to Resistive Memory with Nonlinear Current-Voltage Characteristics
Abstract: Artificial Neural Network computation relies on intensive vector-matrix multiplications. Recently, the emerging nonvolatile memory (NVM) crossbar array showed a feasibility of implementing such operations with high energy efficiency, thus there are many works on efficiently utilizing emerging NVM
The Answer is : Computer Science

Title: A local ensemble transform Kalman particle filter for convective scale data assimilation. 
Abstract: Ensemble data assimilation methods such as the Ensemble Kalman Filter (EnKF) are a key component of probabilistic weather forecasting. They represent the uncertainty in the initial conditions by an ensemble which incorporates information coming from the physical model with the latest observations
The Answer is:

"""

generate(prompt, max_output_tokens=20)

Physics

In [64]:
#physics,maths

prompt = """
What are the posssible choices for the topic of the text:
 - Computer Science
 - Physics 
 - Mathematics 
 - Statistics 
 - Quantitative Biology
 - Quantitative Finance
 
Title: Multivariate Dependency Measure based on Copula and Gaussian Kernel
Abstract:   We propose a new multivariate dependency measure. It is obtained by considering a Gaussian kernel based distance between the copula transform of the given d-dimensional distribution and the uniform copula and then appropriately normalizing it. The resulting measure is shown to satisfy a number of
The Answer is : Mathematics, statistics

Title: Deep Neural Network Optimized to Resistive Memory with Nonlinear Current-Voltage Characteristics
Abstract: Artificial Neural Network computation relies on intensive vector-matrix multiplications. Recently, the emerging nonvolatile memory (NVM) crossbar array showed a feasibility of implementing such operations with high energy efficiency, thus there are many works on efficiently utilizing emerging NVM
The Answer is : Computer Science

Title: Density large deviations for multidimensional stochastic hyperbolic conservation laws
Abstract: We investigate the density large deviation function for a multidimensional conservation law in the vanishing viscosity limit, when the probability concentrates on weak solutions of a hyperbolic conservation law conservation law. When the conductivity and diffusivity matrices are proportional, i.e
The Answer is:

"""

generate(prompt, max_output_tokens=20)

Mathematics, statistics