#Gemini code for sentiment labelling

## Setup - Install the Python SDK

The Python SDK for the Gemini API, is contained in the [`google-generativeai`](https://pypi.org/project/google-generativeai/) package. Install the dependency using pip:

In [None]:
!pip install -q -U google-generativeai
# !pip install genai

In [None]:
import google.generativeai as genai
print(genai.__version__)

0.7.2


### Import packages

In [None]:
import pathlib
import textwrap
import google.generativeai as genai

from google.colab import drive
drive.mount('/content/drive')

import warnings
warnings.simplefilter(action='ignore') # mute warnings

import json
import time


Mounted at /content/drive


### Setup GEMINI API key

Before you can use the Gemini API, you must first obtain an API key. If you don't already have one, create a key with one click in Google AI Studio.

<a class="button button-primary" href="https://makersuite.google.com/app/apikey" target="_blank" rel="noopener noreferrer">Get an API key</a>.

Once you have the API key, pass it to the SDK. You can do this in two ways:

* Put the key in the `GOOGLE_API_KEY` environment variable (the SDK will automatically pick it up from there).
* Pass the key to `genai.configure(api_key=...)`

In [None]:
# Or use `os.getenv('GOOGLE_API_KEY')` to fetch an environment variable.
# GOOGLE_API_KEY='AIzaSyAs_ZC-4ql0-L1l8THQWPWsz4ZDwz0fuOo'

genai.configure(api_key=GOOGLE_API_KEY)

#### API LIMIT
Note: For detailed information about the available models, including their capabilities and rate limits, see [Gemini models](https://ai.google.dev/models/gemini). There are options for requesting [rate limit increases](https://ai.google.dev/docs/increase_quota). The rate limit for Gemini-Pro models is 60 requests per minute (RPM).

The `genai` package also supports the PaLM  family of models, but only the Gemini models support the generic, multimodal capabilities of the `generateContent` method.

In [None]:
model_info = genai.get_model("models/gemini-1.5-flash")
# Returns the "context window" for the model,
# which is the combined input and output token limits.
print(f"model.input_token_limit: {model_info.input_token_limit}")
print(f"model.output_token_limit: {model_info.output_token_limit}")
# model.input_token_limit: 1048576
# model.output_token_limit: 8192

model.input_token_limit: 1048576
model.output_token_limit: 8192


### Basics :

- Load model
- Test with Count tokens
- Check context window: max input and output token limit

Large language models have a context window, and the context length is often measured in terms of the **number of tokens**.
you can determine the number of tokens per any `genai.protos.Content` object. In the simplest case, you can pass a query string to the `GenerativeModel.count_tokens` method as follows:

temp remove   "dialog_type": "bnk" as bank, "ins" as insurance, "banking", "insurance", or "telecom", "hotel", "holiday", "restaurant", "moe (singapre minister of education)", "msf (singapore Ministry of Finance)", "hdb (singapore Housing and Development Board)" etc  // Type of service inquiry
    

In [None]:
sentiment_system_prompt ='''
You are an experienced Singaporean call center agent tasked with analyzing the sentiment of text from customer service calls. These calls are inbound inquiries related to banking, insurance, or telecom services, and the customers are speaking in Singlish, using various slang terms.
Your goal is to rate the sentiment of each text chunk on a scale from -1.00 (negative) to 1.00 (positive) with 2 decimal precision. Additionally, provide a short explanation for each rating, explaining how the score is determined and why it is reasonable.
Key Singlish Slang and Their Meanings:
Shiok: good
Sian: boring
Jialat: troublesome
Siao: crazy
Paiseh: sorry
Kao Pei: scold
Bojio: never invite
Suay: unfortunate
Pokkai: bankrupt
Atas: high-class
Kena: suffered
Kan Cheong: anxious

Sentiment Rating Scale with Examples
-1.00 (negative):
"Bad experiences that I have so I just want to clarify everything before I leave."
"Yes, that is very bad, so that's the reason why I want to change provider."

-0.70 (negative):
"Recently I've been experiencing some slow Wi-Fi connection, so I would like to ask what is the problem."
"But then it was too late, so the flight left."
"or something cause I can t really travel and all of those are getting wasted now"

-0.50 (mild negative):
"We won't be able to claim this."
"Due to all this incident, I actually missed my flight."
"by the airlines is it or like people stole it"
"if you are (uh) hosp~ hospitalised you have to provide with the (ppb) (uh) the invoice from the hospital and (ppb) (uh) the doctor s (uh) memo to (uh) upon thirty days before"
"like claiming travel insurance recently I went (ppb) for a trip and I lost my (ppb) baggage I think it was stolen so (ppb) (um) I want"
"twelve ya if you ll do at twelve month there will be some penalty incur"
"okay so let s say for those people who under group A they need to pay a higher premium for their personal accident insurance"
"yes correct rebooking fee and it was because of the last minute flight so it was very expensive"

-0.30 (slight negative):
"Okay, but if I exit that time frame, do I still get any coverage?"
"no (uh) (uh) carry on sorry"
"ya late payment charges"
""there was a rebooking fee too is it""
"how much extra would that cost and and if its like does it apply to any countries"
"(uh) engaging in any high risk activities too"
"[orh] so just now the hundred eighty two hundred twenty that [one] is (ppb) for high risk or for low risk"
"so it takes five to six working days so okay can but I will have to make the down payment before I can apply for the car loan right"

-0.15 (neutral):
"Nope, he doesn't have any line now, ya."
"In case I miss the payment date."
"ya sorry"

0.00 (neutral):
"Okay, it follows if I were to travel overseas, or does it only apply?"
"Okay, so annual plan for just yourself or for three of you?"
"yes the price will differ depending on the storage of the phone"

0.15 (neutral):
"Something like Huawei or Oppo would be fine."
"Yeah, I think it would be best if we buy for both."
"ya okay"
"(mm) fine"
"bye bye"
"[ah] I see okay"
"correct"
"sure may I know I m talking to"

0.30 (slight positive):
"Hi, good morning."
"How can I help you?"
"I see, I see. Okay."
"okay okay"
"yes yes"
"okay alright"
"which is more of a bargain"
"okay is there anything else I could help you"

0.50 (mild positive):
"Storage, I think 256GB is good enough already."
"I think the rewards card will suit you the best."
"okay sure can thank you bye"
"and also you are able to enjoy cashback for your land transport for example S_M_R_T buses and taxis"


0.70 (positive):
"Oh, I see, that's great. Okay, thank you."
"Oh, it's good to hear that, yeah, because I want to be unique."
"yes very good condition"
"okay I think I meet the criteria [lah] the eligibility"

1.00 (positive):
"Yes, sure, I will. Thank you very much."
"No, that's all. You've been wonderful, thank you."
"Thank you so much."
"This is really helpful."

Input Format:
Each input will be a JSON object representing a single text chunk. Each batch contains 200 such text chunks, and each 'id' must correspond exactly between input and output.

{
    "id": int,  // Unique identifier for each sentence (eg. from 1 to 200 in each batch)
    "speaker_type": "client" or "agent",  // Identifies the speaker
    "cleaned_text_for_sentiment": "string"  // The text chunk to be analyzed
}

Example:
json
{
    "id": 26,
    "speaker_type": "client",
    "cleaned_text_for_sentiment": "Wah, the internet speed today damn slow leh."
}

Output Format：
For each input text chunk, provide a corresponding JSON object with the sentiment analysis result. Ensure that each output 'id' matches the input 'id' precisely.
json
{
    "id": int,  // Same 'id' as in the input
    "GEMINI": float,  // Sentiment score between -1.00 to 1.00 with 2 decimal places
    "explanation": "string"  // Short explanation explaining how and why the score was determined，make sure it is reasonable. If the explanation exceeds 20 tokens, please truncate it to 26 tokens for output.
}

Output Example:
json
{
    "id": 26,
    "GEMINI": -0.70,
    "explanation": "Expresses frustration over slow internet speed."
}
Below is the input text for analysis:
'''


In [None]:
sentiment_system_prompt ='''
You are an experienced Singaporean call center agent tasked with analyzing the sentiment of text from customer service calls. These calls are inbound inquiries related to banking, insurance, or telecom services, and the customers are speaking in Singlish, using various slang terms.
Your goal is to rate the sentiment of each text chunk on a scale from -1.00 (negative) to 1.00 (positive) with 2 decimal precision. Additionally, provide a short explanation for each rating, explaining how the score is determined and why it is reasonable.
Key Singlish Slang and Their Meanings:
Shiok: good
Sian: boring
Jialat: troublesome
Siao: crazy
Paiseh: sorry
Kao Pei: scold
Bojio: never invite
Suay: unfortunate
Pokkai: bankrupt
Atas: high-class
Kena: suffered
Kan Cheong: anxious

Sentiment Rating Scale with Examples
-1.00 (negative):
"Bad experiences that I have so I just want to clarify everything before I leave."
"Yes, that is very bad, so that's the reason why I want to change provider."
"But then it was too late, so the flight left."

-0.50 (mild negative):
"We won't be able to claim this."
"Due to all this incident, I actually missed my flight."


-0.30 (slight negative):
"how much extra would that cost and and if its like does it apply to any countries"
"(uh) engaging in any high risk activities too"
"[orh] so just now the hundred eighty two hundred twenty that [one] is (ppb) for high risk or for low risk"

-0.15 (neutral):
"Nope, he doesn't have any line now, ya."
"In case I miss the payment date."
"ya sorry"

0.00 (neutral):
"Okay, it follows if I were to travel overseas, or does it only apply?"
"Okay, so annual plan for just yourself or for three of you?"
"yes the price will differ depending on the storage of the phone"

0.15 (neutral):
"Something like Huawei or Oppo would be fine."
"Yeah, I think it would be best if we buy for both."

0.30 (slight positive):
"Hi, good morning."
"How can I help you?"
"I see, I see. Okay."

0.50 (mild positive):
"Storage, I think 256GB is good enough already."
"I think the rewards card will suit you the best."

1.00 (positive):
"No, that's all. You've been wonderful, thank you very much."
"This is really helpful."

Input Format:
Each input will be a JSON object representing a single text chunk. Each batch contains 100 such text chunks, and each 'id' must correspond exactly between input and output.

{
    "id": int,  // Unique identifier for each sentence (eg. from 1 to 100 in each batch)
    "cleaned_text_for_sentiment": "string"  // The text chunk to be analyzed
}

Example:
json
{
    "id": 26,
    "cleaned_text_for_sentiment": "Wah, the internet speed today damn slow leh."
}

Output Format：
For each input text chunk, provide a corresponding JSON object with the sentiment analysis result. Ensure that each output 'id' matches the input 'id' precisely.
json
{
    "id": int,  // Same 'id' as in the input
    "GEMINI": float,  // Sentiment score between -1.00 to 1.00 with 2 decimal places
    "explanation": "string"  // Short explanation explaining how and why the score was determined，make sure it is reasonable. If the explanation exceeds 20 tokens, please truncate it to 26 tokens for output.
}

Output Example:
json
{
    "id": 26,
    "GEMINI": -0.70,
    "explanation": "Expresses frustration over slow internet speed."
}
Below is the input text for analysis:
'''


In [None]:
model = genai.GenerativeModel('gemini-1.5-flash')
model.count_tokens(sentiment_system_prompt)
# Increase the timeout to 120 seconds
# model.count_tokens(sentiment_system_prompt, request_options={"timeout": 120})

## Load data to Encode messages

In [None]:
import pandas as pd

data_df = pd.read_csv('/content/drive/MyDrive/PLP/input/IMDA_original3_RAW_176K.csv')
# remove unnamed column
# data_df.drop(columns=['Unnamed: 0'], inplace=True)
# remove none session id
data_df = data_df[~data_df['session_id'].isna()]
# cast session_id into integer
data_df['session_id'] = data_df['session_id'].astype(int)
data_df['speaker_id'] = data_df['speaker_id'].astype(int)
# replace "'" to avoid potention quotation mark in json encoding/decoding issue
data_df['cleaned_text_for_sentiment'] = data_df['cleaned_text_for_sentiment'].str.replace("'"," ")
data_df.head()

Unnamed: 0,file_name,session_id,speaker_id,speaker_type,dialog_type,x_min,x_max,text,cleaned_text_for_sentiment,word_count,duration,qualified_for_sentiment
0,app_0302_3604_phnd_cc-hol.TextGrid,302,3604,client,hol,0.0,3.0935,call three holiday,call three holiday,3,3.0935,False
1,app_0302_0018_phnd_cc-hol.TextGrid,302,18,agent,hol,3.12927,8.50931,hi good afternoon this is lily from A B C trav...,hi good afternoon this is lily from A B C trav...,17,5.38004,True
2,app_0302_3604_phnd_cc-hol.TextGrid,302,3604,client,hol,8.2211,21.06413,hi (uh) lily (uh) I'm joyce here (ppb) (um) I'...,hi (uh) lily (uh) I m joyce here (ppb) (um) I ...,20,12.84303,True
3,app_0302_3604_phnd_cc-hol.TextGrid,302,3604,client,hol,21.06413,30.21838,(um) I'm looking into (um) a package to either...,(um) I m looking into (um) a package to either...,15,9.15425,True
4,app_0302_0018_phnd_cc-hol.TextGrid,302,18,agent,hol,30.819,42.98125,hi miss joy we do have a package to korea and ...,hi miss joy we do have a package to korea and ...,24,12.16225,True


In [None]:
data_df.dialog_type.unique()

array(['hol', 'hot', 'res', 'bnk', 'ins', 'tel', 'hdb', 'moe', 'msf'],
      dtype=object)

In [None]:
data_df[data_df['dialog_type']=='bnk']['cleaned_text_for_sentiment'].head(10).tolist()

['call one banking hello thank you so much for calling A B C bank this is amy on the line and how can I help you today',
 'call one banking',
 'hi (uh) I would like to ask what is the what is the point of having a credit card (ppl)',
 '(uh) may I get your name please',
 '(uh) alice tan',
 'alice tan okay (uh) good (uh) good evening alice so (um) you re asking about details of getting a credit card is it',
 'yes',
 'okay (uh) credit card wise (uh) what is the point of getting it (uh) credit card basically how it ha~ how it works is that you are able to spend money using this credit card and at the end of the month you will get a bill (um) whereby you pay off your bill at the end of the month so different credit we ha~ basically at A B C bank we have mainly three types of credit cards',
 'so of course these three types of credit cards (um) help (uh) do have different (uh) features so we have the cashback credit card air miles credit card and rewards credit card so these three credit card

## API Call Request
1. schema constraint
2. loop to query
3. export results

In [None]:
def test_json_controlled_generation(self):
        # [START json_controlled_generation]
        import typing_extensions as typing

        class Recipe(typing.TypedDict):
            recipe_name: str
            ingredients: list[str]

        model = genai.GenerativeModel("gemini-1.5-pro-latest")
        result = model.generate_content(
            "List a few popular cookie recipes.",
            generation_config=genai.GenerationConfig(
                response_mime_type="application/json", response_schema=list[Recipe]
            ),
        )
        print(result)
        # [END json_controlled_generation]


In [None]:
# schema constraint
import typing_extensions as typing

class SentimentScore(typing.TypedDict):
  id: int
  GEMINI: float
  explanation: str


model = genai.GenerativeModel('gemini-1.5-flash',
                              generation_config={"response_mime_type": "application/json",
                                                 "response_schema": list[SentimentScore]})

generation_config = genai.types.GenerationConfig(
        candidate_count=1,temperature=0.1)

In [None]:
import numpy as np
import time
import json


CHUNK_SIZE = 100  # Reduce chunk size to avoid exceeding response length limit

# Split the DataFrame into chunks of 50 rows each
data_df = data_df[data_df['dialog_type']=='bnk']
chunks = np.array_split(data_df, np.ceil(len(data_df) / CHUNK_SIZE))

In [None]:
input_dict = chunks[0][["cleaned_text_for_sentiment"]].reset_index(names='id').to_dict('records')
response = model.generate_content(
          sentiment_system_prompt + str(input_dict),
          generation_config=generation_config)

KeyboardInterrupt: 

In [None]:
sentiment_scores_dict = json.loads(response.text)
print("[DEBUG] http response dict: ", sentiment_scores_dict)

[DEBUG] http response dict:  [{'id': 607}, {'id': 608}, {'id': 609}, {'id': 610}, {'id': 611}, {'id': 612}, {'id': 613}, {'id': 614}, {'id': 615}, {'id': 616}, {'id': 617}, {'id': 618}, {'id': 619}, {'id': 620}, {'id': 621}, {'id': 622}, {'id': 623}, {'id': 624}, {'id': 625}, {'id': 626}, {'id': 627}, {'id': 628}, {'id': 629}, {'id': 630}, {'id': 631}, {'id': 632}, {'id': 633}, {'id': 634}, {'id': 635}, {'id': 636}, {'id': 637}, {'id': 638}, {'id': 639}, {'id': 640}, {'id': 641}, {'id': 642}, {'id': 643}, {'id': 644}, {'id': 645}, {'id': 646}, {'id': 647}, {'id': 648}, {'id': 649}, {'id': 650}, {'id': 651}, {'id': 652}, {'id': 653}]


In [None]:


failed_records = []
faild_input_dfs = []
queried_result_dfs = []
merged_result_df = pd.DataFrame()

for i, chunk in enumerate(chunks):
    if i > 20:  # for test
        break

    print(f"Processing chunk {i+1}/{len(chunks)}")

    input_dict = chunk[["speaker_type","cleaned_text_for_sentiment"]].reset_index(names='id').to_dict('records')
    try:
        # get Gemini response
        response = model.generate_content(
          sentiment_system_prompt + str(input_dict),
                                    generation_config=generation_config)
    except Exception as e:
      print(f"Failed at chunk {i+1}: API Request error: {e}")
      faild_input_dfs.append(chunk)
      failed_records.append({"id":i+1, "desc": f"API Request error: {e}"})
      queried_result_dfs.append(pd.DataFrame())
      continue

    try:
        if response:
            sentiment_scores_dict = json.loads(response.text)
            print("[DEBUG] http response dict: ", sentiment_scores_dict)
            queried_result = pd.DataFrame.from_records(sentiment_scores_dict)
            queried_result_dfs.append(queried_result)
        else:
            print(f"Failed at chunk {i+1}: Empty Result!")
            faild_input_dfs.append(chunk)
            failed_records.append({"id":i+1, "desc": f"Empty Result"})
            queried_result_dfs.append(pd.DataFrame())
            continue
    except Exception as e:
      print(f"Failed at chunk {i+1}: JSON decode and to dataframe!")
      faild_input_dfs.append(chunk)
      failed_records.append({"id":i+1, "desc": f"JSON decode and to dataframe: {e}"})
      queried_result_dfs.append(pd.DataFrame())
      continue
    print("queried result: ", queried_result)
    # join result to input chunk
    try:
      temp = chunk.join(queried_result.set_index('id'))
      # Check for missing GEMINI values
      if temp['GEMINI'].isna().any():
        # Version 1: still add the rest matched result into the dataframe
        print(f"ALERT: Missing GEMINI value(s) at chunk {i+1}")
        # # Version 2: Raise an exception and treat as fail
        # raise ValueError(f"ALERT: Missing GEMINI value(s) at chunk {i+1}")
      merged_result_df = pd.concat([merged_result_df, temp])
    except Exception as e:
      print(f"Failed at chunk {i+1}: Could not join result: {e}")
      faild_input_dfs.append(chunk)
      failed_records.append({"id":i+1, "desc": f"Could not join result: {e}"})
      continue

    time.sleep(1)
print(merged_result_df.shape)
print(failed_records)

In [None]:
import logging

# Set up logging to capture detailed information
logging.basicConfig(filename='genai_debug.log', level=logging.INFO)

# Rest of your imports...

CHUNK_SIZE = 50  # Further reduce chunk size if necessary
# # Split the DataFrame into chunks of 50 rows each
chunks = np.array_split(data_df, np.ceil(len(data_df) / CHUNK_SIZE))
failed_records = []
faild_input_dfs = []
queried_result_dfs = []
merged_result_df = pd.DataFrame()

# Your existing code...

for i, chunk in enumerate(chunks):
    if i < 0 or i > 200:
        continue

    logging.info(f"Processing chunk {i+1}/{len(chunks)}")

    input_dict = chunk[["speaker_type", "dialog_type", "cleaned_text_for_sentiment"]].reset_index(names='id').to_dict('records')

    try:
        # Get Gemini response
        response = model.generate_content(
            sentiment_system_prompt + str(input_dict),
            generation_config=generation_config
        )
    except Exception as e:
        logging.error(f"Failed at chunk {i+1}: API Request error: {e}")
        faild_input_dfs.append(chunk)
        failed_records.append({"id": i+1, "desc": f"API Request error: {e}"})
        queried_result_dfs.append(pd.DataFrame())
        continue

    try:
        logging.info(f"Response from chunk {i+1}: {response.text}")

        if response:
            sentiment_scores_dict = json.loads(response.text)
            queried_result = pd.DataFrame.from_records(sentiment_scores_dict)
            queried_result_dfs.append(queried_result)
        else:
            logging.warning(f"Failed at chunk {i+1}: Empty Result!")
            faild_input_dfs.append(chunk)
            failed_records.append({"id": i+1, "desc": "Empty Result"})
            queried_result_dfs.append(pd.DataFrame())
            continue
    except json.JSONDecodeError as e:
        logging.error(f"Failed at chunk {i+1}: JSON decode error: {e} - Response: {response.text}")
        faild_input_dfs.append(chunk)
        failed_records.append({"id": i+1, "desc": f"JSON decode error: {e}"})
        queried_result_dfs.append(pd.DataFrame())
        continue

    # Join result to input chunk
    try:
        temp = chunk.join(queried_result.set_index('id'), on='id')
        if temp['GEMINI'].isna().any():
            logging.warning(f"ALERT: Missing GEMINI value(s) at chunk {i+1}")
        merged_result_df = pd.concat([merged_result_df, temp])
    except KeyError as e:
        logging.error(f"Failed at chunk {i+1}: Could not join result, missing GEMINI: {e}")
        faild_input_dfs.append(chunk)
        failed_records.append({"id": i+1, "desc": f"Missing GEMINI: {e}"})
        continue
    except Exception as e:
        logging.error(f"Failed at chunk {i+1}: Could not join result: {e}")
        faild_input_dfs.append(chunk)
        failed_records.append({"id": i+1, "desc": f"Could not join result: {e}"})
        continue

    time.sleep(1)

logging.info(f"Completed processing with {len(failed_records)} failures.")

ERROR:root:Failed at chunk 1: API Request error: HTTPConnectionPool(host='localhost', port=44959): Read timed out. (read timeout=600.0)
ERROR:root:Failed at chunk 2: API Request error: HTTPConnectionPool(host='localhost', port=44959): Read timed out. (read timeout=600.0)
ERROR:root:Failed at chunk 3: API Request error: HTTPConnectionPool(host='localhost', port=44959): Read timed out. (read timeout=600.0)
ERROR:root:Failed at chunk 4: API Request error: HTTPConnectionPool(host='localhost', port=44959): Read timed out. (read timeout=600.0)
ERROR:root:Failed at chunk 5: API Request error: HTTPConnectionPool(host='localhost', port=44959): Read timed out. (read timeout=600.0)
ERROR:root:Failed at chunk 6: API Request error: HTTPConnectionPool(host='localhost', port=44959): Read timed out. (read timeout=600.0)
ERROR:root:Failed at chunk 7: API Request error: HTTPConnectionPool(host='localhost', port=44959): Read timed out. (read timeout=600.0)
ERROR:root:Failed at chunk 8: API Request error:

In [None]:
merged_result_df


In [None]:
print("number of records Null Value matched: ", merged_result_df[merged_result_df['GEMINI'].isna()].shape[0])
merged_result_df[merged_result_df['GEMINI'].isna()]

In [None]:
merged_result_df.to_excel("/content/drive/MyDrive/PLP/IMDA_Original/IMDA_original3_V3b_Gemini_0903_200_a.xlsx",index=False)
merged_result_df1 = merged_result_df[~merged_result_df['GEMINI'].isna()]
merged_result_df1.to_excel("/content/drive/MyDrive/PLP/IMDA_Original/IMDA_original3_V3b_Gemini_0903_200_ra.xlsx",index=False)
merged_result_df1.shape

In [None]:
print("number of failure: ", len(faild_input_dfs))
pd.DataFrame.from_records(failed_records).to_csv(
    "/content/drive/MyDrive/PLP/IMDA_Original/IMDA_original3_V3b_Gemini_0903_200_failed_reason.csv",index=False)

failed_input_concat_df = pd.DataFrame()
for failed_input_df in faild_input_dfs:
    failed_input_concat_df = pd.concat([failed_input_concat_df,failed_input_df])
failed_input_concat_df.to_excel("/content/drive/MyDrive/PLP/IMDA_Original/IMDA_original3_V3b_Gemini_0903_200_failed_input.xlsx",index=False)
failed_input_concat_df.shape[0], failed_input_concat_df.shape[0]//100

# END reference

-   Prompt design is the process of creating prompts that elicit the desired response from language models. Writing well structured prompts is an essential part of ensuring accurate, high quality responses from a language model. Learn about best practices for [prompt writing](https://ai.google.dev/docs/prompt_best_practices).
-   Gemini offers several model variations to meet the needs of different use cases, such as input types and complexity, implementations for chat or other dialog language tasks, and size constraints. Learn about the available [Gemini models](https://ai.google.dev/models/gemini).
-   Gemini offers options for requesting [rate limit increases](https://ai.google.dev/docs/increase_quota). The rate limit for Gemini-Pro models is 60 requests per minute (RPM).