Due to the current situation (`Updated 09/02/2026`),

- Google Gemini has reduced the rate limits for several models, such as `gemini-2.5-flash` and `gemini-3-flash` (text models used in Colab notebooks), to a **limit of 20 Requests Per Day (RPD)**.

- To continue using these models seamlessly with sufficient rate limits, it is necessary to upgrade to the **pay-as-you-go tier** (link a Billing Account).
  - üëâ You can learn how to do this here: [https://ai.google.dev/gemini-api/docs/billing](https://ai.google.dev/gemini-api/docs/billing)

- Alternatively, you can follow the Groq API approach described below.


# Overall of this notebook

Most of concepts and codes are adapted from
- https://github.com/dair-ai/Prompt-Engineering-Guide
- https://ai.google.dev/gemini-api/docs/prompting-strategies
- https://myframework.net/icio-ai-prompt-framework/

# Setting environments and model setup

## Approach 1: Gemini

In [1]:
# %%capture
# !pip install -qU langchain-google-genai

Request for Google API KEY here : https://aistudio.google.com/app/apikey

In [2]:
# from getpass import getpass
# import os

# if "GOOGLE_API_KEY" not in os.environ:
#     os.environ["GOOGLE_API_KEY"] = getpass("Enter your Google AI API key: ")

In [3]:
# from langchain_google_genai import ChatGoogleGenerativeAI

# llm = ChatGoogleGenerativeAI(
#     model="gemini-2.5-flash",
#     temperature=0,
#     max_tokens=None,
#     timeout=None,
#     max_retries=2,
#     # other params...
# )

## Approach 2: Groq API


However, we still have an **alternative** that can be used via a free-tier API: **Groq API** (compatible with LangChain). This does not require linking a credit card and offers several models, such as:

- `llama-3.1-8b-instant` (Rate limit: 30 RPM, 14.4K RPD) *RPM = Requests Per Minute, RPD = Requests Per Day
- `llama-3.3-70b-versatile` (Rate limit: 30 RPM, 1K RPD)
- Other available models: https://console.groq.com/settings/limits

*(RPM = Requests Per Minute, TPM = Tokens Per Minute)*

üëâ You can sign up and get your API Key here: [https://console.groq.com/keys](https://console.groq.com/keys)


In [4]:
!pip install -qU langchain-groq

[?25l   [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m0.0/137.5 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m137.5/137.5 kB[0m [31m5.4 MB/s[0m eta [36m0:00:00[0m
[?25h

In [5]:
import getpass
import os

os.environ["GROQ_API_KEY"] = getpass.getpass("Enter your Groq API key: ")

Enter your Groq API key: ¬∑¬∑¬∑¬∑¬∑¬∑¬∑¬∑¬∑¬∑


In [6]:
from langchain_groq import ChatGroq

llm = ChatGroq(
    model="llama-3.1-8b-instant", # can change
    temperature=0,
    max_tokens=None,
    timeout=None,
    max_retries=2,
)

# Basic

## System Prompt / User Prompt

`System Prompt`:

The system prompt establishes the overall context, persona, and behavioral guidelines for the LLM. It dictates how the model should generally respond and interact, setting the foundational rules for all subsequent interactions within a session or application.

`User Prompt (Human)`:

  The user prompt is the specific query or instruction provided by the user to the LLM. It defines the immediate task or question the user wants the model to address, operating within the framework established by the system prompt. example


In [7]:
# Demo 1: System - Health Scientist / User - Explain the importance of exercise
messages = [
    ("system", "You are a health scientist who always provides factual and evidence-based answers."),
    ("human", "Explain the importance of exercise in a short sentence."),
]
ai_msg = llm.invoke(messages)
print(ai_msg.content)

Regular exercise is crucial for maintaining physical and mental health, as it reduces the risk of chronic diseases, improves cardiovascular function, enhances cognitive function, and promotes overall well-being.


In [8]:
# Demo 2: Syetem - Elderly Person Complaining / User - Explain the importance of exercise
messages = [
    ("system", "You are an elderly person who often complains."),
    ("human", "Explain the importance of exercise in a short sentence."),
]
ai_msg = llm.invoke(messages)
print(ai_msg.content)

*sigh* Fine, exercise is supposed to be good for you, keeps the old bones from creaking too much and the heart from giving out, but I swear, it's a chore.


In [9]:
# Demo 3: System - Mother Explaining to a 5-year-old / User - Explain the importance of exercise
messages = [
    ("system", "You are a mother who needs to answer questions from a 5-year-old child, always explaining complex topics in the simplest way possible."),
    ("human", "Explain the importance of exercise in a short sentence."),
]
ai_msg = llm.invoke(messages)
print(ai_msg.content)

Exercise is like giving our bodies a big hug from the inside out, making us strong and healthy so we can run, play, and have lots of fun!


In [10]:
# Demo 4: Professional Assistant (Python factorial)
messages = [
    ("system", "You are a helpful and informative assistant. Your responses should be clear, concise, and professional. Avoid making assumptions and always ask for clarification if a user's request is ambiguous."),
    ("human", "Write a Python function that calculates the factorial of a given number."),
]
ai_msg = llm.invoke(messages)
print(ai_msg.content)

**Calculating the Factorial of a Number in Python**

Here's a simple Python function that calculates the factorial of a given number using recursion and iteration:

### Recursive Implementation

```python
def factorial_recursive(n):
    """
    Calculate the factorial of a number using recursion.

    Args:
        n (int): The number to calculate the factorial for.

    Returns:
        int: The factorial of the given number.

    Raises:
        ValueError: If n is a negative integer.
    """
    if not isinstance(n, int):
        raise TypeError("Input must be an integer.")
    if n < 0:
        raise ValueError("Input must be a non-negative integer.")
    elif n == 0 or n == 1:
        return 1
    else:
        return n * factorial_recursive(n - 1)
```

### Iterative Implementation

```python
def factorial_iterative(n):
    """
    Calculate the factorial of a number using iteration.

    Args:
        n (int): The number to calculate the factorial for.

    Returns:
        int: 

## User Prompt Framework - ICIO

The ICIO framework is a simple and practical method that **helps you structure your prompts** step by step.
- `Instruction (I)` --> What do you want the AI to do?

  - The instruction should be specific and direct. A clear task helps the AI give you the right kind of output.
- `Context (C)` --> Give background information. Why are you doing this task? What‚Äôs the situation?

  - Context helps the AI better understand your purpose and tone.
  - ***Optional, but nice to have.***

- `Input (I)` --> What exact text or data should the AI process?
  - Provide the content the AI needs to work with.
  - Without input data, the AI may guess or go off track. Be clear and complete.

- `Output (O)` --> Set the style or format of the output. What should the response look like? What tone or structure do you expect?
  - This helps guide the AI to produce the kind of result you want.

In [11]:
# Example A: Customer Feedback Summary

# ICIO fields
instruction = "Summarize customer opinions."
context = "For the product development team to consider improvements in the next version."
input_text = (
    "Many customers like the battery lasting up to 3 days, which is much better than the older version. "
    "The AMOLED screen provides vibrant colors and is clearly visible even under bright sunlight. "
    "However, some users reported that Bluetooth connections with certain headphones often drop. "
    "The sleep tracking system is also not very accurate and sometimes fails to record data. "
    "Additionally, customers would like to see a blood pressure monitoring function added."
)
output_format = 'Summarize into a table with 3 columns: "Feature", "Status (Good/Needs Improvement/Requested)", "Notes".'

# Create messages
messages = [
    ("system", "You are a product analyst who summarizes customer feedback into clear, structured tables for business teams."),
    ("human",
     f"{instruction}\n"
     f"{context}\n"
     f"{input_text}\n"
     f"{output_format}"
    ),
]

# Invoke model
ai_msg = llm.invoke(messages)
print(ai_msg.content)

Here's a summary of customer opinions in a table format:

| Feature | Status | Notes |
| --- | --- | --- |
| Battery Life | Good | Lasts up to 3 days, an improvement from the older version |
| AMOLED Screen | Good | Provides vibrant colors and is visible under bright sunlight |
| Bluetooth Connectivity | Needs Improvement | Often drops connection with certain headphones |
| Sleep Tracking System | Needs Improvement | Not very accurate and sometimes fails to record data |
| Blood Pressure Monitoring | Requested | Customers would like to see this feature added |


***To summarize, recognizing ICIO when prompting helps ensure the prompt is complete and clear, and that the LLM provides the desired output.***

## Combine

Note:
  - In practice, many modern prompts don't separate system and user instructions. Instead, they combine them into a single, comprehensive prompt.
  - This approach is effective because today's large language models are skilled at understanding and following complex, structured instructions.
  - You can define the model's persona, give it specific instructions, and provide the necessary data all within one prompt.

In [12]:
# Example B: Customer Feedback Summary (Single Prompt)

prompt = '''
You are a product analyst who summarizes customer feedback into clear, structured tables for business teams.
Summarize customer opinions for the product development team to consider improvements in the next version.
Customer Opinions:
    "Many customers like the battery lasting up to 3 days, which is much better than the older version. "
    "The AMOLED screen provides vibrant colors and is clearly visible even under bright sunlight. "
    "However, some users reported that Bluetooth connections with certain headphones often drop. "
    "The sleep tracking system is also not very accurate and sometimes fails to record data. "
    "Additionally, customers would like to see a blood pressure monitoring function added."
Output: Summarize into a table with 3 columns: "Feature", "Status (Good/Needs Improvement/Requested)", "Notes".
'''

# Invoke model
ai_msg = llm.invoke(prompt)
print(ai_msg.content)

**Customer Feedback Summary Table**

| Feature | Status | Notes |
| --- | --- | --- |
| Battery Life | Good | Lasts up to 3 days, an improvement from the older version |
| AMOLED Screen | Good | Provides vibrant colors and is visible under bright sunlight |
| Bluetooth Connectivity | Needs Improvement | Drops connection with certain headphones |
| Sleep Tracking System | Needs Improvement | Not very accurate and sometimes fails to record data |
| Blood Pressure Monitoring | Requested | Customers would like to see this feature added |


## Structured Input

In prompt engineering, **structured input** helps guide the LLM to focus on exactly what we want.  

One common technique is using **delimiters** (special symbols or markers) to clearly separate instructions, context, and input data.


Why use delimiters?
- They **reduce ambiguity** ‚Üí the model doesn‚Äôt ‚Äúguess‚Äù where instructions or content begin/end.  
- They **minimize misinterpretation** ‚Üí the model treats the content inside delimiters as a defined block.  
- They are especially useful when prompts are **long, multi-part, or contain different types of information**.
---

Examples of delimiters

You can use different symbols such as:
- Triple dashes (---)
- Triple hashtags (###)
- Triple backticks: \`\`\` ... \`\`\`
- Triple quotes: """ ... """
- Angle brackets: < ... >
- Tags: `<instruction> ... </instruction>`

In [13]:
# The raw text to be summarized
text = """
In the digital age, online marketing has become the cornerstone of businesses of all sizes, offering a broad reach to consumers at a lower cost than traditional marketing.
Popular online marketing tools include SEO (Search Engine Optimization), Social Media Marketing, and high-quality Content Marketing.
Leveraging data analytics also helps businesses analyze customer behavior and refine their strategies effectively.
"""

# The prompt using delimiters (triple backticks ```)
prompt = f"""You are a helpful assistant.
Summarize the text within the triple backticks concisely, in no more than two sentences.

```{text}```
"""

ai_msg = llm.invoke(prompt)
print(ai_msg.content)

Online marketing has become a cornerstone for businesses, offering a broad reach at a lower cost than traditional marketing. Key tools include SEO, social media marketing, content marketing, and data analytics to refine strategies.


In [14]:
prompt = f"""
<Instructions>
You are a marketing expert. Analyze the article within <Article> and provide recommendations based on the topics outlined in <Response_Format>.
</Instructions>

<Article>
Our company recently launched a new smartwatch, but sales have been disappointing. Most customers say the features aren't unique compared to competitors, and the price is too high for the value they receive.
</Article>

<Response_Format>
### Problem Analysis:
- [Summary of main issues]

### Strategic Recommendations:
- [Suggestion for the product]
- [Suggestion for pricing]
- [Suggestion for marketing communications]
</Response_Format>
"""

ai_msg = llm.invoke(prompt)
print(ai_msg.content)

### Problem Analysis:
The main issues with the new smartwatch are:

- Lack of unique features compared to competitors, making it difficult to differentiate the product in the market.
- High price point that does not provide sufficient value to customers, leading to disappointing sales.

### Strategic Recommendations:

#### Suggestion for the product:
To address the issue of lacking unique features, we recommend the following:

- Conduct market research to identify emerging trends and technologies in the smartwatch industry.
- Develop a minimum viable product (MVP) that incorporates a unique feature or functionality that sets the smartwatch apart from competitors.
- Consider partnering with a popular fitness or wellness brand to integrate their technology into the smartwatch, making it more appealing to customers.
- Offer customization options, such as interchangeable straps or watch faces, to give customers a sense of personalization and ownership.

#### Suggestion for pricing:
To addr

Explanation:

- `<Instructions>`: Sets the model's persona and primary objective.

- `<Article>`: Contains the raw data to be analyzed.

- `<Response_Format>`: Clearly outlines the desired structure of the output. This forces the model to organize its response systematically and address all specified points.



## Structured Output

`CSV` is best reserved for situations where the data is exclusively flat and **tabular**, like a basic spreadsheet.

`JSON` is the clear winner for most tasks today because it can handle **hierarchical and nested data**. This is essential for working with APIs, configurations, and any data that isn't a simple table. It also natively supports data types like integers, strings, and booleans, which simplifies processing.

### Output : CSV

In [15]:
# Example: Structured output (CSV)
prompt = """You are a helpful assistant.
**Task:** Convert the following customer list into a CSV string.
**Output Format:** The first row should contain the headers "Name" and "City".
The subsequent rows should contain the customer data, with values separated by commas.
Whole answer should be under the backtrick ```csv ... ```.
Response the final answer only.
**Data:**
- John Doe from New York
- Jane Smith from London
- Peter Jones from Tokyo
"""

ai_msg_csv = llm.invoke(prompt)
print("Structured Output:\n", ai_msg_csv.content)

Structured Output:
 ```csv
Name,City
John Doe,New York
Jane Smith,London
Peter Jones,Tokyo
```


#### Parsing CSV Output into a DataFrame

We used a regex pattern to find the ```csv``` content and convert them into a DataFrame.

In [16]:
import re
import pandas as pd
import io

def csv_string_to_df(text: str) -> pd.DataFrame:
    """
    Extracts CSV content from a string and converts it into a pandas DataFrame.

    Args:
        text (str): The input string containing CSV content enclosed in ```csv...```.

    Returns:
        pd.DataFrame: A pandas DataFrame containing the extracted data.
    """
    # Use a regex pattern to find the content between the delimiters
    match = re.search(r'```csv\s(.*?)```', text, re.DOTALL)

    if match:
        # Extract the content from the first capturing group
        csv_content = match.group(1).strip()

        # Use io.StringIO to treat the string as a file
        data = io.StringIO(csv_content)

        # Read the "file" into a pandas DataFrame
        df = pd.read_csv(data)

        return df
    else:
        # Return an empty DataFrame or raise an error if no match is found
        print("No CSV content found within ```csv...``` delimiters.")
        return pd.DataFrame()

In [17]:
csv_string_to_df(ai_msg_csv.content)

Unnamed: 0,Name,City
0,John Doe,New York
1,Jane Smith,London
2,Peter Jones,Tokyo


### Output : JSON

In [18]:
# Example: Structured output (JSON)
prompt = """
You are a helpful assistant.
For the given student record, return a JSON object with the following fields:
- name (string) ‚Üí student‚Äôs full name
- age (integer) ‚Üí student‚Äôs age
- scores (object) ‚Üí nested dictionary with subject name as key and integer score as value
- extracurricular (array of strings) ‚Üí list of activities
The whole answer must be under ```json ... ```.
Student Record:
Alice, 21 years old. Math = 85, English = 92. She joined Basketball and Drama Club.

Answer:
Response the final answer only.
"""

ai_msg_json = llm.invoke(prompt)
print("Structured Output:\n", ai_msg_json.content)

Structured Output:
 ```json
{
  "name": "Alice",
  "age": 21,
  "scores": {
    "Math": 85,
    "English": 92
  },
  "extracurricular": ["Basketball", "Drama Club"]
}
```


#### Parsing JSON Output into Dict

We used a regex pattern to find the ```csv``` content and convert them into a DataFrame.

In [19]:
import re
import json

def json_string_to_dict(text: str):
    """
    Extracts JSON content from a string enclosed in ```json...```
    and parses it into a Python dict or list.

    Args:
        text (str): The input string containing JSON content enclosed in ```json...```.

    Returns:
        dict or list: Parsed JSON object (Python dict or list).
    """
    # Use regex to find JSON block
    match = re.search(r'```json\s(.*?)```', text, re.DOTALL)

    if match:
        # Extract JSON content
        json_content = match.group(1).strip()

        try:
            return json.loads(json_content)
        except json.JSONDecodeError as e:
            print("Invalid JSON:", e)
            return None
    else:
        print("No JSON content found within ```json...``` delimiters.")
        return None

In [20]:
dict_output = json_string_to_dict(ai_msg_json.content)
dict_output

{'name': 'Alice',
 'age': 21,
 'scores': {'Math': 85, 'English': 92},
 'extracurricular': ['Basketball', 'Drama Club']}

In [21]:
dict_output['scores']['Math']

85

### Output : Pydantic Schema

- LangChain supports structured outputs, **allowing us to bind a schema (dict / JSON Schema / Pydantic) to the model**
  - and enforce responses to follow the defined structure instead of relying only on prompt wording.
- ***However, complex output structures may still fail, so prompting and custom parsing function are still important in some cases.***

Read more: [LangChain Docs ‚Äì Structured Outputs](https://python.langchain.com/docs/concepts/structured_outputs/)


In [22]:
# pydantic schema

# suppose that we want the output something like this :
'''{'name': 'Alice',
 'age': 21,
 'scores': {'Math': 85, 'English': 92},
 'extracurricular': ['Basketball', 'Drama Club']}'''

# we can defined class (data fields) like this

from typing import Dict, List
from pydantic import BaseModel, Field

class DesiredOutput(BaseModel):
    name: str = Field(description="Student's first name")
    age: int = Field(description="Age in years")
    extracurricular: List[str] = Field(description="List of activities/clubs")

    #subject_scores: Dict[str, int] = Field(description="Key = subject, Value = scores (as a JSON object)")

    # This line cause an error. / Complex Data Structure (uncomment if you want to test it)

In [23]:
# Wrap LLM so it returns a DesiredOutput object directly
structured_llm = llm.with_structured_output(DesiredOutput)

In [24]:
prompt = """
You are a helpful assistant.
For the given student record, extract informations

Student Record:
Alice, 21 years old. Math = 85, English = 92. She joined Basketball and Drama Club.

Answer:
"""


# Generate output
results = structured_llm.invoke(prompt)
results

DesiredOutput(name='Alice', age=21, extracurricular=['Basketball', 'Drama Club'])

In [25]:
results.model_dump_json()

'{"name":"Alice","age":21,"extracurricular":["Basketball","Drama Club"]}'

## Boundary Condition
- **Don't know, don't guess**  
  Instruct the model to answer *"I don‚Äôt know"* if the information is unknown or unverifiable.  
  ‚Üí Helps prevent the model from attempting to answer overly difficult or specific open-ended questions.  
  > Note: This depends on the **use case** ‚Äî but in scenarios where we *don‚Äôt want the model to attempt an uncertain answer*, this condition is very useful.

- **Output Format Remarking**  
  Explicitly remind the model about the required output format.  
  ‚Üí e.g., *"Don‚Äôt give any additional explanation, just output [format] only."*

In [26]:
# Example 1: Without boundary condition
prompt = """
What are the details of the announcement from the Meteorological Department, Announcement No. 2/2025, regarding ‚ÄòMeasures to Cope with the Early Arrival of Summer Storms‚Äô?
"""

ai_msg = llm.invoke(prompt)
print("Without boundary condition:\n", ai_msg.content)


Without boundary condition:
 I'm unable to verify the details of the announcement from the Meteorological Department, Announcement No. 2/2025, regarding 'Measures to Cope with the Early Arrival of Summer Storms'.


**Key Takeaways**:
- Without boundary conditions, an LLM will still attempt to generate an answer ‚Äî sometimes hallucinating content, and other times making an estimated guess while recognizing its own uncertainty.
- If you want to avoid such cases, define clear boundary conditions that tell the LLM to respond with `‚ÄúI don‚Äôt know‚Äù` or another `concise fallback` instead of producing uncertain or fabricated answers.
- This keeps your system consistent and predictable.

In [27]:
# Example 2: With boundary condition
prompt = """
What are the details of the announcement from the Meteorological Department, Announcement No. 2/2025, regarding ‚ÄòMeasures to Cope with the Early Arrival of Summer Storms‚Äô?

If the answer is not known or cannot be verified, just reply: `None`.
"""

ai_msg = llm.invoke(prompt)
print("With boundary condition:\n", ai_msg.content)


With boundary condition:
 None.


## Prompt Template

Prompt templates offer several benefits:

- **Consistency**: Ensure a consistent structure for your prompts across multiple interactions
- **Efficiency**: Easily swap out variable content without rewriting the entire prompt
- **Testability**: Quickly test different inputs and edge cases by changing only the variable portion
- **Scalability***: Simplify prompt management as your application grows in complexity
- **Version control**: Easily track changes to your prompt structure over time by keeping tabs only on the core part of your prompt, separate from dynamic inputs

### Example: Prompt Template in a Loop (Task: Sentiment Analysis)

Example Task: **Sentiment Analysis**

We used a prompt template with the approach **‚Äúrun in a loop + change only variables‚Äù**.  
This demonstrates how prompt templates cover several benefits at once:

- **Consistency**: Every iteration uses the same prompt structure.  
- **Efficiency**: Only the variable `{text}` changes in each loop.  
- **Testability**: Multiple inputs can be tested quickly by swapping variable values.  
- **Scalability**: The same template can be applied to a larger dataset without modification.  
- **Version Control**: Easily track prompt versions against results.




In [28]:
!wget https://github.com/neubig/anlp-code/raw/refs/heads/main/data/sst-sentiment-text-threeclass/dev.txt

--2026-02-09 14:40:48--  https://github.com/neubig/anlp-code/raw/refs/heads/main/data/sst-sentiment-text-threeclass/dev.txt
Resolving github.com (github.com)... 140.82.112.3
Connecting to github.com (github.com)|140.82.112.3|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/neubig/anlp-code/refs/heads/main/data/sst-sentiment-text-threeclass/dev.txt [following]
--2026-02-09 14:40:48--  https://raw.githubusercontent.com/neubig/anlp-code/refs/heads/main/data/sst-sentiment-text-threeclass/dev.txt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.109.133, 185.199.108.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 122071 (119K) [text/plain]
Saving to: ‚Äòdev.txt‚Äô


2026-02-09 14:40:49 (9.79 MB/s) - ‚Äòdev.txt‚Äô saved [122071/122071]



In [29]:
def read_xy_data(filename: str) -> tuple[list[str], list[int]]:
    x_data = []
    y_data = []
    with open(filename, 'r') as f:
        for line in f:
            label, text = line.strip().split(' ||| ')
            x_data.append(text)
            y_data.append(int(label))
    return x_data, y_data

In [30]:
x_test, y_test = read_xy_data('dev.txt')
x_test, y_test = x_test[:10], y_test[:10] # small size for quick testing

For sentiment analysis, we will be using the following prompt:

```
Analyse the sentiment of the following text: ```text```
if the sentiment is positive output '1', '0' for neutral, and '-1' for negative.
**DO NOT OFFER ANY EXPLANATION JUST OUTPUT THE NUMBER**
```
LLMs nowaday usually have chain-of-thought baked in so they usually will output their reasoning before answering.

- It is important to tell the model not to output their explanation by including `**DO NOT OFFER ANY EXPLANATION JUST OUTPUT THE NUMBER**
`
- Otherwise, it will not be easy to programmatically use the outputs.
Alternatively, you can use structured outputs `(see table of contents -> Structured Output)` for ease of parsing.

In [31]:
prompt_template = """
Analyse the sentiment of the following text: ```{x_input}```
if the sentiment is positive output '1', '0' for neutral, and '-1' for negative.
**DO NOT OFFER ANY EXPLANATION JUST OUTPUT THE NUMBER**"""

In [32]:
from tqdm.notebook import tqdm

In [33]:
import time
from tqdm.notebook import tqdm

output = []
for sent in tqdm(x_test):
   try:
      prompt_filled = prompt_template.format(x_input=sent)
      print('prompt:', prompt_filled) # debugging
      output_res = llm.invoke(prompt_filled).content.strip()
      print('response:', output_res) # debugging
      print('--'*20)
      output.append(int(output_res))
      time.sleep(2) # Adding a 2-second delay to avoid rate limit error
   except:
      output.append(0)

  0%|          | 0/10 [00:00<?, ?it/s]

prompt: 
Analyse the sentiment of the following text: ```It 's a lovely film with lovely performances by Buy and Accorsi .```
if the sentiment is positive output '1', '0' for neutral, and '-1' for negative.
**DO NOT OFFER ANY EXPLANATION JUST OUTPUT THE NUMBER**
response: 1
----------------------------------------
prompt: 
Analyse the sentiment of the following text: ```No one goes unindicted here , which is probably for the best .```
if the sentiment is positive output '1', '0' for neutral, and '-1' for negative.
**DO NOT OFFER ANY EXPLANATION JUST OUTPUT THE NUMBER**
response: 1
----------------------------------------
prompt: 
Analyse the sentiment of the following text: ```And if you 're not nearly moved to tears by a couple of scenes , you 've got ice water in your veins .```
if the sentiment is positive output '1', '0' for neutral, and '-1' for negative.
**DO NOT OFFER ANY EXPLANATION JUST OUTPUT THE NUMBER**
response: 1
----------------------------------------
prompt: 
Analyse t

In [34]:
from sklearn.metrics import accuracy_score
accuracy_score(y_test, output)

0.8

## Additional: Temperature Setting

Settings to keep in mind

- Temperature is an important parameter to consider.
  - Keep it low if you are looking for exact or deterministic answers.
  - Keep it high if you are looking for more diverse or creative responses.

> In all the previous examples, we only set up the LLM once, and the parameter was fixed as Temperature = 0

This means every example so far was generated with a deterministic setting (no randomness).

### Approach 1 : Gemini

Temperature Range for Gemini-2.5-flash : 0-2 (default 1)

>Ref: https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini/2-5-flash

In [35]:
# from langchain_google_genai import ChatGoogleGenerativeAI

# llm_low_temp = ChatGoogleGenerativeAI(
#     model="gemini-2.5-flash",
#     temperature=0,
#     max_tokens=None,
#     timeout=None,
#     max_retries=2,
#     # other params...
# )

In [36]:
# llm_high_temp = ChatGoogleGenerativeAI(
#     model="gemini-2.5-flash",
#     temperature=2,
#     max_tokens=None,
#     timeout=None,
#     max_retries=2,
#     # other params...
# )

In [37]:
# # llm_low_temp
# prompt = "Write one slogan for a mobile banking application. Just only answer the slogan without additional suggestions"
# for rnd in range(3):
#   try:
#     output_res = llm_low_temp.invoke(prompt).content.strip()
#     print(f"Round {rnd+1} | response:", output_res)
#   except Exception as e:
#     print(f"Round {rnd+1} | Error:", e)

#   print("--"*30)
#   time.sleep(2)

In [38]:
# # llm_high_temp
# prompt = "Write one slogan for a mobile banking application. Just only answer the slogan without additional suggestions"
# for rnd in range(3):
#   try:
#     output_res = llm_high_temp.invoke(prompt).content.strip()
#     print(f"Round {rnd+1} | response:", output_res)
#   except Exception as e:
#     print(f"Round {rnd+1} | Error:", e)

#   print("--"*30)
#   time.sleep(2)

### Approach 2 : Groq

In [39]:
llm_low_temp = ChatGroq(
    model="llama-3.1-8b-instant", # can change
    temperature=0,
    max_tokens=None,
    timeout=None,
    max_retries=2,
)

In [40]:
llm_high_temp = ChatGroq(
    model="llama-3.1-8b-instant", # can change
    temperature=0.9,
    max_tokens=None,
    timeout=None,
    max_retries=2,
)

In [41]:
# low_temp
prompt = "Write one slogan for a mobile banking application. Just only answer the slogan without additional suggestions"
for rnd in range(3):
  try:
    output_res = llm_low_temp.invoke(prompt).content.strip()
    print(f"Round {rnd+1} | response:", output_res)
  except Exception as e:
    print(f"Round {rnd+1} | Error:", e)

  print("--"*30)
  time.sleep(2)

Round 1 | response: "Bank on the go, wherever you grow."
------------------------------------------------------------
Round 2 | response: "Bank on the go, wherever you grow."
------------------------------------------------------------
Round 3 | response: "Bank on the go, wherever you grow."
------------------------------------------------------------


In [42]:
# llm_high_temp
prompt = "Write one slogan for a mobile banking application. Just only answer the slogan without additional suggestions"
for rnd in range(3):
  try:
    output_res = llm_high_temp.invoke(prompt).content.strip()
    print(f"Round {rnd+1} | response:", output_res)
  except Exception as e:
    print(f"Round {rnd+1} | Error:", e)

  print("--"*30)
  time.sleep(2) # Adding a 2-second delay to avoid rate limit error

Round 1 | response: "Bank on the move, not in line."
------------------------------------------------------------
Round 2 | response: "Banking at your fingertips, anywhere in life."
------------------------------------------------------------
Round 3 | response: "Bank on the move, wherever you go."
------------------------------------------------------------


Summary

- Low temp ‚Üí Reliable, consistent outputs. Useful for classification, extraction, or when you want reproducibility.
- High temp ‚Üí Diverse, creative slogans. Useful for brainstorming, ideation, or when multiple fresh options are desired.