# üõ† Pre-trained Models and Fine-Tuning

This guide walks you through two hands-on labs:  
1. Using OpenAI‚Äôs GPT API for text generation  
2. Fine-tuning a BERT model for sentiment analysis on a real-world dataset  

## Text Generation with OpenAI‚Äôs GPT API

### Prerequisites

- An OpenAI API key (set as environment variable `OPENAI_API_KEY`)  
- Install the `openai` library:  
  ```bash
  pip install openai
  ```


In [1]:
import warnings
warnings.filterwarnings('ignore')

In [2]:
pip install -q openai

Note: you may need to restart the kernel to use updated packages.


In [3]:
import os
from dotenv import load_dotenv

In [4]:
# Load .env from home directory
dotenv_path = os.path.expanduser("~/dev.env")
load_dotenv(dotenv_path, override=True)

# Get API key from env
api_key = os.getenv("OPENAI_API_KEY")


In [5]:
print(dotenv_path)

C:\Users\adminuser/dev.env


In [6]:
import os
assert "OPENAI_API_KEY" in os.environ

 - The temperature parameter controls the randomness of the model's output. A lower temperature results in more predictable, focused text, while a higher temperature produces more creative and unexpected results.

   - Use low temperature for tasks that require factual, consistent answers, like summarization, question-answering, or code generation.

   - Use high temperature for creative tasks like writing stories, brainstorming ideas, or generating diverse marketing copy.

### top_p
- If you set top_p = 0.90 (or 90%):
   - The model will sum the probabilities: 0.60 (the) + 0.25 (a) = 0.85. Since 0.85 is still less than 0.90, it adds the next word: 0.85 + 0.10 (one) = 0.95. This total now exceeds the 0.90 threshold.

   - As a result, the model will only choose its next word from the pool of {"the", "a", "one"} and will completely ignore "that" and all less likely options.

In [7]:
import os
import openai

def generate_text(prompt: str,
                  model: str = "gpt-3.5-turbo",
                  max_tokens: int = 100,            # sets the maximum number of tokens the model can generate for its response.
                  temperature: float = 0.7,
                  top_p: float = 1.0,
                  n_samples: int = 1):
    client = openai.OpenAI(api_key=api_key)
    
    response = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        max_tokens=max_tokens,
        temperature=temperature,
        top_p=top_p,
        n=n_samples
    )

    return [choice.message.content.strip() for choice in response.choices]


In [8]:
#pip install guardrails-ai

#### Note first open cli with venc active and then configure guardrail using command below
 - guardrails configure

In [9]:
#!guardrails hub install hub://guardrails/competitor_check

In [10]:
#!guardrails hub install hub://guardrails/toxic_language

In [23]:
#!guardrails hub install hub://guardrails/guardrails_pii

In [27]:
from guardrails import Guard, OnFailAction
from guardrails.hub import CompetitorCheck, ToxicLanguage
from guardrails.hub import GuardrailsPII
# Setup Guard
guard = Guard().use_many(
    CompetitorCheck(["Apple", "Microsoft", "Google"], on_fail="fix"),
    ToxicLanguage(threshold=0.99, validation_method="sentence", on_fail="fix"),
    GuardrailsPII(entities=["DATE_TIME", "EMAIL_ADDRESS", "PHONE_NUMBER"], on_fail="fix")
)

Fetching 4 files:   0%|          | 0/4 [00:00<?, ?it/s]



In [38]:
prompt = "I am a software engineer and do website development. I need to know the latest technnologies in the market. If you are not a fool then please suggest some top languages and email me - atin@atin.com or call at 8596784521"
val_output = guard.validate(prompt)

In [39]:
cleaned_prompt = val_output.to_dict()['validatedOutput']

In [40]:
cleaned_prompt

'I am a software engineer and do website development. I need to know the latest technnologies in the market. If you are not a fool then please suggest some top languages and email me - <EMAIL_ADDRESS> or call at <PHONE_NUMBER>'

In [41]:
outputs = generate_text(cleaned_prompt, max_tokens=50, temperature=0.5, n_samples=2)

In [42]:
for i, text in enumerate(outputs, 1):
    print(f"\n=== Sample {i} ===\n{text}")


=== Sample 1 ===
I'm sorry, but I am unable to provide personal contact information. However, I can suggest some popular languages and technologies that are currently in demand in the market:

1. JavaScript (especially with frameworks like React, Angular, or Vue)
2.

=== Sample 2 ===
I'm sorry, but I am unable to provide personal contact information. However, I can suggest some popular languages and technologies for website development:

1. JavaScript: A versatile language used for front-end and back-end development.
2. Python: Known for


Run it:

---

### 3. Experimenting with Parameters

- **`temperature`**: Higher values (0.8‚Äì1.0) ‚Üí more creative outputs; lower (0.2‚Äì0.5) ‚Üí more focused.  
- **`max_tokens`**: Limits response length.  
- **`top_p`** (nucleus sampling): Restricts vocabulary to cumulative probability _p_.  
- **`n`**: Number of samples to generate per prompt.

> üí° Try generating three variations of a product description by setting `n_samples=3` and compare diversity.

---

### 4. Advanced: Chat Completions (GPT-3.5 / GPT-4)

OpenAI‚Äôs chat endpoint uses a list of messages:

In [10]:
import os
import openai

client = openai.OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain the benefits of electric vehicles."}
    ],
    max_tokens=150,
    temperature=0.6
)

print(response.choices[0].message.content)


Electric vehicles (EVs) offer several benefits compared to traditional gasoline-powered vehicles, such as:

1. Environmental benefits: EVs produce zero tailpipe emissions, which helps reduce air pollution and greenhouse gas emissions. This can lead to improved air quality and reduced carbon footprint.

2. Cost savings: EVs generally have lower operating costs compared to gasoline vehicles. Electricity is often cheaper than gasoline, and EVs require less maintenance due to fewer moving parts.

3. Energy efficiency: EVs are more energy-efficient than gasoline vehicles, as electric motors convert a higher percentage of energy from the grid to power the vehicle compared to internal combustion engines.

4. Performance: Electric motors provide instant torque, resulting in quick acceleration and a smooth driving experience. EVs also


> üîç Notice how you can steer tone and style with the ‚Äúsystem‚Äù message.

---