# ü§ñ Multi-Provider LLM Sandbox ‚Äì Gemini vs OpenAI vs Hugging Face


This notebook demonstrates how to **run the same prompt across three major LLM providers**:

- **Gemini (Google AI Studio)**
- **OpenAI (ChatCompletion)**
- **Hugging Face (Inference API)**

You'll see how different models:
- Interpret the same prompt
- Return different levels of detail, formatting, or factuality
- Can be tuned to converge on a consistent style or tone

---

## üß† Why Do LLM Outputs Differ?

Each model is trained on:
- Different datasets
- Different architectures (transformer variants, attention heads)
- Different alignment objectives (instruction-following, safety, speed)

They also vary by:
- Context length
- Temperature settings
- Output formatting capabilities

---

## üß™ Evaluation Strategy

When comparing LLM outputs:
- ‚úÖ Look for *semantic similarity*: do they express the same core idea?
- üéØ Check *tone and format*: do they suit your intended use case?
- üß† Consider *reasoning depth*: factual density, hallucinations, and logic

You can use:
- Human review (best for policy/qual)
- Cosine similarity (for embeddings)
- Prompt-level scoring (via another LLM)

---


In [None]:

!pip install -q openai google-generativeai requests


## üîê Set API Keys and Imports

In [None]:

import openai
import google.generativeai as genai
import requests

# Set your keys (for demo only ‚Äì use env vars in real use)
openai.api_key = "sk-..."
genai.configure(api_key="your-gemini-api-key")
hf_token = "hf_..."


## üì• Shared Prompt

In [None]:

prompt = "Summarize the risks and benefits of AI in healthcare in 3 bullet points."


## üîÆ Gemini Output

In [None]:

gemini_model = genai.GenerativeModel("gemini-1.5-pro")
gemini_response = gemini_model.generate_content(prompt)
print("Gemini:
", gemini_response.text)


## üß† OpenAI Output

In [None]:

response = openai.ChatCompletion.create(
    model="gpt-4",
    messages=[
        {"role": "system", "content": "You are a concise assistant."},
        {"role": "user", "content": prompt}
    ]
)
print("OpenAI:
", response['choices'][0]['message']['content'])


## ü§ñ Hugging Face Output

In [None]:

headers = {"Authorization": f"Bearer {hf_token}"}
url = "https://api-inference.huggingface.co/models/google/flan-t5-large"

response = requests.post(url, headers=headers, json={"inputs": prompt})
print("Hugging Face:
", response.json())


## üéõÔ∏è Tips to Improve Output Across Providers


| Provider    | Best Practices                                                              |
|-------------|------------------------------------------------------------------------------|
| Gemini      | Use clear structure, few-shot if needed, supports image+text                |
| OpenAI      | Use system message to control tone, try temperature 0.3‚Äì0.7                 |
| Hugging Face| Choose task-specific model (`flan-t5`, `bart`, `bloom`), tune parameters    |

Try:
- Adding `"Respond in JSON format"` or `"Use short bullet points"`
- Breaking large instructions into steps
- Using examples (`few-shot prompting`) to guide structure



---

## ‚úÖ Summary

You‚Äôve now:
- Run side-by-side completions across Gemini, OpenAI, and HF
- Compared format, fluency, and precision
- Learned how to tune for more consistent or tailored output

üìò For more: see `api_inference_quickstart.md` or `compare_gemini_vs_hf.md`
