# Setup Guide: API Configuration for ADS 525

This notebook covers account creation and API setup for OpenAI and HuggingFace. Complete these steps before working on course notebooks.

---

## 1. OpenAI API Setup

OpenAI provides access to GPT models (GPT-3.5, GPT-4) through their API. You will need this for text generation and classification tasks.

### Step 1: Create an OpenAI Account

1. Go to https://platform.openai.com/signup
2. Sign up with email or Google/Microsoft account
3. Verify your email address
4. Add payment method (required for API access)

**Note:** OpenAI provides $5 in free credits for new accounts. After that, you pay per token used.

**Pricing:** https://openai.com/pricing

### Step 2: Generate API Key

1. Log in to https://platform.openai.com
2. Click your profile icon (top-right) > **API keys**
3. Click **Create new secret key**
4. Name your key (e.g., "ADS525")
5. Copy the key immediately (you cannot view it again)
6. Store it securely

**Direct link:** https://platform.openai.com/api-keys

### Step 3: Install OpenAI Library

Run this cell to install the OpenAI Python library:

In [None]:
!pip install openai

### Step 4: Configure API Key

**Option A: Direct in Code (for testing only)**

Replace `YOUR_API_KEY_HERE` with your actual key:

In [None]:
import openai

client = openai.OpenAI(api_key="YOUR_API_KEY_HERE")

**Option B: Environment Variable (recommended)**

Set your API key as an environment variable:

In [None]:
import os
import openai

# Set the API key
os.environ["OPENAI_API_KEY"] = "YOUR_API_KEY_HERE"

# Client will automatically use the environment variable
client = openai.OpenAI()

**Option C: Google Colab Secrets (best for Colab)**

1. In Colab, click the key icon (🔑) in the left sidebar
2. Click **Add new secret**
3. Name: `OPENAI_API_KEY`
4. Value: Your API key
5. Toggle on notebook access

In [None]:
# For Google Colab
from google.colab import userdata
import openai

client = openai.OpenAI(api_key=userdata.get('OPENAI_API_KEY'))

### Step 5: Test Your Setup

Run this cell to verify your API key works:

In [None]:
try:
    response = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[{"role": "user", "content": "Say 'API setup successful'"}],
        max_tokens=10
    )
    print("Success:", response.choices[0].message.content)
except Exception as e:
    print("Error:", e)

### OpenAI Documentation

- **API Reference:** https://platform.openai.com/docs/api-reference
- **Quickstart Guide:** https://platform.openai.com/docs/quickstart
- **Python Library:** https://github.com/openai/openai-python
- **Usage Dashboard:** https://platform.openai.com/usage
- **Rate Limits:** https://platform.openai.com/docs/guides/rate-limits

---

## 2. HuggingFace Setup

HuggingFace hosts thousands of pre-trained models and datasets. Most models are free to use.

### Step 1: Create a HuggingFace Account

1. Go to https://huggingface.co/join
2. Sign up with email or Google/GitHub account
3. Verify your email address

**Note:** Account creation is free. No payment method required.

### Step 2: Generate Access Token

An access token allows you to:
- Download private models
- Access gated models (like Llama)
- Upload models/datasets
- Avoid rate limits

**Steps:**
1. Log in to https://huggingface.co
2. Click your profile picture > **Settings**
3. Go to **Access Tokens** in the left sidebar
4. Click **New token**
5. Name: "ADS525"
6. Role: **Read** (sufficient for this course)
7. Click **Generate**
8. Copy the token

**Direct link:** https://huggingface.co/settings/tokens

### Step 3: Install HuggingFace Libraries

Run this cell to install required libraries:

In [None]:
!pip install transformers datasets sentence-transformers huggingface_hub

### Step 4: Authenticate with HuggingFace

**Option A: Login via Terminal (one-time)**

Run this once to store credentials permanently:

In [None]:
from huggingface_hub import login

# This will prompt for your token
login()

**Option B: Environment Variable**

In [None]:
import os

os.environ["HF_TOKEN"] = "YOUR_HF_TOKEN_HERE"

**Option C: Google Colab Secrets (best for Colab)**

1. In Colab, click the key icon (🔑)
2. Click **Add new secret**
3. Name: `HF_TOKEN`
4. Value: Your token
5. Toggle on notebook access

In [None]:
# For Google Colab
from google.colab import userdata
from huggingface_hub import login

login(token=userdata.get('HF_TOKEN'))

### Step 5: Test Your Setup

Run this cell to verify access:

In [None]:
from transformers import pipeline

try:
    # Load a small model
    classifier = pipeline("sentiment-analysis", model="distilbert-base-uncased-finetuned-sst-2-english")
    result = classifier("HuggingFace setup successful")
    print("Success:", result)
except Exception as e:
    print("Error:", e)

### HuggingFace Documentation

- **Models Hub:** https://huggingface.co/models
- **Datasets Hub:** https://huggingface.co/datasets
- **Transformers Library:** https://huggingface.co/docs/transformers
- **Datasets Library:** https://huggingface.co/docs/datasets
- **Sentence Transformers:** https://www.sbert.net/
- **Hub API:** https://huggingface.co/docs/huggingface_hub

---

## 3. Google Colab Specific Instructions

### Enable GPU

Most course notebooks require a GPU for reasonable runtime.

**Steps:**
1. Go to **Runtime** > **Change runtime type**
2. **Hardware accelerator:** Select **GPU**
3. **GPU type:** Select **T4** (free tier)
4. Click **Save**

Verify GPU is available:

In [None]:
import torch

if torch.cuda.is_available():
    print(f"GPU available: {torch.cuda.get_device_name(0)}")
    print(f"CUDA version: {torch.version.cuda}")
else:
    print("No GPU available. Using CPU.")

### Persistent Storage (Optional)

Mount Google Drive to save models and data between sessions:

In [None]:
from google.colab import drive

drive.mount('/content/drive')
print("Drive mounted at /content/drive")

---

## 4. Troubleshooting

### OpenAI Errors

**Error:** `AuthenticationError: Incorrect API key`
- Check your API key is correct (no extra spaces)
- Verify key has not been revoked at https://platform.openai.com/api-keys

**Error:** `RateLimitError: Rate limit exceeded`
- You are making too many requests
- Add delays between API calls: `import time; time.sleep(1)`
- Check rate limits: https://platform.openai.com/account/rate-limits

**Error:** `InsufficientQuotaError: You exceeded your current quota`
- Your free credits are exhausted
- Add payment method at https://platform.openai.com/account/billing

**Error:** `InvalidRequestError: max_tokens is too large`
- Reduce `max_tokens` parameter
- Check model limits: https://platform.openai.com/docs/models

### HuggingFace Errors

**Error:** `OSError: [Errno 403] Forbidden`
- Model requires authentication
- Run `huggingface-cli login` or use token

**Error:** `OSError: [Errno 404] Model not found`
- Check model name spelling
- Verify model exists: https://huggingface.co/models

**Error:** `OutOfMemoryError: CUDA out of memory`
- Model is too large for available GPU memory
- Use a smaller model
- Reduce batch size
- Use `device="cpu"` instead of GPU

**Error:** `RepositoryNotFoundError`
- Check repository name
- Model may be private (requires authentication)

### General Python Errors

**Error:** `ModuleNotFoundError: No module named 'X'`
- Run `!pip install X`
- Restart runtime after installation

**Error:** `RuntimeError: CUDA error: device-side assert triggered`
- Usually an indexing error in model code
- Try running on CPU to get better error message: `device="cpu"`

---

## 5. Cost Management

### OpenAI Cost Tips

- Use `gpt-3.5-turbo` instead of `gpt-4` (20x cheaper)
- Set `max_tokens` to limit response length
- Cache results to avoid repeated API calls
- Monitor usage: https://platform.openai.com/usage
- Set usage limits: https://platform.openai.com/account/limits

**Example pricing (as of 2024):**
- GPT-3.5-turbo: $0.0015 per 1K tokens
- GPT-4: $0.03 per 1K tokens (input)

**Estimate tokens:** ~750 words = 1000 tokens

### HuggingFace (Free)

- All public models and datasets are free
- No API costs
- Only costs are compute (your GPU/CPU)

---

## 6. Additional Resources

### Useful Libraries

- **transformers:** Pre-trained models (BERT, GPT, etc.)
- **datasets:** Access to 10,000+ datasets
- **sentence-transformers:** Sentence embeddings
- **openai:** OpenAI API client
- **langchain:** Framework for LLM applications
- **tiktoken:** OpenAI tokenizer

Install all at once:

In [None]:
!pip install transformers datasets sentence-transformers openai langchain tiktoken accelerate

### Community Resources

- **HuggingFace Forums:** https://discuss.huggingface.co/
- **OpenAI Community:** https://community.openai.com/
- **Stack Overflow:** https://stackoverflow.com/questions/tagged/transformers
- **Course GitHub:** https://github.com/hemekci/ADS525

---

## Setup Complete

If both test cells ran successfully, you are ready to start the course notebooks. Refer back to this guide if you encounter authentication or API issues.