# Fine tuning the model to make a chat bot

This is the big guacamole at the end of the rainbow. We'll be fine tuning one of the OpenAI models to be able to respond sort of like ChatGPT. I think there's an example of trying to do this on the foundation model in `openai.ipynb` without fine-tuning, and right now it _sucks_.

In [12]:
import import_ipynb
import openai # type:ignore
import gpt # type:ignore
import torch
import urllib
import ssl
import os
import json
from pprint import pprint, PrettyPrinter

def get_device() -> torch.device:
    if torch.cuda.is_available(): # type: ignore[attr-defined]
        return torch.device("cuda")
    elif torch.backends.mps.is_available(): # type: ignore[attr-defined]
        return torch.device("mps:0")
    else:
        return torch.device("cpu")

## Download the instruction training data

This is 1,100 instruction-response pairs (actually some have a third field called input) that were made specifically for the book.

In [15]:
def download_and_load_file(file_path, url):
    ssl_context = ssl.create_default_context()
    ssl_context.check_hostname = False
    ssl_context.verify_mode = ssl.CERT_NONE

    if not os.path.exists(file_path):
        with urllib.request.urlopen(url, context=ssl_context) as response: # type:ignore
            text_data = response.read().decode("utf-8")
        with open(file_path, "w", encoding="utf-8") as file:
            file.write(text_data)
    else:
        with open(file_path, "r", encoding="utf-8") as file:
            text_data = file.read()

    with open(file_path, "r", encoding="utf-8") as file:
        data = json.load(file)
    
    return data

file_path = "instruction-data.json"
url = (
    "https://raw.githubusercontent.com/rasbt/LLMs-from-scratch"
    "/main/ch07/01_main-chapter-code/instruction-data.json"
)

data = download_and_load_file(file_path, url)
print("Number of entries:", len(data))
print("Example:")
pprint(data[1])

Number of entries: 1100
Example:
{'input': 'He go to the park every day.',
 'instruction': 'Edit the following sentence for grammar.',
 'output': 'He goes to the park every day.'}


## Convert the examples to Stanford Alpaca format

The [format](https://github.com/tatsu-lab/stanford_alpaca) looks like this:

```
Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{instruction}

### Input:
{input}

### Response:
```

Or, if there's no input:

```
Below is an instruction that describes a task. Write a response that appropriately completes the request.

### Instruction:
{instruction}

### Response:
```