In [None]:
!pip install openai



In [None]:
import json
import openai

# How to fine-tune a GPT-3 model for specific prompts

I'm constantly looking for ways to automate the work with support requests. An idea has been to fine-tune a GPT-3 model to answer common support-related questions.

**Here's how you can fine-tune a GPT-3 model with Python with your own data.**

In this walkthrough, we'll fine-tune a GPT-3 model to answer common support-related questions.

Detailed step-by-step intructions for this repo in this blog post: https://norahsakal.com/blog/fine-tune-gpt3-model

>### Disclaimer
>This guide walks you through fine-tuning a GPT-3 model in Python, shown in a Jupyter notebook.
>If you're looking for the steps of fine-tuning right in a terminal, [OpenAI has a great guide for fine-tuning in your terminal](https://beta.openai.com/docs/guides/fine-tuning "fine-tuning in terminal").

# Define OpenAI API keys

In [None]:
# Ganti API_Key nya, jangan pake API_Key kuu😤
# Jangan di commit ketika ada api_key nya, bakalan error di github
api_key ="YOUR_API_KEY"
openai.api_key = api_key

# Create training data

Make sure to end each `prompt` with a suffix. According to the [OpenAI API reference](https://beta.openai.com/docs/guides/fine-tuning "fine-tuning reference"), you can use ` ->`.

Also, make sure to end each `completion` with a suffix as well; I'm using `.\n`.

In [None]:
data_file = [
    {
        "prompt": "Apa itu inflasi? ->",
        "completion": "Inflasi adalah kenaikan harga barang dan jasa secara umum dan terus-menerus dalam suatu periode waktu tertentu.\n"
    },
    {
        "prompt": "Bagaimana cara mengatur anggaran bulanan? ->",
        "completion": "Untuk mengatur anggaran bulanan, mulailah dengan mencatat semua pendapatan dan pengeluaran, kemudian buatlah kategori pengeluaran dan tetapkan batas pengeluaran untuk setiap kategori.\n"
    },
    {
        "prompt": "Apa itu saham? ->",
        "completion": "Saham adalah bukti kepemilikan seseorang atau badan terhadap sebuah perusahaan. Dengan memiliki saham, investor berhak atas bagian dari keuntungan perusahaan tersebut.\n"
    },
    {
        "prompt": "Apa perbedaan antara saham dan obligasi? ->",
        "completion": "Saham adalah surat berharga yang menunjukkan kepemilikan sebagian dari perusahaan, sedangkan obligasi adalah surat utang yang diterbitkan oleh perusahaan atau pemerintah sebagai bentuk pinjaman.\n"
    },
    {
        "prompt": "Bagaimana cara berinvestasi di pasar saham? ->",
        "completion": "Untuk berinvestasi di pasar saham, Anda perlu membuka akun di perusahaan sekuritas, menyetor dana, dan mulai membeli saham melalui platform trading yang disediakan.\n"
    },
    {
        "prompt": "Apa itu diversifikasi investasi? ->",
        "completion": "Diversifikasi investasi adalah strategi menyebar investasi ke berbagai jenis aset untuk mengurangi risiko. Dengan diversifikasi, kerugian pada satu aset dapat ditutupi oleh keuntungan dari aset lain.\n"
    },
    {
        "prompt": "Apa itu return on investment (ROI)? ->",
        "completion": "Return on Investment (ROI) adalah rasio yang menunjukkan seberapa besar keuntungan yang diperoleh dari investasi dibandingkan dengan biaya investasi tersebut.\n"
    },
    {
        "prompt": "Bagaimana cara menghitung ROI? ->",
        "completion": "ROI dapat dihitung dengan rumus: (Keuntungan dari investasi - Biaya investasi) / Biaya investasi x 100%.\n"
    },
    {
        "prompt": "Apa yang dimaksud dengan aset likuid? ->",
        "completion": "Aset likuid adalah aset yang mudah dan cepat diubah menjadi uang tunai tanpa kehilangan nilai yang signifikan, seperti uang tunai dan rekening tabungan.\n"
    },
    {
        "prompt": "Apa itu risiko investasi? ->",
        "completion": "Risiko investasi adalah kemungkinan bahwa hasil investasi akan berbeda dari yang diharapkan, termasuk kemungkinan kehilangan sebagian atau seluruh investasi.\n"
    }
]

# Save dict as JSONL

Training data need to be a JSONL document.
JSONL file is a newline-delimited JSON file.
More info about JSONL: https://jsonlines.org/

In [None]:
file_name = "training_data.jsonl"

with open(file_name, 'w') as outfile:
    for entry in data_file:
        json.dump(entry, outfile)
        outfile.write('\n')

# Check JSONL file

In [None]:
!openai tools fine_tunes.prepare_data -f training_data.jsonl

Analyzing...

- Your file contains 10 prompt-completion pairs. In general, we recommend having at least a few hundred examples. We've found that performance tends to linearly increase for every doubling of the number of examples
- All prompts end with suffix `? ->`
- All completions end with suffix `.\n`
- The completion should start with a whitespace character (` `). This tends to produce better results due to the tokenization we use. See https://platform.openai.com/docs/guides/fine-tuning/preparing-your-dataset for more details

Based on the analysis we will perform the following actions:
- [Recommended] Add a whitespace character to the beginning of the completion [Y/n]: Y


Your data will be written to a new JSONL file. Proceed [Y/n]: Y

Wrote modified file to `training_data_prepared (2).jsonl`
Feel free to take a look!

Now use that file when fine-tuning:
> openai api fine_tunes.create -t "training_data_prepared (2).jsonl"

After you’ve fine-tuned a model, remember that your promp

# Upload file to your OpenAI account

In [None]:
# !pip install openai==0.28

In [None]:
upload_response = openai.File.create(
  file=open(file_name, "rb"),
  purpose='fine-tune'
)
upload_response

In [None]:
print(upload_response)

In [None]:
# !upload_response = openai.File.create(
#   file=open(file_name, "rb"),
#   purpose='fine-tune'
# )
# upload_response

# Save file name

In [None]:
file_id = upload_response.id
file_id

In [None]:
# !pip install --upgrade openai

# Fine-tune a model

The default model is **Curie**.

If you'd like to use **DaVinci** instead, then add it as a base model to fine-tune:

```openai.FineTune.create(training_file=file_id, model="davinci")```

In [None]:
fine_tune_response = openai.FineTune.create(training_file=file_id)
fine_tune_response

# Check fine-tune progress

Check the progress with `openai.FineTune.list_events(id=fine_tune_response.id)` and get a list of all the fine-tuning events

In [None]:
fine_tune_events = openai.FineTune.list_events(id=fine_tune_response.id)
fine_tune_events

Check the progress with `openai.FineTune.retrieve(id=fine_tune_response.id)` and get an object with the fine-tuning job data

In [None]:
retrieve_response = openai.FineTune.retrieve(id=fine_tune_response.id)
retrieve_response

# Save fine-tuned model

### Troubleshooting fine_tuned_model as null
During the fine-tuning process, the **fine_tuned_model** key may not be immediately available in the fine_tune_response object returned by `openai.FineTune.create()`.

To check the status of your fine-tuning process, you can call the `openai.FineTune.retrieve()` function and pass in the **fine_tune_response.id**. This function will return a JSON object with information about the training status, such as the current epoch, the current batch, the training loss, and the validation loss.

After the fine-tuning process is complete, you can check the status of all your fine-tuned models by calling `openai.FineTune.list()`. This will list all of your fine-tunes and their current status.

Once the fine-tuning process is complete, you can retrieve the fine_tuned_model key by calling the `openai.FineTune.retrieve()` function again and passing in the fine_tune_response.id. This will return a JSON object with the key fine_tuned_model and the ID of the fine-tuned model that you can use for further completions.

### Option 1

If `fine_tune_response.fine_tuned_model != None` then the key **fine_tuned_model** is availble from the fine_tune_response object

In [None]:
if fine_tune_response.fine_tuned_model != None:
    fine_tuned_model = fine_tune_response.fine_tuned_model

### Option 2

If `fine_tune_response.fine_tuned_model == None:` you can get the **fine_tuned_model** by listing all fine-tune events

In [None]:
if fine_tune_response.fine_tuned_model == None:
    fine_tune_list = openai.FineTune.list()
    fine_tuned_model = fine_tune_list['data'][0].fine_tuned_model

### Option 3

If `fine_tune_response.fine_tuned_model == None:` you can get the **fine_tuned_model** key by retrieving the fine-tune job

In [None]:
if fine_tune_response.fine_tuned_model == None:
    fine_tuned_model = openai.FineTune.retrieve(id=fine_tune_response.id).fine_tuned_model

# Test the new model on a new prompt

Remember to end the prompt with the same suffix as we used in the training data; ` ->`:

In [None]:
new_prompt = "NEW PROMPT ->"

In [None]:
answer = openai.Completion.create(
  model=fine_tuned_model,
  prompt=new_prompt,
  max_tokens=10, # Change amount of tokens for longer completion
  temperature=0
)
answer['choices'][0]['text']