Install the `openai` package and locate its path, as well as the path to the `attrs` package.

In [1]:
!pip install openai
!pip show openai

Name: openai
Version: 0.27.2
Summary: Python client library for the OpenAI API
Home-page: https://github.com/openai/openai-python
Author: OpenAI
Author-email: support@openai.com
License: None
Location: /cloud/lib/lib/python3.8/site-packages
Requires: aiohttp, requests, tqdm
Required-by: 


In [3]:
!pip show attrs

Name: attrs
Version: 22.2.0
Summary: Classes Without Boilerplate
Home-page: https://www.attrs.org/
Author: Hynek Schlawack
Author-email: hs@ox.cx
License: MIT
Location: /cloud/lib/lib/python3.8/site-packages
Requires: 
Required-by: aiohttp, jsonschema


In addition to locating the paths for the `openai` and `attrs` packages, you may also want to view all the system paths that are included in the current Python script. This can be useful for troubleshooting issues related to missing modules or packages.

In [4]:
import sys
print(sys.path)

['/cloud/project', '/opt/python/3.8.10/lib/python38.zip', '/opt/python/3.8.10/lib/python3.8', '/opt/python/3.8.10/lib/python3.8/lib-dynload', '', '/cloud/lib/lib/python3.8/site-packages', '/opt/python/3.8.10/lib/python3.8/site-packages']


If the paths to `openai` and `attrs` packages are not included in the list of system paths, add them to the list.

In [5]:
sys.path.append("/cloud/lib/lib/python3.8/site-packages")
sys.path.append("/usr/local/lib/python3.8/dist-packages")

In [6]:
print(sys.path)

['/cloud/project', '/opt/python/3.8.10/lib/python38.zip', '/opt/python/3.8.10/lib/python3.8', '/opt/python/3.8.10/lib/python3.8/lib-dynload', '', '/cloud/lib/lib/python3.8/site-packages', '/opt/python/3.8.10/lib/python3.8/site-packages', '/cloud/lib/lib/python3.8/site-packages', '/usr/local/lib/python3.8/dist-packages']


Provide your OpenAI API key

In [5]:
# Replace "your_api_key_here" with your actual API key
api_key ="your_api_key_here"

Convert the csv data file into a jsonl file for training purposes

In [69]:
import csv
import json

# Open the CSV input file and JSONL output file
with open('qna_chitchat_friendly.csv', 'r') as input_file, open('training_data.jsonl', 'w') as output_file:
    # Create a CSV reader object
    csv_reader = csv.DictReader(input_file)
    # Iterate over the rows in the CSV file and convert each row to a JSON object
    for row in csv_reader:
        # Create a dictionary with the "prompt" and "completion" keys
        # separate prompts and completions using the suggested common separator "\n"
        json_obj = {
            "prompt": row["Question"],
            "completion": row["Answer"]+"\n"
        }
        # Write the JSON object to the output file as a single line in JSONL format
        output_file.write(json.dumps(json_obj) + '\n')



Upload the `training_data.jsonl` file to OpenAI and obtain the assigned file id.

In [70]:
import openai

# Set your OpenAI API key
openai.api_key = api_key

# Upload the file to OpenAI
with open("training_data.jsonl", "rb") as file:
    upload_response = openai.File.create(
        file=file,
        purpose="fine-tune"
    )

# Get the ID of the uploaded file
file_id = upload_response.id

In [71]:
file_id

'file-A5s6c6Kvk9M4yDH3qK8A4yAn'

Choose one from the four pre-trained base models, `ada`, `babbage`, `curie`, or `davinci`, to fine-tune using the uploaded training data file. The `ada` model is the smallest and has the least number of parrameters; the `davinci` model is the largest and most powerful one. If you don't specify a base model to train, the default model selection is `curie`. 

In [72]:
fine_tune_response = openai.FineTune.create(training_file="file-A5s6c6Kvk9M4yDH3qK8A4yAn", model="davinci")

To check the status of the fine-tuned model, look for a `pending` status, which indicates that all the code above has executed successfully.

In [73]:
fine_tune_response.status

'pending'

Please allow 10 to 30 minutes for OpenAI to build the fine-tuned model, keeping in mind that the time required may vary depending on the size of the model being fine-tuned. For instance, `ada` takes the shortest time while `davinci` takes the longest.

Then, log into your OpenAI account, navigate to the `Usage` page in the left-sidebar and locate `Daily usage breakdown` section, and choose the current date. Next, click on `Fine-tune training` and find the model ID as below.

![Fine-tuned model id](https://filedn.com/lJpzjOtA91quQEpwdrgCvcy/Business%20Data%20Mining%20and%20Knowledge%20Discovery/chatGPT/fine-tuned%20model%20id.png)

Set the `model_id` and a new prompt to receive a response from the fine-tuned model.

In [16]:
# Set the ID of the fine-tuned GPT-3 model
model_id = "davinci:ft-personal-2023-03-26-22-11-22"

# Set a new prompt for text generation
prompt = "what's your name?"

# Generate text using the fine-tuned GPT-3 model
response = openai.Completion.create(
    
    engine=model_id, # A variable containing the ID of the fine-tuned GPT-3 model being used
    prompt=prompt, 
    temperature=0.8, # Controls the randomness and creativity of the generated output. 
                     # Lower values will produce more conservative output, while higher 
                     # values will produce more creative output. The default value is 1.0, 
                     # and 0.8 is used in this example.
    
    max_tokens=10,   # The maximum number of tokens (words or sub-words) in the generated output. 
                     # If the model reaches this limit, it will stop generating new output. 
                     # The default value is 2048, and 10 is used in this example.
    
    n=1,             # The number of output texts to generate. The default value is 1, and 1 is used in this example.
    
    stop=None,       # Specifies a stopping sequence for the generated output. When the model 
                     # generates this sequence, it will stop generating new output. The default value is None,
                     # which means the model will continue generating output until it reaches the max_tokens limit.
    
    timeout=30,      # The maximum time in seconds to wait for a response from the OpenAI API. If the API does not 
                     # respond within this time, an exception will be raised. The default value is 60 seconds, and 
                     # 30 is used in this example.
)

In [17]:
print(response.choices[0].text)

Oh, I don't have a name.



You may delete the fine-tuned model if you on longer use it.

In [18]:
openai.Model.delete("davinci:ft-personal-2023-03-26-22-11-22")

<Model model id=davinci:ft-personal-2023-03-26-22-11-22 at 0x7f90f7a4ae00> JSON: {
  "deleted": true,
  "id": "davinci:ft-personal-2023-03-26-22-11-22",
  "object": "model"
}