https://www.geeksforgeeks.org/openai-python-api/
https://platform.openai.com/docs/guides/legacy-fine-tuning

## 0. Install `openAI` library

If needed, install `openAI` with `pip`:

In [1]:
pip install --upgrade openai

Defaulting to user installation because normal site-packages is not writeable
Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip available: 22.3.1 -> 23.2.1
[notice] To update, run: python.exe -m pip install --upgrade pip


## 1. Set OPENAI_API_KEY

Using Python code

In [2]:
import openai

In [3]:
from dotenv import load_dotenv
import os

# Take the variables from the .env file
load_dotenv() 

# Load variable that contains the API from OpenAI
API_KEY = os.environ["OPENAI_API"]

openai.api_key = API_KEY

As environment variable

In [None]:
!export OPENAI_API_KEY="<OPENAI_API_KEY>"

## 2. Prepare training data

Convert CSV file to JSONL file using CLI command

In [None]:
!openai tools fine_tunes.prepare_data -f <LOCAL_FILE>

## 3. Create a fine-tuned model

Start your fine-tuning job using the OpenAI CLI:

In [None]:
!openai api fine_tunes.create -t <TRAIN_FILE_ID_OR_PATH> -m <BASE_MODEL>

After you've started a fine-tune job, it may take some time to complete. Your job may be queued behind other jobs on our system, and training our model can take minutes or hours depending on the model and dataset size. If the event stream is interrupted for any reason, you can resume it by running:

In [None]:
!openai api fine_tunes.follow -i <YOUR_FINE_TUNE_JOB_ID>

Using Python code. First upload train dataset

In [19]:
# uploading the train dataset
openai.File.create(
  file=open("data_prepared_train.jsonl", "rb"),
  purpose='fine-tune'
)

<File file id=file-pMIeIcXJKJM7QOjkVu9OrNBc at 0x206a81677d0> JSON: {
  "object": "file",
  "id": "file-pMIeIcXJKJM7QOjkVu9OrNBc",
  "purpose": "fine-tune",
  "filename": "file",
  "bytes": 51701,
  "created_at": 1694987589,
  "status": "uploaded",
  "status_details": null
}

Then upload validation dataset

In [5]:
# uploading the validation dataset
openai.File.create(
  file=open("data_prepared_valid.jsonl", "rb"),
  purpose='fine-tune'
)

<File file id=file-ikpX0K7C7YEyMGHG4hBOZNsb at 0x206a8166750> JSON: {
  "object": "file",
  "id": "file-ikpX0K7C7YEyMGHG4hBOZNsb",
  "purpose": "fine-tune",
  "filename": "file",
  "bytes": 12545,
  "created_at": 1694986169,
  "status": "uploaded",
  "status_details": null
}

Now we will fine-tune the model

In [20]:
# call the FineTune API endpoint
res = openai.FineTune.create(
    # training file ID
    training_file ="file-pMIeIcXJKJM7QOjkVu9OrNBc",
    # validation file ID
    validation_file="file-ikpX0K7C7YEyMGHG4hBOZNsb",
    # model ID (model must be one of ada, babbage, curie, davinci)
    model= "curie",
    # number of epochs
    n_epochs =3,
    # batch size to process
    batch_size= 4,
    # adjust the learning rate of the model
    learning_rate_multiplier= 0.1333,
    # add a suffix to identify the model
    suffix = "aiBT"
)

# storing the job_id of the process
jobID = res["id"]
# storing the status of the process
status = res["status"]

# Fine-tunning model with jobID:
print(f'Fine-tunning model with jobID: {jobID}.')
print(f"Training Response: {res}")
print(f"Training Status: {status}")


Fine-tunning model with jobID: ft-8Q4A7D7ZpMZ0xH0nv44eqO6t.
Training Response: {
  "object": "fine-tune",
  "id": "ft-8Q4A7D7ZpMZ0xH0nv44eqO6t",
  "hyperparams": {
    "n_epochs": 3,
    "batch_size": 4,
    "prompt_loss_weight": 0.01,
    "learning_rate_multiplier": 0.01
  },
  "organization_id": "org-nWYnYxLO4AK96AchxEmHy9Bw",
  "model": "curie",
  "training_files": [
    {
      "object": "file",
      "id": "file-pMIeIcXJKJM7QOjkVu9OrNBc",
      "purpose": "fine-tune",
      "filename": "file",
      "bytes": 51701,
      "created_at": 1694987589,
      "status": "uploaded",
      "status_details": null
    }
  ],
  "validation_files": [
    {
      "object": "file",
      "id": "file-ikpX0K7C7YEyMGHG4hBOZNsb",
      "purpose": "fine-tune",
      "filename": "file",
      "bytes": 12545,
      "created_at": 1694986169,
      "status": "processed",
      "status_details": null
    }
  ],
  "result_files": [],
  "created_at": 1694987743,
  "updated_at": 1694987743,
  "status": "pendi

Sometimes the process fails. We can check this in 'status' tag of the response. In 'status_details' we can get more details aout the process. A common error is the string-formatting in prompts or completions. We can check them using print command to see whether the text shows on the screen.

After correct any mistake, we have to upload the file(s) again.

In [18]:
print("Historical Background: paragraph 1: The epic poem Beowulf, written in Old English, is the earliest existing Germanic epic and one of four surviving Anglo-Saxon manuscripts. Although Beowulf was written by an anonymous Englishman in Old English, the tale takes place in that part of Scandinavia from which Germanic tribes emigrated to England. Beowulf comes from Geatland, the southeastern part of what is now Sweden. Hrothgar, king of the Danes, lives near what is now Leire, on Zealand, Denmark's largest island. The Beowulf epic contains three major: tales about Beowulf and several minor tales that reflect a rich Germanic oral tradition of myths, legends, and folklore. paragraph 2: The Beowulf warriors have a foot in both the Bronze and Iron Ages. Their mead-halls reflect the wealthy living of the Bronze Age Northmen, and their wooden shields, wood-shafted spears, and bronze-hilted swords are those of the Bronze Age warrior. However, they carry iron-tipped spears, and their best swords have iron or iron-edged blades. Beowulf also orders an iron shield for his fight with a dragon. Iron replaced bronze because it produced a blade with a cutting edge that was stronger and sharper. The Northmen learned how to forge iron in about 500 s.c. Although they had been superior to the European Celts in bronze work, it was the Celts who taught them how to make and design iron work. Iron was accessible everywhere in Scandinavia, usually in the form of 'bog-iron' found in the layers of peat in peat bogs. paragraph 3: The Beowulf epic also reveals interesting aspects of the lives of the Anglo-Saxons who lived in England at the time of the anonymous Beowulf poet. The Germanic tribes, including the Angles, the Saxons, and the Jutes, invaded England from about A.O. 450 to 600. By the time of the Beowulf poet, Anglo-Saxon society in England was neither primitive nor uncultured. paragraph 4: Although the Beowulf manuscript was written in about A.O. 1000, it was not discovered until the seventeenth century. Scholars do not know whether Beowulf is the sole surviving epic from a flourishing Anglo-Saxon literary period that produced other great epics or whether it was unique even in its own time. Many scholars think that the epic was probably written sometime between the late seventh century and the early ninth century. If they are correct, the original manuscript was probably lost during the ninth-century Viking invasions of Anglia, in which the Danes destroyed the Anglo-Saxon monasteries and their great libraries. However, other scholars think that the poet's favorable attitude toward the Danes must place the epic's composition after the Viking invasions and at the start of the eleventh century, when this Beowulf manuscript was written. paragraph 5: The identity of the Beowulf poet is also uncertain. He apparently was a Christian who loved the pagan heroic tradition of his ancestors and blended the values of the pagan hero with the Christian values of his own country and time. Because he wrote in the Anglian dialect, he probably was either a monk in a monastery or a poet in an Anglo-Saxon court located north of the Thames River. Appeal and Value paragraph 6: Beowulf interests contemporary readers for many reasons. First, it is an outstanding adventure story. Grendel, Grendel's mother, and the dragon are marvelous characters, and each fight is unique, action-packed, and exciting. Second, Beowulf is a very appealing hero. He is the perfect warrior, combining extraordinary strength, skill, courage, and loyalty. Like Hercules, he devotes his life to making the world a safer place. He chooses to risk death in order to help other people, and he faces his inevitable death with heroism and dignity. Third, the Beowulf poet is interested in the psychological aspects of human behavior. For example, the Danish hero's welcoming speech illustrates his jealousy of Beowulf. The behavior of Beowulf's warriors in the dragon fight reveals their cowardice. Beowulf's attitudes toward heroism reflect his maturity and experience, while King Hrothgar's attitudes toward life show the experiences of an aged nobleman. paragraph 7: Finally, the Beowulf poet exhibits a mature appreciation of the transitory nature of human life and achievement. In Beowulf, as in the major epics of other cultures, the hero must create a meaningful life in a world that is often dangerous and uncaring. He must accept the inevitability of death. He chooses to reject despair; instead, he takes pride in himself and in his accomplishments, and he values human relationships.")

Historical Background: paragraph 1: The epic poem Beowulf, written in Old English, is the earliest existing Germanic epic and one of four surviving Anglo-Saxon manuscripts. Although Beowulf was written by an anonymous Englishman in Old English, the tale takes place in that part of Scandinavia from which Germanic tribes emigrated to England. Beowulf comes from Geatland, the southeastern part of what is now Sweden. Hrothgar, king of the Danes, lives near what is now Leire, on Zealand, Denmark's largest island. The Beowulf epic contains three major: tales about Beowulf and several minor tales that reflect a rich Germanic oral tradition of myths, legends, and folklore. paragraph 2: The Beowulf warriors have a foot in both the Bronze and Iron Ages. Their mead-halls reflect the wealthy living of the Bronze Age Northmen, and their wooden shields, wood-shafted spears, and bronze-hilted swords are those of the Bronze Age warrior. However, they carry iron-tipped spears, and their best swords hav

To know the status of your fine-tuning task

In [34]:
import time
# retrieve the status of fine-tuning from OpenAI
status = openai.FineTune.retrieve(id=jobID)["status"]

# check if status is not "succeeded" or "failed" 
if status not in ["succeeded", "failed"]:
  print(f'Job not in terminal status: {status}. Waiting.')
else:
  print(f'Finetune job {jobID} finished with status: {status}')

# also print all the fine-tuning jobs
# print('Checking other finetune jobs in the subscription.')
# result = openai.FineTune.list()
# print(f'Found {len(result.data)} finetune jobs.')

Finetune job ft-8Q4A7D7ZpMZ0xH0nv44eqO6t finished with status: succeeded


## 4. Use a fine-tuned model

CLI commands to use the model

In [None]:
!openai api completions.create -m <FINE_TUNED_MODEL> -p <YOUR_PROMPT>

We can check information of the model using the following commands in Python

In [36]:
result = openai.FineTune.retrieve(id=jobID)
fine_tuned_model  = result.fine_tuned_model
print(fine_tuned_model)

curie:ft-personal:aibt-2023-09-17-22-27-34


Now is time to use the fine-tuned model!!!

In [15]:
# text prompt
new_prompt = 'Generate a reading passage about XXXXXXX with TOEFL iBT format ->'

# calling the Completion API
answer = openai.Completion.create(
    # model ID collected from previous step is supposed to be passed here
    model=fine_tuned_model,
    # text input is passed here
    prompt=new_prompt,
    # Max tokens used for the completions
    max_tokens = 1000,
    # In this case, a high temperature is not a good idea
    temperature=0.5,
    # Stop sequence
    stop=["END"]
)
# displaying the result
print(answer['choices'][0]['text'])


 El Curso de Python para Desarrolladores de Platzi es perfecto para ti. Aprenderás desde cero hasta llegar a ser un experto en este lenguaje de programación y desarrollo. 
