# Generating Synthetic Data with OpenAI

In this notebook, we present different sections that use specific prompts to generate the synthetic data for fine-tuning our model.

In [1]:
!pip install openai

Collecting openai
  Downloading openai-1.35.12-py3-none-any.whl (328 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m328.4/328.4 kB[0m [31m2.9 MB/s[0m eta [36m0:00:00[0m
Collecting httpx<1,>=0.23.0 (from openai)
  Downloading httpx-0.27.0-py3-none-any.whl (75 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m75.6/75.6 kB[0m [31m4.2 MB/s[0m eta [36m0:00:00[0m
Collecting httpcore==1.* (from httpx<1,>=0.23.0->openai)
  Downloading httpcore-1.0.5-py3-none-any.whl (77 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m77.9/77.9 kB[0m [31m4.2 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting h11<0.15,>=0.13 (from httpcore==1.*->httpx<1,>=0.23.0->openai)
  Downloading h11-0.14.0-py3-none-any.whl (58 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m58.3/58.3 kB[0m [31m3.2 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: h11, httpcore, httpx, openai
Successfully installed h11-0.14.0 httpcore-1.0.5 h

## Loglines and Titles

In [3]:
from openai import OpenAI

client = OpenAI(api_key='API_KEY')

In [4]:
systemPrompt = """
You are responsible for generating synthetic data for a script-writing task. Be creative in your approach and add diversity to the examples by considering \
 various genres, different audience types, settings at different locations around the world and people from all walks of life.
"""

In [15]:
userPrompt = """
I want your help in creating several examples of user-provided log lines and the corresponding movie or story titles that would be appropriate for that log line. \
I have provided some examples below in JSON format:

{ \"logline\": \"A historical drama centered around the life of a young woman in Victorian England who defies societal norms to become a pioneering doctor. Her journey is fraught with challenges as she faces gender discrimination, personal sacrifices, and the struggle to gain acceptance in a male-dominated field.\", \"title\": \"Against All Odds\"}
{ \"logline\": \"A fantasy story set in a magical kingdom where a young farm girl discovers she has the power to communicate with dragons. As tensions rise between humans and dragons, she must navigate political intrigue and ancient prophecies to unite the two factions and save her world from impending doom.\", \"title\": \"Dragon Whisperer\"}
{ \"logline\": \"A modern-day thriller where a brilliant but troubled detective must solve a series of cryptic murders that are linked to an underground network of hackers. As he delves deeper into the case, he uncovers a conspiracy that threatens national security and must race against time to prevent a catastrophe.\", \"title\": \"Code of Silence\"}

I want you to create five examples and provide the output in the same JSON format as above. Please give me only the examples and nothing else as any other sentences will hurt the downstream processing of the examples."
"""

In [22]:
chatOutput = client.chat.completions.create(model="gpt-4o",
                                            messages=[{"role": "system", "content": systemPrompt},
                                                      {"role": "user", "content": userPrompt}
                                                      ]
                                            )

In [32]:
import json

examples = json.loads(chatOutput.choices[0].message.content)

In [36]:
## Some post-processing to adapt the examples to the JSONL format - the question prompt changes for each section/type of examples
dataset = []
for example in examples:
  data = {}
  data["question"] = "You are a scriptwriter assistant with a flair for creativity and imagination. The user will provide you with a log line that typically contain the setting, protagonist, antagonist, a conflict or goal, and sometimes the inciting incident. Based on the log line, please come up with a title suggestion for the story."
  data["context"] = example['logline']
  data["answer"] = example['title']
  dataset.append(data)

In [43]:
for item in dataset:
  print (json.dumps(item))

{"question": "You are a scriptwriter assistant with a flair for creativity and imagination. The user will provide you with a log line that typically contain the setting, protagonist, antagonist, a conflict or goal, and sometimes the inciting incident. Based on the log line, please come up with a title suggestion for the story.", "context": "A sci-fi adventure where a team of astronauts embarks on a mission to terraform a distant planet. As they face unforeseen challenges and mysterious alien phenomena, they must confront their own personal demons and work together to survive and create a new home for humanity.", "answer": "Genesis Frontier"}
{"question": "You are a scriptwriter assistant with a flair for creativity and imagination. The user will provide you with a log line that typically contain the setting, protagonist, antagonist, a conflict or goal, and sometimes the inciting incident. Based on the log line, please come up with a title suggestion for the story.", "context": "A roman

## Type 1 Prompts - Loglines and Characters

In this type of prompt we want to generate multiple examples where a logline is provided as input and based on this the characters for that story are generated. We can formalize and enforce the data structure using the Instructor library. The Instructor library in general can be used for any structure extraction with API-based LLMs.

In [1]:
!pip install instructor

Collecting instructor
  Downloading instructor-1.3.4-py3-none-any.whl.metadata (14 kB)
Collecting docstring-parser<0.17,>=0.16 (from instructor)
  Downloading docstring_parser-0.16-py3-none-any.whl.metadata (3.0 kB)
Collecting jiter<0.5.0,>=0.4.1 (from instructor)
  Downloading jiter-0.4.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.6 kB)
Downloading instructor-1.3.4-py3-none-any.whl (53 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m53.1/53.1 kB[0m [31m1.4 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading docstring_parser-0.16-py3-none-any.whl (36 kB)
Downloading jiter-0.4.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (327 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m327.6/327.6 kB[0m [31m7.6 MB/s[0m eta [36m0:00:00[0mta [36m0:00:01[0m
[?25hInstalling collected packages: jiter, docstring-parser, instructor
Successfully installed docstring-parser-0.16 instructor-1.3.4 jiter-0.4.2

[1m[[0m[34;49

In [81]:
## Making use of the Instructor library to generate structured examples

import instructor

from typing import Iterable, List, Tuple
from pydantic import BaseModel, Field, ConfigDict
from openai import OpenAI


# Define the UserDetail model
class storyCharacters(BaseModel):
    """Class to hold generated values for Characters in the story described by the logline"""
    logline: str
    characters: List[str]
    
    model_config = ConfigDict(
        json_schema_extra={
            "examples": [
                {
                    "logline": "Ancient Greek tragedy based upon the myth of Jason and Medea. Medea, a former princess and the wife of Jason, finds her position in the Greek world threatened as Jason leaves Medea for a Greek princess of Corinth. Medea takes vengeance on Jason by murdering his new wife as well as Medea's own two sons, after which she escapes to Athens.", 
                    "characters": [
                        ("Medea is the protagonist of the play. A sorceress and a princess, she fled her country and family to live with Jason in Corinth, where they established a family of two children and gained a favorable reputation. Jason has divorced Medea and taken up with a new family."),
                        ("Jason is considered the play's villain, though his evil stems more from weakness than strength. A former adventurer, Jason abandons his wife, Medea, in order to marry the beautiful young daughter of Creon, King of Corinth, and fuels Medea to a revenge."),
                        ("The Women of Corinth are a commentator to the action. They fully sympathizes with Medea's plight, excepting her decision to murder her own children."),
                        ("Creon is the King of Corinth, banishes Medea from the city"),
                        ("The Nurse is the caretaker of the house and of the children and serves as Medea's confidant.")
                 ]
                }
            ]
        }
    )

In [82]:
# Patch the OpenAI client to enable the response_model functionality
client = instructor.from_openai(OpenAI(api_key="API_KEY"))

def generate_character_descriptions(count: int) -> Iterable[storyCharacters]:
    return client.chat.completions.create(
        model="gpt-4o",
        response_model=Iterable[storyCharacters],
        messages=[
            {"role": "user", "content": f"You are responsible for generating synthetic data for a script-writing task. Be creative in your approach and add diversity to the examples by considering \
                                        various genres, different audience types, settings at different locations around the world and people from all walks of life. \
                                        Generate `{count}` synthetic examples"},
        ],
    )

In [None]:
type1_prompts = []
for charDescription in generate_character_descriptions(5):
    print(charDescription)
    type1_prompts.append(charDescription)

In [27]:
## Some post-processing to adapt the examples to the JSONL format - the question prompt changes for each section/type of examples
dataset = []
for example in type1_prompts:
  data = {}
  data["question"] = "You are a scriptwriter assistant with a flair for creativity and imagination. You have to generate a list of characters based on the log line."
  data["context"] = example.logline
  data["answer"] = " ".join(example.characters)
  dataset.append(data)

In [34]:
import json

for item in dataset:
    print (json.dumps(item))

{"question": "You are a scriptwriter assistant with a flair for creativity and imagination. You have to generate a list of characters based on the log line.", "context": "In a dystopian future, a disillusioned scientist teams up with a gifted hacker to expose a corrupt government's surveillance operations. Along the way, they uncover a deeper conspiracy that threatens to destroy humanity.", "answer": "Dr. Eleanor Rigby, a brilliant but jaded scientist who has lost faith in humanity's future. Max 'Cipher' Monroe, a charismatic and highly skilled hacker with a dark past. Director Nathan Crowe, the manipulative head of the government's surveillance agency. Lucy, a mysterious informant with knowledge of the deeper conspiracy. Dr. Isaac Thorn, Eleanor's former mentor who now works for the corrupt government."}
{"question": "You are a scriptwriter assistant with a flair for creativity and imagination. You have to generate a list of characters based on the log line.", "context": "A heartwarmi

In [88]:
import time

for i in range(1,5):
    # Make the OpenAI API Call
    type1_prompts = []
    for charDescription in generate_character_descriptions(5):
        type1_prompts.append(charDescription)
    dataset = []

    # Adapt the response to the dataset format that we need
    for example in type1_prompts:
        data = {}
        data["question"] = "You are a scriptwriter assistant with a flair for creativity and imagination. You have to generate a list of characters based on the log line."
        data["context"] = example.logline
        data["answer"] = " ".join(example.characters)
        dataset.append(data)
    
    # Print/Save the dataset to file
    for item in dataset:
        print (json.dumps(item))
    
    time.sleep(5)

{"question": "You are a scriptwriter assistant with a flair for creativity and imagination. You have to generate a list of characters based on the log line.", "context": "A group of teenagers from a small town in Scotland discover an ancient artifact that transports them to a parallel world filled with mythical creatures. They must navigate this fantastical realm and find a way back home.", "answer": "Lachlan, the adventurous leader of the group, is known for his bravery and quick thinking. Isla, a tech-savvy girl with a keen interest in ancient history, is resourceful and intelligent. Ewan, the skeptical and cautious member, often acts as the voice of reason amidst the chaos. Fiona, an empathetic and kind-hearted girl, has a unique ability to communicate with the mythical creatures. Rowan, a mischievous and fearless boy, thrives in the unpredictability of the new world."}
{"question": "You are a scriptwriter assistant with a flair for creativity and imagination. You have to generate a

## Type 2 Prompts - Loglines, characters & scenes

In this type of prompt, we conside the inputs to be the logline and the characters - for e.g. like the ones we generated in the previous step and based on this we want the LLM to generate the scenes. The scene also contains multiple elements like the scene name, scene location, plot element and beat.

In [92]:
## Making use of the Instructor library to generate structured examples

import instructor

from typing import Iterable, List, Tuple
from pydantic import BaseModel, Field, ConfigDict
from openai import OpenAI

# Define the UserDetail model
class scene(BaseModel):
    """Class to hold generated values for a specific scene in the story"""
    sceneName: str
    sceneLocation: str
    plotElement: str
    beat: str

    model_config = ConfigDict(
        json_schema_extra={
            "examples": [
                {"sceneName": "Introduction", "sceneLocation": "Medea's modest home", "plotElement": "Exposition", "beat": "The Nurse recounts the chain of events that have turned Medea's world to enmity. The Nurse laments how Jason has abandoned Medea and his own children in order to remarry with the daughter of Creon."},
                {"sceneName": "Medea's plight", "sceneLocation": "Medea's modest home", "plotElement": "Inciting Incident.", "beat": "The Nurse confides in the Tutor and testifies to the emotional shock Jason's betrayal has sparked in Medea. The Tutor shares the Nurse's sympathy for Medea's plight. Medea's first words are cries of helplessness. Medea wishes for her own death."},
                {"sceneName": "Medea's opression", "sceneLocation": "Outside the Royal Palace.", "plotElement": "Rising Action.", "beat": "Medea pleads to the Nurse that Jason be made to suffer for the suffering he has inflicted upon her. Creon approaches the house and banishes Medea and her children from Corinth. Medea plans on killing her three antagonists, Creon, his daughter and Jason."},
            ]
        }
    )

# Define the UserDetail model
class storyScenes(BaseModel):
    """Class to hold generated values for Scenes in the story described by the logline and characters"""
    logline: str
    characters: List[str]
    scenes: List[scene]
    
    model_config = ConfigDict(
        json_schema_extra={
            "examples": [
                {
                    "logline": "Ancient Greek tragedy based upon the myth of Jason and Medea. Medea, a former princess and the wife of Jason, finds her position in the Greek world threatened as Jason leaves Medea for a Greek princess of Corinth. Medea takes vengeance on Jason by murdering his new wife as well as Medea's own two sons, after which she escapes to Athens.", 
                    "characters": [
                        ("Medea is the protagonist of the play. A sorceress and a princess, she fled her country and family to live with Jason in Corinth, where they established a family of two children and gained a favorable reputation. Jason has divorced Medea and taken up with a new family."),
                        ("Jason is considered the play's villain, though his evil stems more from weakness than strength. A former adventurer, Jason abandons his wife, Medea, in order to marry the beautiful young daughter of Creon, King of Corinth, and fuels Medea to a revenge."),
                        ("The Women of Corinth are a commentator to the action. They fully sympathizes with Medea's plight, excepting her decision to murder her own children."),
                        ("Creon is the King of Corinth, banishes Medea from the city"),
                        ("The Nurse is the caretaker of the house and of the children and serves as Medea's confidant.")],
                    "scenes": [
                        {"sceneName": "Introduction", "sceneLocation": "Medea's modest home", "plotElement": "Exposition", "beat": "The Nurse recounts the chain of events that have turned Medea's world to enmity. The Nurse laments how Jason has abandoned Medea and his own children in order to remarry with the daughter of Creon."},
                        {"sceneName": "Medea's plight", "sceneLocation": "Medea's modest home", "plotElement": "Inciting Incident.", "beat": "The Nurse confides in the Tutor and testifies to the emotional shock Jason's betrayal has sparked in Medea. The Tutor shares the Nurse's sympathy for Medea's plight. Medea's first words are cries of helplessness. Medea wishes for her own death."},
                        {"sceneName": "Medea's opression", "sceneLocation": "Outside the Royal Palace.", "plotElement": "Rising Action.", "beat": "Medea pleads to the Nurse that Jason be made to suffer for the suffering he has inflicted upon her. Creon approaches the house and banishes Medea and her children from Corinth. Medea plans on killing her three antagonists, Creon, his daughter and Jason."},
                    ]
                }
            ]
        }
    )

In [93]:
# Patch the OpenAI client to enable the response_model functionality
client = instructor.from_openai(OpenAI(api_key="API_KEY"))

def generate_story_scenes(count: int) -> Iterable[storyScenes]:
    return client.chat.completions.create(
        model="gpt-4o",
        response_model=Iterable[storyScenes],
        messages=[
            {"role": "user", "content": f"You are responsible for generating synthetic data for a script-writing task. Be creative in your approach and add diversity to the examples by considering \
                                        various genres, different audience types, settings at different locations around the world and people from all walks of life. \
                                        Generate `{count}` synthetic examples"},
        ],
    )

In [94]:
def create_scene_description(inputScene):
    oneScene = ""
    oneScene += "Scene Name: " + inputScene.sceneName + ", "
    oneScene += "Scene Location: " + inputScene.sceneLocation + ", "
    oneScene += "Plot Element: " + inputScene.plotElement + ", "
    oneScene += "Beat: " + inputScene.beat + " "
    return (oneScene)

def combine_scene_descriptions(inputScene):
    output = ""
    for obj in inputScene:
        oneScene = create_scene_description(obj)
        output += oneScene
    return output

In [98]:
for i in range (1,5):
    # Make calls to OpenAI
    type2_prompts = []
    for scenes in generate_story_scenes(5):
        type2_prompts.append(scenes)
    
    # Adapt responses to fit instruction dataset needed for fine-tuning
    dataset = []
    for example in type2_prompts:
       data = {}
       data["question"] = "You are a scriptwriter assistant with a flair for creativity and imagination. You have to generate a sequence of scenes based on the log line and characters of the story."
       data["context"] = example.logline + " ".join(example.characters)
       data["answer"] = combine_scene_descriptions(example.scenes)
       dataset.append(data)
    
    #Print/Save dataset
    for item in dataset:
        print (json.dumps(item))

    #Provide a brief pause before making next API call
    time.sleep(5)

{"question": "You are a scriptwriter assistant with a flair for creativity and imagination. You have to generate a sequence of scenes based on the log line and characters of the story.", "context": "In a dystopian future, a lone hacker discovers a government conspiracy to control the minds of its citizens and fights to bring the truth to light.Alex is a reclusive hacker with a mysterious past. Once a prodigy, Alex now lives off the grid, connected only to the underground network of other hackers. Casey is a young journalist who dreams of uncovering the truth and making a name in the field. Casey stumbles upon Alex's findings and joins forces. Director Smith is the head of the government's mind control project, ruthless and intelligent, willing to go to any lengths to protect the secret. Riley is Alex's old mentor who now works for the government. Inside, Riley struggles with conflicting loyalties and guilt. The Resistance is a group of activists and former government agents who have ba

## Type 3 Prompts - Loglines, characters, scene and dialogues

In this type of prompt, we conside the inputs to be the logline, the characters and the specific scene from the previous list. Based on this information we want the LLM to generate the screenplay of the scene including the dialogues between the characters.

In [99]:
## Making use of the Instructor library to generate structured examples

import instructor

from typing import Iterable, List, Tuple
from pydantic import BaseModel, Field, ConfigDict
from openai import OpenAI

class scene(BaseModel):
    """Class to hold values for a specific scene in the story"""
    sceneName: str
    sceneLocation: str
    plotElement: str
    beat: str

    model_config = ConfigDict(
        json_schema_extra={
            "examples": [
                {"sceneName": "Introduction", "sceneLocation": "Medea's modest home", "plotElement": "Exposition", "beat": "The Nurse recounts the chain of events that have turned Medea's world to enmity. The Nurse laments how Jason has abandoned Medea and his own children in order to remarry with the daughter of Creon."},
                {"sceneName": "Medea's plight", "sceneLocation": "Medea's modest home", "plotElement": "Inciting Incident.", "beat": "The Nurse confides in the Tutor and testifies to the emotional shock Jason's betrayal has sparked in Medea. The Tutor shares the Nurse's sympathy for Medea's plight. Medea's first words are cries of helplessness. Medea wishes for her own death."},
                {"sceneName": "Medea's opression", "sceneLocation": "Outside the Royal Palace.", "plotElement": "Rising Action.", "beat": "Medea pleads to the Nurse that Jason be made to suffer for the suffering he has inflicted upon her. Creon approaches the house and banishes Medea and her children from Corinth. Medea plans on killing her three antagonists, Creon, his daughter and Jason."},
            ]
        }
    )

class sceneDialogue(BaseModel):
    """Class to hold generated values for dialogues in the scene based on the logline and characters"""
    logline: str
    characters: List[str]
    scene: scene
    dialogues: str
    
    model_config = ConfigDict(
        json_schema_extra={
            "examples": [
                {
                    "logline": "Ancient Greek tragedy based upon the myth of Jason and Medea. Medea, a former princess and the wife of Jason, finds her position in the Greek world threatened as Jason leaves Medea for a Greek princess of Corinth. Medea takes vengeance on Jason by murdering his new wife as well as Medea's own two sons, after which she escapes to Athens.", 
                    "characters": [
                        ("Medea is the protagonist of the play. A sorceress and a princess, she fled her country and family to live with Jason in Corinth, where they established a family of two children and gained a favorable reputation. Jason has divorced Medea and taken up with a new family."),
                        ("Jason is considered the play's villain, though his evil stems more from weakness than strength. A former adventurer, Jason abandons his wife, Medea, in order to marry the beautiful young daughter of Creon, King of Corinth, and fuels Medea to a revenge."),
                        ("The Women of Corinth are a commentator to the action. They fully sympathizes with Medea's plight, excepting her decision to murder her own children."),
                        ("Creon is the King of Corinth, banishes Medea from the city"),
                        ("The Nurse is the caretaker of the house and of the children and serves as Medea's confidant.")],
                    "scene": {"sceneName": "Medea's Revenge", "sceneLocation": "Outside the Royal Palace.", "plotElement": "Resolution", "beat": " The palace opens its doors, revealing Medea and the two dead children seated in a chariot drawn by dragons. Jason curses himself for having wed Medea and mourns his tragic losses . Medea denies Jason the right to a proper burial of his children . Medea flees to Athens and divines an unheroic death for Jason ."},
                    "dialogues": "WOMEN OF CORINTH Throw wide the doors and see thy children 's murdered corpses . JASON Haste , ye slaves , loose the bolts , undo the fastenings , that I may see the sight of twofold woe , my murdered sons and her , whose blood in vengeance I will shed . ( MEDEA appears above the house , on a chariot drawn by dragons ; the children 's corpses are beside her .) MEDEA Why shake those doors and attempt to loose their bolts , in quest of the dead and me their murderess ? From such toil desist . If thou wouldst aught with me , say on , if so thou wilt ; but never shalt thou lay hand on me , so swift the steeds the sun , my father 's sire , to me doth give to save me from the hand of my foes . JASON Accursed woman ! by gods , by me and all mankind abhorred as never woman was , who hadst the heart to stab thy babes , thou their mother , leaving me undone and childless ; this hast thou done and still dost gaze upon the sun and earth after this deed most impious . Curses on thee ! now perceive what then I missed in the day I brought thee , fraught with doom , from thy home in a barbarian land to dwell in Hellas , traitress to thy sire and to the land that nurtured thee . Perish , vile sorceress , murderess of thy babes ! Whilst I must mourn my luckless fate , for I shall ne 'er enjoy my new - found bride , nor shall I have the children , whom I bred and reared , alive to say the last farewell to me ; nay , I have lost them . MEDEA To this thy speech I could have made a long reply , but Father Zeus knows well all I have done for thee , and the treatment thou hast given me . Yet thou wert not ordained to scorn my love and lead a life of joy in mockery of me , nor was thy royal bride nor Creon , who gave thee a second wife , to thrust me from this land and rue it not . Wherefore , if thou wilt , call me e'en a lioness , and Scylla , whose home is in the Tyrrhene land ; for I in turn have wrung thy heart , as well I might . JASON Thou , too , art grieved thyself , and sharest in my sorrow . MEDEA Be well assured I am ; but it relieves my pain to know thou canst not mock at me . JASON O my children , how vile a mother ye have found ! MEDEA My sons , your father 's feeble lust has been your ruin ! JASON 'Twas not my hand , at any rate , that slew them . MEDEA No , but thy foul treatment of me , and thy new marriage . JASON Didst think that marriage cause enough to murder them ? MEDEA Dost think a woman counts this a trifling injury ? JASON So she be self - restrained ; but in thy eyes all is evil . MEDEA Thy sons are dead and gone . That will stab thy heart ."
                }
            ]
        }
    )

In [101]:
# Patch the OpenAI client to enable the response_model functionality
client = instructor.from_openai(OpenAI(api_key="API_KEY"))

def generate_scene_dialogs(count: int) -> Iterable[sceneDialogue]:
    return client.chat.completions.create(
        model="gpt-4o",
        response_model=Iterable[sceneDialogue],
        messages=[
            {"role": "user", "content": f"You are responsible for generating synthetic data for a script-writing task. Be creative in your approach and add diversity to the examples by considering \
                                        various genres, different audience types, settings at different locations around the world and people from all walks of life. \
                                        Generate `{count}` synthetic examples"},
        ],
    )



In [106]:
for i in range(1,5):
    #Make the OpenAI calls
    type3_prompts = []
    for dialog in generate_scene_dialogs(5):
        type3_prompts.append(dialog)

    ## Some post-processing to adapt the examples to the JSONL format - the question prompt changes for each section/type of examples
    dataset = []
    for example in type3_prompts:
       data = {}
       data["question"] = "You are a scriptwriter assistant with a flair for creativity and imagination. You have to generate dialogues for a scene based on the log line and characters of the story."
       data["context"] = example.logline + " ".join(example.characters) + " " + create_scene_description(example.scene)
       data["answer"] = example.dialogues
       dataset.append(data)
    
    #Dump/Save data to file 
    for item in dataset:
        print (json.dumps(item))

    # Wait a bit before making the next call to the API
    time.sleep(5)

{"question": "You are a scriptwriter assistant with a flair for creativity and imagination. You have to generate dialogues for a scene based on the log line and characters of the story.", "context": "In a dystopian future, a group of rebels fights against a tyrannical regime that controls the last remaining city on Earth.Zephyr - The charismatic leader of the rebel group, determined to overthrow the regime. Nia - A tech genius and the strategic brain behind the rebel operations. Orion - A former soldier who now fights for the rebellion. The Overseer - The ruthless leader of the regime. Haven - A double agent whose true loyalties are unknown. Scene Name: The Secret Meeting, Scene Location: Abandoned Warehouse, Plot Element: Inciting Incident, Beat: The rebels gather to plan their next move against the regime. Haven arrives with crucial information. ", "answer": "ZEPHYR We can't keep hiding forever. We need to strike now, while they're vulnerable. NIA I've hacked into their mainframe. Th