<a href="https://colab.research.google.com/github/withpi/cookbook-withpi/blob/main/colabs/Generate_Synthetic_Input.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<a href="https://withpi.ai"><img src="https://play.withpi.ai/logo/logoFullBlack.svg" width="240"></a>

<a href="https://code.withpi.ai"><font size="4">Documentation</font></a>

<a href="https://build.withpi.ai"><font size="4">Copilot</font></a>

# Generate Synthetic Input

Many techniques require input data to drives evaluation and training, but getting high-quality data can be painful and expensive.

Generating this data with AI support can give you a higher quality set with much lower effort.

## Install and initialize SDK

You'll need a `WITHPI_API_KEY` from https://build.withpi.ai/account.  Add it to your notebook secrets (the key symbol) on the left.

Run the cell below to install packages and load the SDK

In [1]:
%%capture

%pip install withpi withpi-utils datasets

import os
from google.colab import userdata
from withpi import PiClient

# Load the notebook secret into the environment so the Pi Client can access it.
os.environ["WITHPI_API_KEY"] = userdata.get('WITHPI_API_KEY')

pi = PiClient()

## Generate an Input Set

Let's say we want to build an AI to generate stories in the style of Aesop's Fables.  We can build a dataset of moral lessons that could be used to exercise it.

In [4]:
from withpi_utils import stream

data_generation_status = pi.data.generate.start_job(
    application_description="""
Write a children's story in the style of Aesop's Fables teaching a life lesson
specified by the user. Provide just the story with no extra content.
""",
    num_inputs_to_generate=9,
    seeds=[],
    batch_size=3,
    num_shots=3,
)

input_set = []

for data in stream(pi.data.generate, data_generation_status):
  input_set.append(data)
  print(f"[OUTPUT] - {data}")

LAUNCHING
[INFO] Generating 9 seeds as they are not provided.
[INFO] Yielding generated 9 seeds
[INFO] Data Generation Complete => Good Inputs: 9. Bad Inputs: 0. Similar Inputs: 0
DONE
[OUTPUT] - Write a story teaching the lesson 'honesty is the best policy'.
[OUTPUT] - Create a fable illustrating why patience is important.
[OUTPUT] - Tell a story about the dangers of greed in an animal setting.
[OUTPUT] - Write a tale showing the value of teamwork through animals overcoming a challenge together.
[OUTPUT] - Create a story that conveys the moral 'slow and steady wins the race'.
[OUTPUT] - Write a fable that teaches kids about the importance of kindness.
[OUTPUT] - Tell a tale that demonstrates why procrastination can lead to consequences.
[OUTPUT] - Create a story teaching that appearances can be deceiving.
[OUTPUT] - Write a fable explaining why helping others can bring unexpected rewards.


## Next Steps

You can take this input set and use it to drive your evaluation or training workflows as you see fit.  You can experiment with the parameters to this API function (see https://code.withpi.ai/ for details) to see what sort of behavior you get.