<a href="https://colab.research.google.com/github/Bryan-Az/Advanced-Prompt-Engineering/blob/main/notebooks/prompt_techniques/In_Context_Learning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Applying In-Context-Learning to Prompt Template Generation

## Imports and Installs

In [None]:
%%capture
!pip install datasets

In [65]:
import json
import datasets
import openai
from google.colab import userdata
import os

## Loading the Sample Template and ICL Sample Data

In [31]:
# downloading the icl sample json from github
!wget https://raw.githubusercontent.com/Bryan-Az/Advanced-Prompt-Engineering/refs/heads/main/prompts/ICL_sample_structure.json -O ICL_sample_structure.json

--2024-11-30 00:27:36--  https://raw.githubusercontent.com/Bryan-Az/Advanced-Prompt-Engineering/refs/heads/main/prompts/ICL_sample_structure.json
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.111.133, 185.199.109.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 594 [text/plain]
Saving to: ‘ICL_sample_structure.json’


2024-11-30 00:27:37 (17.8 MB/s) - ‘ICL_sample_structure.json’ saved [594/594]



In [45]:
#loading the sample ICL template
with open('ICL_sample_structure.json', 'r') as file:
    sample_structure = json.load(file)

In [90]:
display(sample_structure)

{'domain_1': [{'prompt_1_components': {'prompt': 'prompt text with question, variable_1 is text, variable_2 is 0.',
    'data_context': {'variable_1': 'text', 'variable_2': 0}},
   'test_case': {'eval_prompt_1': 'eval prompt with question, variable_1 is text_1, variable_2 is 1.',
    'data_context': {'variable_1': 'text_1', 'variable_2': 1},
    'expected_output': 'response text with answer, making reference to variable_1 as text_1 and variable_2 as 1'}}]}

In [42]:
# loading a sample dataset for use as context
roots_dataset = datasets.load_dataset('Alexis-Az/math_datasets', 'roots')

roots/Roots.csv:   0%|          | 0.00/5.31M [00:00<?, ?B/s]

Generating train split:   0%|          | 0/10000 [00:00<?, ? examples/s]

## Initializing the OpenAI Model API Class

In [66]:
openai_key = userdata.get("openai")
os.environ["OPENAI_API_KEY"] = openai_key

In [70]:
model_name = 'gpt-3.5-turbo'
def get_completion(prompt, model=model_name):
  """Generates text using the OpenAI API.

  Args:
    prompt: The prompt to send to the API.
    model: The name of the model to use.

  Returns:
    The generated text.
  """
  response = openai.chat.completions.create(
      model=model,
      messages=[{"role": "user", "content": prompt}],
      temperature=0.01
  )
  return response.choices[0].message.content

## Using the Model API to Generate ICL Prompts for 3 Domains

In [82]:
# calling the OpenAI API to use this template to generate a sample ICL prompt for healthcare
ICL_prompt_generation_prompt_healthcare = f"Can you help me use this sample json template to generate an in-context learning JSON prompts for healthcare?. Remember to change the key name of the domain to reflect healthcare. Template: {sample_structure}"
prompt_1 = get_completion(ICL_prompt_generation_prompt_healthcare)

In [83]:
# calling the OpenAI API to use this template to generate a sample ICL prompt for software engineering
ICL_prompt_generation_prompt_software_engineering = f"Can you help me use this sample json template to generate an in-context learning JSON prompts for software engineering? Remember to change the key name of the domain to reflect software. Template: {sample_structure}"
prompt_2 = get_completion(ICL_prompt_generation_prompt_software_engineering)

In [84]:
# calling the OpenAI API to use this template to generate a sample ICL prompt for math
examples = roots_dataset['train'][:3]
ICL_prompt_generation_prompt_math = f"Can you help me use this sample json template to generate an in-context learning JSON prompts for math? Remember to change the key name of the domain to reflect math. Template: {sample_structure} Examples: {examples}"
prompt_3 = get_completion(ICL_prompt_generation_prompt_math)

In [86]:
# using regex to select the json from the text output
import re
def extract_json_from_text(text):
    json_match = re.search(r'{.*}', text, re.DOTALL)
    if json_match:
        return json_match.group()
    else:
        return None
# applying it on the 3 prompts
prompts = [prompt_1, prompt_2, prompt_3]
ICL_prompts = [extract_json_from_text(prompt) for prompt in prompts]

In [89]:
# writing the output by appending to a json file at prompts/field_specific_prompts_ICL.json
with open('prompts/field_specific_prompts_ICL.json', 'w') as file:
    json.dump(ICL_prompts, file, indent=4)