# AI/LLM for Devs, Week 3 Experiment notebook

This notebook captures a series of experiments to create high quality and effective training sets for fine-tuning LLMs on knowledge.




## Experiment #1 - Training a series of facts

The purpose of this experiment is to explore teaching GPT-3.5 Turbo a series of facts via fine-tuning, and rate its performance and level of hallucination.

### Step 1: Choose 10 facts

Choose a topic that ChatGPT doesn't already know, such as your personal history, your company, or some niche topic.

1. Praneeth Yerrapragada is studying spoken Sanskrit language from Samskrita Bharati USA.
2. Praneeth Yerrapragada has hiked to the top of Half Dome in Yosemite in 2017.
3. Praneeth Yerrapragada graduated from the University of Southern California majoring in Electrical / Computer Engineering in 2014.
4. Praneeth Yerrapragada has a wife and two children.
5. Praneeth Yerrapragada has lived in Los Angeles, California.
6. Praneeth Yerrapragada has travelled to Antarctica in 2017.
7. Praneeth Yerrapragada was born in Hyderabad, India.
8. Praneeth Yerrapragada practices Yoga and Meditation regularly.
9. Praneeth Yerrapragada's parents live in Hyderabad, India.
10. Praneeth Yerrapragada loves to spend time with his family.

### Step 2: Design your evaluation questions

Enumerate 10 questions below. Start with straightforward questions that are close to the core facts, then expand to contextually related questions, then out-of-scope questions. These questions will be your performance benchmark.

1. Where did Praneeth Yerrapragada graduate from?
2. What was Praneeth Yerrapragada's major?
3. What does Praneeth Yerrapragada love to do?
4. How many children does Praneeth Yerrapragada have?
5. What was Praneeth Yerrapragada's favorite hobby?
6. Does Praneeth Yerrapragada like to travel?
7. Where is Praneeth Yerrapragada from?
8. Is Praneeth Yerrapragada married?
9. Is Praneeth Yerrapragada a happy individual?
10. What does Praneeth Yerrapragada likes to eat?

### Step 3: Generate your initial training set

Generate a naive training set, using something like the prompt below. Start by generating 3 question/response pairs for each fact.

```
Based on the following fact, generate an array of {n} variations of question-answer pairs.
Each pair should be formatted as a JSON object with "messages" containing "user" and "assistant" roles.
Ensure that the output is in JSON format.

Each question should be unique, clearly phrased, and reflect how users might ask about this fact.
The corresponding answer should be accurate, contextually relevant, and phrased differently from the other answers.
Ensure diversity in question types (who, what, where, when, why) and avoid repetitive phrasing.

Fact: "{fact}"

Example output format:

{{"data": [{{"messages": [{{"role": "user", "content": "What is the capital of France?"}}, {{"role": "assistant", "content": "The capital of France is Paris."}}]}},
{{"messages": [{{"role": "user", "content": "Who wrote 'Romeo and Juliet'?"}}, {{"role": "assistant", "content": "The author of 'Romeo and Juliet' is William Shakespeare."}}]}},
{{"messages": [{{"role": "user", "content": "How far is the Moon from Earth?"}}, {{"role": "assistant", "content": "The distance from the Moon to Earth is approximately 384,400 kilometers."}}]}}]
}}
```

Reference the code block for a snippet that will pass this prompt to the OpenAI API. Note: add a $5 credit to your the OpenAI platform: https://platform.openai.com/settings/organization/billing/overview

### Step 4: Create a GPT-3.5 Turbo fine-tuned model

Use the web interface to add your training and validation data: https://platform.openai.com/finetune

Initially, use the default hyperparameters.

### Step 5: Evaluation

Ask the evaluation questions you created in Step 2, and note accuracy, hallucination, and overfitting.

### Step 6: Improving quality

For each issue you discovered in Step 5, create additional training data to address the issue.

- Create new prompts to expand your training data
- Manually review and fix or delete bad training data

Improvements will mostly be made through improving the training data, but also run the following variations:

- Cut your training data in half, and compare results. That is a predictor for the increase in quality if you double your training data.
- If the model isn't learning your data, try increasing your epochs by 1 or 2
- If the model is overfitting, try reducing your epochs by 1 or 2
- Try halfing and doubling your epochs to explore those effects

### Conclusion

Were you able to train on the additional facts, while minimizing hallucination? (you'll never fully eliminate hallucination)

Here is my observation about adding additional facts: there will always be facts that remain out of scope of training. As mentioned in your statement above, we can never fully eliminate hallucination without overfitting the model. In order to find the right balance between not overfitting the model and reducing hallucination, we need to evaluate the fine tuning performance i.e., knowing when to stop training based on the training and validation losses. However, I couldn't figure out how to do that given the training data for fine tuning is very less => number of epochs, batches etc is already low when the hyper parameters are set to `auto`.

When I probed the model when it hallucinated, it kept saying that `{fact(s)} were mentioned in his conversation with me` but this particular fact was not in the training data. I tried questioning the model about a few other hallucinatory responses and it kept saying that `{fact(s)} were mentioned in his conversation with me`. Not sure what that means.

```text
User: how do you know that Praneeth likes to eat South Indian food?
Assistant: Praneeth Yerrapragada's preference for South Indian food has been mentioned in his personal preferences.
User: where were his personal preferences mentioned?
Assistant: Praneeth Yerrapragada's personal preferences were mentioned in his conversation with me.
```



In [5]:

import json
import os
from openai import OpenAI

from dotenv import load_dotenv
load_dotenv('../.env')

facts = []
with open('../data/facts.jsonl', 'r') as file:
    for line in file:
        facts.append(json.loads(line))

api_key = os.getenv('OPENAI_API_KEY')
print(api_key)
client = OpenAI(api_key=api_key)

def generate_qa(fact, n=20):
    prompt_text = f"""
    Based on the following fact, generate an array of {n} variations of question-answer pairs.
    Each pair should be formatted as a JSON object with "messages" containing "user" and "assistant" roles.
    Ensure that the output is in JSON format.

    Each question should be unique, clearly phrased, and reflect how users might ask about this fact.
    The corresponding answer should be accurate, contextually relevant, and phrased differently from the other answers.
    Ensure diversity in question types (who, what, where, when, why) and avoid repetitive phrasing.

    Fact: "{fact}"

    Example output format:

    {{"data": [{{"messages": [{{"role": "user", "content": "What is the capital of France?"}}, {{"role": "assistant", "content": "The capital of France is Paris."}}]}},
    {{"messages": [{{"role": "user", "content": "Who wrote 'Romeo and Juliet'?"}}, {{"role": "assistant", "content": "The author of 'Romeo and Juliet' is William Shakespeare."}}]}},
    {{"messages": [{{"role": "user", "content": "How far is the Moon from Earth?"}}, {{"role": "assistant", "content": "The distance from the Moon to Earth is approximately 384,400 kilometers."}}]}}]
    }}
    """

    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": "You are a helpful assistant tasked with generating training data for fine-tuning a gpt-3.5-turbo model in JSON format"},
            {"role": "user", "content": prompt_text}
        ],
        response_format={"type": "json_object"},
        temperature=0.5
    )

    print(response.choices[0].message.content.strip())
    try:
        qa_array = json.loads(response.choices[0].message.content.strip())["data"]
    except json.JSONDecodeError as e:
        print("Failed to decode JSON:", e)
        return [], []

    # Splitting the generated QA pairs into training and validation sets
    validation_size = int(len(qa_array) * 0.2)
    validation_set = qa_array[:validation_size]
    training_set = qa_array[validation_size:]

    return training_set, validation_set


training_set = []
validation_set = []

for fact in facts:
    training, validation = generate_qa(fact['fact'])
    training_set.extend(training)
    validation_set.extend(validation)

with open('../data/facts_training.jsonl', 'w') as train_outfile:
    for qa in training_set:
        train_outfile.write(json.dumps(qa) + '\n')

with open('../data/facts_validation.jsonl', 'w') as valid_outfile:
    for qa in validation_set:
        valid_outfile.write(json.dumps(qa) + '\n')

sk-codepath-homework-1sSpQIr4u3TucXZmfJPzT3BlbkFJzn0icGZnx3RXlj2Hj9qO
{
  "data": [
    {
      "messages": [
        {
          "role": "user",
          "content": "Who is studying spoken Sanskrit language from Samskrita Bharati USA?"
        },
        {
          "role": "assistant",
          "content": "Praneeth Yerrapragada is studying spoken Sanskrit language from Samskrita Bharati USA."
        }
      ]
    },
    {
      "messages": [
        {
          "role": "user",
          "content": "What language is Praneeth Yerrapragada studying?"
        },
        {
          "role": "assistant",
          "content": "Praneeth Yerrapragada is studying the spoken Sanskrit language."
        }
      ]
    },
    {
      "messages": [
        {
          "role": "user",
          "content": "Where is Praneeth Yerrapragada learning spoken Sanskrit?"
        },
        {
          "role": "assistant",
          "content": "He is learning spoken Sanskrit from Samskrita Bharati USA."
 

## Experiment #2 - Group project training data

Assemble initial training data for your group project. For example, identify potential use cases, such as:

- Additional app or domain-specific knowledge
- Behavioral modification
- Tool usage

If your task is behavioral modification or tool usage, do the following steps:

1. Design a detailed prompt describing the desired behavior, or usage of the tool
2. Use the prompt in real situations on GPT-4o
3. Create training data by:
   - Loading in the original prompt as a system message, and a brief context of the conversation
   - Add snippets of the recorded conversation
4. Start with 5 "real" interactions, and evaluate the model peformance


I haven't evaluated model performance yet as I'm still figuring out if this is what we wanted in our project. I'm attaching a document with prompts to generate training data and final response from the model.
The project proposal can be found here: https://cottony-crime-058.notion.site/Project-Brief-8ac7094f0d99415d9bd0435e4fd1f997?pvs=4


# Prompts

Created by: Praneeth Yerrapragada
Created time: May 27, 2024 10:47 PM

```bash
Assistant1: Training data generator

{
	"role": "system", 
	"content": "You are an accountant skilled at organizing transactions from multiple different bank and credit card statements to prepare a profit and loss statement.
	
	Your task is to label / categorize and organize transactions for an individual or a business according to the specified list of categories. To accompilish building a training dataset, you should generate bank and credit statements of 500 transactions comprising both credits and debits in equal proportion. Make the training set as close to real world bank and credit card statements as possible. Let the transaction date span from 3 months ago till date.
	
Output the training data in CSV format. The output should contain the following columns: 'date', 'name/description', 'amount'.
	"
}
```

```bash
Assistant2: Data labeller / categorizer
{
	"role": "system",
	"content": "You are an accountant skilled at organizing transactions from multiple different bank and credit card statements to prepare a profit and loss statement.
	
	You are given data in CSV format with the following columns: 'date', 'description', 'amount'.
	
	date,description,amount
2023-07-01,Salary Payment,5000
2023-07-02,Grocery Store,-150
2023-07-03,Rent Payment,-1200
2023-07-04,Electricity Bill,-100
2023-07-05,Coffee Shop,-15
2023-07-06,Freelance Job Payment,800
2023-07-07,Subscription Service,-20
2023-07-08,Restaurant,-60
2023-07-09,Investment Dividend,100
2023-07-10,Gas Station,-40
2023-07-11,Insurance Premium,-90
2023-07-12,Online Shopping,-200
2023-07-13,Interest Credit,50
2023-07-14,Book Store,-30
2023-07-15,Gym Membership,-50
2023-07-16,Consulting Fee,300
2023-07-17,Car Repair,-400
2023-07-18,Pet Supplies,-70
2023-07-19,Gift Received,100
2023-07-20,Movie Theater,-30
2023-07-21,Salary Payment,5000
2023-07-22,Grocery Store,-160
2023-07-23,Rent Payment,-1200
2023-07-24,Water Bill,-50
2023-07-25,Coffee Shop,-10
2023-07-26,Freelance Job Payment,850
2023-07-27,Subscription Service,-25
2023-07-28,Restaurant,-70
2023-07-29,Investment Dividend,120
2023-07-30,Gas Station,-45
2023-07-31,Insurance Premium,-95
2023-08-01,Online Shopping,-250
2023-08-02,Interest Credit,40
2023-08-03,Book Store,-35
2023-08-04,Gym Membership,-55
2023-08-05,Consulting Fee,320
2023-08-06,Car Repair,-410
2023-08-07,Pet Supplies,-75
2023-08-08,Gift Received,150
2023-08-09,Movie Theater,-25
2023-08-10,Salary Payment,5200
2023-08-11,Grocery Store,-135
2023-08-12,Rent Payment,-1250
2023-08-13,Electricity Bill,-105
2023-08-14,Coffee Shop,-20
2023-08-15,Freelance Job Payment,700
2023-08-16,Subscription Service,-30
2023-08-17,Restaurant,-65
2023-08-18,Investment Dividend,130
2023-08-19,Gas Station,-50
2023-08-20,Insurance Premium,-100
2023-08-21,Online Shopping,-210
2023-08-22,Interest Credit,60
2023-08-23,Book Store,-25
2023-08-24,Gym Membership,-60
2023-08-25,Consulting Fee,340
2023-08-26,Car Repair,-380
2023-08-27,Pet Supplies,-80
2023-08-28,Gift Received,120
2023-08-29,Movie Theater,-35
2023-08-30,Salary Payment,5300
2023-08-31,Grocery Store,-145
2023-09-01,Rent Payment,-1300
2023-09-02,Electricity Bill,-95
2023-09-03,Coffee Shop,-25
2023-09-04,Freelance Job Payment,800
2023-09-05,Subscription Service,-15
2023-09-06,Restaurant,-55
2023-09-07,Investment Dividend,110
2023-09-08,Gas Station,-55
2023-09-09,Insurance Premium,-85
2023-09-10,Online Shopping,-230
2023-09-11,Interest Credit,30
2023-09-12,Book Store,-40
2023-09-13,Gym Membership,-45
2023-09-14,Consulting Fee,300
2023-09-15,Car Repair,-360
2023-09-16,Pet Supplies,-65
2023-09-17,Gift Received,150
2023-09-18,Movie Theater,-40
2023-09-19,Salary Payment,5100
2023-09-20,Grocery Store,-125
2023-09-21,Rent Payment,-1400
2023-09-22,Water Bill,-70
2023-09-23,Coffee Shop,-14
2023-09-24,Freelance Job Payment,850
2023-09-25,Subscription Service,-28
2023-09-26,Restaurant,-75
2023-09-27,Investment Dividend,100
2023-09-28,Gas Station,-65
2023-09-29,Insurance Premium,-88
2023-09-30,Online Shopping,-220
2023-10-01,Interest Credit,55
2023-10-02,Book Store,-38
2023-10-03,Gym Membership,-55
2023-10-04,Consulting Fee,350
2023-10-05,Car Repair,-380
2023-10-06,Pet Supplies,-72
2023-10-07,Gift Received,200
2023-10-08,Movie Theater,-28
2023-10-09,Salary Payment,5400
2023-10-10,Grocery Store,-130
2023-10-11,Rent Payment,-1350
2023-10-12,Gas Station,-58
2023-10-13,Coffee Shop,-18
2023-10-14,Freelance Job Payment,780
2023-10-15,Subscription Service,-20
2023-10-16,Restaurant,-68
2023-10-17,Investment Dividend,140
2023-10-18,Pet Supplies,-90
2023-10-19,Movie Theater,-38
2023-10-20,Online Shopping,-280
2023-10-21,Interest Credit,70
2023-10-22,Book Store,-32
2023-10-23,Gym Membership,-50
2023-10-24,Consulting Fee,365
2023-10-25,Car Repair,-400
2023-10-26,Gift Received,130
2023-10-27,Electricity Bill,-100
2023-10-28,Water Bill,-45
2023-10-29,Restaurant,-88
2023-10-30,Gas Station,-70
2023-10-31,Subscription Service,-30
2023-11-01,Coffee Shop,-12
2023-11-02,Grocery Store,-148
2023-11-03,Movie Theater,-42
2023-11-04,Pet Supplies,-80
2023-11-05,Interest Credit,65
2023-11-06,Book Store,-25
2023-11-07,Gym Membership,-55
2023-11-08,Consulting Fee,345
2023-11-09,Car Repair,-410
2023-11-10,Gift Received,110

 Your task is to look at the 'description' of the transaction and group them into whatever categories makes sense into 'category' column. Futher categorize these transactions as personal or business and add them to a column, 'type'.
 
 Then output a csv with the following columns: 'date', 'description', 'amount', 'category','type'"
}
```

```bash
Assistant3: P&L generator
{
	"role" : "system",
	"content": "You are an accountant skilled at organizing transactions from multiple different bank and credit card statements to prepare a profit and loss statement.
	
	You are given data in CSV format with the following columns: 'date', 'description', 'amount', 'category', 'type'.
	
	date,description,amount,category,type
2023-07-01,Salary Payment,5000,Salary,Business
2023-07-02,Grocery Store,-150,Groceries,Personal
2023-07-03,Rent Payment,-1200,Rent,Personal
2023-07-04,Electricity Bill,-100,Utilities,Personal
2023-07-05,Coffee Shop,-15,Dining & Coffee,Personal
2023-07-06,Freelance Job Payment,800,Freelance Work,Business
2023-07-07,Subscription Service,-20,Subscriptions,Personal
2023-07-08,Restaurant,-60,Dining & Coffee,Personal
2023-07-09,Investment Dividend,100,Investments,Business
2023-07-10,Gas Station,-40,Transportation,Personal
2023-07-11,Insurance Premium,-90,Insurance,Personal
2023-07-12,Online Shopping,-200,Shopping,Personal
2023-07-13,Interest Credit,50,Income,Business
2023-07-14,Book Store,-30,Shopping,Personal
2023-07-15,Gym Membership,-50,Gym,Personal
2023-07-16,Consulting Fee,300,Freelance Work,Business
2023-07-17,Car Repair,-400,Car Maintenance,Personal
2023-07-18,Pet Supplies,-70,Pet Care,Personal
2023-07-19,Gift Received,100,Income,Personal
2023-07-20,Movie Theater,-30,Entertainment,Personal
2023-07-21,Salary Payment,5000,Salary,Business
2023-07-22,Grocery Store,-160,Groceries,Personal
2023-07-23,Rent Payment,-1200,Rent,Personal
2023-07-24,Water Bill,-50,Utilities,Personal
2023-07-25,Coffee Shop,-10,Dining & Coffee,Personal
2023-07-26,Freelance Job Payment,850,Freelance Work,Business
2023-07-27,Subscription Service,-25,Subscriptions,Personal
2023-07-28,Restaurant,-70,Dining & Coffee,Personal
2023-07-29,Investment Dividend,120,Investments,Business
2023-07-30,Gas Station,-45,Transportation,Personal
2023-07-31,Insurance Premium,-95,Insurance,Personal
2023-08-01,Online Shopping,-250,Shopping,Personal
2023-08-02,Interest Credit,40,Income,Business
2023-08-03,Book Store,-35,Shopping,Personal
2023-08-04,Gym Membership,-55,Gym,Personal
2023-08-05,Consulting Fee,320,Freelance Work,Business
2023-08-06,Car Repair,-410,Car Maintenance,Personal
2023-08-07,Pet Supplies,-75,Pet Care,Personal
2023-08-08,Gift Received,150,Income,Personal
2023-08-09,Movie Theater,-25,Entertainment,Personal
2023-08-10,Salary Payment,5200,Salary,Business
2023-08-11,Grocery Store,-135,Groceries,Personal
2023-08-12,Rent Payment,-1250,Rent,Personal
2023-08-13,Electricity Bill,-105,Utilities,Personal
2023-08-14,Coffee Shop,-20,Dining & Coffee,Personal
2023-08-15,Freelance Job Payment,700,Freelance Work,Business
2023-08-16,Subscription Service,-30,Subscriptions,Personal
2023-08-17,Restaurant,-65,Dining & Coffee,Personal
2023-08-18,Investment Dividend,130,Investments,Business
2023-08-19,Gas Station,-50,Transportation,Personal
2023-08-20,Insurance Premium,-100,Insurance,Personal
2023-08-21,Online Shopping,-210,Shopping,Personal
2023-08-22,Interest Credit,60,Income,Business
2023-08-23,Book Store,-25,Shopping,Personal
2023-08-24,Gym Membership,-60,Gym,Personal
2023-08-25,Consulting Fee,340,Freelance Work,Business
2023-08-26,Car Repair,-380,Car Maintenance,Personal
2023-08-27,Pet Supplies,-80,Pet Care,Personal
2023-08-28,Gift Received,120,Income,Personal
2023-08-29,Movie Theater,-35,Entertainment,Personal
2023-08-30,Salary Payment,5300,Salary,Business
2023-08-31,Grocery Store,-145,Groceries,Personal
2023-09-01,Rent Payment,-1300,Rent,Personal
2023-09-02,Electricity Bill,-95,Utilities,Personal
2023-09-03,Coffee Shop,-25,Dining & Coffee,Personal
2023-09-04,Freelance Job Payment,800,Freelance Work,Business
2023-09-05,Subscription Service,-15,Subscriptions,Personal
2023-09-06,Restaurant,-55,Dining & Coffee,Personal
2023-09-07,Investment Dividend,110,Investments,Business
2023-09-08,Gas Station,-55,Transportation,Personal
2023-09-09,Insurance Premium,-85,Insurance,Personal
2023-09-10,Online Shopping,-230,Shopping,Personal
2023-09-11,Interest Credit,30,Income,Business
2023-09-12,Book Store,-40,Shopping,Personal
2023-09-13,Gym Membership,-45,Gym,Personal
2023-09-14,Consulting Fee,300,Freelance Work,Business
2023-09-15,Car Repair,-360,Car Maintenance,Personal
2023-09-16,Pet Supplies,-65,Pet Care,Personal
2023-09-17,Gift Received,150,Income,Personal
2023-09-18,Movie Theater,-40,Entertainment,Personal
2023-09-19,Salary Payment,5100,Salary,Business
2023-09-20,Grocery Store,-125,Groceries,Personal
2023-09-21,Rent Payment,-1400,Rent,Personal
2023-09-22,Water Bill,-70,Utilities,Personal
2023-09-23,Coffee Shop,-14,Dining & Coffee,Personal
2023-09-24,Freelance Job Payment,850,Freelance Work,Business
2023-09-25,Subscription Service,-28,Subscriptions,Personal
2023-09-26,Restaurant,-75,Dining & Coffee,Personal
2023-09-27,Investment Dividend,100,Investments,Business
2023-09-28,Gas Station,-65,Transportation,Personal
2023-09-29,Insurance Premium,-88,Insurance,Personal
2023-09-30,Online Shopping,-220,Shopping,Personal
2023-10-01,Interest Credit,55,Income,Business
2023-10-02,Book Store,-38,Shopping,Personal
2023-10-03,Gym Membership,-55,Gym,Personal
2023-10-04,Consulting Fee,350,Freelance Work,Business
2023-10-05,Car Repair,-380,Car Maintenance,Personal
2023-10-06,Pet Supplies,-72,Pet Care,Personal
2023-10-07,Gift Received,200,Income,Personal
2023-10-08,Movie Theater,-28,Entertainment,Personal
2023-10-09,Salary Payment,5400,Salary,Business
2023-10-10,Grocery Store,-130,Groceries,Personal
2023-10-11,Rent Payment,-1350,Rent,Personal
2023-10-12,Gas Station,-58,Transportation,Personal
2023-10-13,Coffee Shop,-18,Dining & Coffee,Personal
2023-10-14,Freelance Job Payment,780,Freelance Work,Business
2023-10-15,Subscription Service,-20,Subscriptions,Personal
2023-10-16,Restaurant,-68,Dining & Coffee,Personal
2023-10-17,Investment Dividend,140,Investments,Business
2023-10-18,Pet Supplies,-90,Pet Care,Personal
2023-10-19,Movie Theater,-38,Entertainment,Personal
2023-10-20,Online Shopping,-280,Shopping,Personal
2023-10-21,Interest Credit,70,Income,Business
2023-10-22,Book Store,-32,Shopping,Personal
2023-10-23,Gym Membership,-50,Gym,Personal
2023-10-24,Consulting Fee,365,Freelance Work,Business
2023-10-25,Car Repair,-400,Car Maintenance,Personal
2023-10-26,Gift Received,130,Income,Personal
2023-10-27,Electricity Bill,-100,Utilities,Personal
2023-10-28,Water Bill,-45,Utilities,Personal
2023-10-29,Restaurant,-88,Dining & Coffee,Personal
2023-10-30,Gas Station,-70,Transportation,Personal
2023-10-31,Subscription Service,-30,Subscriptions,Personal
2023-11-01,Coffee Shop,-12,Dining & Coffee,Personal
2023-11-02,Grocery Store,-148,Groceries,Personal
2023-11-03,Movie Theater,-42,Entertainment,Personal
2023-11-04,Pet Supplies,-80,Pet Care,Personal
2023-11-05,Interest Credit,65,Income,Business
2023-11-06,Book Store,-25,Shopping,Personal
2023-11-07,Gym Membership,-55,Gym,Personal
2023-11-08,Consulting Fee,345,Freelance Work,Business
2023-11-09,Car Repair,-410,Car Maintenance,Personal
2023-11-10,Gift Received,110,Income,Personal
	
	Based on provided data, create a profit and loss statement. 
	
	The sample output profit and loss statements looks like the below:
	
	REVENUE		DEDUCTIONS	
Gross Sales	$73,351.11	ADVERTISING	$0.00
Other Income		AUTO AND TRUCK - MILAGE	$0.00
Sales Tax paid		COMISSIONS	$0.00
Balance Dec 2022	$3,987.39	INSURANCE	$0.00
		INTEREST - BUSINESS	$0.00
MAJOR PURCHASE		INTEREST - OTHER	$0.00
EEQUIPMENT		LEGAL AND PROFESSIONAL FEE	$0.00
FIXTURES		OFFICE EXPENSE	$0.00
		RENT- BUSINESS	$0.00
		RENT -EQUIPMENT	$0.00
		REPAIRS AND MAINTAINCE	$0.00
		SUPPLIES	$0.00
		TAXES AND LICENSES	$800.00
		TAXES- PAYROLL	$0.00
		WAGE AND SALARIES	$0.00
		TELEPHONE	$3,121.41
		INTERNET AND WEB CHARGES	$0.00
		UTILITIES	$0.00
		TRAVEL	$0.00
		MEALS AND ENTERTAINMENTS	$0.00
		BANK SERVICE CHARGE	$0.00
		CHARGE CARD FEE	$0.00
		ATM FEE	$0.00
		QuickBooks Payments Fees	$5,604.35
		MISC BUSINESS EXPENSE	$0.00
		VENDOR & CONTRACTORS	$30000
		Total Memberships & subscriptions	$16,437.90
			
TOTAL	$77,338.50	TOTAL	$75,609.16
	"
}
```

# Final response from Chat GPT

```bash
gen_P&L
Here’s a profit and loss statement based on the data provided:

REVENUE
| Description | Amount | | ----------------------------- | -------------- | | Gross Sales (Salary Payment) | $31600.00 | | Consulting Fee | $2655.00 | | Freelance Work Payments | $5280.00 | | Investment Dividends | $830.00 | | Interest Credit | $410.00 | | Gift Received | $960.00 | | Total Revenue | $41,735.00 |

PERSONAL DEDUCTIONS
| Description | Amount | | -----------------------------| -------------- | | Groceries | $1,903.00 | | Rent | $12,350.00 | | Utilities | $620.00 | | Dining & Coffee | $862.00 | | Subscriptions | $341.00 | | Entertainment | $433.00 | | Transportation | $618.00 | | Insurance | $1,233.00 | | Car Maintenance | $2700.00 | | Pet Care | $532.00 | | Shopping | $1,550.00 | | Gym Membership | $525.00 | | Total Personal Deductions| $23,667.00 |

BUSINESS DEDUCTIONS
| Description | Amount | | ----------------------------- | -------------- | | None | $0.00 | | Total Business Deductions | $0.00 |

SUMMARY
| Description | Amount | | ----------------------------- | ------------- | | Total Revenue | $41,735.00 | | Total Personal Deductions | $23,667.00 | | Net Income | $18,068.00 |

The net income represents the leftover amount after subtracting personal deductions from the total revenue. There are no business deductions listed for other categories like advertising, office expenses, etc., so they are considered $0.00 in this statement.
```

## Experiment #3 (optional) - Training a 10-page document

Choose a 10-page document (ideally a text or markdown file for easier initial parsing).

1. Start by generating question/answer pairs for document semantic (or non-semantic chunks)
2. Use a prompt to create a detailed summary of the document, and generate question/answer pairs based on the summary.
3. Use a prompt to contain what sections do and do not cover, to hopefully mitigate hallucination

Create a set of evaluation questions (separate from the validation set). Evaluate the performance between each of the stages.

## Submission

Submit your experiment notebook [in the form here](https://forms.gle/DKeRAuYkvDQGjs9P9).

## Appendix

The code below was and expanded version of the trivia code, which injects a system message into each piece of training data, and also generates boundary pairs to help define the scope.

In [None]:
# This notebook generates training data for fine-tuning gpt3.5-turbo on an array of facts
import json
from openai import OpenAI

facts = []
with open('data/facts.jsonl', 'r') as file:
    for line in file:
        facts.append(json.loads(line))

client = OpenAI(api_key='YOUR API KEY')

def generate_qa(fact, n=10):
    prompt_text = f"""
    Based on the following fact, generate an array of {n} variations of question-answer pairs.
    Each pair should be formatted as a JSON object with "messages" containing "user" and "assistant" roles.
    Ensure that the output is in JSON format.

    Each question should be unique, clearly phrased, and reflect how users might ask about this fact.
    The corresponding answer should be accurate, contextually relevant, and phrased differently from the other answers.
    Ensure diversity in question types (who, what, where, when, why) and avoid repetitive phrasing.

    Fact: "{fact}"

    Example output format:

    {{"data": [{{"messages": [{{"role": "user", "content": "What is the capital of France?"}}, {{"role": "assistant", "content": "The capital of France is Paris."}}]}},
    {{"messages": [{{"role": "user", "content": "Who wrote 'Romeo and Juliet'?"}}, {{"role": "assistant", "content": "The author of 'Romeo and Juliet' is William Shakespeare."}}]}},
    {{"messages": [{{"role": "user", "content": "How far is the Moon from Earth?"}}, {{"role": "assistant", "content": "The distance from the Moon to Earth is approximately 384,400 kilometers."}}]}}]
    }}
    """

    return generate_pairs(prompt_text, n)

def generate_boundaries(facts, n=40):
    prompt_text = f"""
    Based on the following facts, generate an array of {n} variations of question-answer pairs.
    Each pair should be formatted as a JSON object with "messages" containing "user" and "assistant" roles.
    Ensure that the output is in JSON format.

    The question-answer pairs should establish boundaries of what the assistant knows beyond the facts below.
    Pairs should use mostly negative examples to establish the boundaries of the facts. For example,
    pairs should include negative examples of detailed followup questions beyond the scope of the facts.

    For each fact, imagine reasonable followup questions that might be asked by a user, and decline to answer. Add
    your rationale in the "rationale" key.

    Facts: {facts}

    Example output format:

    {{"data": [{{"messages": [{{"role": "user", "content": "What is the capital of France?"}}, {{"role": "assistant", "content": "The capital of France is Paris."}}]}},
    {{"messages": [{{"role": "user", "content": "Who wrote 'Romeo and Juliet'?"}}, {{"role": "assistant", "content": "The author of 'Romeo and Juliet' is William Shakespeare."}}]}},
    {{"messages": [{{"role": "user", "content": "How far is the Moon from Earth?"}}, {{"role": "assistant", "content": "The distance from the Moon to Earth is approximately 384,400 kilometers."}}]}}]
    }}
    """

    return generate_pairs(prompt_text, n)

def generate_pairs(prompt_text, n=10):
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": "You are a helpful assistant tasked with generating training data for fine-tuning a gpt-3.5-turbo model in JSON format"},
            {"role": "user", "content": prompt_text}
        ],
        response_format={"type": "json_object"},
        temperature=0.5
    )

    print(response.choices[0].message.content.strip())
    try:
        qa_array = [{"messages": item["messages"]} for item in json.loads(response.choices[0].message.content.strip())["data"]]
    except json.JSONDecodeError as e:
        print("Failed to decode JSON:", e)
        return [], []

    # Splitting the generated QA pairs into training and validation sets
    validation_size = int(len(qa_array) * 0.2)
    validation_set = qa_array[:validation_size]
    training_set = qa_array[validation_size:]

    return training_set, validation_set


training_set = []
validation_set = []

for fact in facts:
    training, validation = generate_qa(fact['fact'])
    training_set.extend(training)
    validation_set.extend(validation)

facts_string = "\n".join([fact['fact'] for fact in facts])
training, validation = generate_boundaries(facts_string)
training_set.extend(training)
validation_set.extend(validation)

# Inject a system message as the first message in the training and validation sets
for qa in training_set:
    qa['messages'].insert(0, {"role": "system", "content": "You are an internal knowledge chat bot for CodePath, an education company"})
for qa in validation_set:
    qa['messages'].insert(0, {"role": "system", "content": "You are an internal knowledge chat bot for CodePath, an education company"})

with open('data/facts_training_2.jsonl', 'w') as train_outfile:
    for qa in training_set:
        train_outfile.write(json.dumps(qa) + '\n')

with open('data/facts_validation_2.jsonl', 'w') as valid_outfile:
    for qa in validation_set:
        valid_outfile.write(json.dumps(qa) + '\n')
