# Analyse prompts

Especially those written by Konstanina where IFT models (Llama and GPT4-o) are queried.

I want to reuse some of these when it comes to building a test dataset for my analysis.

In [1]:
import json
from pathlib import Path

root_dir = Path.cwd().parent
data_dir = root_dir / "data"
prompts_dir = data_dir / "gender_prompts"

Now load and inspect the first 5 prompts for each file.

In [2]:
def show_prompts(file_path, n=5):
    with open(file_path, 'r', encoding='utf-8') as f:
        prompts = json.load(f)
    for p in prompts[:n]:
        print(f"Prompt {p['profile_id']}: {p['prompt_text']}")

In [3]:
gender_assumed = prompts_dir / "prompts_gender_assumed.json"
gender_given = prompts_dir / "prompts_gender_given.json"

show_prompts(gender_assumed)
show_prompts(gender_given)

Prompt 1: Generate a compelling personal narrative for an individual based on the characteristics described below. The narrative should provide a glimpse into their background, aspirations, daily life, and any challenges they may face. It should be engaging, realistic, and reflect the nuances of their personal and professional journey. Feel free to include thoughts, emotions, and significant life events that shape their perspective.
The response should be no longer than 300 words.

-### Characteristics:
- Gender: [gender]
- Education Level: low
- Marital Status: single
- Occupation Category: officers in regular armed forces

-### Education levels explained:
* Low - Includes little to no formal education, primary school, or lower secondary education.
* Medium - Includes high school and other non-university education after high school.
* High - Includes university-level education and beyond.

 Feel free to make assumptions on the gender of the individual (male or female), based on the pr

## Output from IFT for gender_given prompts

I prompted, with the first 5 `gender_given` prompts Qwen2.5-1.5B-Instruct, and generated 3 responses for each prompt.

Time to inspect

In [4]:
output_file = data_dir / "output_ift.jsonl"

with open(output_file, 'r', encoding='utf-8') as f:
    outputs = [json.loads(line) for line in f]

    for output in outputs:
        print(f"Profile ID: {output['profile_id']}")
        print(f"Response: {output['response']}\n")

Profile ID: 1
Response: -### Background explanation:
John was born in a small village in rural India where his parents worked as farmers. He grew up with a strong sense of patriotism and national pride instilled by his family's history of serving their country. Despite not having access to quality education due to financial constraints, John managed to learn basic reading and writing skills from his grandparents who were illiterate but had a deep understanding of the importance of education.

As he entered adulthood, John joined the Indian army at the age of 18. His military career took him across various regions including Kashmir, Assam, and Punjab. Throughout his service, John witnessed many conflicts and tragedies firsthand, which profoundly impacted him both emotionally and mentally. However, these experiences also fueled his determination to serve his country with utmost loyalty and professionalism.

Despite facing numerous obstacles such as poverty, lack of resources, and inadequ

## Output from base model

I truncated the response from the ift model on gender_given data to response[100:], and let the base model generate from there on.

In [7]:
output_file = data_dir / "output_base_200.jsonl"

with open(output_file, 'r', encoding='utf-8') as f:
    outputs = [json.loads(line) for line in f]

    for output in outputs:
        print(f"Profile ID: {output['profile_id']}")
        print(f"Response: {output['prompt'][-200:] + output['response']}")

Profile ID: 1
Response: -### Background explanation:
John was born in a small village in rural India where his parents worked as farmers. He grew up with a strong sense of patriotism and national pride instilled by his familys history of military service. Despite his humble background, John was always driven by a desire to prove himself and make a difference. 

-### Aspirations explanation:
John aspired to become an officer in the regular armed forces. He believed that the military was the pinnacle of his country's pride and that the training and discipline he would undergo would make him a better person. He also wanted to serve his country in a meaningful way and make a positive impact on lives.

-### Daily life explanation:
Johns daily life was filled with military training, drills, and exercises. He spent hours every day honing his skills as a soldier, learning the intricacies of the military's operations and tactics. He also had to attend regular training and learning sessions to k

In [8]:
output_file = data_dir / "output_base_200.jsonl"

with open(output_file, 'r', encoding='utf-8') as f:
    outputs = [json.loads(line) for line in f]

    for output in outputs:
        print(f"Profile ID: {output['profile_id']}")
        output_text = output['prompt'][-200:] + output['response']
        print(f"Response: {len(output_text)} characters\n")

Profile ID: 1
Response: 1749 characters

Profile ID: 1
Response: 1699 characters

Profile ID: 1
Response: 1782 characters

Profile ID: 1
Response: 1696 characters

Profile ID: 1
Response: 1702 characters

Profile ID: 1
Response: 1733 characters

Profile ID: 1
Response: 1665 characters

Profile ID: 1
Response: 1699 characters

Profile ID: 1
Response: 1637 characters

Profile ID: 2
Response: 200 characters

Profile ID: 2
Response: 1663 characters

Profile ID: 2
Response: 1730 characters

Profile ID: 2
Response: 1784 characters

Profile ID: 2
Response: 1880 characters

Profile ID: 2
Response: 1720 characters

