# 4 Generating the Bank for the MicroTasks

To generate the bank for the microtasks I will use an API for an LLM.
The output will be two questions for each core course of each programme.

Basically I will create the perfect prompt that will use the columns of the df_courses to generate the microstasks. 

We use two prompts: broad + disambiguaition (Kenneth Style)

In [1]:
from pathlib import Path
import pandas as pd
#!pip install --upgrade openai
import os, re
from openai import OpenAI
import json
import numpy as np
from tqdm import tqdm  # optional progress bar, pip install tqdm


## 1 Load the data and filter for max 2 courses

In [2]:
# load the csv file about the courses forwhih we have to gen the tasks
silver = Path("../data_programmes_courses/silver")

df_courses_tasks = pd.read_csv(silver / "df_courses_tasks_silver.csv", encoding="utf-8-sig")
print("The shape of the courses tasks dataframe is:", df_courses_tasks.shape)

# keep only first two courses from each programme
df_courses_tasks = df_courses_tasks.groupby("programme_title").head(2).reset_index(drop=True)
print("After keeping only first two courses from each programme the shape is:", df_courses_tasks.shape)

The shape of the courses tasks dataframe is: (36, 21)
After keeping only first two courses from each programme the shape is: (28, 21)


## 2. Set up OpenAI client 

In [3]:

key_path = Path("../data_bank_microtasks") / "api_key.txt"

# Read the key and strip spaces and newlines
api_key = key_path.read_text(encoding="utf8").strip()

# Create the client using this key
client = OpenAI(api_key=api_key)

models = client.models.list()
#for m in models.data:
#    print(m.id)

model_gpt = "gpt-4.1-mini"  


## 3. Define the Prompts
Here is the prompt that generates for each programme the questions based on the core courses.

In [4]:
SYSTEM_PROMPT_BROAD = """
You generate one broad RIASEC question for a playful study choice tool.

The question should feel like a first step in a real course task.

INPUT
You receive one JSON object with:
- programme_title
- course_code
- course_name
- course_objective
- course_content
- teaching_methods
- assessment

TASK
Create exactly one multiple choice question with six options.

Rules
1. Use the course information to imagine a realistic first year situation.
2. Write a short question string that describes the situation and ends with a question.
3. Create tiny_learn as a list of exactly three short sentences:
   a. definition of the key concept
   b. method reminder
   c. common mistake
4. Create six options labelled A to F.
5. Each option describes a plausible first action the student could take.
6. Each option must be tagged with one RIASEC code:
   R, I, A, S, E, or C.
7. Across the six options you must use each of the six RIASEC codes exactly once.
8. Do not mention RIASEC, personality, or profiles in the visible text.

OUTPUT
Return one JSON object with this shape:

{
  "question": string,
  "tiny_learn": [string, string, string],
  "options": {
    "A": {"text": string, "riasec": "R" | "I" | "A" | "S" | "E" | "C"},
    "B": {...},
    "C": {...},
    "D": {...},
    "E": {...},
    "F": {...}
  }
}

All strings must be single line strings. Do not insert raw newline characters in any value.
"""

SYSTEM_PROMPT_DISAMB = """
You generate one RIASEC disambiguation question for a playful study choice tool.

The question should feel like a first step in a real course task.

INPUT
You receive one JSON object with:
- programme_title
- course_code
- course_name
- course_objective
- course_content
- teaching_methods
- assessment
- triple_code   for example "RIA"

triple_code contains three distinct letters from R, I, A, S, E, C.

TASK
Create exactly one multiple choice question with three options.

Rules
1. Use the course information to imagine a realistic first year situation.
2. Write a short question string that describes the situation and ends with a question.
3. Create tiny_learn as a list of exactly three short sentences:
   a. definition of the key concept
   b. method reminder
   c. common mistake
4. Create three options labelled A, B, C.
5. The three options together must use exactly the three letters in triple_code, one per option.
   For example triple_code "RIA" means one R, one I, one A.
6. Each option describes a plausible first action the student could take.
7. Each option must be tagged with a riasec letter that is one of the letters in triple_code.
8. Do not mention RIASEC, personality, or profiles in the visible text.

OUTPUT
Return one JSON object with this shape:

{
  "question": string,
  "tiny_learn": [string, string, string],
  "options": {
    "A": {"text": string, "riasec": one letter from triple_code},
    "B": {...},
    "C": {...}
  }
}

All strings must be single line strings. Do not insert raw newline characters in any value.
"""


## 4. Define the helpers functions

In [5]:
import json
import re
from collections import defaultdict
from pathlib import Path
from tqdm import tqdm

def truncate(text, max_chars=1200):
    """Short helper to shorten very long fields."""
    if text is None:
        return ""
    s = str(text)
    if len(s) <= max_chars:
        return s
    return s[:max_chars]

def build_course_payload(row):
    """Context that we pass to the model for one course."""
    return {
        "programme_title": str(row.get("programme_title", "")),
        "course_code": str(row.get("code", "")),
        "course_name": str(row.get("course_name", "")),
        "course_objective": truncate(row.get("course_objective", ""), 800),
        "course_content": truncate(row.get("course_content", ""), 800),
        "teaching_methods": truncate(row.get("additional_information_teaching_methods", ""), 400),
        "assessment": truncate(row.get("method_of_assessment", ""), 400),
    }

def safe_parse_json(raw_text: str):
    """
    Clean and parse model output as JSON.
    Removes newlines and obvious trailing commas, then parses.
    """
    if raw_text is None:
        raise ValueError("Model returned no text")

    text = raw_text.strip()
    if not text:
        raise ValueError("Model returned empty text")

    # normalise whitespace
    text = text.replace("\r\n", " ").replace("\n", " ").replace("\t", " ")
    text = re.sub(r"\s+", " ", text)

    # remove trailing commas before closing braces or brackets
    text = re.sub(r",\s*([}\]])", r"\1", text)

    # try direct
    try:
        return json.loads(text)
    except json.JSONDecodeError:
        pass

    # try first and last brace
    start = text.find("{")
    end = text.rfind("}")
    if start == -1 or end == -1 or end <= start:
        print("Could not find JSON object, preview:")
        print(text[:400])
        raise ValueError("No JSON object found")

    candidate = text[start : end + 1]
    candidate = re.sub(r",\s*([}\]])", r"\1", candidate)

    try:
        return json.loads(candidate)
    except json.JSONDecodeError as e:
        print("Still could not parse, preview:")
        print(candidate[:400])
        raise e


def call_broad_question(row, model_name=model_gpt):
    payload = build_course_payload(row)

    response = client.responses.create(
        model=model_name,
        input=json.dumps(payload),
        instructions=SYSTEM_PROMPT_BROAD,
        max_output_tokens=500,
    )

    raw = response.output_text
    print("BROAD PREVIEW:", repr((raw or "")[:160]))
    return safe_parse_json(raw)


def call_disamb_question(row, triple_code, model_name=model_gpt):
    payload = build_course_payload(row)
    payload["triple_code"] = triple_code

    response = client.responses.create(
        model=model_name,
        input=json.dumps(payload),
        instructions=SYSTEM_PROMPT_DISAMB,
        max_output_tokens=400,
    )

    raw = response.output_text
    print(f"DISAMB {triple_code} PREVIEW:", repr((raw or "")[:160]))
    return safe_parse_json(raw)



## 5. Function that calls the API and provide the prompt

In [6]:
from collections import defaultdict
from pathlib import Path
import json
from tqdm import tqdm

TRIPLES = ["RIA", "RIS", "REC", "IEC", "ASE", "ASC"]

# final result:
# {
#   "Programme": {
#       "broad": [ {question_code, question, tiny_learn, options}, ... ],
#       "R": [ {question_code, question, tiny_learn, options}, ... ],
#       "I": [...], "A": [...], "S": [...], "E": [...], "C": [...]
#   }
# }
ml_structure = defaultdict(lambda: {
    "broad": [],
    "R": [],
    "I": [],
    "A": [],
    "S": [],
    "E": [],
    "C": [],
})

# track how many questions we have emitted per course
from collections import Counter
question_counter = Counter()

max_courses = 100   # start small, then increase

for i, (_, row) in enumerate(tqdm(df_courses_tasks.iterrows(), total=len(df_courses_tasks))):
    if i >= max_courses:
        break

    programme = str(row.get("programme_title", "UNKNOWN PROGRAMME"))
    course_code = str(row.get("code", ""))

    # 1) broad question for this course
    try:
        broad_obj = call_broad_question(row)
    except Exception as e:
        print(f"Problem making broad question for course {course_code}: {e}")
        continue

    question_counter[course_code] += 1
    broad_qcode = f"{course_code}_{question_counter[course_code]}"

    broad_entry = {
        "question_code": broad_qcode,
        "question": broad_obj["question"],
        "tiny_learn": broad_obj["tiny_learn"],
        "options": broad_obj["options"],
    }

    ml_structure[programme]["broad"].append(broad_entry)

    # 2) six disambiguation questions for this course
    for triple in TRIPLES:
        try:
            disamb_obj = call_disamb_question(row, triple)
        except Exception as e:
            print(f"Problem making disamb {triple} for course {course_code}: {e}")
            continue

        question_counter[course_code] += 1
        disamb_qcode = f"{course_code}_{question_counter[course_code]}"

        disamb_entry = {
            "question_code": disamb_qcode,
            "question": disamb_obj["question"],
            "tiny_learn": disamb_obj["tiny_learn"],
            "options": disamb_obj["options"],
        }

        # attach under each RIASEC letter in the triple
        for letter in set(triple):
            if letter in "RIASEC":
                ml_structure[programme][letter].append(disamb_entry)


  0%|          | 0/28 [00:00<?, ?it/s]

BROAD PREVIEW: '{\n  "question": "You have just selected an ancient coin from the Allard Pierson collection to start your research. What is your first step to understand its his'
DISAMB RIA PREVIEW: '{\n  "question": "You have just been assigned a research project on a coin from the ancient collection. What should you do first to start your investigation?",\n '
DISAMB RIS PREVIEW: '{\n  "question": "You start your research on an ancient artifact. Which first step do you take to understand its historical significance?",\n  "tiny_learn": [\n   '
DISAMB REC PREVIEW: '{\n  "question": "You are beginning research on an ancient object from the Allard Pierson collection. What is your first step to understand its historical contex'
DISAMB IEC PREVIEW: '{\n  "question": "You have just started researching an ancient artifact for your writing assignment. What should you do first to ensure your research is reliable'
DISAMB ASE PREVIEW: '{\n  "question": "You are about to begin researching an ancie

  4%|▎         | 1/28 [00:31<14:11, 31.52s/it]

DISAMB ASC PREVIEW: '{\n  "question": "You have chosen an ancient coin from the Allard Pierson collection for your research project. What should you do first to start your study effe'
BROAD PREVIEW: '{\n  "question": "You have just started the course and need to select a canonical item not discussed in class for your group presentation. What is your first ste'
DISAMB RIA PREVIEW: '{\n  "question": "You have been assigned to present a small research project on a canonical item for your seminar. What is your first step?",\n  "tiny_learn": [\n '
DISAMB RIS PREVIEW: '{\n  "question": "You need to decide how to start working on your group project about a classical canonical item not discussed in class. Which initial step do yo'
DISAMB REC PREVIEW: '{\n  "question": "You need to prepare for a group presentation on a canonical item not discussed in class. What should you do first?",\n  "tiny_learn": [\n    "A c'
DISAMB IEC PREVIEW: '{\n  "question": "You need to prepare your group presentatio

  7%|▋         | 2/28 [00:58<12:36, 29.11s/it]

DISAMB ASC PREVIEW: '{\n  "question": "You have just been assigned to a group project to research and present on a classical canonical item not covered in class. What is your first s'
BROAD PREVIEW: '{\n  "question": "You have just attended your first lecture on prominent communication theories and received the week\'s reading assignment. What is your first st'
DISAMB RIA PREVIEW: '{\n  "question": "You have just been assigned your first seminar group and given the weekly reading on communication theories. What should you do first to prepar'
DISAMB RIS PREVIEW: '{\n  "question": "You have just attended the first lecture covering various communication theories. What should be your first step to deepen your understanding?"'
DISAMB REC PREVIEW: '{\n  "question": "You have just started the course and need to prepare for the first seminar discussion on communication theories. Which first step will best set'
DISAMB IEC PREVIEW: '{\n  "question": "You have just received the assignment to anal

 11%|█         | 3/28 [01:27<12:04, 28.97s/it]

DISAMB ASC PREVIEW: '{\n  "question": "You just attended the first lecture introducing various communication theories. To prepare for the upcoming seminar, what should you do first?"'
BROAD PREVIEW: '{\n  "question": "You have just been introduced to the basics of phonetics and language acquisition in your Introduction to Linguistics course. You have been giv'
DISAMB RIA PREVIEW: '{\n  "question": "You are starting your first assignment analyzing a dataset of child language samples. What is your initial step to approach the task effectivel'
DISAMB RIS PREVIEW: '{\n  "question": "You have just received a dataset of transcribed child speech to analyze for your first linguistics exercise. What should you do first?",\n  "tin'
DISAMB REC PREVIEW: '{\n  "question": "You receive a dataset of sentences from an unfamiliar language as your first task. What should you do to start analyzing it effectively?",\n  "t'
DISAMB IEC PREVIEW: '{\n  "question": "You have just started the Introduction to Li

 14%|█▍        | 4/28 [01:51<10:49, 27.06s/it]

DISAMB ASC PREVIEW: '{\n  "question": "You receive a dataset of spoken language samples with transcription errors. What is your first step to prepare this data for linguistic analysi'
BROAD PREVIEW: '{\n  "question": "You have just received the first assignment involving limits and derivatives of functions of one variable. What is your first step to start sol'
DISAMB RIA PREVIEW: '{\n  "question": "You are starting to study the properties of continuous functions and want to decide how to begin. What should you do first?",\n  "tiny_learn": ['
DISAMB RIS PREVIEW: '{\n  "question": "You are starting to study a challenging theorem involving limits and continuity. What is your first step to ensure you understand the problem?"'
DISAMB REC PREVIEW: '{\n  "question": "You are starting to study the concept of derivatives in the course. What is the first thing you should do to fully understand the new material?'
DISAMB IEC PREVIEW: '{\n  "question": "You are starting to study functions of one va

 18%|█▊        | 5/28 [02:17<10:08, 26.45s/it]

DISAMB ASC PREVIEW: '{\n  "question": "You are starting the course and want to best prepare for the midterm exam. Which initial approach should you take to understand the course mate'
BROAD PREVIEW: '{\n  "question": "You have just started the Introduction to Programming course and your first task is to write a simple program that asks for a user\'s name and p'
DISAMB RIA PREVIEW: '{\n  "question": "After your first lecture on algorithms and variables, you want to start solving a simple problem given in natural language. What could be your '
DISAMB RIS PREVIEW: '{\n  "question": "You have just started the programming course and are asked to write a small program to check user input for correctness. What should be your fi'
DISAMB REC PREVIEW: '{\n  "question": "In your first week of programming, you encounter a task to write a simple program that uses variables and loops. Which approach should you take'
DISAMB IEC PREVIEW: '{\n  "question": "You just started the Introduction to Programm

 21%|██▏       | 6/28 [02:44<09:46, 26.66s/it]

DISAMB ASC PREVIEW: '{\n  "question": "You have just learned about variables, lists, and writing simple functions in Python. What should you do first to start a programming assignmen'
BROAD PREVIEW: '{\n  "question": "You are starting your first assignment to write a status quaestionis on a historical topic. What is your first step to ensure a solid foundatio'
DISAMB RIA PREVIEW: '{\n  "question": "You have been assigned to write the status quaestionis on a historical topic. What should you do first?",\n  "tiny_learn": [\n    "The status qua'
DISAMB RIS PREVIEW: '{\n  "question": "You are starting your first assignment to write a status quaestionis on a historical topic. What should you do first to begin your research eff'
DISAMB REC PREVIEW: '{\n  "question": "You have to write a status quaestionis on a historical topic. What is your first step to start this assignment?",\n  "tiny_learn": [\n    "Status'
DISAMB IEC PREVIEW: '{\n  "question": "You need to start your first weekly assign

 25%|██▌       | 7/28 [03:11<09:25, 26.92s/it]

DISAMB ASC PREVIEW: '{\n  "question": "You are starting your first assignment to write a status quaestionis about a historical topic. What should you do first to set a solid foundati'
BROAD PREVIEW: '{\n  "question": "Je hebt net de eerste opdracht ontvangen voor het schrijven van een status quaestionis over een historiografisch thema. Wat is jouw eerste stap'
DISAMB RIA PREVIEW: '{\n  "question": "You need to start your status quaestionis on a historiographical theme. What should you do first to build a solid foundation for your paper?",\n'
DISAMB RIS PREVIEW: '{\n  "question": "You have just received the assignment to write a status quaestionis on a historiographical theme. What should you do first to get started?",\n  '
DISAMB REC PREVIEW: '{\n  "question": "Je begint net met het schrijven van je status quaestionis voor het historiografisch thema. Wat is je eerste stap om een helder overzicht te mak'
DISAMB IEC PREVIEW: '{\n  "question": "Je begint aan de literatuurstudie voor je st

 29%|██▊       | 8/28 [03:44<09:34, 28.71s/it]

DISAMB ASC PREVIEW: '{\n  "question": "Je krijgt de opdracht om een status quaestionis te schrijven over een historiografisch thema. Wat doe je als eerste om goed van start te gaan?"'
BROAD PREVIEW: '{\n  "question": "Je hebt een nieuw gedicht ontvangen voor de eerste analyseopdracht. Wat is je eerste stap om het gedicht volgens de structuralistische benaderi'
DISAMB RIA PREVIEW: '{\n  "question": "Je hebt een gedicht gekozen voor je creatieve opdracht maar weet niet zeker hoe je het structuralistisch moet analyseren; wat is je eerste stap'
DISAMB RIS PREVIEW: '{\n  "question": "Je hebt net een dichtregel gelezen die je intrigeert door zijn onverwachte structuur. Wat is je eerste stap om deze regel te begrijpen?",\n  "ti'
DISAMB REC PREVIEW: '{\n  "question": "Je hebt net een gedicht gekozen om te analyseren voor je creatieve opdracht. Wat is een goede eerste stap om het werk grondig te begrijpen?",\n '
DISAMB IEC PREVIEW: '{\n  "question": "You are starting the first analysis assignme

 32%|███▏      | 9/28 [04:14<09:14, 29.20s/it]

DISAMB ASC PREVIEW: '{\n  "question": "Je gaat aan de slag met je eerste creatieve opdracht waarin je een bestaand verhaal bewerkt. Wat is de eerste stap die je neemt om een goede be'
BROAD PREVIEW: '{\n  "question": "You have just read the first chapter of James Wood’s \'How Fiction Works\' and need to start your first short story for the course. What is your '
DISAMB RIA PREVIEW: '{\n  "question": "Je bent net begonnen aan je eerste creatieve schrijfopdracht waarin je een kort verhaal moet schrijven. Wat is een goede eerste stap om een eff'
DISAMB RIS PREVIEW: '{\n  "question": "You\'re tasked with starting your first short story based on concepts from James Wood\'s book. Which first step would help you best understand na'
DISAMB REC PREVIEW: '{\n  "question": "You have just read the chapter on plot structure from Wood\'s book and need to start your own story. What is your first step?",\n  "tiny_learn": '
DISAMB IEC PREVIEW: '{\n  "question": "Je bent net begonnen met het schrijven v

 36%|███▌      | 10/28 [04:42<08:37, 28.77s/it]

DISAMB ASC PREVIEW: '{\n  "question": "You have just read a chapter about narrative perspective and character development; what is your first step to begin your short story?",\n  "tin'
BROAD PREVIEW: '{\n  "question": "At the start of your Ancient Philosophy course, you are given a primary source text from the Greco-Roman tradition to analyze. What is your fir'
DISAMB RIA PREVIEW: '{\n  "question": "You are preparing for a seminar on core texts from both Greco-Roman and Chinese philosophical traditions. What is your first step to engage eff'
DISAMB RIS PREVIEW: '{\n  "question": "You have just started reading a primary source from the Greco-Roman tradition. What should you do first to best understand the material?",\n  "t'
DISAMB REC PREVIEW: '{\n  "question": "You have to prepare for the first seminar by engaging with primary texts on ancient philosophies. What is the best way to start?",\n  "tiny_lear'
DISAMB IEC PREVIEW: '{\n  "question": "You have just started reading a primary sou

 39%|███▉      | 11/28 [05:12<08:14, 29.07s/it]

DISAMB ASC PREVIEW: '{\n  "question": "You are about to start reading a primary philosophical text for the first seminar. What is your first step to understand it effectively?",\n  "t'
BROAD PREVIEW: '{\n  "question": "In your first seminar, you are asked to analyze an argument about the value of knowledge. What is your initial step to engage with this task?",'
DISAMB RIA PREVIEW: '{\n  "question": "You have just started analyzing a key epistemological argument for your first seminar. What is the best initial step to take?",\n  "tiny_learn":'
DISAMB RIS PREVIEW: '{\n  "question": "You have just read a complex epistemological argument about the value of knowledge. What should you do first to get a clear understanding?",\n  '
DISAMB REC PREVIEW: '{\n  "question": "In your first seminar on epistemology, you are asked to evaluate a given argument about the nature of knowledge. What is your first step?",\n  "'
DISAMB IEC PREVIEW: '{\n  "question": "You have just read a challenging philosoph

 43%|████▎     | 12/28 [05:39<07:35, 28.48s/it]

DISAMB ASC PREVIEW: '{\n  "question": "You are starting your first seminar on epistemology and need to prepare. What should you do first to engage actively in the seminar discussions'
BROAD PREVIEW: '{\n  "question": "In your first genetics lab session, you receive a sample of bacterial DNA and need to understand its genome before proceeding. What should you '
DISAMB RIA PREVIEW: '{\n  "question": "You need to prepare for the first partial exam in Genetics by understanding how human genome structure differs from prokaryotic genomes. What i'
DISAMB RIS PREVIEW: '{\n  "question": "You are starting your first assignment on gene expression regulation. What is the best initial step to tackle this task?",\n  "tiny_learn": [\n  '
DISAMB REC PREVIEW: '{\n  "question": "You just began the Genetics course and need to start your first assignment on differences between prokaryotic and eukaryotic genomes. What shou'
DISAMB IEC PREVIEW: '{\n  "question": "You’re starting a genetics assignment on DNA

 46%|████▋     | 13/28 [06:08<07:10, 28.69s/it]

DISAMB ASC PREVIEW: '{\n  "question": "You just started the Genetics course and need to prepare for the first lab practical on DNA structure. What should be your first step?",\n  "tin'
BROAD PREVIEW: '{\n  "question": "You have just been assigned to formulate a scientific research question based on current biomedical literature. What is your first step?",\n  "t'
DISAMB RIA PREVIEW: '{\n  "question": "You have to start your first literature report by defining a research question. What is the best initial approach?",\n  "tiny_learn": [\n    "A r'
DISAMB RIS PREVIEW: '{\n  "question": "You have just received your first scientific paper to analyze for the Introduction to Biomedical Sciences course. What should you do first to s'
DISAMB REC PREVIEW: '{\n  "question": "You are starting your first assignment where you must critically analyze a research article. What is your first step?",\n  "tiny_learn": [\n    "'
DISAMB IEC PREVIEW: '{\n  "question": "You are starting your first assignment o

 50%|█████     | 14/28 [06:34<06:31, 27.97s/it]

DISAMB ASC PREVIEW: '{\n  "question": "You have just formulated a research question about the effect of a new drug on blood pressure. What is your first step to ensure your study is '
BROAD PREVIEW: '{\n  "question": "You have just attended a lecture on limits and continuity and are preparing to solve your first set of exercises. What is your first step to un'
DISAMB RIA PREVIEW: '{\n  "question": "You need to start preparing for your Calculus 1 midterm by choosing a study approach. Which first step would best suit your goal?",\n  "tiny_lea'
DISAMB RIS PREVIEW: '{\n  "question": "You are beginning to study limits and derivatives in Calculus 1. When you encounter a tough limit problem, what is your first step to understan'
DISAMB REC PREVIEW: '{\n  "question": "You are tasked with finding local extreme values of a function but first need to ensure the function is differentiable. What is your first step'
DISAMB IEC PREVIEW: '{\n  "question": "You are starting the study of limits and deri

 54%|█████▎    | 15/28 [07:02<06:03, 27.93s/it]

DISAMB ASC PREVIEW: '{\n  "question": "You have just been introduced to the concept of limits in calculus. What is the best first step to understand how to calculate a limit?",\n  "ti'
BROAD PREVIEW: '{\n  "question": "You have just been introduced to the overall case study for this Business Analytics course. What is your first step to start addressing the cas'
DISAMB RIA PREVIEW: '{\n  "question": "You have just started the Introduction to Business Analytics course and received the first case study. What should you do first to approach the'
DISAMB RIS PREVIEW: '{\n  "question": "You have just been introduced to Business Analytics and need to start your first project. What should you do first?",\n  "tiny_learn": [\n    "Bu'
DISAMB REC PREVIEW: '{\n  "question": "You are starting your first Business Analytics case study and need to decide your first step. What should you do?",\n  "tiny_learn": [\n    "Busi'
DISAMB IEC PREVIEW: '{\n  "question": "You have received a dataset for the case 

 57%|█████▋    | 16/28 [07:29<05:32, 27.70s/it]

DISAMB ASC PREVIEW: '{\n  "question": "You are starting the large case study in Business Analytics and need to decide your initial focus. What should you do first?",\n  "tiny_learn": '
BROAD PREVIEW: '{\n  "question": "You have just started your first programming assignment where you need to write a simple C++ program using if statements and loops. What is you'
DISAMB RIA PREVIEW: '{\n  "question": "You have just been assigned your first programming exercise using C++. What is the best way to start approaching the problem?",\n  "tiny_learn":'
DISAMB RIS PREVIEW: '{\n  "question": "You have just been assigned your first programming task involving loops and conditional statements in C++. How do you start tackling the proble'
DISAMB REC PREVIEW: '{\n  "question": "You just started your first programming assignment which requires writing a small C++ program to solve a problem. What should you do first?",\n '
DISAMB IEC PREVIEW: '{\n  "question": "You just received your first programming as

 61%|██████    | 17/28 [07:55<04:59, 27.25s/it]

DISAMB ASC PREVIEW: '{\n  "question": "You just started the Computer Programming course and need to tackle your first assignment on loops and conditional statements. What\'s your firs'
BROAD PREVIEW: '{\n  "question": "In your first logic exercise class, you need to analyze a propositional formula and decide how to start understanding its structure. What is yo'
DISAMB RIA PREVIEW: '{\n  "question": "You have been given a complex logical formula to simplify as your first exercise. What is your first step to tackle this task?",\n  "tiny_learn"'
DISAMB RIS PREVIEW: '{\n  "question": "You are given a complex logical formula and need to start your analysis. What is your first approach to understand and simplify it?",\n  "tiny_l'
DISAMB REC PREVIEW: '{\n  "question": "You are tackling a first set theory exercise that asks you to analyze a relation on a given set. What is your first step to ensure you understa'
DISAMB IEC PREVIEW: '{\n  "question": "You are given a complex logical formula to 

 64%|██████▍   | 18/28 [08:24<04:35, 27.58s/it]

DISAMB ASC PREVIEW: '{\n  "question": "You have just started working on a logic exercise involving Venn diagrams and set relations. What would be your first step to tackle the proble'
BROAD PREVIEW: '{\n  "question": "As a first task in your Economic Challenges course, you need to prepare by understanding the historical context of a major economic theory. Wha'
DISAMB RIA PREVIEW: '{\n  "question": "In your first week, you are asked to analyze a historical economic challenge and relate it to a current economic issue. What is your initial ap'
DISAMB RIS PREVIEW: '{\n  "question": "At the start of the course, you are asked to analyze an economic theory from a historical period. What is your first step?",\n  "tiny_learn": [\n'
DISAMB REC PREVIEW: '{\n  "question": "You are starting to study different economic theories and notice they emerged as responses to social and political events. What is your first s'
DISAMB IEC PREVIEW: '{\n  "question": "You have just read about classical and moder

 68%|██████▊   | 19/28 [08:50<04:05, 27.23s/it]

DISAMB ASC PREVIEW: '{\n  "question": "You have just started studying economic challenges and need to choose how to begin your coursework. What is your first step?",\n  "tiny_learn": '
BROAD PREVIEW: '{\n  "question": "You have just been introduced to the concept of market equilibrium using mathematical models. What is your first step to start understanding th'
DISAMB RIA PREVIEW: '{\n  "question": "You need to start working on your first assignment where you translate a real-world business problem into mathematical expressions. What is you'
DISAMB RIS PREVIEW: '{\n  "question": "You need to start working on a business case study involving market equilibrium and investment risk. What is your first step?",\n  "tiny_learn":'
DISAMB REC PREVIEW: '{\n  "question": "You have just started the course and face a complex economic problem involving mathematical models. What is your first step to approach this pr'
DISAMB IEC PREVIEW: '{\n  "question": "You are given a business scenario involving 

 71%|███████▏  | 20/28 [09:20<03:43, 27.98s/it]

DISAMB ASC PREVIEW: '{\n  "question": "You have been asked to analyze a dataset related to market prices using mathematical models. What is your first step to ensure you understand t'
BROAD PREVIEW: '{\n  "question": "You have just received the assignment brief asking you to analyze an international business problem using both deductive and inductive research'
DISAMB RIA PREVIEW: '{\n  "question": "You need to start your first research project by deciding how to frame your research question. Which approach will you choose first?",\n  "tiny_'
DISAMB RIS PREVIEW: '{\n  "question": "As a first step in your research project on international business, how would you best begin to define your research question?",\n  "tiny_learn"'
DISAMB REC PREVIEW: '{\n  "question": "You have been given a research topic on an international business problem. What is the best first step to begin your academic study?",\n  "tiny_'
DISAMB IEC PREVIEW: '{\n  "question": "You have just received your first assignmen

 75%|███████▌  | 21/28 [09:48<03:15, 27.88s/it]

DISAMB ASC PREVIEW: '{\n  "question": "You are preparing your first academic research project and need to decide how to start. Which step will help you get focused on the most releva'
BROAD PREVIEW: '{\n  "question": "You are tasked with starting a group project analyzing a multinational company\'s intercultural challenges. What is your first step?",\n  "tiny_l'
DISAMB RIA PREVIEW: '{\n  "question": "You are starting a group assignment analyzing a cross-cultural business case. What is your first step to understand the complex situation?",\n  '
DISAMB RIS PREVIEW: '{\n  "question": "You are given a case study about a conflict in an international team due to cultural misunderstandings. What should you do first to understand '
DISAMB REC PREVIEW: '{\n  "question": "You have been given a case study about a multicultural team facing conflicts in an international company. What is your first step to understand'
DISAMB IEC PREVIEW: '{\n  "question": "You are starting your first group assignmen

 79%|███████▊  | 22/28 [10:17<02:49, 28.30s/it]

DISAMB ASC PREVIEW: '{\n  "question": "You are tasked with analyzing a case study about a multicultural team facing collaboration challenges; what is your first step?",\n  "tiny_learn'
BROAD PREVIEW: '{\n  "question": "You have just been introduced to the concept of a direct proof. What is your first step to start working on a proof for the upcoming assignment'
DISAMB RIA PREVIEW: '{\n  "question": "You just started the course and need to prepare for the first homework assignment involving mathematical proofs. What is the best way to begin?'
DISAMB RIS PREVIEW: '{\n  "question": "You have just learned how to prove simple mathematical statements using direct proof and proof by contradiction. Which step should you take fir'
DISAMB REC PREVIEW: '{\n  "question": "You have just encountered a mathematical statement in your Basic Concepts in Mathematics course. What is your first step in approaching its pro'
DISAMB IEC PREVIEW: '{\n  "question": "You just encountered a problem that involves 

 82%|████████▏ | 23/28 [10:41<02:14, 26.99s/it]

DISAMB ASC PREVIEW: '{\n  "question": "You just received the first assignment with a proof problem. What is your first step to approach it effectively?",\n  "tiny_learn": [\n    "A mat'
BROAD PREVIEW: '{\n  "question": "You are beginning your study of Single Variable Calculus and encounter a problem asking you to find the limit of a complex function as x approa'
DISAMB RIA PREVIEW: '{\n  "question": "You are given a function and need to find its local maximum. What should be your first step?",\n  "tiny_learn": [\n    "A local maximum occurs wh'
DISAMB RIS PREVIEW: '{\n  "question": "You are given a function and asked to find its local maxima and minima. What should you do first?",\n  "tiny_learn": [\n    "Local maxima and min'
DISAMB REC PREVIEW: '{\n  "question": "You are given a function and need to find its local maxima and minima as part of your first assignment. What should you do first?",\n  "tiny_lea'
DISAMB IEC PREVIEW: '{\n  "question": "You are given a function and asked to f

 86%|████████▌ | 24/28 [11:12<01:52, 28.18s/it]

DISAMB ASC PREVIEW: '{\n  "question": "You are starting your first exercise on calculating derivatives and need to decide your initial approach. What should you do first?",\n  "tiny_l'
BROAD PREVIEW: '{\n  "question": "You are beginning your first group assignment on visual and material culture and need to decide how to start exploring the topic. What is your '
DISAMB RIA PREVIEW: '{\n  "question": "You are starting your first group assignment on visual culture and must decide how to analyze an artwork. What should be your first step?",\n  "'
DISAMB RIS PREVIEW: '{\n  "question": "In your first group assignment, you need to analyze how an image communicates cultural meaning. What is your first step?",\n  "tiny_learn": [\n  '
DISAMB REC PREVIEW: '{\n  "question": "You need to start your first group assignment exploring how an image functions in cultural contexts. What is your initial approach?",\n  "tiny_l'
DISAMB IEC PREVIEW: '{\n  "question": "You start your first assignment exploring

 89%|████████▉ | 25/28 [11:36<01:20, 26.99s/it]

DISAMB ASC PREVIEW: '{\n  "question": "After attending the first seminar on concepts like image, medium, and visual culture, what is your next best step to deepen your understanding?'
BROAD PREVIEW: '{\n  "question": "You have just been assigned to prepare a short presentation on the cultural impact of the Enlightenment era for your European Cultural History '
DISAMB RIA PREVIEW: '{\n  "question": "You have just started the European Cultural History course and need to decide how to prepare for the upcoming first quiz on Renaissance and Enl'
DISAMB RIS PREVIEW: '{\n  "question": "You are preparing your first presentation on the cultural impact of the Renaissance. How do you start your research process?",\n  "tiny_learn": '
DISAMB REC PREVIEW: '{\n  "question": "You have just received the syllabus for the European Cultural History course. What is your first step to prepare for the upcoming lecture on th'
DISAMB IEC PREVIEW: '{\n  "question": "You have just started the European Cultural H

 93%|█████████▎| 26/28 [12:03<00:54, 27.01s/it]

DISAMB ASC PREVIEW: '{\n  "question": "You are preparing your first presentation for the European Cultural History course. What should be your initial focus to build a strong foundat'
BROAD PREVIEW: '{\n  "question": "You have just been assigned to analyze a classic ethical dilemma for your seminar discussion. What is your first step in preparing your argumen'
DISAMB RIA PREVIEW: '{\n  "question": "You are starting your first seminar on ethical theories and need to decide how to prepare. What should be your initial approach?",\n  "tiny_lear'
DISAMB RIS PREVIEW: '{\n  "question": "You need to prepare for the first seminar by choosing how to engage with the course material. What is your best first step?",\n  "tiny_learn": ['
DISAMB REC PREVIEW: '{\n  "question": "In your first seminar, you are asked to choose an ethical theory to analyze a case about global poverty. Which approach do you take first?",\n  '
DISAMB IEC PREVIEW: '{\n  "question": "You are starting your first seminar on ethi

 96%|█████████▋| 27/28 [12:31<00:27, 27.17s/it]

DISAMB ASC PREVIEW: '{\n  "question": "In your first seminar on \'Ethics (PPE)\', you need to prepare an initial response to a question about applying virtue ethics to a current moral '
BROAD PREVIEW: '{\n  "question": "You are tasked with finding the maximum value of a function that represents an economic utility model. What is your first step?",\n  "tiny_learn'
DISAMB RIA PREVIEW: '{\n  "question": "You are assigned a problem to find the maximum value of a function with constraints. What should be your first step?",\n  "tiny_learn": [\n    "O'
DISAMB RIS PREVIEW: '{\n  "question": "You are given a problem involving optimization of a function with several variables. What is your first step to start solving it?",\n  "tiny_lea'
DISAMB REC PREVIEW: '{\n  "question": "You need to prepare for your first compulsory exercise in the Maths Lab. Which step should you take first to approach the problem most effectiv'
DISAMB IEC PREVIEW: '{\n  "question": "You are given a problem to optimize a fu

100%|██████████| 28/28 [12:55<00:00, 27.69s/it]

DISAMB ASC PREVIEW: '{\n  "question": "You are given a system of linear equations representing a political voting scenario. What is your first step to analyze the system?",\n  "tiny_l'





## 6. Saving

In [8]:
ml_structure = dict(ml_structure)

output_dir = Path("../data_bank_microtasks")
output_dir.mkdir(parents=True, exist_ok=True)

out_path = output_dir / "microtasks_prototype.json"

with out_path.open("w", encoding="utf8") as f:
    json.dump(ml_structure, f, ensure_ascii=False, indent=2)

print("Saved ML structure to:", out_path)


Saved ML structure to: ..\data_bank_microtasks\microtasks_prototype.json


In [9]:
with out_path.open("r", encoding="utf8") as f:
    data = json.load(f)

print("Programmes:", list(data.keys())[:5])

prog_name = next(iter(data.keys()))
p = data[prog_name]

print("\nProgramme:", prog_name)
print(" broad questions:", len(p["broad"]))
print(" R questions:", len(p["R"]))
print(" I questions:", len(p["I"]))
print(" A questions:", len(p["A"]))
print(" S questions:", len(p["S"]))
print(" E questions:", len(p["E"]))
print(" C questions:", len(p["C"]))

from pprint import pprint

print("\nExample broad question:")
pprint(p["broad"][0])

print("\nExample R disambiguation question:")
if p["R"]:
    pprint(p["R"][0])


Programmes: ['Ancient Studies', 'Communication and Information Studies', 'Econometrics and Operations Research', 'History', 'Literature and Society']

Programme: Ancient Studies
 broad questions: 2
 R questions: 6
 I questions: 6
 A questions: 6
 S questions: 6
 E questions: 6
 C questions: 6

Example broad question:
{'options': {'A': {'riasec': 'A',
                   'text': 'Sketch the coin to capture its details and symbols '
                           'for further artistic analysis.'},
             'B': {'riasec': 'I',
                   'text': 'Visit the library to locate academic articles that '
                           "explain the coin's historical period and usage."},
             'C': {'riasec': 'S',
                   'text': 'Contact the museum curator to ask about the '
                           'conservation process and physical characteristics '
                           'of the coin.'},
             'D': {'riasec': 'R',
                   'text': 'Analyze the coin