<a href="https://colab.research.google.com/github/alexk2206/tds_capstone/blob/Alex-DEV/test_qa_dataset.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Test QA dataset
Created by: Alexander Keßler

In order to test a trained and evaluated model a test dataset is needed. This will be generated from scratch with new questions.

In [1]:
import pandas as pd
import random
import json
from itertools import chain, combinations
from datetime import datetime, timedelta

import google.generativeai as genai
from IPython.display import display, Markdown
from google.colab import userdata
import time

## Load required data

In [2]:
dfs = []

for i in range(1, 6):
    url = f'https://raw.githubusercontent.com/alexk2206/tds_capstone/refs/heads/main/questionnaires/questionnaire{i}.json'
    df = pd.read_json(url)
    df['options'] = df['options'].apply(lambda x: ', '.join([opt['option'] for opt in x]))
    dfs.append(df)

all_questions = pd.concat(dfs, ignore_index=True)

In [3]:
print(f"all_questions shape: {all_questions.shape}")
all_questions

all_questions shape: (25, 4)


Unnamed: 0,id,type,question,options
0,aa2d8cdd-0758-4035-b0b6-ca18e2f380d8,SINGLE_SELECT,Data processing consent,"Yes, No"
1,12e1ed1d-edaa-4e93-8645-de3850e998f9,SINGLE_SELECT,Customer group,"End User, Wholesaler, Distributor, Consultant,..."
2,625012ae-9192-4cf6-a73d-55e1813d6014,MULTI_SELECT,Products interested in,"MY-SYSTEM, Notion, JTS, JS EcoLine, AKW100, AX100"
3,0699fc5a-34a4-4160-bda1-fb135a3615da,MULTI_SELECT,What kind of follow up is planned,"Email, Phone, Schedule a Visit, No action"
4,815dab84-bc5e-4764-9777-0c0126e3173e,MULTI_SELECT,Who to copy in follow up,"Stephan Maier, Joachim Wagner, Erik Schneider,..."
5,3f34e5b3-1cb0-48ea-93d2-3f21b3371b5d,SINGLE_SELECT,Would you like to receive marketing informatio...,"Yes, No"
6,ba042f33-697e-4c6f-924c-b4de2c30f443,SINGLE_SELECT,What industry are you operating in?,"Aerospace, Computers & Networks, Government, M..."
7,7a776cc0-ffe8-4891-b8a9-dd5ff984de13,MULTI_SELECT,What products are you interested in?,"Automotive radar target simulation, Noise figu..."
8,a0148bc7-15b3-41d5-b97c-6420b8bd927c,TEXT,Notes,Please provide any additional information that...
9,5aefc81d-c5d2-41fc-bc7b-6117d1c7671e,SINGLE_SELECT,What type of company is it?,"Construction company, Craft enterprises, Scaff..."


## Setting up a Prompt for ChatGPT

To create new questions for a test dataset, we used ChatGPT to generate them. We formatted the prompt in JSON, uploaded it to our GitHub repository, and then loaded it into our environment for execution. Below is the prompt we used:

You are a salesman at a trade fair and want to ask customers who visit your exhibition stand questions. You will be given a list of questions, the question types, and possible options to answer each question. I want you to think of completely new questions, their types, and possible answer options. Your aim is to create 20 questions with each type at least once for a new questionnaire for the next trade fair. Keep it short, as you are allowed to use only up to 32 tokens per question and up to 32 tokens for their options. It is important to come up with completely new questions!

Sample questions divided by // : {questions}

Question type divided by // : {type}

Possible answer options per question divided by // : {options}

New questions with type and answer options formatted as a json:

### For {questions} we used this string:

Data processing consent // Customer group // Products interested in // What kind of follow up is planned // Who to copy in follow up // Would you like to receive marketing information from via e-mail? // What industry are you operating in? // What products are you interested in? // Notes // What type of company is it? // What is the size of your company? // When do you wish to receive a follow-up? // Any additional notes? // Which language is wanted for communication? // What is the type of contact? // What is the contact person interested in? // What phone number can we use for contact? // When does the contact person wish to receive a follow up? // Customer type // Customer satisfaction // Size of the trade fair team (on average) // CRM-System // Productinterests // Searches a solution for // Next steps

### For {type} we used this string:

SINGLE_SELECT // SINGLE_SELECT // MULTI_SELECT // MULTI_SELECT // MULTI_SELECT // SINGLE_SELECT // SINGLE_SELECT // MULTI_SELECT // TEXT // SINGLE_SELECT // SINGLE_SELECT // DATE // TEXT // SINGLE_SELECT // MULTI_SELECT // MULTI_SELECT // NUMBER // MULTI_SELECT // SINGLE_SELECT // SINGLE_SELECT // SINGLE_SELECT // SINGLE_SELECT // MULTI_SELECT // MULTI_SELECT // SINGLE_SELECT

### For {options} we used this string:

Yes, No // End User, Wholesaler, Distributor, Consultant, Planner, Architect, R&D // MY-SYSTEM, Notion, JTS, JS EcoLine, AKW100, AX100 // Email, Phone, Schedule a Visit, No action // Stephan Maier, Joachim Wagner, Erik Schneider, Oliver Eibel, Angelina Haug, Marisa Peng, Johannes Wagner, Jessica Hanke, Sandro Kalter, Jens Roschmann, Domiki Stein, Sean Kennin, Tim Persson // Yes, No // Aerospace, Computers & Networks, Government, Medical, Automotive, Defense, Industrial, Network Operators & Infrastructure, Public Safety / Law Enforcement, Physical Security // Automotive radar target simulation, Noise figure measurements, Double-Pulse Testing, Display port debugging and compliance, High-speed interconnect testing // Please provide any additional information that you would like to share. // Construction company, Craft enterprises, Scaffolding company, Trading company, Production company, Education sector // 1-10, 11-50, 51-200, 201-2000, larger than 2000 // Date // What additional information would you like to share? // German, Italian, Japanese, English, Spanish // Existing customer, Supplier, New customer / Prospect, Press / media, Competitor // 100 Additive Manufacturing, 200 Automation, 300 Advanced Manufacturing, 234 Assembly Systems, 256 Joining Systems for large components, Others // phone number // 1 week, 2 weeks, 3 weeks // New customer, Existing customer, Partner, Applicant // Very satisfied, Satisfied, Unsatisfied, Very unsatisfied // 1-5, 6-10, 11-15, 16-20, 21-30, 31-40, more than 40 // Salesforce, Pipedrive, Close.io, Microsoft Dynamics, HubSpot, CAS, SAP Sales Cloud, Adito // BusinessCards, DataEnrichment, VisitReport, Data Cleansing, DataQuality // Scan business cards, Clean up CRM, Extract data from emails, Improve CRM data quality, Capture trade fair contacts // Offer, Meeting, Call

In [4]:
test_questions_url = 'https://raw.githubusercontent.com/alexk2206/tds_capstone/refs/heads/main/datasets/test_dataset_questions.json'
test_questions = pd.read_json(test_questions_url)
test_questions

Unnamed: 0,question,type,options
0,How did you hear about our exhibition stand?,SINGLE_SELECT,"Social media, Email invitation, Trade fair web..."
1,What is your primary goal at this trade fair?,SINGLE_SELECT,"Networking, Finding suppliers, Learning about ..."
2,Which features are most important in a solution?,MULTI_SELECT,"Ease of use, Cost efficiency, Scalability, Sec..."
3,How would you prefer to receive product updates?,SINGLE_SELECT,"Email, Webinar, Newsletter, Social media, In-p..."
4,Who in your company evaluates new solutions?,MULTI_SELECT,"Team leader, IT department, Procurement, CEO, ..."
5,Do you plan to implement a solution within the...,SINGLE_SELECT,"Yes, No"
6,What is your preferred method of follow-up?,SINGLE_SELECT,"Phone call, Email, Video meeting, In-person vi..."
7,What stage are you in the buying process?,SINGLE_SELECT,"Exploration, Evaluation, Decision-making, Alre..."
8,What challenges are you currently facing in yo...,TEXT,Please share specific challenges or issues.
9,What department are you representing?,SINGLE_SELECT,"R&D, Procurement, Marketing, Operations, Other"


## Define functions

- function for combination creation of MULTI_SELECT questions
- function for budget creation
- function for date creation
- function for creation of note taking prompt
- function for processing different types of questions
- function to scale up the number of questions

In [5]:
def generate_combinations(options_list, max_size):
    # Generates all possible combinations of options from the provided list, with combination sizes ranging from 0 to the minimum of the list length or max_size.
    # Returns a list of these combinations.
    return list(chain.from_iterable(combinations(options_list, r) for r in range(0, min(len(options_list), max_size) + 1)))


def generate_budget():
    # Generates a random budget between $2000 and $18000
    # This budget is then formatted as a string with a dollar sign and returned.
    # The range is between 20 and 180, and the value is multiplied by 100 to get rounded numbers to the hundreds.
    budget = random.randint(20, 180) * 100
    return f"${budget}"


def generate_date(today=None):
    # Generates a random date within the last two weeks, based on the provided 'today' date (or the current date if none is given).
    # Returns a list containing the generated date in 'YYYY-MM-DD' format.
    if today is None:
        today = datetime.today()

    random_days = random.randint(0, 13)
    random_date = today - timedelta(days=random_days)

    date = random_date.strftime('%Y-%m-%d')

    return [date]


def generate_notes():
    # Returns a list containing the placeholder text 'Add additional information here' as the intended answer for text-based questions.
    return ['Add additional information here']

In [6]:
def process_selections(row, max_size):
    # Processes selection-type questions (MULTI_SELECT and SINGLE_SELECT) by generating possible answer combinations
    # for MULTI_SELECT questions and individual options for SINGLE_SELECT questions.
    # Returns a list of dictionaries with expanded question-answer pairs.
    question = row['question']  # Extract the question text from the row
    options_list = row['options']  # Extract the list of options for the question
    question_type = row['type']  # Extract the question type (e.g., MULTI_SELECT, SINGLE_SELECT)
    expanded = []  # Initialize an empty list to store expanded question-answer pairs

    if question_type == 'MULTI_SELECT':
        options_combinations = generate_combinations(options_list, max_size=max_size)
        for combo in options_combinations:
            expanded.append({'question': question, 'type': question_type, 'options': options_list, 'intended_answer': list(combo)})

    elif question_type == 'SINGLE_SELECT':
        for option in options_list:
            expanded.append({'question': question, 'type': question_type, 'options': options_list, 'intended_answer': [option]})

    return expanded


def process_freetext(row):
    # Processes free text-type questions (TEXT, NUMBER, and DATE) by generating appropriate intended answers
    # Returns a list of dictionaries with expanded question-answer pairs.

    question = row['question']
    options_list = row['options']
    question_type = row['type']
    expanded = []

    if question_type == 'TEXT':
        expanded.append({'question': question, 'type': question_type, 'options': options_list, 'intended_answer': generate_notes()})

    elif question_type == 'NUMBER':
        expanded.append({'question': question, 'type': question_type, 'options': options_list, 'intended_answer': generate_budget()})

    elif question_type == 'DATE':
        expanded.append({'question': question, 'type': question_type, 'options': options_list, 'intended_answer': generate_date()})

    return expanded


In [7]:
min_q_amount = 8
max_q_amount = 12

def adjust_question_amount(df, column, random_state):
    # Adjusts the amount of each unique question in the specified column to be between 8 and 12 occurrences.
    # If a group has fewer than the required amount, it samples with replacement. Otherwise, it samples without replacement.
    # With this range and approach we assured that we have a smaller test dataset
    # Returns the DataFrame with adjusted group sizes.
    random.seed(random_state)
    def adjust_group(group):
        max_amount = random.randint(min_q_amount, max_q_amount)

        if len(group) < max_amount:
            return group.sample(n=max_amount, replace=True, random_state=random_state)
        else:
            return group.sample(n=max_amount, random_state=random_state)

    return df.groupby(column, group_keys=False).apply(adjust_group).reset_index(drop=True)

## Apply functions on the dataset
- split dataset into selection questions and
- create intended answers for selection type questions
- scale up free text questions
- create intended answers for free text questions
- append dataset

In [8]:
# This code processes the `test_questions` DataFrame by first separating the selection questions (MULTI_SELECT and SINGLE_SELECT) from the free text questions (all others).

selection_test_questions = test_questions[(test_questions['type'] == 'MULTI_SELECT') | (test_questions['type'] == 'SINGLE_SELECT')].reset_index(drop=True)
freetext_test_questions = test_questions[(test_questions['type'] != 'MULTI_SELECT') & (test_questions['type'] != 'SINGLE_SELECT')].reset_index(drop=True)

selection_counts = selection_test_questions['type'].value_counts()
freetext_counts = freetext_test_questions['type'].value_counts()

print(f'selection_test_questions shape: {selection_test_questions.shape}, counts per type:\n{selection_counts}')
print(f'freetext_test_questions shape: {freetext_test_questions.shape}, counts per type:\n{freetext_counts}')

print(selection_test_questions)
print(freetext_test_questions)

selection_test_questions shape: (16, 3), counts per type:
type
SINGLE_SELECT    12
MULTI_SELECT      4
Name: count, dtype: int64
freetext_test_questions shape: (4, 3), counts per type:
type
TEXT      2
DATE      1
NUMBER    1
Name: count, dtype: int64
                                             question           type  \
0        How did you hear about our exhibition stand?  SINGLE_SELECT   
1       What is your primary goal at this trade fair?  SINGLE_SELECT   
2    Which features are most important in a solution?   MULTI_SELECT   
3    How would you prefer to receive product updates?  SINGLE_SELECT   
4        Who in your company evaluates new solutions?   MULTI_SELECT   
5   Do you plan to implement a solution within the...  SINGLE_SELECT   
6         What is your preferred method of follow-up?  SINGLE_SELECT   
7           What stage are you in the buying process?  SINGLE_SELECT   
8               What department are you representing?  SINGLE_SELECT   
9          How many employee

In [9]:
# This code splits the 'options' column into a list for the selection questions and applies the `process_selections` function to generate combinations for MULTI_SELECT and individual options for SINGLE_SELECT questions.
# The resulting expanded data is normalized into a new DataFrame, `selection_test_intended_answers`, which contains the generated question-answer pairs.

selection_test_questions.loc[:, 'options'] = selection_test_questions['options'].str.split(', ').copy()
expanded_data = selection_test_questions.apply(lambda row: process_selections(row, max_size=6), axis=1).explode()
selection_test_intended_answers = pd.json_normalize(expanded_data)

print(f'selection_test_intended_answers shape: {selection_test_intended_answers.shape}')
selection_test_intended_answers.sample(25)

selection_test_intended_answers shape: (218, 4)


Unnamed: 0,question,type,options,intended_answer
62,Who in your company evaluates new solutions?,MULTI_SELECT,"[Team leader, IT department, Procurement, CEO,...","[CEO, Other]"
38,Which features are most important in a solution?,MULTI_SELECT,"[Ease of use, Cost efficiency, Scalability, Se...","[Ease of use, Cost efficiency, Security, Support]"
203,What support resources do you need for impleme...,MULTI_SELECT,"[Training, Documentation, Technical support, O...","[Training, Documentation, Onsite assistance]"
206,What support resources do you need for impleme...,MULTI_SELECT,"[Training, Documentation, Technical support, O...","[Training, Technical support, None]"
79,Do you plan to implement a solution within the...,SINGLE_SELECT,"[Yes, No]",[Yes]
201,What support resources do you need for impleme...,MULTI_SELECT,"[Training, Documentation, Technical support, O...","[Onsite assistance, None]"
73,Who in your company evaluates new solutions?,MULTI_SELECT,"[Team leader, IT department, Procurement, CEO,...","[Team leader, IT department, Procurement, CEO]"
176,How soon are you looking for a solution?,SINGLE_SELECT,"[Immediately, 1-3 months, 4-6 months, Over 6 m...",[Immediately]
37,Which features are most important in a solution?,MULTI_SELECT,"[Ease of use, Cost efficiency, Scalability, Se...","[Ease of use, Cost efficiency, Scalability, Su..."
215,What support resources do you need for impleme...,MULTI_SELECT,"[Training, Documentation, Technical support, O...","[Training, Technical support, Onsite assistanc..."


In [10]:
# This code scales the `selection_test_intended_answers` DataFrame.
# Here we can allready scale the questions, because every intended answer is already unique.
# It ensures that each unique question has a number of occurrences between 8 and 12, using the `adjust_question_amount` function.

selection_test_intended_answers_scaled = adjust_question_amount(selection_test_intended_answers, 'question', 1)

selection_test_intended_answers_scaled_counts = selection_test_intended_answers_scaled['type'].value_counts()
print(f'freetext_test_questions_scaled shape: {selection_test_intended_answers_scaled.shape}\ncounts per type:\n{selection_test_intended_answers_scaled_counts}')
selection_test_intended_answers_scaled

freetext_test_questions_scaled shape: (161, 4)
counts per type:
type
SINGLE_SELECT    119
MULTI_SELECT      42
Name: count, dtype: int64


  return df.groupby(column, group_keys=False).apply(adjust_group).reset_index(drop=True)


Unnamed: 0,question,type,options,intended_answer
0,Do you plan to implement a solution within the...,SINGLE_SELECT,"[Yes, No]",[No]
1,Do you plan to implement a solution within the...,SINGLE_SELECT,"[Yes, No]",[No]
2,Do you plan to implement a solution within the...,SINGLE_SELECT,"[Yes, No]",[Yes]
3,Do you plan to implement a solution within the...,SINGLE_SELECT,"[Yes, No]",[Yes]
4,Do you plan to implement a solution within the...,SINGLE_SELECT,"[Yes, No]",[No]
...,...,...,...,...
156,Who in your company evaluates new solutions?,MULTI_SELECT,"[Team leader, IT department, Procurement, CEO,...","[Team leader, IT department, CEO, Other]"
157,Who in your company evaluates new solutions?,MULTI_SELECT,"[Team leader, IT department, Procurement, CEO,...","[Team leader, Procurement, CEO]"
158,Who in your company evaluates new solutions?,MULTI_SELECT,"[Team leader, IT department, Procurement, CEO,...","[Procurement, Other]"
159,Who in your company evaluates new solutions?,MULTI_SELECT,"[Team leader, IT department, Procurement, CEO,...","[IT department, CEO, Other]"


In [11]:
# This code scales the `freetext_test_questions` DataFrame.
# Here we have to scale befor generating intended answers in order to get different intended answers.
# If we would have generated them beforehand, every intended answer would have been the same.
# It ensures that each unique question has a number of occurrences between 8 and 12, using the `adjust_question_amount` function.

freetext_test_questions_scaled = adjust_question_amount(freetext_test_questions, 'question', 1)

freetext_test_questions_scaled_counts = freetext_test_questions_scaled['type'].value_counts()
print(f'freetext_test_questions_scaled shape: {freetext_test_questions_scaled.shape}\ncounts per type:\n{freetext_test_questions_scaled_counts}')
freetext_test_questions_scaled

freetext_test_questions_scaled shape: (39, 3)
counts per type:
type
TEXT      21
DATE      10
NUMBER     8
Name: count, dtype: int64


  return df.groupby(column, group_keys=False).apply(adjust_group).reset_index(drop=True)


Unnamed: 0,question,type,options
0,Do you have any specific technical requirements?,TEXT,Please describe your requirements.
1,Do you have any specific technical requirements?,TEXT,Please describe your requirements.
2,Do you have any specific technical requirements?,TEXT,Please describe your requirements.
3,Do you have any specific technical requirements?,TEXT,Please describe your requirements.
4,Do you have any specific technical requirements?,TEXT,Please describe your requirements.
5,Do you have any specific technical requirements?,TEXT,Please describe your requirements.
6,Do you have any specific technical requirements?,TEXT,Please describe your requirements.
7,Do you have any specific technical requirements?,TEXT,Please describe your requirements.
8,Do you have any specific technical requirements?,TEXT,Please describe your requirements.
9,What challenges are you currently facing in yo...,TEXT,Please share specific challenges or issues.


In [12]:
# Split the 'options' column into lists of options (comma-separated) for each row
freetext_test_questions_scaled['options'] = freetext_test_questions_scaled['options'].str.split(', ')

# Initialize an empty list to accumulate the expanded rows
expanded_rows = []

# Iterate over each row in the DataFrame
for _, row in freetext_test_questions_scaled.iterrows():
    question = row['question']
    options_list = row['options']
    question_type = row['type']

    # For NUMBER questions, generate a budget as the intended answer
    if question_type == 'NUMBER':
        expanded_rows.append({'question': question, 'type': question_type, 'options': options_list, 'intended_answer': generate_budget()})

    # For TEXT questions, set a default note as the intended answer
    elif question_type == 'TEXT':
        expanded_rows.append({'question': question, 'type': question_type, 'options': options_list, 'intended_answer' : generate_notes()})

    # For DATE questions, generate a random date as the intended answer
    elif question_type == 'DATE':
        expanded_rows.append({'question': question, 'type': question_type, 'options': options_list, 'intended_answer' : generate_date()})

    # For other types (i.e., if no specific condition matched), use the options list as the intended answer
    else:
        expanded_rows.append({'question': question, 'type': question_type, 'options': options_list, 'intended_answer' : options_list})

freetext_test_intended_answer_scaled = pd.DataFrame(expanded_rows)

In [13]:
# This code combines the selection and freetext intended answer DataFrames into one
# Shuffle the combined DataFrame randomly (fraction 1 means shuffling all rows) and reset the index

combined_test_df = pd.concat([selection_test_intended_answers_scaled, freetext_test_intended_answer_scaled], ignore_index=True)
test_qa_dataset = combined_test_df.sample(frac=1, random_state=1).reset_index(drop=True)

print(f'test_qa_dataset shape: {test_qa_dataset.shape}')
test_qa_dataset.head(30)

test_qa_dataset shape: (200, 4)


Unnamed: 0,question,type,options,intended_answer
0,What department are you representing?,SINGLE_SELECT,"[R&D, Procurement, Marketing, Operations, Other]",[Operations]
1,How soon are you looking for a solution?,SINGLE_SELECT,"[Immediately, 1-3 months, 4-6 months, Over 6 m...",[Not sure]
2,How satisfied are you with the current solutio...,SINGLE_SELECT,"[Very satisfied, Satisfied, Neutral, Unsatisfi...",[Very satisfied]
3,What stage are you in the buying process?,SINGLE_SELECT,"[Exploration, Evaluation, Decision-making, Alr...",[Exploration]
4,What is your estimated budget for this project?,NUMBER,[Please provide an approximate value.],$13500
5,When do you expect to finalize your decision?,DATE,[Select an approximate date.],[2025-01-22]
6,What language do you prefer for communication?,SINGLE_SELECT,"[English, German, French, Spanish, Italian, Ot...",[German]
7,Do you plan to implement a solution within the...,SINGLE_SELECT,"[Yes, No]",[No]
8,How satisfied are you with the current solutio...,SINGLE_SELECT,"[Very satisfied, Satisfied, Neutral, Unsatisfi...",[Unsatisfied]
9,Do you have any specific technical requirements?,TEXT,[Please describe your requirements.],[Add additional information here]


In [14]:
# question count
print(test_qa_dataset['question'].value_counts())

question
How did you hear about our exhibition stand?                       12
What challenges are you currently facing in your industry?         12
Who in your company evaluates new solutions?                       12
What is your preferred method of follow-up?                        11
How would you prefer to receive product updates?                   11
What type of customer relationship are you seeking?                11
What support resources do you need for implementation?             11
What is your primary goal at this trade fair?                      11
What department are you representing?                              11
Which features are most important in a solution?                   11
When do you expect to finalize your decision?                      10
How satisfied are you with the current solutions in your field?    10
Do you have any specific technical requirements?                    9
Do you plan to implement a solution within the next 6 months?       9
What langua

In [15]:
test_qa_dataset.to_json('test_qa_dataset.json', orient='records')

## Setting up the functions and prompts

In [16]:
# Setting up the API Key and the required model
# Here we used gemini-2.0-flash-exp, because we hit the API limit and ran into too many errors with the gemini-1.5-flash

key = userdata.get('GOOGLE_API_KEY')
genai.configure(api_key=key)
model = genai.GenerativeModel("gemini-2.0-flash-exp")

In [17]:
max_output_tokens = 48

def generate_selection_answer_easy(question, intended_answer):
  # The prompt is designed to make the model respond naturally, as if it doesn't know the answer options, but still uses the intended answer.
  prompt = f"""
  You are asked a question, and you need to provide a natural, conversational answer in the first person. Do not use special characters other than ',' and '.'.
  Act like you really do not know which options there are and the intended answer is your answer.
  When given a range, use a number between the two values.
  Be concise but clear, and avoid unnecessary elaboration. Use up to {max_output_tokens} tokens.
  Question: {question}\n
  Intended answer: {intended_answer}\n
  Answer as a sentence, mentioning and explaining all the provided options:
  """
  response = model.generate_content(
      contents = prompt,
      generation_config = genai.GenerationConfig(
          max_output_tokens=max_output_tokens,
          temperature=2)  # Higher temperature for more varied, natural responses.
    )

  # Strips extra whitespace from the generated response.
  answer = response.text.strip()

  # Introducing a short delay between API calls to prevent hitting rate limits.
  # Here 6 seconds were the sweetspot as well.
  time.sleep(6)

  # Returns the generated answer with an 'easy' difficulty tag.
  return {"answer": answer, "difficulty": "easy"}


def generate_budget_number_answer_easy(question, intended_answer):
  # The prompt asks for a clear and professional response regarding budget information.
  prompt = f"""
  You are responding to a question about budget. Your response should be clear, concise, and professional,
  naturally integrating the provided budget into your answer. Keep it in the first person, present tense,
  and ensure it sounds conversational and appropriate. Use up to {max_output_tokens} tokens.
  Question: {question}\n
  Intended answer: {intended_answer}\n
  Answer as a sentence, providing the budget in a natural and relevant manner:
  """
  response = model.generate_content(
      contents = prompt,
      generation_config = genai.GenerationConfig(
          max_output_tokens=max_output_tokens,
          temperature=2)
    )

  answer = response.text.strip()

  time.sleep(6)

  return {"answer": answer, "difficulty": "easy"}


def generate_freetext_answer_easy(question, intended_answer):
  # The prompt is tailored for open-ended answers where the model responds naturally, either providing information or politely declining.
  prompt = f"""
  You are being asked if you have any additional notes or information to share.
  Your response should sound natural, in the first person, and can be either brief or more detailed, depending on the situation.
  You can provide additional information but you don't have to and mention it clearly and politely.
  If there isn't anything else to add, express that in a conversational manner. Use up to {max_output_tokens} tokens.
  Question: {question}\n
  Intended answer: {intended_answer}
  Answer as a sentence, providing any additional information or politely stating that there's nothing else to add:
  """
  response = model.generate_content(
      contents = prompt,
      generation_config = genai.GenerationConfig(
          max_output_tokens=max_output_tokens,
          temperature=2)
    )

  answer = response.text.strip()

  time.sleep(6)

  return {"answer": answer, "difficulty": "easy"}



def generate_date_answer_easy(question, intended_answer):
  # The prompt encourages the model to generate a natural, conversational response about a specific date.
  prompt = f"""
  You are asked a question about a specific date, and you need to provide a natural, conversational answer in the first person.
  Include the date from the intended answer in your response, phrasing it naturally as if you're suggesting a meeting.
  Be concise but clear, and use up to {max_output_tokens} tokens.
  Question: {question}\n
  Intended Answer: {intended_answer}\n
  Context: Provide a conversational response mentioning the date in a natural way:
  """
  response = model.generate_content(
      contents = prompt,
      generation_config = genai.GenerationConfig(
          max_output_tokens=max_output_tokens,
          temperature=2)
    )

  answer = response.text.strip()

  time.sleep(6)

  return {"answer": answer, "difficulty": "easy"}

In [18]:
# Initialize a global variable to track the number of cycles
cycle_count = 0

# Define the function that generates answers for a row of data based on its type
def generate_answer_for_row(row):
    global cycle_count
    cycle_count += 1  # Increment the cycle count for each row processed
    print(f"Cycle: {cycle_count}")  # Print the current cycle number to track progress

    question = row['question']
    intended_answer = row['intended_answer']
    question_type = row['type']

    # Check the type of the question and generate the appropriate answer
    if question_type in ['SINGLE_SELECT', 'MULTI_SELECT']:
        return generate_selection_answer_easy(question, intended_answer)  # For selection type questions
    elif question_type == 'NUMBER': # For number type questions
        return generate_budget_number_answer_easy(question, intended_answer)
    elif question_type == 'TEXT': # For free-text questions
        return generate_freetext_answer_easy(question, intended_answer)
    elif question_type == 'DATE': # For date type questions
        return generate_date_answer_easy(question, intended_answer)
    else:
        return {"answer": "Unknown question type", "difficulty": "unknown"} # Default case for unknown question types

## Generating contexts

In [None]:
# Sampling
# This came in handy to check quickly on generated contexts.
cycle_count = 0

sample_type = "DATE"
sample_size = 5
sample_question = "Size of the trade fair team (on average)"

# Filtering the data in a desired way let us look up the contexts for special types or questions

test_qa_dataset_filtered = test_qa_dataset[test_qa_dataset['type'] == sample_type]
#test_qa_dataset_filtered = test_qa_dataset[test_qa_dataset['question'] == sample_question]
#test_qa_dataset_filtered = test_qa_dataset.copy()

sampled_questions = test_qa_dataset_filtered.sample(n=min(sample_size, len(test_qa_dataset_filtered))).reset_index(drop=True)

sampled_questions[['context', 'difficulty']] = sampled_questions.apply(lambda row: pd.Series(generate_answer_for_row(row)), axis=1)

sampled_questions

Cycle: 1
Cycle: 2
Cycle: 3
Cycle: 4
Cycle: 5


Unnamed: 0,question,type,options,intended_answer,context,difficulty
0,When do you expect to finalize your decision?,DATE,[Select an approximate date.],[2025-01-29],Let's aim to have everything finalized around ...,easy
1,When do you expect to finalize your decision?,DATE,[Select an approximate date.],[2025-01-17],"I'm aiming to have everything finalized, maybe...",easy
2,When do you expect to finalize your decision?,DATE,[Select an approximate date.],[2025-01-16],I'm hoping we can wrap things up by January 16...,easy
3,When do you expect to finalize your decision?,DATE,[Select an approximate date.],[2025-01-18],How about we plan on finalizing things by Janu...,easy
4,When do you expect to finalize your decision?,DATE,[Select an approximate date.],[2025-01-22],"I think a meeting around January 22nd, 2025 sh...",easy


In [None]:
# Generating contexts for the whole dataset

cycle_count = 0
test_qa_dataset[['context', 'difficulty']] = test_qa_dataset.apply(lambda row: pd.Series(generate_answer_for_row(row)), axis=1)
test_qa_dataset.to_json('test_qa_dataset_with_answers.json', orient='records')

Cycle: 1
Cycle: 2
Cycle: 3
Cycle: 4
Cycle: 5
Cycle: 6
Cycle: 7
Cycle: 8
Cycle: 9
Cycle: 10
Cycle: 11
Cycle: 12
Cycle: 13
Cycle: 14
Cycle: 15
Cycle: 16
Cycle: 17
Cycle: 18
Cycle: 19
Cycle: 20
Cycle: 21
Cycle: 22
Cycle: 23
Cycle: 24
Cycle: 25
Cycle: 26
Cycle: 27
Cycle: 28
Cycle: 29
Cycle: 30
Cycle: 31
Cycle: 32
Cycle: 33
Cycle: 34
Cycle: 35
Cycle: 36
Cycle: 37
Cycle: 38
Cycle: 39
Cycle: 40
Cycle: 41
Cycle: 42
Cycle: 43
Cycle: 44
Cycle: 45
Cycle: 46
Cycle: 47
Cycle: 48
Cycle: 49
Cycle: 50
Cycle: 51
Cycle: 52
Cycle: 53
Cycle: 54
Cycle: 55
Cycle: 56
Cycle: 57
Cycle: 58
Cycle: 59
Cycle: 60
Cycle: 61
Cycle: 62
Cycle: 63
Cycle: 64
Cycle: 65
Cycle: 66
Cycle: 67
Cycle: 68
Cycle: 69
Cycle: 70
Cycle: 71
Cycle: 72
Cycle: 73
Cycle: 74
Cycle: 75
Cycle: 76
Cycle: 77
Cycle: 78
Cycle: 79
Cycle: 80
Cycle: 81
Cycle: 82
Cycle: 83
Cycle: 84
Cycle: 85
Cycle: 86
Cycle: 87
Cycle: 88
Cycle: 89
Cycle: 90
Cycle: 91
Cycle: 92
Cycle: 93
Cycle: 94
Cycle: 95
Cycle: 96
Cycle: 97
Cycle: 98
Cycle: 99
Cycle: 100
Cycle: 1

In [19]:
test_qa_dataset_with_answers_url = 'https://raw.githubusercontent.com/alexk2206/tds_capstone/refs/heads/main/datasets/test_qa_dataset_with_answers.json'
test_qa_dataset_with_answers = pd.read_json(test_qa_dataset_with_answers_url)
print(f'test_qa_dataset_with_answers shape = {test_qa_dataset_with_answers.shape}')
test_qa_dataset_with_answers.sample(25)

test_qa_dataset_with_answers shape = (200, 6)


Unnamed: 0,question,type,options,intended_answer,context,difficulty
171,What department are you representing?,SINGLE_SELECT,"[R&D, Procurement, Marketing, Operations, Other]",[Procurement],"Oh gosh, I'm not sure which departments there ...",easy
69,What department are you representing?,SINGLE_SELECT,"[R&D, Procurement, Marketing, Operations, Other]",[Other],"Well, I'm not really part of any specific depa...",easy
176,What is your primary goal at this trade fair?,SINGLE_SELECT,"[Networking, Finding suppliers, Learning about...",[Networking],"Oh gosh, I think my primary goal here is just ...",easy
185,What stage are you in the buying process?,SINGLE_SELECT,"[Exploration, Evaluation, Decision-making, Alr...",[Not buying],"Oh, I'm definitely not buying right now.",easy
90,What language do you prefer for communication?,SINGLE_SELECT,"[English, German, French, Spanish, Italian, Ot...",[English],"I'd have to say English, since it is what I us...",easy
195,What type of customer relationship are you see...,SINGLE_SELECT,"[Supplier, Partner, Reseller, End-user, Other]",[Supplier],"Oh, well I guess I am looking for a supplier r...",easy
118,How soon are you looking for a solution?,SINGLE_SELECT,"[Immediately, 1-3 months, 4-6 months, Over 6 m...",[1-3 months],"Hmm, I'd say probably somewhere around 2 month...",easy
8,How satisfied are you with the current solutio...,SINGLE_SELECT,"[Very satisfied, Satisfied, Neutral, Unsatisfi...",[Unsatisfied],"Oh, well, I'm definitely unsatisfied with the ...",easy
143,What is your estimated budget for this project?,NUMBER,[Please provide an approximate value.],$4400,"Right now, my estimated budget for this projec...",easy
104,Do you plan to implement a solution within the...,SINGLE_SELECT,"[Yes, No]",[Yes],"Yes, I think I will do it in that timeframe, i...",easy
