# Research: Critiquing survey questions
This notebook shows some ways of using EDSL to critique and improve survey questions. We do this by parameterizing a series of free text questions prompting an AI agent to provide feedback and criticism a survey question. We then ask the agent to provide a better version of the survey question, with and without the agent's responses to the prompts for feedback and criticism. We also compare results from different personas assigned to the agents and results using different LLMs.

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/expectedparrot/edsl/blob/main/docs/notebooks/critique_questions.ipynb)

In [1]:
# ! pip install edsl

In [2]:
from edsl.questions import QuestionFreeText
from edsl import Scenario, Survey, Agent, Model

In [7]:
q1 = QuestionFreeText(
    question_name="problems",
    question_text="What are some problems with this survey question: {{ draft_question }}",
)

q2 = QuestionFreeText(
    question_name="confusing",
    question_text="What are some ways in which this survey question may be confusing: {{ draft_question }}",
)

q3 = QuestionFreeText(
    question_name="truthful",
    question_text="What are some ways of ensuring that respondents will answer this survey question truthfully: {{ draft_question }}",
)

q4 = QuestionFreeText(
    question_name="revised1",
    question_text="Please provide an improved version of the following survey question: {{ draft_question }}",
)

# This question also prompts the agent to provide an improved version of the draft survey question,
# but we will add the context of responses 1-3 to the prompt (see .add_targeted_memory() step below)
q5 = QuestionFreeText(
    question_name="revised2",
    question_text="Please provide an improved version of the following survey question: {{ draft_question }}",
)

draft_questions = ["Where are you from?", "What is your annual income?"]

scenarios = [Scenario({"draft_question": q}) for q in draft_questions]

personas = [
    "",  # No persona
    "You have some experience in responding to surveys.",
    "You are an expert in survey design and cognitive testing.",
]

agents = [Agent(traits={"persona": p}) for p in personas]

survey = Survey(questions=[q1, q2, q3, q4, q5])

# Here we add the context of responses 1-3 to the prompt for q5:
survey.add_targeted_memory(q5, q1)
survey.add_targeted_memory(q5, q2)
survey.add_targeted_memory(q5, q3)

results = survey.by(scenarios).by(agents).run(progress_bar=True, stop_on_exception= True, cache = False)

Output()

In [4]:
results.columns

['agent.agent_instruction',
 'agent.agent_name',
 'agent.persona',
 'answer.confusing',
 'answer.problems',
 'answer.revised1',
 'answer.revised2',
 'answer.truthful',
 'model.frequency_penalty',
 'model.logprobs',
 'model.max_tokens',
 'model.model',
 'model.presence_penalty',
 'model.temperature',
 'model.top_logprobs',
 'model.top_p',
 'prompt.confusing_system_prompt',
 'prompt.confusing_user_prompt',
 'prompt.problems_system_prompt',
 'prompt.problems_user_prompt',
 'prompt.revised1_system_prompt',
 'prompt.revised1_user_prompt',
 'prompt.revised2_system_prompt',
 'prompt.revised2_user_prompt',
 'prompt.truthful_system_prompt',
 'prompt.truthful_user_prompt',
 'question_options.confusing_question_options',
 'question_options.problems_question_options',
 'question_options.revised1_question_options',
 'question_options.revised2_question_options',
 'question_options.truthful_question_options',
 'question_text.confusing_question_text',
 'question_text.problems_question_text',
 'questio

In [5]:
results.select(
    "persona", "problems", "confusing", "truthful", "revised1", "revised2"
).print()

agent.persona,answer.problems,answer.confusing,answer.truthful,answer.revised1,answer.revised2
,"The survey question 'Where are you from?' can be problematic due to its ambiguity and potential for misunderstanding. Respondents might be unsure whether to answer with their country of origin, city, state, or current place of residence. It can also be sensitive for individuals who may not want to disclose this information due to privacy concerns or for those who have complex backgrounds and do not identify with a single place. Additionally, the question may not be inclusive of individuals who consider themselves to be from multiple locations. To improve the question, it could be made more specific, such as 'What is your current country of residence?' or 'In which city did you spend most of your childhood?'","The question 'Where are you from?' can be confusing for several reasons. It can be unclear whether it refers to one's birthplace, current place of residence, nationality, or ethnic background. Additionally, for individuals who have moved frequently or come from multicultural backgrounds, the answer may not be straightforward. People might also confuse it with where they are currently living if they are traveling or have recently moved.","To encourage truthful responses to the question 'Where are you from?', you can ensure anonymity, explaining that the data is confidential and will be used for statistical purposes only. You can also make the question optional, reducing pressure on respondents who might not want to disclose their location. Additionally, you can provide a clear and easy-to-understand reason for asking the question, helping respondents see the value in providing accurate information.",I am originally from .,Please specify the city and country where you currently reside.
You have some experience in responding to surveys.,"The question 'Where are you from?' can be problematic due to its ambiguity and potential sensitivity. It may not be clear whether the question refers to someone's current residence, hometown, country of birth, or ethnic origin, which can lead to a wide range of interpretations and responses. Additionally, the question might make some respondents uncomfortable or feel it is an invasion of privacy, especially if they have a complex background or personal history related to their place of origin. It could also be considered too broad and not specific enough to yield useful data for the survey, as the term 'from' can encompass a large geographical area without providing specific insights.","The question 'Where are you from?' can be confusing because it is ambiguous. It could refer to a person's hometown, the place they were born, the city or country they consider home, or their ethnic background. Additionally, for people who have moved frequently or come from multicultural backgrounds, the question might be difficult to answer succinctly. The question also assumes that 'from' refers to a geographical location, which might not be the case for everyone, especially those who identify more with a cultural or social community rather than a physical place.","To encourage truthful answers to the question 'Where are you from?', you can ensure anonymity, explain the importance of truthful responses for the integrity of the data, reassure respondents that there are no right or wrong answers, and make sure the survey is conducted in a secure and confidential manner.",Please specify your place of birth or the location you consider your hometown.,"I was born in Warsaw, Poland, but I currently reside in Toronto, Canada."
You have some experience in responding to surveys.,"The question 'What is your annual income?' can be problematic for several reasons: 1) It lacks specificity regarding the currency, which could lead to confusion for respondents from different countries; 2) It does not specify whether the income should be reported before or after taxes, which can significantly change the reported amount; 3) The question may be considered sensitive or intrusive by some respondents, leading to non-response or inaccurate reporting; 4) It does not provide options for respondents who may not have a traditional income, such as retirees, students, or those unemployed; 5) The question does not clarify if it includes other forms of income like investments, alimony, or government assistance, potentially leading to inconsistent data.","The question 'What is your annual income?' can be confusing due to several reasons: It does not specify whether it refers to gross income or net income after taxes, it does not account for non-monetary benefits that might be considered part of one's compensation, it does not clarify if it includes investment income or only wages from employment, it assumes a stable income, which might not be the case for freelancers or those on variable incomes, and it does not address multiple income households where combined income might be more relevant.","To encourage truthful responses to the question about annual income, you can ensure anonymity and confidentiality, explaining to respondents that their information won't be shared with third parties. Providing a rationale for why the question is important can also help. Implementing a user-friendly survey design and including income ranges rather than asking for a specific figure may increase comfort levels. Additionally, using incentives for completing the survey can motivate honest responses. Lastly, assuring respondents that there are no right or wrong answers and that their honesty is crucial for the integrity of the research can promote truthfulness.",Please select the range that best represents your total annual income before taxes:,"Please indicate your total annual income before taxes from all sources (wages, investments, alimony, etc.) in your local currency. Choose the range that best represents your income: - Less than 20,000 - 20,001 to 40,000 - 40,001 to 60,000 - 60,001 to 80,000 - 80,001 to 100,000 - 100,001 to 150,000 - More than 150,000 - Prefer not to say (Your response will remain confidential and is important for our research. There are no right or wrong answers.)"
You are an expert in survey design and cognitive testing.,"The survey question 'Where are you from?' is ambiguous and can lead to several issues. Firstly, it is not specific enough, as respondents might be unsure whether to answer with their country of origin, the city where they were born, the place they currently live, or their ethnic background. Secondly, it assumes that 'from' refers to a geographical location, which might not be applicable to all respondents, such as those with a multicultural background. Additionally, the question may be too broad and not yield data that is useful for the survey's purpose. To improve the question, it should be rephrased for clarity, such as 'What is your country of birth?' or 'In which city do you currently reside?' depending on the context of the survey.","The question 'Where are you from?' can be confusing for several reasons. First, it is ambiguous because it could refer to a person's birthplace, current residence, or the place where they grew up. Second, for individuals who have moved frequently or come from multicultural backgrounds, it may be difficult to pinpoint one specific place. Third, it lacks context and specificity, which might lead to a wide range of interpretations and responses that may not align with the surveyor's intent. To improve clarity, the question could be rephrased to 'What is your current city of residence?' or 'In which city were you born?', depending on the information sought.","To encourage truthful responses to the question 'Where are you from?', one can implement several strategies: 1. Assure anonymity so respondents feel safe to disclose their information without fear of repercussions. 2. Clearly explain the purpose of the question and how the data will be used, which helps to build trust and understanding. 3. Make sure the question is non-intrusive and culturally sensitive to avoid discomfort that might lead to dishonesty. 4. Offer a 'prefer not to say' option to respect privacy and reduce the pressure to provide a specific answer. 5. Use a well-designed survey layout that is professional and appears legitimate to increase the perceived importance of honest reporting. 6. Pilot test the question to ensure it does not lead to confusion or misinterpretation, which can affect the accuracy of responses.",,
,"The survey question 'What is your annual income?' may have several problems such as lack of specificity, privacy concerns, cultural sensitivity, potential for misinterpretation, and non-inclusiveness of various income types. It doesn't specify the currency or whether it's before or after taxes, which can lead to inconsistent responses. Some respondents may not feel comfortable sharing this information, leading to a lower response rate or inaccurate data. The question may not translate well across different cultures where income is discussed differently. Additionally, it assumes a regular annual income, which may not apply to freelancers or those with irregular earnings.","The survey question 'What is your annual income?' might be confusing for several reasons. First, it does not specify the currency, which can lead to inaccurate responses from people in different countries. Second, it doesn't clarify whether it's asking for gross income (before taxes and deductions) or net income (after taxes and deductions). Third, it does not account for non-monetary compensation that might be considered part of one's income, such as health benefits or stock options. Fourth, it assumes a regular annual income, which might not apply to freelancers or people with fluctuating earnings. Lastly, it doesn't provide a range or scale for respondents to select from, which could lead to a wide variety of responses that are difficult to categorize and analyze.","Ensuring truthful responses to sensitive questions like annual income can be challenging, but here are some strategies that might help: 1. Guarantee anonymity: Assure respondents that their identities will not be linked to their responses. 2. Emphasize the importance of accurate data: Explain how truthful responses can lead to better decision-making or services that benefit them. 3. Use indirect questioning: Ask questions that allow respondents to provide information about their income bracket rather than an exact figure. 4. Provide a secure survey environment: Use encrypted digital platforms or secure paper forms to reassure respondents about the privacy of their data. 5. Include a rationale for the question: Explain why knowing their income is essential for the survey's purpose. 6. Offer incentives: Provide a small reward for completing the survey, which can motivate honest responses. 7. Use data validation: Employ techniques to cross-check the information provided with known data points to encourage accuracy.",Please specify your total annual income before taxes from all sources.,"Please select the range that best represents your total annual income before taxes. This information is confidential and will be used solely for statistical analysis to improve our services. Your individual response will remain anonymous. (All amounts are in USD): - Under $20,000 - $20,000 to $39,999 - $40,000 to $59,999 - $60,000 to $79,999 - $80,000 to $99,999 - $100,000 to $119,999 - $120,000 to $139,999 - $140,000 to $159,999 - $160,000 to $179,999 - $180,000 to $199,999 - Over $200,000 - Prefer not to say"
You are an expert in survey design and cognitive testing.,"Several issues could arise with the question 'What is your annual income?' Some of the problems include lack of specificity, sensitivity of the topic, potential for misunderstanding, and lack of response options. Firstly, the question does not specify whether it refers to before-tax or after-tax income, which can lead to inconsistent responses. Secondly, income can be a sensitive topic, and respondents might be hesitant to provide this information, especially if the survey does not ensure confidentiality. Thirdly, the question does not account for variations in income sources; for example, people might have multiple income streams or non-traditional sources of income that are not easily quantifiable in a single question. Finally, without providing a range or specific options to choose from, the question may yield a wide variety of responses that can be challenging to analyze and could lead to inaccuracies if respondents do not all interpret the question in the same way.","The survey question 'What is your annual income?' can be confusing for several reasons: 1) It does not specify whether the income should be reported before or after taxes (gross vs. net income). 2) It does not clarify if all sources of income should be included, such as investments, government benefits, or only employment income. 3) The question does not state the currency or if the respondent should adjust for purchasing power parity if they are from different countries. 4) It does not define the time period for 'annual' (e.g., the last calendar year, the last 12 months, or the current fiscal year). 5) The question assumes everyone has a stable annual income, which may not be true for freelancers, seasonal workers, or those on variable income. 6) It does not provide options for respondents who are unemployed or retired. 7) The question may be sensitive, leading to potential inaccuracies due to respondents' discomfort in disclosing their income.","Ensuring that respondents answer the question about their annual income truthfully can be challenging due to the sensitive nature of the topic. However, there are several strategies that can be employed to increase the likelihood of honest responses: 1. Guarantee anonymity: Assure respondents that their individual responses will be confidential and that results will only be reported in aggregate form. 2. Emphasize the importance of accurate data: Explain how truthful responses can lead to better decision-making or more relevant findings that can benefit the respondent's community or demographic. 3. Provide a range of income brackets: Instead of asking for an exact figure, offer categories of income ranges for respondents to select from, which might reduce the discomfort associated with disclosing exact amounts. 4. Include a 'Prefer not to say' option: This allows respondents who are uncomfortable sharing their income to still participate in the survey without providing false information. 5. Ensure the survey is conducted by a trusted organization: Respondents are more likely to be truthful if they trust the organization conducting the survey. 6. Use indirect questioning: For example, ask about spending habits or lifestyle indicators that can be correlated with income levels instead of asking about income directly. 7. Normalize the question: Preface the income question with statements that indicate it is a common question to ask and that many others have already answered honestly.",What is your total annual income before taxes? Please select the range that best represents your income.,"In the past 12 months, what has been your total household income before taxes? Please select the range that best represents your income from the options below. All responses will remain confidential and are important for us to ensure the accuracy of our study. - Under $25,000 - $25,000 to $49,999 - $50,000 to $74,999 - $75,000 to $99,999 - $100,000 to $149,999 - $150,000 to $199,999 - $200,000 to $249,999 - $250,000 to $299,999 - $300,000 to $349,999 - $350,000 to $399,999 - $400,000 to $449,999 - $450,000 to $499,999 - Over $500,000 - Prefer not to say"
