# Research: Critiquing survey questions
This notebook shows some ways of using EDSL to critique and improve survey questions. We do this by parameterizing a series of free text questions prompting an AI agent to provide feedback and criticism a survey question. We then ask the agent to provide a better version of the survey question, with and without the agent's responses to the prompts for feedback and criticism. We also compare results from different personas assigned to the agents and results using different LLMs.

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/expectedparrot/edsl/blob/main/docs/notebooks/critique_questions.ipynb)

In [1]:
# ! pip install edsl

In [2]:
from edsl.questions import QuestionFreeText
from edsl import Scenario, Survey, Agent, Model

In [3]:
q1 = QuestionFreeText(
    question_name = "problems",
    question_text = "What are some problems with this survey question: {{ draft_question }}"
)

q2 = QuestionFreeText(
    question_name = "confusing",
    question_text = "What are some ways in which this survey question may be confusing: {{ draft_question }}"
)

q3 = QuestionFreeText(
    question_name = "truthful",
    question_text = "What are some ways of ensuring that respondents will answer this survey question truthfully: {{ draft_question }}"
)

q4 = QuestionFreeText(
    question_name = "revised1",
    question_text = "Please provide an improved version of the following survey question: {{ draft_question }}"
)

# This question also prompts the agent to provide an improved version of the draft survey question, 
# but we will add the context of responses 1-3 to the prompt (see .add_targeted_memory() step below)
q5 = QuestionFreeText(
    question_name = "revised2",
    question_text = "Please provide an improved version of the following survey question: {{ draft_question }}"
)

draft_questions = [
    "Where are you from?",
    "What is your annual income?"
]

scenarios = [Scenario({"draft_question":q}) for q in draft_questions]

personas = [
    "", # No persona
    "You have some experience in responding to surveys.",
    "You are an expert in survey design and cognitive testing.",
]

agents = [Agent(traits={"persona":p}) for p in personas]

survey = Survey(questions = [q1, q2, q3, q4, q5])

# Here we add the context of responses 1-3 to the prompt for q5:
survey.add_targeted_memory(q5, q1)
survey.add_targeted_memory(q5, q2)
survey.add_targeted_memory(q5, q3)

results = survey.by(scenarios).by(agents).run(progress_bar=True)

Output()

In [4]:
results.columns

['agent.agent_name',
 'agent.persona',
 'answer.confusing',
 'answer.problems',
 'answer.revised1',
 'answer.revised2',
 'answer.truthful',
 'iteration.iteration',
 'model.frequency_penalty',
 'model.logprobs',
 'model.max_tokens',
 'model.model',
 'model.presence_penalty',
 'model.temperature',
 'model.top_logprobs',
 'model.top_p',
 'prompt.confusing_system_prompt',
 'prompt.confusing_user_prompt',
 'prompt.problems_system_prompt',
 'prompt.problems_user_prompt',
 'prompt.revised1_system_prompt',
 'prompt.revised1_user_prompt',
 'prompt.revised2_system_prompt',
 'prompt.revised2_user_prompt',
 'prompt.truthful_system_prompt',
 'prompt.truthful_user_prompt',
 'raw_model_response.confusing_raw_model_response',
 'raw_model_response.problems_raw_model_response',
 'raw_model_response.revised1_raw_model_response',
 'raw_model_response.revised2_raw_model_response',
 'raw_model_response.truthful_raw_model_response',
 'scenario.draft_question']

In [5]:
results.select("persona", "problems", "confusing", "truthful", "revised1", "revised2").print()

agent.persona,answer.problems,answer.confusing,answer.truthful,answer.revised1,answer.revised2
You have some experience in responding to surveys.,"The question 'Where are you from?' can be problematic for several reasons. Firstly, it is ambiguous and can be interpreted in different ways; respondents might be unsure whether to answer with their country of origin, the city they were born in, or the place they currently reside. Secondly, it may be too broad and not specific enough for the survey's needs, failing to gather the precise data intended. Thirdly, for individuals with complex backgrounds, such as those who have lived in multiple places or are from multicultural families, the question could be difficult to answer succinctly. Lastly, it may also be considered too personal or sensitive by some respondents, depending on the context of the survey.","The question 'Where are you from?' can be confusing because it is ambiguous. It could refer to one's current place of residence, hometown, or the place where they were born. Additionally, for individuals who have moved frequently or come from multicultural backgrounds, the question might be challenging to answer succinctly. The term 'from' can also imply cultural, ethnic, or national identity, which may not align with geographical location for some individuals.","To encourage truthful responses to the question 'Where are you from?', you can ensure anonymity so that respondents feel secure in providing their actual location without fear of repercussions. Additionally, explaining the purpose of the question and how the data will be used can help to establish trust. Offering incentives for completing the survey might also motivate respondents to answer honestly. Lastly, making sure the survey is conducted in a professional manner and is distributed through reputable channels can further assure respondents of its legitimacy, thereby increasing the likelihood of truthful answers.",What is your country of origin?,In which country and city did you spend the majority of your childhood?
You are an expert in survey design and cognitive testing.,"The question 'Where are you from?' is ambiguous and can lead to various interpretations. Respondents might be unsure whether to answer with their country of origin, the city they were born in, the place they currently live, or the location where they grew up. Additionally, it doesn't account for respondents who may have a multicultural background or who have moved frequently. To improve the question, it should be made more specific, such as 'What is your country of birth?' or 'In which city do you currently reside?' This would help in obtaining more precise and useful data from the survey.","The question 'Where are you from?' can be confusing for several reasons. Firstly, it is ambiguous and can refer to different aspects of a person's background such as their birthplace, hometown, current residence, or even their ethnic or cultural heritage. Secondly, individuals who have moved frequently might find it difficult to pinpoint a single location as 'where they are from.' Additionally, the term 'from' might be interpreted in a temporal sense, asking for the origin in time rather than in space, although this is less common. Lastly, for people with complex migration histories or those who identify with multiple places, the question may be too simplistic to capture the nuances of their origins.","To encourage truthful responses to the question 'Where are you from?', consider the following strategies: 1) Ensure anonymity of the survey to reduce social desirability bias; people may be more honest if they know their identities are not attached to their answers. 2) Preface the survey with a statement on the importance of truthful responses for the purpose of the research. 3) Avoid leading or loaded questions that might influence the respondent to answer in a particular way. 4) Offer a range of options that cover all possible responses, including an 'other' option with a space for them to fill in their specific location if it is not listed. 5) If the survey is not anonymous, build rapport with respondents to foster a trusting environment that encourages honesty. 6) Use neutral language that does not carry judgment or assumptions about any locations. 7) If the information is particularly sensitive, consider using indirect questioning techniques that allow respondents to answer the question without directly stating their location.",What is your current country of residence?,What is your current country of residence?
,"The question 'Where are you from?' can be problematic for several reasons. First, it is ambiguous and can be interpreted in different ways; respondents might be unsure whether to answer with their country of origin, city, state, or current residence. Second, it can be sensitive or personal for individuals who may not wish to disclose their background due to privacy concerns or potential biases. Third, it might not capture the complexity of someone's background, especially for those who have lived in multiple places or have a multicultural heritage. Lastly, without context or clarification, the question may not provide useful data for the survey's purpose.","The question 'Where are you from?' can be confusing for several reasons. Firstly, it is ambiguous and can refer to one's birthplace, current residence, or the place where one grew up. Secondly, for individuals who have lived in multiple places, it's unclear which location the question is referring to. Thirdly, the question can be sensitive or complex for those with multicultural or multi-ethnic backgrounds. Lastly, it may be taken to imply nationality, ethnicity, or heritage, which can lead to assumptions or misunderstandings about a person's identity.","To encourage truthful responses to the question 'Where are you from?', you could ensure anonymity, explain the purpose of the survey and how the data will be used, reassure respondents that there are no right or wrong answers, and emphasize the importance of accurate data for the study. Additionally, creating a comfortable and non-judgmental survey environment can help respondents feel at ease to answer honestly.",Could you please specify your place of birth or hometown?,"In which country, state, or city do you currently reside?"
,"The survey question 'What is your annual income?' can be problematic for several reasons. First, it lacks specificity regarding currency and whether the income should be reported before or after taxes. Additionally, it does not account for income fluctuations or clarify if it includes non-monetary benefits. Respondents might also feel uncomfortable disclosing this information due to privacy concerns, and without clear instructions on how the data will be used and protected, response rates could be low. Finally, the question does not provide options for those with no income or those who are retired or unemployed.","The survey question 'What is your annual income?' may be confusing for several reasons. First, it does not specify whether the income should be reported before or after taxes, which can lead to a significant difference in the reported amount. Second, it does not clarify whether to include only personal income or household income, which can also affect the response. Third, it does not account for non-monetary benefits that could be considered part of one's compensation, such as health insurance or stock options. Fourth, the question does not specify the currency or adjust for cost of living differences in various regions, which can be misleading when comparing incomes internationally. Lastly, it assumes a stable annual income, which might not apply to freelancers, seasonal workers, or those on variable commissions.","Ensuring that respondents answer the question about their annual income truthfully can be challenging, but here are some strategies that may help: 1. Guarantee anonymity and confidentiality to reduce the fear of judgment or negative consequences. 2. Emphasize the importance of accurate data for research purposes and how it can benefit the community or the individuals. 3. Provide a range of income brackets for respondents to choose from instead of asking for an exact figure, as this can reduce the pressure to report a specific amount. 4. Include a rationale for why the question is being asked to help respondents understand its relevance. 5. Offer incentives for completing the survey, which may encourage more honest responses. 6. Make sure the survey is conducted by a reputable organization to increase trust among respondents. 7. Use indirect questioning techniques that allow respondents to provide information about their income without directly stating it.",Could you please specify your total annual income before taxes?,"Please select the range that best represents your total annual household income before taxes. All responses are confidential and will only be used for statistical purposes. [ ] Under $25,000 [ ] $25,000 to $49,999 [ ] $50,000 to $74,999 [ ] $75,000 to $99,999 [ ] $100,000 to $149,999 [ ] $150,000 to $199,999 [ ] $200,000 to $299,999 [ ] $300,000 or more [ ] Prefer not to say [ ] Not applicable (e.g., unemployed, retired) Note: Please report all income in US dollars (USD)."
You have some experience in responding to surveys.,"The question 'What is your annual income?' could be problematic for several reasons. Firstly, it lacks specificity regarding currency and whether the income should be reported before or after taxes. Secondly, some respondents might be uncomfortable disclosing their income, leading to a lower response rate or inaccurate reporting. Thirdly, the question does not account for income fluctuations or non-traditional income sources, which might be relevant for freelancers, gig economy workers, or those with variable incomes. Fourthly, it doesn't provide options for those who are currently unemployed or do not have a traditional income. Finally, without predefined income ranges or categories, the analysis of open-ended income responses can be complex and time-consuming.","The question 'What is your annual income?' can be confusing because it does not specify whether to include non-taxable income, whether it's before or after taxes, if it should include bonuses or variable compensation, if it's personal income or household income, and it does not account for income in-kind or benefits. Additionally, it does not provide guidance on how to calculate income for self-employed individuals or those with irregular income streams.","To encourage truthful responses about annual income, the survey can ensure anonymity and confidentiality, clearly communicate the purpose of the data collection and how the information will be used, and reassure respondents that there will be no negative consequences regardless of their answer. Offering incentives for completion may also motivate respondents to provide accurate information. Additionally, the survey can be designed to be easy to understand and answer, with income ranges instead of exact figures to simplify the process and reduce the discomfort associated with disclosing precise income amounts.",Please indicate your total annual income before taxes from all sources (in USD):,"Please select the range that best represents your total annual household income before taxes. (Note: All income information will remain confidential and is being collected solely for statistical purposes. Please include all sources of income such as wages, salaries, bonuses, pensions, dividends, and any other money received by members of your household in the past year.)  - Under $25,000  - $25,001 - $50,000  - $50,001 - $75,000  - $75,001 - $100,000  - $100,001 - $150,000  - $150,001 - $200,000  - Over $200,000  - Prefer not to say  - Not applicable (e.g., unemployed, no traditional income)"
You are an expert in survey design and cognitive testing.,"The question 'What is your annual income?' can present several problems. First, it lacks specificity regarding the currency or the need for pre-tax versus post-tax income. Second, it might not account for income variability, such as for freelancers or those with fluctuating incomes. Third, it does not provide options for respondents who might be unemployed or have non-traditional sources of income. Fourth, it may be considered too sensitive or personal, leading to non-response or inaccurate reporting. Fifth, without income ranges or categories, the data can be difficult to analyze and compare. To improve the question, it should include clear instructions, consider privacy concerns, and possibly offer income brackets for easier reporting and analysis.","The question 'What is your annual income?' could be confusing for several reasons: 1) It does not specify whether the income should be reported before or after taxes (gross vs. net income), 2) It does not clarify if it should include only personal income or household income, 3) It assumes a stable yearly income, which might not apply to individuals with fluctuating earnings, 4) It does not provide a currency, which is particularly confusing in international surveys, and 5) It does not offer guidance on how to calculate or estimate income for self-employed or freelance individuals.","Ensuring that respondents answer truthfully about their annual income can be challenging, but there are several strategies that can help increase the accuracy of responses: 1. Guarantee anonymity: Assure respondents that their individual responses will be confidential and that data will be reported only in aggregate form. 2. Emphasize the importance of accurate data: Explain how truthful responses can lead to better insights and decisions that may benefit them directly or indirectly. 3. Include response options that cover a broad range: Provide a wide range of income categories so respondents can choose the one that best fits without feeling singled out. 4. Use indirect questioning: Frame the question in a way that makes it less intrusive, such as asking for income ranges instead of exact amounts. 5. Implement a web-based survey: Online surveys can increase the perception of privacy and encourage more honest responses. 6. Reduce social desirability bias: Remind respondents that all types of answers are normal and acceptable, which can reduce the pressure to answer in a socially desirable way. 7. Offer incentives: Provide a small reward for completing the survey, which can motivate respondents to take it more seriously and provide truthful answers.","Please select the range that best represents your total annual income before taxes from all sources. - Less than $10,000 - $10,000 to $24,999 - $25,000 to $49,999 - $50,000 to $74,999 - $75,000 to $99,999 - $100,000 to $149,999 - $150,000 to $199,999 - $200,000 to $299,999 - $300,000 to $399,999 - $400,000 to $499,999 - $500,000 or more - Prefer not to answer","Please select the income range that best represents your total pre-tax annual income (in USD) for the previous year. If your income varies, please provide an estimate that reflects your average annual income. If you did not have any income, please select the 'No income' option. Your response will remain confidential and will only be used for aggregate statistical analysis. [ ] Under $10,000 [ ] $10,000 to $24,999 [ ] $25,000 to $49,999 [ ] $50,000 to $74,999 [ ] $75,000 to $99,999 [ ] $100,000 to $149,999 [ ] $150,000 to $199,999 [ ] $200,000 to $299,999 [ ] $300,000 to $399,999 [ ] $400,000 to $499,999 [ ] $500,000 or more [ ] No income"
