Skip to content

Question quality is poor #32

@jeremymanning

Description

@jeremymanning

There are several recurring issues across different questions. They ALL need to be addressed via a careful audit of EVERY question:

Problem Example or elaboration Consequence Proposed solution
Question text is too long and/or complex Some questions include extraneous detail (outside of the scope of the to-be-tested concept) that makes parsing more difficult without improving signal This leads to testing reading comprehension instead of the desired conceptual content Audit long questions and simplify/reword to focus on the desired concept. Keep questions short, simple, and direct
Distractors vary along non-critical dimensions For example, there's a question about the Terracotta Army (correct answer). Several distractors list "Terracotta Army" followed by additional extraneous text that goes beyond the scope of the initial question (e.g., "from the funerary temple ..." vs "from the burial complex ..." vs "from the mausoleum of ..."). This ends up focusing the test on those minor details instead of the core concept. Reword questions and responses so that the "answers" and "distractors" are very short (1--3ish words)
Answers can be determined from context without actually having expertise in the target area Question: "What hardstone material, mined and carved in China since the Neolithic...". “jade” appears in 3 options so it must be jade. “gemstone” appears in 3 options so it must be gemstone. “virtue and purity” appears in 3 options so it must be that. B is the option that contains all of those, so the answer must be B. You can apply this logic to ~3/4 of the questions This reduces the utility and signal provided by the questions (about knowledge), since correct responses end up reflecting ability to pattern match more than expertise or knowledge. Carefully audit all questions to determine whether the content of EITHER the question or response options provides sufficient information in and of themselves to be able to easily guess the answer without actually having expertise in the tested area

Suggested approach:

  1. Create a skill to audit and improve questions for a given domain (follow general approach of generate-questions skill)
  2. For each question in the given domain, audit carefully for the above issues and return a re-worded question + responses
  3. Do this across multiple passes:
  • Pass 1: flag which issues in the table are present and re-word
  • Pass 2: re-audit for all issues in the table. continue alternating between auditing + fixing until the question passes all audits.
  1. Then update the question.
  2. After all questions have been updated, we will need to re-embed all questions.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions