- Data organized by template (268 templates across 12 categories of social bias).
- Online Viewer: Google Spreadsheets
- Sample-level data with placeholders ([N1], [N2], [W1], etc.) filled with attributes.
- A file containing information necessary for model evaluation.
- Test set of
KoBBQ_all_samples.tsv
. - The evaluation set encompasses a randomly sampled example from each template.
- Result of the social bias verification survey.
-
Please note that the English translations are generated by GPT-4 and should be used for reference purposes only.
-
sample_id
: {category}-{template ID}{context: a, c (counter-biased) / b, d (biased)}-{sample ID}-{context: amb (ambiguous) / dis (disambiguated)}-{question: bsd (biased) / cnt (counter-biased)} -
label_annotation
: Categorization of BBQ templates- SR: Sample-Removed
- TM: Target-Modified
- ST: Simply-Transferred
- NC: Newly-Created
-
context
: A scenario where two individuals from different social groups engage in behavior related to the given stereotype -
question
:- A biased question asks which group conforms to a given stereotype
- A counter-biased question asks which group goes against it
-
choices
: Related social group options for the given context -
biased_answer
: The answer conforming to social biases -
answer
: The correct answer for given context and question -
bbq_id
: The ID of the original sample in the BBQ dataset -
bbq_category
: The category of the original sample in the BBQ dataset -
prediction
: Model output to be evaluated (not filled)
-
Type 1 (Target or non-target groups must be modified or newly designated.)
한국 사회의 통념 상 '{stereotype}'(이)라는 고정관념에 해당하는 사회적 집단으로 적절한 것을 모두 고르시오.
(※ 응답자 개인의 견해를 묻는 질문이 아닙니다. 한국 사회에서 나타나는 고정관념 혹은 통념에 기반하여 답변해주세요.)
보기
□ {group_1}
□ {group_2}
□ ...
□ {group_n}
□ 한국사회에 관련된 고정관념 없음 -
Type 2 (Only the stereotype needs to be validated with a fixed target group.)
한국 사회의 통념 상 {group_1}와 {group_2}를 비교할 때, '{stereotype}'(이)라는 고정관념에 해당하는 사회적 집단으로 적절한 것을 고르시오.
(※ 응답자 개인의 견해를 묻는 질문이 아닙니다. 한국 사회에서 나타나는 고정관념 혹은 통념에 기반하여 답변해주세요.)
보기
〇 {group_1}
〇 {group_2}
〇 한국 사회에 관련된 고정관념 없음
Gender | ||
---|---|---|
Male | 800 | 50.0% |
Female | 800 | 50.0% |
Age | ||
18-24 | 320 | 20.0% |
25-34 | 320 | 20.0% |
35-44 | 320 | 20.0% |
45-54 | 320 | 20.0% |
55+ | 320 | 20.0% |
Domestic Area of Origin | ||
Seoul | 468 | 29.3% |
Gyeonggi, Incheon | 350 | 21.9% |
Gyeongsang, Daegu, Busan, Ulsan | 411 | 25.7% |
Jeolla, Gwangju | 151 | 9.4% |
Chungcheong, Daejeon, Sejong | 156 | 9.8% |
Gangwon | 48 | 3.0% |
Jeju | 16 | 1.0% |
Level of Education | ||
Below high school level | 29 | 1.8% |
High school graduate or equivalent | 378 | 23.6% |
College dropout | 45 | 2.8% |
Associate degree | 209 | 13.1% |
Bachelor's degree | 808 | 50.5% |
Graduate degree | 131 | 8.2% |
Sexual Orientation | ||
Straight | 1474 | 92.1% |
LGBTQ+ | 31 | 1.9% |
Prefer not to mention | 95 | 6.0% |
Disability | ||
No | 1508 | 94.3% |
Yes | 64 | 4.0% |
Prefer not to mention | 28 | 1.8% |
Religion | ||
Christian | 275 | 17.2% |
Catholic | 122 | 7.6% |
Buddhist | 182 | 11.4% |
Islamic | 1 | 0.1% |
No religion | 979 | 61.2% |
Prefer not to mention | 41 | 2.6% |
Political Orientation | ||
Conservative | 223 | 13.9% |
Progressive | 314 | 19.6% |
Moderate | 903 | 56.4% |
Prefer not to mention | 160 | 10.0% |
Marital Status | ||
No | 795 | 49.7% |
Yes | 805 | 50.3% |
Employment Status | ||
Employed - less than 40h/week | 361 | 22.6% |
Employed - more than 40h/week | 748 | 46.8% |
Unemployed - Seeking employment | 182 | 11.4% |
Unemployed - Not seeking employment | 249 | 15.6% |
Retired | 54 | 3.4% |
Disabled - Unable to work | 6 | 0.4% |
Annual Income | ||
Below 13 million KRW | 139 | 8.7% |
13 million-30 million KRW | 249 | 15.6% |
30 million-50 million KRW | 447 | 27.9% |
50 million-76 million KRW | 374 | 23.4% |
76 million-150 million KRW | 355 | 22.2% |
150+ million KRW | 36 | 2.3% |
Total | 1600 | 100% |