-
Notifications
You must be signed in to change notification settings - Fork 1
/
dictionaries.Rmd
627 lines (343 loc) · 37.7 KB
/
dictionaries.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
---
title: "Dictionaries"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Dictionaries}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
```{r setup, include = FALSE}
library(actdata)
library(dplyr)
```
# EPA dictionary data
actdata makes available a number of what are known as EPA dictionaries. These dictionaries provide the measured evaluation, potency, and activity (EPA) values associated with terms.
In addition to measured (or in one case, estimated) EPA values, the package provides metadata on the data collection and the term. These metadata are provided as additional variables. These variables are:
* **dataset**: an identifier unique to a particular study (e.g. `nc1978`; `morocco2015`). These keys are used within this package's functions to access particular datasets.
* **context**: The country (or other context, such as the internet) where the data was originally collected.
* **year**: The year of data collection. In some cases these are approximate; always check original sources for definitive information.
* **component**: indicates what category the term belongs to. Not all studies provide all possible components, and some focus on only one component. Components include:
* **identity:** Words that can be used to refer to people. Identities can serve as the actors or the objects in an actor-behavior-object ACT event. They are typically nouns (e.g. academic, woman, youngster).
* **modified_identity:** A person with an item that may change their affective connotation (e.g. adolescent with a flat basketball). Note that there are also some identities modified with adjectives mixed into the "identities" category (e.g. white man)--those are classified as identities rather than modified identities at present.
* **behavior:** Actions that people (or other entities, like governments) can perform. Most can serve as the behavior in an actor-behavior-object ACT event. These are typically verbs (e.g. wheedle, acclaim, work).
* **modifier:** Typically adjectives that can be applied to identities (e.g. active, witty, young)
* **setting:** Places and situations (e.g. airplane, alley, worship_service)
* **value:** Traits that people can have (some overlap with modifiers) or concepts or states of being that they may consider desirable or undesirable (e.g. authenticity, accessibility, tradition, risk averse)
* **artifact:** A non-human thing (e.g. cigar, gas guzzler, sports car, slippers)
* **group**: The subsample of the respondents who provided the rating. All datasets provide group = "all", representing the average over the entire sample. Many data sets additionally provide separate summary statistics for male and female respondents. Some dictionaries (e.g., internationaldomesticrelations1981, gaymensanfrancisco1980) provide other subgroups; see dictionary documentation for details. Summary statistics across the whole sample (group = "all) are calculated slightly differently depending on data set. Some dictionaries (e.g. the 2015 US, Morocco, and Egypt dictionaries) are originally published as average values over all respondents. In these cases, `all` is the only provided option. Other dictionaries are originally published in male and female subsets. Average values over all raters are not provided in these originally published sets. In this case, the package provides an approximate average calculated by averaging the values for each subsample. Typically, subsamples are male and female genders, studies recruit approximately equal numbers of men and women, and men and women's ratings do not differ substantially on most terms. In these cases, we expect these approximate average values to be reasonably close to those that we would obtain from an average over all raters. For more information on gender and affect control theory dictionaries, see section 4.1 of David Heise's *Expressive Order* (2007).
* **instcodes:** A fourteen digit binary code that further classifies terms. See the section on institution codes below for details.
## Available dictionaries
This package contains data from 40 different publicly available affect control theory dictionary data sets. Basic information on these dictionaries is shown in the table below. Detailed information on each dictionary, including references, is available at the end of this page.
**Please cite these data!** The data sets included in this package were originally collected by a number of different research teams. When using them for publication, please cite both their publications about the data (linked in the dataset detail section below) and this package.
Not sure which data set to use? If you simply need dependable recent ratings of a wide-ranging set of concepts in a US context (such as is used in much ACT behavior modeling research), I recommend the usfullsurveyor2015 or the usmturk2015 data sets. These are the largest and most recent general dictionaries that are currently available. Other data sets may be useful for questions regarding specific kinds of terms, other cultural contexts, or other points in time.
```{r dict key table, echo = FALSE}
dict_table <- actdata:::get_meta("dict") %>%
dplyr::select(-filetype) %>%
dplyr::select(key, context, year, stats, components) %>%
dplyr::arrange(key)
# , genders, individual, information, notes) %>%
# mutate(linkname = "Webpage")
# dict_table$information <- kableExtra::cell_spec(dict_table$linkname, format = "html", link = dict_table$information)
# dict_table <- dplyr::select(dict_table, -linkname)
names(dict_table) <- c("Dataset key", "Country or context", "Year", "Statistics available", "Components")
# , "Genders", "Individual data available", "More information", "Notes")
dict_table %>%
kableExtra::kable(escape = FALSE) %>%
kableExtra::kable_styling()
```
## Accessing dictionary data: summary statistics
Within the package, all summary data is stored in one data frame named `epa_summary_statistics`. This data frame contains all available EPA means and, when available, institution codes, variances, covariances, and respondent numbers for all terms in all dictionaries. There is one row per term-dataset-gender group.
These data are identical to (and usually sourced directly from) that provided in other places, such as [affectcontroltheory.org](http://affectcontroltheory.org/resources-for-researchers/data-sets-for-simulation/), Interact Java, and a legacy [affect control theory](https://cs.uwaterloo.ca/~jhoey/research/ACTBackup/ACT/data.html) [website](https://cs.uwaterloo.ca/~jhoey/research/ACTBackup/ACT/interact/importable_data.htm).
Researchers will rarely need or want to work with all of these data at one time. To easily build subsets of this data frame, use the `epa_subset()` function. This function allows users to search by term, filter by dataset, return only certain summary statistics, and more. Both the original data frame and these subsets are provided in long form, making them easy to manipulate further using the Tidyverse if needed.
The columns in this data set correspond to those in the dictionary metadata table above. However, to aid sorting, only one year (instead of a range) is provided for all data sets.
```{r subset examples}
# Return all sample average entries for terms containing "friend"
contains_friend <- epa_subset(expr = "friend", group = "all")
head(contains_friend)
# Return all entries for the identity "friend"
friend <- epa_subset(expr = "friend", exactmatch = TRUE, component = "identity")
head(friend)
# Return the entire Ontario 1980 dataset
all_ontario1980 <- epa_subset(dataset = "ontario1980")
head(all_ontario1980)
# Return this same dataset, but only include the female mean values for behaviors
f_mean_ontario1980 <- epa_subset(dataset = "ontario1980", group = "female", component = "behavior", stat = "mean")
head(f_mean_ontario1980)
```
## Accessing dictionary data: individual-level data
EPA summary information is likely to be sufficient for most research questions. Existing research in affect control theory almost always uses summary data. However, the respondent-level data used to compute these summaries may also be useful in particular instances.
A number of data sets, all from 2015 or later, include individual-level data, newly made publicly available in this package. Some of these are subsets of others. These are the following (see the dictionary table above for more information):
1. morocco2015
2. egypt2015
3. usmturk2015
4. dukestudent2015
5. uga2015
6. dukecommunity2015
7. usstudent2015 (a combination of 4 and 5)
8. usfullsurveyor2015 (a combination of 4, 5, and 6)
9. occs2019
10. occs2020
11. artifactmods2022
12. humanvalues2022
13. products2022
All individual data is located in the `individual` data frame within this package. Like the summary datasets, these data are provided in long form, with one respondent's ratings of one term per row. Where available, respondents' gender, race, and age are also included.
To subset these data, use the `epa_subset()` function with the datatype argument set to "individual."
When using any of the US 2015 individual level data sets, keep in mind that usstudent2015 and usfullsurveyor2015 are combinations of other datasets. In the individual data, rows belonging to these sets are included, but are categorized as belonging to the more specific data set (dukestudent2015, uga2015, or dukecommunity2015). You may still use the usstudent2015 and usfullsurveyor2015 keys in `epa_subset()` with `datatype = individual`. The function will return rows from the appropriate combination of data sets.
```{r subset individual data}
# Return all individual-level ratings for identities from the egypt2015 data set
egypt_individual <- epa_subset(dataset = "egypt2015", datatype = "individual", component = "identity")
head(egypt_individual)
# Return all ratings by female-identifying students in the usstudent2015 data set
female_students <- epa_subset(dataset = "usstudent2015", datatype = "individual") %>%
dplyr::filter(gender == "Female")
head(female_students)
```
## Term table
One of the main goals of this package is to make it easy to compare meaning across dictionaries. To this end, the package provides a data frame called `term_table` that shows at a glance which terms are included in which dictionaries. Each column in these tables represents a dictionary (labeled with its key) and each row is a term. Cell entries (0/1) indicate whether or not the specified dictionary has the specified term. These tables can easily be modified further to generate summaries across a set of dictionaries of interest. To see the entries for only a particular component, to search by term, or to limit to a particular set of dictionaries, use dplyr functions to filter the term table.
```{r term table}
# the whole table
head(term_table)
# settings only
set_tt <- term_table %>%
dplyr::filter(component == "setting")
head(set_tt)
# limit to only the two Germany dictionaries and exclude terms in neither
german_tt <- term_table %>%
dplyr::select(term, component, germany1989, germany2007) %>%
dplyr::filter(germany1989 + germany2007 >= 1)
head(german_tt)
# limit to terms that contain "friend"
friend_tt <- term_table %>%
dplyr::filter(stringr::str_detect(term, "friend"))
head(friend_tt)
```
## Institution codes
Contexts restrict the labels that we consider reasonable choices. For instance, if two people are discussing next year's budget in a business meeting, it would seem quite unlikely for one to label the other as a "priest". A business-related label, like "manager" or "employee", or a label that applies across a variety of contexts, like "genius" or "jerk", would seem more realistic.
EPA dictionaries usually contain 14-digit binary strings known as "institution codes" that contain information about what social contexts terms apply within. These codes can be used by analysis software when simulating interaction.
Valid categories are (see Heise's 2007 book *Expressive Order* for details):
* male, female: What genders terms can typically be applied to (identities only)
* overt, surmised: Whether labeling behaviors requires interpretation or insight on the part of the observer (behaviors only)
* place, time: Type of setting (settings only)
* lay, business, law, politics, academe, medicine, religion, family, sexual: Social institutions that terms may belong to. Identities, behaviors, and settings only.
* monadic, group, corporal: How a term requires or implicates others. Identities, behaviors, and settings only.
* adjective, adverb: Part of speech (modifiers)
* emotion, trait, status, feature, emotion_spiral: Categories for modifiers.
This package provides several ways to demystify and make use of these codes. See the function documentation for more details.
* The `epa_subset()` function takes an `institutions` argument that allows a user to filter by institution.
* The `expand_instcodes()` function converts institution code strings into columns containing TRUE/FALSE/NA values. These values indicate whether a category is applicable to a term and, if so, whether the term belongs to the category. Representing institutions in this way makes it easier for users to work with them.
* The `create_instcode()` function takes a component and a set of logical values indicating category membership and returns a properly formatted institution code binary string. This is useful for creating new institution codes.
```{r institution codes}
businesslaw <- epa_subset(dataset = "morocco2015", institutions = c("business", "law"), stat = "mean") %>%
dplyr::select(term, component, instcodes, E, P, A)
head(businesslaw)
# the default is to keep terms for which there are no institution codes.
# Change this behavior using the drop.na.instcodes argument in epa_subset().
businesslaw <- expand_instcodes(businesslaw) %>%
dplyr::select(-E, -P, -A)
head(businesslaw, 7)
newcode <- create_instcode(component = "setting", place = TRUE, family = TRUE, religion = TRUE)
print(paste("The institution code", newcode, "represents a setting that is a place and is relevant to only the family and religion domains."))
```
# Dataset collection details
Data collection efforts are typically intended to either be general or specific. General data collections aim to provide information on the typical, normative culture of a place or context, typically a country. They measure EPA for terms that apply to a wide range of social situations, and use samples of respondents that are argued to be either representative of a country's population at large or "cultural experts" who are deeply familiar with the types of spaces in which the cultural of interest is reproduced (Heise 2007). Data from these general data collections is often used in simulations of actor-behavior-object ACT events. Specific data collections measure EPA for terms in only a particular domain of interest, and may recruit respondents from only a target subpopulation of interest. Such data are sometimes used in simulations of events, but often the reason to collect them is to be able to better describe meanings in a particular domain.
Below, in approximate reverse chronological order within general/specific categories, are details including links for more information about the collection of each of the datasets contained in this package.
## General dictionaries
### mostafaviestimates2022
Description: These EPA values were estimated using Mostafavi, Porter, and Robinson's Bidirectional Encoder Representations from Transforms (BERT) model. Most terms from previously collected dictionaries are included.
Sample: Not applicable; values were estimated, not empirically measured. However, the model was trained on the 2015 US Full Surveyor (usfullsurveyor2015) dataset.
Authors: Moeen Mostafavi, Michael D. Porter, Dawn T. Robinson
Relevant publications/more information: [Mostafavi, Porter, and Robinson 2022](https://arxiv.org/abs/2202.00065)
### Calcutta 2017: calcuttaall2017 and calcuttasubset2017
Description: Semantic differential ratings of 1,469 concepts in Bengali, a language spoken by about 250 million individuals in eastern India and Bangladesh. The calcuttaall2017 dataset contains summary information from all respondents. The calcuttasubset2017 dataset contains summary information calculated using data from just respondents who used scales as expected. See linked paper for details.
Sample: 20 male and 20 female native Bengali speakers living in Calcutta, India
Year collected: 2013
Authors: Shibashis Mukherjee, David Heise
Relevant publications/more information: [Mukherjee and Heise 2017](https://link.springer.com/article/10.3758/s13428-016-0704-6)
### United States 2015: dukecommunity2015, dukestudent2015, uga2015, usfullsurveyor2015, usmturk2015, usstudent2015
Description: Ratings of 929 identities, 814 behaviors, and 660 modifiers collected between 2012 and 2014 from several samples of people living in the United States: students at the University of Georgia (uga2015), students at Duke University (dukestudent2015), non-students living in Durham, North Carolina (dukecommunity2015), and US-based workers on Amazon Mechanical Turk (usmturk2015). Some keys refer to combinations of datasets. usstudent2014 is the combination of dukestudent2015 and uga2015. usfullsurveyor2015 is the combination of both student samples and the community sample (dukestudent2015, uga2015, and dukecommunity2015).
Sample: n = 1368 undergraduates at the University of Georgia (uga2015), n = 216 students at Duke University (dukestudent2015), n = 159 non-students living in Durham, North Carolina (dukecommunity2015), and n = 2615 US Amazon Mechanical Turk workers.
Year collected: 2012-2014
Authors: Lynn Smith-Lovin, Dawn T. Robinson, Bryan C. Cannon, Jesse K. Clark, Robert Freeland, Jonathan H. Morgan, Kimberly B. Rogers
Relevant publications/more information/citations: [usstudent](http://affectcontroltheory.org/usa-student-surveyor-dictionary-2015/), [usfullsurveyor](http://affectcontroltheory.org/usa-combined-surveyor-dictionary-2015/), [uga](http://affectcontroltheory.org/usa-georgia-dictionary-2015/), [usmturk](http://affectcontroltheory.org/usa-online-dictionary-2015/)
### egypt2015
Description: 397 identities, 368 behaviors, and 233 modifiers
Sample: 1716 residents of Cairo, Egypt
Year collected: 2012-2014
Authors: Hamid Latif, Lynn Smith-Lovin, Dawn T. Robinson, Bryan C. Cannon, Brent Curdy, Darys J Kriegel and Jonathan H. Morgan
Relevant publications/more information: [Link to source](http://affectcontroltheory.org/egypt-dictionary-2015/), [Robinson, Smith-Lovin, and Zhao 2020](https://link.springer.com/chapter/10.1007/978-3-030-41231-9_8)
### morocco2015
Description: 397 identities, 368 behaviors, and 233 modifiers
Sample: n = 1546 residents of Rabat, Morocco
Year collected: 2014-2015
Authors: Lynn Smith-Lovin, A. Soudi, Dawn T. Robinson, Bryan C. Cannon, Brent Curdy, Darys J. Kriegel, Jonathan H. Morgan
Relevant publications/more information: [Link to source](http://affectcontroltheory.org/morocco-dictionary-2015/), [Robinson, Smith-Lovin, and Zhao 2020](https://link.springer.com/chapter/10.1007/978-3-030-41231-9_8)
### germany2007
Description: 376 identities, 393 behaviors, and 331 modifiers.
Sample: Ratings were obtained with Surveyor from 1905 subjects (734 male and 1171 female) from all over Germany. The research was advertised as a “study of language and emotion” in an extensive recruitment campaign including mailing lists from different universities, weblogs, newspaper reports and radio interviews. Most of the participants (N = 1029) were between 20 and 29 years of age, but the sample covered all ages, including N = 129 being younger than 20 and N = 92 older than 60 years. The data of 83 persons (4.4 %) were excluded from the analysis, as they had indicated German not being their mother tongue. On average, each stimulus was rated by 29.5 male and 46.4 female raters.
Year collected: 2007
Authors: Tobias Schröder
Relevant publications/more information: [Schröder 2011](https://journals.sagepub.com/doi/abs/10.1177/0261927x10387103). Raw data available through Interact Java applet.
### indiana2003
Description: Ratings of 500 Identities, 500 Behaviors, 300 Modifiers, and 200 Settings were collected at Indiana University, via the Internet using the Surveyor applet.
Sample: n = 1027 Indiana University students enrolled in the business and arts and sciences schools who lived in the U.S.A. at age 16 and were about equally male and female.
Year collected: 2002-2003
Authors: Clare Francis and David R. Heise
Relevant publications/more information: [Information on sample](https://cs.uwaterloo.ca/~jhoey/research/ACTBackup/ACT/PDF/ProjectNotes.pdf), [information on term selection](https://kb-research.ca/archives/ACT/recovered/Project_Magellan___br_Collecting_Cross-cultural_Affective_Meanings_Via_The_Internet.pdf). Raw data available through Interact Java applet.
### ontario2001
Description: Data on 993 Identities, 601 Behaviors, 500 Modifiers, and 200 Settings were gathered with the Attitude program from Guelph, Ontario, undergraduates in 2001-2002. Data on settings were gathered with the Surveyor program at Guelph in 2003. Funded by the Social Science and Humanities Research Council of Canada.
Sample: University of Guelph undergraduate students
Year collected: 2001-2003
Authors: Neil J. MacKinnon
Relevant publications/more information: [Luke 2010](https://uwspace.uwaterloo.ca/handle/10012/5584), [Affect Control Theory website entry](https://affectcontroltheory.org/canada-dictionary-2001-2003/). Raw data available through Interact Java applet.
### china1999
Description: Ratings of 449 Identities, 300 Behaviors, 98 Emotions, 150 Traits, and 149 Settings were obtained with the Attitude program.
Sample: About 380 undergraduate students at Fudan University in Shanghai, China
Year collected: 1991
Authors: Herman W. Smith and Yi Cai
Relevant publications/more information: [Zhao 2022](https://doi-org.proxy.lib.duke.edu/10.1177/00027642211066025). [Affect control theory website](https://affectcontroltheory.org/china-dictionary-1999/). Raw data available through Interact Java applet.
### texas1998
Description: Ratings of 443 Identities, 278 Behaviors, 65 Modifiers, and 1 Setting were collected at Texas Tech University with program Attitude.
Sample: Some disagreement between sources regarding sample size. [Schneider 2007](https://kb-research.ca/archives/ACT/recovered/TX_stereo.ppp.pdf) says 420 undergraduate students received a small monetary incentive for rating 413 identities. The [affect control theory website entry](https://affectcontroltheory.org/usa-texas-1998/) claims sample size is 482. It may be that this data set contains the combined ratings of Texas Tech students with those of additional terms by University of Missouri students collected at around the same time by Herm Smith (see Schneider 2007).
Year collected: 1998
Authors: Andreas Schneider
Relevant publications/more information: [Schneider 2007](https://kb-research.ca/archives/ACT/recovered/TX_stereo.ppp.pdf). Raw data available through Interact Java applet.
### japan1995
Description:
From [Affect Control Theory website](https://affectcontroltheory.org/japan-dictionary-1989-2002/): Ratings of 403 Identities and 307 Behaviors, and a few Settings were obtained with the Attitude program from 323 Tohoku University students in 1989. In 1995 and 1996, 120 women students at Kyoritsu Women's, Japan Women's, and Teikyo Universities and 120 men students at Teikyo and Rikkyo Universities rated an additional 300 settings, 300 modifiers (mainly traits), 200 business identities, and 75 behaviors. Yoichi Murase (Rikkyo University) and Nozomu Matsubara (Tokyo University) provided access to students who rated 102 emotions, 70 behaviors and 55 identities in 2002 using Surveyor. Total numbers of entries in Interact lexicon are: 713 Identities, 455 Behaviors, 426 Modifiers, and 300 Settings. Number of male or female raters generally is about 30 for each concept.
From [Smith, Matsuno, and Umino 1994](https://www.jstor.org/stable/2786706) and [Smith, Umino, and Matsuno 1998](https://www.tandfonline.com/doi/abs/10.1080/0022250X.1998.9990209): 403 identities and 307 behaviors were rated by a convenience sample of 25 men and 25 women at a national university in Japan.
Sample: University students in Japan. Some disagreement among sources regarding sample size.
Year collected: 1989-2002
Authors: Herman W. Smith, Takanori Matsuno, Shuuichirou Ike, and Michio Umino
Relevant publications/more information: [Smith, Matsuno, and Umino 1994](https://www.jstor.org/stable/2786706), [Smith, Umino, and Matsuno 1998](https://www.tandfonline.com/doi/abs/10.1080/0022250X.1998.9990209). Raw data available through Interact Java applet.
### germany1989
Description:
From [Affect Control Theory website](https://affectcontroltheory.org/germany-dictionary-1989/): Ratings of 442 Identities, 295 Behaviors, and 67 Modifiers, selected for back-translatability with the 1978 U.S.A. dictionary were obtained with the Attitude program from 520 Mannheim students. Subjects were matched to the American undergraduate population by proportional inclusion of 12 and 13 grade youths at two German Studenten des Grundstudiums and Gymnasiasten, along with subjects from Mannheim University, which attracts students mainly from the Rhein-Neckar region in former West Germany.
From [Schneider 2004](https://doi.org/10.1525/sop.2004.47.3.313): To correspond to the undergraduate population in the United States, subjects were pupils in the thirteenth grade of Gymnasien1 as well as university students. 380 subjects were recruited from Mannheim University and two Gymnasien in Mannheim, a large industrial city, that attract students mainly from the Rhein-Neckar region in former West Germany.
Sample: German students; disagreement among sources about sample size.
Year collected: 1989
Authors: Andreas Schneider
Relevant publications/more information: [Schneider 2004](https://doi.org/10.1525/sop.2004.47.3.313). Raw data available through Interact Java applet.
### ontario1980
Description: Data on 843 Identities and 593 Behaviors were obtained with paper questionnaires in 1980-1983, and 495 Modifiers were added in 1985-1986. Funded by the Social Science and Humanities Research Council of Canada.
Sample: 5534 (identities and behaviors) or 1260 (modifiers) undergraduate students at the University of Guelph. Each term was rated by approximately 35 men and 35 women.
Year collected: 1980-1986
Authors: Neil J. MacKinnon
Relevant publications/more information: [MacKinnon and Luke 2002](https://www.jstor.org/stable/3341547). Raw data available through Interact Java applet.
### nc1978
Description: From [Affect Control Theory website](https://affectcontroltheory.org/usa-north-carolina-1978/): Ratings of 721 Identities, 600 Behaviors, 440 Modifiers, and 345 Settings were obtained with paper questionnaires from 1,225 North Carolina undergraduates. (Ratings for some emotion words in this data set were obtained by Heise from Indiana University undergraduates in 1985.) Funded by National Institute of Mental Health Grant 1-R01-MH29978-01-SSR.
Sample: From [Smith-Lovin 1988](https://www.taylorfrancis.com/chapters/edit/10.4324/9781315025773-2/impressions-events-lynn-smith-lovin): Data were collected from students in social sciences and humanities classes at the University of North Carolina at Chapel Hill in the 1977-78 school year. Each out-of-context stimulus was rated by approximately 56 subjects. Approximately half of the raters for each stimulus were males and half females.
Year collected: Mostly 1977-1978; some emotions from 1985
Authors: Lynn Smith-Lovin and David R. Heise
Relevant publications/more information: [Smith-Lovin 1988](https://www.taylorfrancis.com/chapters/edit/10.4324/9781315025773-2/impressions-events-lynn-smith-lovin). Raw data available through Interact Java applet.
### nireland1977
Description: Ratings of 528 Identities and 498 Behaviors were obtained with paper questionnaires.
Sample: 319 Belfast teenagers in Catholic high schools
Year collected: 1977
Authors: Willigan and Heise
Relevant publications/more information: [Affect Control Theory website entry](https://affectcontroltheory.org/northern-ireland-dictionary-1977/). Raw data available through Interact Java applet.
## Specific dictionaries
### artifactmods2022
Description: Ratings of 58 identities, 52 physical artifacts, and 212 artifact-modified identities across a range of identities and artifact types.
Sample: n = 825 participants recruited through Amazon Mechanical Turk who had lived in the U.S. over half their lives.
Year collected: 2015
Authors: Rohan Lulham and Daniel B. Shank
Relevant publications/more information: [Lulham and Shank 2022](https://doi.org/10.1177/00027642211066045)
### employeeorg2022
Description: Ratings of organizations (e.g., *library*) and their employees (e.g., *employee of a library*).
Sample: 118-119 participants in the U.S. recruited through Amazon Mechanical Turk.
Year collected: Unknown
Authors: Daniel B. Shank and Alexander Burns
Relevant publications/more information: [Shank and Burns 2022](https://doi.org/10.1016/j.ssresearch.2022.102723)
### humanvalues2022
Description: Ratings of 393 values, traits, etc. Collected with data for Lulham and Shank 2022 (artifactmods2022).
Sample: n = 825 participants recruited through Amazon Mechanical Turk who had lived in the U.S. over half their lives.
Year collected: 2015
Authors: Rohan Lulham, Daniel B. Shank, and Clementine Thurgood
Relevant publications/more information: [Lulham and Shank 2022](https://doi.org/10.1177/00027642211066045)
### products2022
Description: Ratings of 132 everyday products. Collected with data for Lulham and Shank 2022 (artifactmods2022).
Sample: n = 825 participants recruited through Amazon Mechanical Turk who had lived in the U.S. over half their lives.
Year collected: 2015
Authors: Rohan Lulham, Daniel B. Shank, and Clementine Thurgood
Relevant publications/more information: [Lulham and Shank 2022](https://doi.org/10.1177/00027642211066045)
### techvshuman2021
Description: Ratings of 59 roles as performed by a human, a computer, or an AI identity (e.g., a product assembling employee, a product assembling computer system, a product assembling artificial intelligence).
Sample: n = 549 people recruited using Prolific who had lived over half their lives in the United States.
Year collected: Unknown
Authors: Daniel B. Shank, Madison Bowen, Alexander Burns, Matthew Dew
Relevant publications/more information: [Shank, Bowen, Burns, and Dew 2021](https://www.sciencedirect.com/science/article/pii/S2451958821000403)
### generaltech2020
Description: Ratings of 25 general technology terms (e.g. bot, artificial intelligence).
Sample: 59 Amazon Mechanical Turk workers who had lived in the US at least half their lives.
Year collected: Unknown
Authors: Daniel B. Shank, Alexander Burns, Sophia Rodriguez, Madison Bowen
Relevant publications/more information: [Shank, Burns, Rodriguez, and Bowen 2020](https://crisp.org.uiowa.edu/sites/crisp.org.uiowa.edu/files/2020-06/crisp_28_4_shank.pdf)
### Occupations: occs2019 and occs2020
Description: Ratings of U.S. occupational categories. Wave 1, collected in 2019 and 2020, immediately before the onset of the COVID-19 pandemic, contains ratings for all 650 U.S. census occupational categories. Wave 2, collected in 2020 in the early months of the pandemic, contains ratings for 94 occupations, with a focus on those deemed "essential" in the pandemic context.
Sample: Quasi-nationally representative sample from an online Qualtrics panel, n = 2726
Year collected: 2019-2020
Authors: Joseph M. Quinn, Robert E. Freeland, Kimberly B. Rogers, Jesse Hoey, Lynn Smith-Lovin
Relevant publications/more information: [Quinn, Freeland, Rogers, Hoey, and Smith-Lovin 2022](https://doi.org/10.1177/00027642211066041)
### nounphrasegrammar2019
Description: Participants rate 28 social identity concepts, which are either count or collective nouns, presented in one of five grammatical forms. These data have been used to examine the influences of determiners (a/an, the, and all) and grammatical number (singular or plural) on the affective meaning of social identity concepts.
Sample: Between 372 and 384 Amazon Mechanical Turk workers living in the United States.
Year collected: Unknown
Authors: Daniel B. Shank, Sarah E. Hercula, Brent Curdy
Relevant publications/more information: [Shank, Hercula, and Curdy 2019](https://journal.equinoxpub.com/JRDS/article/view/15490)
### groups2019
Description: Ratings of 170 group-related concepts. These were collected as pilot data for Shank, Hercula, and Curdy 2019 (nounphrasegrammar2019) and Shank and Burns 2022 (employeeorg2022).
Relevant publications/more information: [Shank, Hercula, and Curdy 2019](https://journal.equinoxpub.com/JRDS/article/view/15490), [Shank and Burns 2022](https://doi.org/10.1016/j.ssresearch.2022.102723)
### groups2017
Description: EPA ratings of 64 group-related concepts, including primary groups, task groups, categories of people, and collectives.
Sample: 155 Amazon Mechanical Turk workers who had lived in the US the majority of their lives.
Year collected: 2017
Authors: Daniel B. Shank and Alexander Burns
Relevant publications/more information: [Shank and Burns 2018](https://crisp.org.uiowa.edu/sites/crisp.org.uiowa.edu/files/2020-04/crisp_vol_26_5.pdf)
### ugatech2008
Description: Ratings of 80 technology-related items and concepts.
Sample: n = 174 undergraduate students at the University of Georgia; 100 women and 74 men.
Year collected: 2008
Authors: Daniel B. Shank
Relevant publications/more information: [Shank 2010](https://crisp.org.uiowa.edu/sites/crisp.org.uiowa.edu/files/2020-04/15.10.pdf)
### politics2003
Description: Ratings of political actors (individuals, collectives, groups, and organizations) and behaviors (individual and social).
Sample: Students in an introduction to sociology course at the University of Missouri-St Louis. n = 47 men and 74 women.
Year collected: Unclear
Authors: Kyle Irwin
Relevant publications/more information: [Irwin 2003](https://www.proquest.com/dissertations-theses/political-interaction-affective-meaning/docview/250699845/se-2?accountid=10598), [link to source](https://cs.uwaterloo.ca/~jhoey/research/ACTBackup/ACT/interact/subcultures/politics.htm)
### expressive2002
Description: 98 nonverbal behavior terms collected in Study 1 of [Rashotte 2002](https://doi.org/10.2307/3090170). Terms were divided into four subsets. Subset 1 of the behaviors was given to 50 men and 50 women; subset 2 to 47 men and 52 women; subset 3 to 39 men and 59 women; and subset 4 to 36 men and 61 women.
Sample: 230 female and 172 male students in introductory sociology courses at a large public university in the southwestern United States.
Year collected: Unclear
Authors: Lisa Slattery Walker (formerly Lisa Slattery Rashotte)
Relevant publications/more information: [Rashotte 2002](https://doi.org/10.2307/3090170), [link to source](https://cs.uwaterloo.ca/~jhoey/research/ACTBackup/ACT/interact/subcultures/expressive_acts.htm)
### internet1998
Description: Internet-related identities, behaviors, and settings rated by Internet users in 1998.
Sample: 2431 Internet users (56% male, 44% female) who responded to an ad run for six months on the Yahoo! search engine in 1998.
Year collected: 1998
Authors: Adam B. King
Relevant publications/more information: [King 2001](https://doi.org/10.1177/089443930101900402), [link to source](https://cs.uwaterloo.ca/~jhoey/research/ACTBackup/ACT/interact/subcultures/InternetCulture.htm)
### household1994
Description: 53 terms related to household chores, family, and relationships.
Sample: 23 male and 46 female college students
Year collected: 1994
Authors: Amy Kroska
Relevant publications/more information: [Link to source](https://cs.uwaterloo.ca/~jhoey/research/ACTBackup/ACT/interact/subcultures/household.htm)
### internationaldomesticrelations1981
Description: 238 domestic and inter-state behaviors that nations can engage in.
Sample: A total of 57 respondents filled out both the international and domestic semantic differential rating booklets; 28 respondents belonging to a "general" sample (group variable = "nonprofessional") and 29 respondents to an "elite" sample (group variable = "professional"). The assignment of individuals to the two groups was based upon their level of expertise regarding international relations. Members of the elite sample included university professors of international relations, members of the United States' Departments of State and Defense, advanced graduate students of international relations, and members of consulting firms which contract for foreign policy work. The general sample was largely comprised of university undergraduates along with other individuals who have limited or no substantive knowledge or understanding of the workings of the international system. (Azar and Lerner 1981)
Year collected: Unclear
Authors: Edward E. Azar and Steve Lerner
Relevant publications/more information: [Azar and Lerner 1981](https://doi.org/10.1080/03050628108434560), [link to source](https://cs.uwaterloo.ca/~jhoey/research/ACTBackup/ACT/interact/subcultures/international_relations.htm)
### gaymensanfrancisco1980
Description: Ratings of 14 sex-related behaviors gathered from ten San Francisco gay men in the 1980s by Professor Don Barrett, California State University, San Marcos. The "unsafebetter" group is the mean EPA ratings made by those respondents who feel that unsafe sex practices are more pleasurable than safe sex practices. The "safebetter" group is the mean EPA ratings made by those respondents who think that safe sex practices are more pleasurable. Group = "all" returns the average ratings across the sample.
Sample: 10 gay men in San Francisco
Year collected: Unclear; 1980s
Authors: Donald C. Barrett
Relevant publications/more information: [Link to source](https://cs.uwaterloo.ca/~jhoey/research/ACTBackup/ACT/interact/subcultures/gay_sex.htm)