-
Notifications
You must be signed in to change notification settings - Fork 0
/
LitReviewThesis050211.tex
executable file
·1060 lines (566 loc) · 182 KB
/
LitReviewThesis050211.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\part{The Placebo Effect}
\section{Introduction to the Placebo Effect}
\label{sec:intr-plac-effect}
The major focus of this thesis was measurement, specifically a multi-method approach to the measurement of the placebo effect. The placebo effect is typically regarded as a nuisance parameter in clinical trials, and yet that is where much of the research which has broadened our understanding of the pheneomenon has taken place. Despite years of research, there are few known individual level predictors of the effect. In fact, some researchers have argued that there is no such entity as a placebo responder \cite{Kaptchuk2008a} .
The placebo is an interesting topic of study in that the term stands as a proxy for those elements of human health which are not determined by features specific to the treatment. Instead, the term placebo refers to the ``non-specific'' parts of the treatment, those which are not attributable to specific biological or mechanical activities forming the basis of modern medicine.
Despite the confusion surrounding the definition and interpretation of placebo (discussed more fully in Section \ref{sec:concept-placebo} below), the term will be retained throughout this as it does form a useful overarching construct for this research and other research on the interactions of mental and physical states in health. A definition suitable for the purposes of this thesis will be given in Section \ref{sec:defin-this-thes}.
\section{The Concept of the Placebo}
\label{sec:concept-placebo}
The placebo effect has a long history in medicine, and some researchers believe that almost every treatment developed before the 20th Century may have relied primarily on this effect for their curative properties \cite{Shapiro1997,Macedo2003} . Despite this, the concept is not particularly well defined , and different researchers use the same phrase to mean different things \cite{Ernst1995b,hrobjartsson1996uncontrollable} . This section will examine some of these definitions, and look for markers towards which a more comprehensive definition of the effect can be outlined.
\subsection{History of the Concept}
\label{sec:history-concept}
Placebos have a long and colourful past, some elements of which are still relevant today. The word itself is taken from the Latin for "I will please" and referred originally to the cries of paid mourners at funeral ceremonies \cite{Macedo2003}. Over time, it became associated with any medicine which was given more to please a patient than to actually relieve the symptoms \cite{Kaptchuk1998}. Placebos were considered ethically acceptable, if somewhat dubious for many years. However, as more effective and tailored treatments were developed, the use of placebos declined. Following the development of randomised double-blind trials, the placebo increased in importance once more \cite{Kaptchuk1998}, as it became a required part of the process for establishing the efficacy of all new drugs.
The first serious attempts at studying of the placebo as an effect in itself were begun by Beecher's article "The Powerful Placebo" which argued that placebos were useful in many differing kinds of ailments and diseases, and that their effectiveness did not depend on the intelligence of the patient \cite{beecher1955powerful,Kaptchuk1998}. This article aroused interest in the placebo, and is generally regarded as the first paper which focused on the effect as interesting in its own right, rather than as a device to disguise a doctor's lack of effective treatments.
Beecher made many strong claims in his 1955 article, including an assertion that placebos were effective 35\% of the time. Later research has demonstrated that many of these claims were inaccurate or misleading \cite{Kienle1998}. The main criticisms of Kienle and Keine will be covered below, but in brief, they found that much of the reported improvement in the studies examined by Beecher could have been due to natural history of the disease and regression to the mean.
Much of the lack of clarity found within the use and definition of placebo \cite{Macedo2003,Kaptchuk1998} results from its use in two contradictory situations. In the first, that of the randomised controlled trial (RCT), placebos are a control for all effects of treatment not related to the substance or procedure under test \cite{Vickers2000}. In this setting, the aim is that they should be minimised. In the second, the setting of clinical practice, the placebo is imbued with all the authority of medicine and utilised in order to effect changes that may result from mindset or to placate a troublesome patient \cite{Bootzin2003,Sherman2008a}. Macedo and Kaptchuk (in seperate articles) argue that both of these approaches give too much power to the concept and contribute to the confusion surrounding its definition and explication.
Much of the confusion results from the terms ``specific'' and ``non-specific'' effects. These particularly confusing terms have lead to much agonising and debate over the years. Some have even suggested that these terms should be abandoned \cite{Caspi2002}. The specific parts of a treatment are typically defined as the biologically effective agent, while the non-specific factors are all other parts of the treatment. Price and Benedetti argue that there are three sources of non-specific effects - the patient, the provider and the relationship between them \cite{Price2008}.
Given that placebos can exert extremely specific changes \cite{Amanzio2001,Caspi2002}, they are not non-specific effects, and to the extent that they do not exert changes in outcomes, then the concept is irrelevant \cite{Moerman2003}.
\subsection{Placebos in Randomised Trials}
\label{sec:plac-rand-trials}
Placebos are most well known for their use in a clinical trial setting. The FDA (Food and Drug Administration) in the United States of America requires all drugs to demonstrate usefulness over and above placebo in order to be licensed. This followed a number of scandals in the late nineteen fifties and early nineteen sixties where treatments which had been in use for many years showed little or no effect under double-blind conditions \cite{Moerman2000a}.
In a typical randomised trial, neither patients nor physicians know whether a particular person receives drug or placebo. If 50\% of those given placebo feel better (with respect to the primary outcome measure of the study), even in the absence of medication, this is called a placebo effect, and the inert procedure or pill used is a placebo. More technically, any mean improvement in the control group can be classified as a placebo effect. However, this is not entirely accurate, as will be seen below.
% The most common situation in which a placebo is used is as in the example above, as a control for the effects of treatment apart from the specific medication being tested (the non specific effects). Here the placebo is typically regarded as a pill or procedure which appears real, but in fact is not. In this situation, a placebo effect is often defined as the effects observed in the placebo arm of a clinical trial.
This definition runs into immediate problems, as the effects in a placebo arm of a clinical trial result from a combination of the placebo effect and other factors, such as regression to the mean \cite{Morton2003}, demand characteristics of clinical trials \cite{Hrobjartsson2001} and other factors such as the natural history of the disease.
Regression to the mean is the tendency for an extreme score measured at time 1 to be closer to the mean at time 2. This tendency is a property of all measuring tools which are not perfectly reliable \cite{Morton2003}, and can be controlled for by sampling from the general population, as using participants with high scores on the outcome variable to be measured tends to exacerbate the effect.
Sampling from the general population, while a good strategy, is not practical for many trials of new medicines, given the exclusion and inclusion criteria for clinical trials. These criteria normally require that participants in trials suffer from the condition, and have no other effective treatment \cite{Daugherty2008}.
Demand characteristics \cite{Fernandez1994,weber1972subject} refer to tendency for participants in research to give the answer which they believe that the researcher wants to hear. This can result in an over-exaggeration of symptoms at the first assessment and a minimising tendency for the same symptoms at the end of the study \cite{Vase2005}.
Another factor which can bias the results of a clinical trial is unidentified parallel interventions \cite{Ernst1995b}. These occur when participants in a clinical trial change other factors as the result of being in the trial, thus biasing the results. One example of this could be if participants in a clinical trial for hypertension reduce their salt intake (potentially as a result of the increased salience of the risks associated with hypertension caused by the process of enrollment in the trial, where typically large amounts of information are provided regarding the condition under treatment).
It is important to note that without a no-treatment control group the effects of these confounders cannot be separated from the true placebo response. The no treatment group serves this purpose as factors such as regression to the mean and natural history should apply equally to both the placebo and the no-treatment group. The definition of the placebo typically used is:
\begin{quotation}
the placebo response is the response to treatment in the placebo group less the response to treatment in the no-treatment group.
\end{quotation}
A more precise definition in the context of clinical trials is given by Knipschild et al \cite{Knipschild2005}
\begin{quotation}
the placebo effect ... [is] ... the difference in effect between the placebo group and the spontaneous course in a randomized clinical trial
\end{quotation}
This definition relies upon an understanding of the spontaneous course of an illness in a controlled trial, which can be operationalised as the progress of the no-treatment control group. It is somewhat limiting, but is clearly operationalised for a particular setting, which is useful in terms of clarity around the construct which is being measured.
\subsection{Placebos and Cognitive Performance}
\label{sec:plac-cogn-perf}
Within the confines of the clinical trial, the two definitions above are workable definitions of the placebo effect. The important part of a clinical trial is the test of the active medication, and the placebo is important only insofar as it relates to the testing of this medication.
However, clinical trials are not the only context where placebos are administered, and in other situations these definitions run into problems. Consider some recent work of Oken et al \cite{Oken2008}. In this experiment, healthy seniors were administered placebos which they were told would improve cognitive performance. There were two active groups (given different instructions) and one no-treatment group. The participants were tested for cognitive abilities at the beginning, middle and end of the placebo treatment. The seniors given the placebo pills showed significant increases in cognitive ability over the course of the study, and many were disappointed when debriefed and were told that they had been given placebos.
While the definitions given above can, at a stretch, account for these results, it does indicate a need to more carefully define the concept of placebo. Whether or not one accepts the clinical trial definition depends crucially on ones definition of treatment.
For many, this is some device or procedure that restores the organism to optimal function. Alternatively, this view could be described as believing that treatment restores homeostasis to the organism. Indeed, the Oxford English Dictionary agrees with this definition describing treatment as \textit{"medical care for an illness or injury"}.
In the Oken study above however, this was not the case, as the researchers were examining the positive effects of placebos, rather than attempting to cure a deficit. We could, however, define treatment as something which improves the performance or health of an organism, and such a definition would not encounter these problems.
It is relatively easy to see how placebos can reduce pain, but is more difficult to see how such pills can improve cognitive performance. One can argue that decreases in pain result from a response bias \cite{Allan2002}, but the measurements of cognitive performance in the Oken et al study were not subject to these kinds of bias.
One can argue that the expectancy of the participants for improvement led them to actually improve, but this argument begs the question as we then need to define how expectancies exert such effects.
A more plausible explanation for these findings is stereotype threat \cite{schmader2003converging,spencerclaude1999stereotype}. Stereotype threat occurs when a member of a particular group performs badly as the result of their worry surrounding being judged badly for their performance.
This phenomenon could have accounted for the results observed in the Oken et al study. It may be that the belief that the pills were enabling improved performance compensated for the effects of stereotype threat. An interesting experiment would be to investigate the effects of placebo cognitive pill administration on the performance of women and African Americans in more traditional academic environments.
Another study where placebo effects were demonstrated in a non clinical setting was that of Crum \cite{Crum2007}. In this study attendents in hotels were cluster randomised (using the hotel as the unit of sampling) and half of the attendents were informed of how many calories they burned by engaging in their roles as hotel attendents. One month later, the informed group had lost significantly more weight and had improved on both self report and externally measured dimensions related to weight and health. Again, this effect is difficult to conceptualise in terms of treatment. Crum, in this paper defined the placebo effect as
\begin{quotation}
any effect of treatment which is mediated more by the participants
beliefs and expectancies rather than the physiological actions of
the treatment.
\end{quotation}
This definition is more widely applicable than our first definition above, but at the cost of introducing two new terms which are not clearly defined, namely beliefs and expectancies. While the second of these terms has a number of specific meanings in psychological thought (for example, process, outcome and response expectancies), belief is such a commonly used word that it has not been precisely defined in any psychological theory. This definition also suffers from the issues surrounding the definition of treatment that were noted above, in Section \ref{sec:plac-rand-trials}.
% One problematic point here is that while we have some evidence for the physiological pathways followed by painkilling placebos \cite{Benedetti2005a,Mayberg2002,Benedetti} we have little to no idea how placebos can exert change on cognitive variables. Such research, especially if these cognitive gains were long lasting, could result in a substantial rethinking of how we measure and approach the concept of intelligence, and particularly its genetic and environmental components.
Another definition (which avoids the treatment definition problem) was quoted from Ross \& Olson, 1981 by Flaten and colleagues \cite{Flaten1999}
This definition is as follows:
\begin{quotation}
A placebo or nocebo may be defined as an inactive
substance or a procedure that is administered with
suggestions that it will modify a symptom or sensation
\end{quotation}
This definition avoids the problem of defining treatment, and also avoids the use of the terms beliefs and expectancies. However, it can be seen from the definition that this is achieved only at the cost of introducing suggestion as a possible mediator of the placebo effect. On the face of it, this is perhaps not a term without merit. In studies of hypnosis, suggestion is commonly regarded as the driver of the observed effects \cite{Kirsch1994} and hypnosis has been proposed as an ethical method to induce placebo responses in participating individuals. This definition is perhaps the best of those that have been examined so far, but it does require us to limit ourselves in the study of placebo to sensory phenomena. Both the Crum \& langer study and the Oken et al study indicate that effects termed as placebos have a wider sphere of effect.
A definition which can be regarded as quite similiar to the one given by Crum \& Langer above is the following:
\begin{quotation}
the nocebo effect is a causation of sickness by expectation of sickness and by associated emotional states
\end{quotation}
While this definition refers to the nocebo effect (negative placebo effects) it focuses on the expectation portion of placebo. However, Benedetti et al have demonstrated that there are at least some placebo effects which cannot be mediated by expectancies \cite{Benedetti2003a}, so this definition does not account for all placebo effects.
One interesting feature (not seen in other definitions) is the focus on emotional states as a cause of the nocebo effect. This does seem to make physiological states, as placebo responses can be inhibited by a chemical known as CCK, which also acts as an anxiogenic \cite{Benedetti2006c}. However, emotional states are typically measured with the same or less precision than placebo, so this definition runs into problems when we attempt to operationalise (and therefore measure) emotional states.
A further definition of the placebo effect was given by Shapiro and Shapiro \cite{Shapiro1997}, in which they defined a placebo effect as ``the effect caused by a placebo treatment''. This definition is wide enough to encompass all the possible placebo effects, which is useful. However, this definition is so wide that it tells us very little about the placebo itself. It is merely a rephrasing of the term placebo effect and tells very little and does not point to useful directions for further research. It is also worth noting that this definition excludes the influence of the practitioner involved, and some research indicates that this is the most important part of the effect \cite{Blasi2001,Kaptchuk2008}.
Another definition of the placebo effects, proposed by Price et al \cite{Price2008} is that
\begin{quotation}
a placebo is any substance or procedure that simulates a treatment.
\end{quotation}
This definition fails to resolve our problem discussed above with relation to cognitive abilities.
Kienle \& Keine give two definitions of placebo in their critique of Beecher:
\begin{quotation}
Placebo is defined in two separate ways, firstly as the imitation of a
therapy, and secondly as any self-healing effect.
\end{quotation}
The first definition of imitation of a therapy is extremely similiar to the Price definition quoted above, and the second, while it sounds plausible is far too vague to be of any use in research. Definition of self-healing alone could take many papers, and there is no guarantee that the concept would prove useful in the end.
Another definition \cite{Blasi2001}, does resolve the problems encountered with the Price \& Benedetti definitions above. This definition is that
\begin{quotation}
placebos are inert substances that have an effect due to context
\end{quotation}
This definition has some good points, in that it allows for placebo effects in any areas in which they are found, it allows for patient-provider effects and it does not prejudge the causes of the effect.
One major difficulty which can be observed with this definition is that it does not account for active placebos, where a substance which is a medication in one context is administered for a condition where it is not expected to have any pharmacological effect\cite{Kirsch1998} .
These active placebos would contradict the definition of placebo as inert, and yet the research demonstrates that these active placebos can be as effective as the regular inert pill \cite{Flaten2004} and sometimes more effective \cite{Kirsch2002a}.
Another issue with this definition was elided to above, to the extent which a placebo induces specific changes, it is not inert, and therefore, using this definition would cease to be a placebo, which is clearly nonsensical \cite{Moerman2002b}.
A similiar definition (in the context of clinical trials) is given by Knipschild et al \cite{Knipschild2005}
\begin{quotation}
[the placebo effect] ... is the effect of co-interventions in a
treatment study connected to the doctor--patient relationship.
\end{quotation}
Again, this definition is quite precisely operationalised, but it assumes that the active ingredient of placebo is the relationship between provider and patient, which has not been demonstrated to be the case at all.
A more recent definition of the effect, formulated by Colloca and Benedetti
\cite{Colloca2005} says that
\begin{quotation}
the study of the placebo effect is the study of the psychosocial context
surrounding the patient
\end{quotation}
This appears to be an update of the Di Blasi et al definition and seems quite useful. One factor that this definition excludes is the interior experience of the person who has the response. Context refers to external things or processes (at least according to the Oxford English Dictionary), and there is evidence that when participants attend to their own internal somatic sensations, placebo effects increase \cite{Geers2006}. Obviously, the participants somatic sensations are not part of the context surrounding them, and yet they appear to have an impact upon the response.
% I don't know about the argument above, seems a little weak
One of the most interesting definitions of the placebo response is that developed by Daniel Moerman, an anthropologist with his conception of the ``meaning response'' \cite{Moerman2002b,Moerman2002a}.
This definition is as follows:
\begin{quotation}
We define the meaning response as the physiologic or psychological
effects of meaning in the origins or treatment of illness; meaning
responses elicited after the use of inert or sham treatment can be
called the "placebo effect" when they are desirable and the "nocebo
effect" when they are undesirable
\end{quotation}
This definition is particularly interesting as it focuses on the perception of the treatment by the individual, which is a factor often neglected in many definitions of placebo. This definition, while it seems to be among the best that we have, still displays far too much of a focus on treatment and medical conditions to account for placebo effects on cognitive function or exercise performance \cite{Crum2007} .
While this is understandable, given the roots of the placebo concept, it is far too restrictive given that we have evidence of placebos in contexts far removed from medicine, such as sport and cognitive function \cite{Benedetti2007a,Oken2008}.
One problem with this definition rests on our understanding of the word "meaning". The OED defines this as \textit{"what is meant by a word, idea or action"}, which is not particularly useful in this context. What Moerman seems to understand by the word is the interpretation placed on the treatment by the individual when they receive it. This again supposes that people are aware of all of the meanings which they possess, when the research would seem to indicate that this is not always the case, demonstrated by (amongst other evidence) the Shiv et al (2005) study noted below \cite{Shiv2005a}.
The Moerman definition is useful in that it points to the individual and cultural interpretations of treatments which appear to facilitate a placebo response, it is still quite imprecisely specified and not particularly useful in practice. % (meaning is internal, and some elements of meaning may reside at a non-verbal, implicit level so there is great difficulty in assessing these meanings in order to devise more effective forms of treatment).
Another interesting study which points towards the need for a more clearly defined placebo concept is that of Shiv et al \cite{Shiv2005a} , where the response to an energy drink was heavily affected by the price.
When participants believed that the energy drink had been discounted, they solved significantly less puzzles than when the were given the drink at its full price. An important issue to note in the context of the paragraph above is that when participants attention was drawn to the discounting, the effect disappeared. This implies that although the signal of the reduced price affected their abilities, this meaning only activated when their attention was not focused on it (suggesting that they responded to the signal in an implicit fashion, somewhat like priming).
Again, this study cannot be conceptualised as a treatment, as we are looking at an improvement in cognitive performance rather than the relief of some illness or problem.
\subsection{Definition for this thesis}
\label{sec:defin-this-thes}
A definition which can be widened to fit these kinds of placebo effects and mind-body interactions is the definition of Di Blasi et al \cite{Blasi2001}, if we take account of the issue arising from active placebos. Perhaps the definition would work better if we removed the word 'inert' and replaced with 'believed inert for the specific condition concerned', so that it reads
\begin{quotation}
a placebo is a treatment believed inert for the specific condition
concerned which has an effect due to context.
\end{quotation}
This definition would need to be supplemented with a more precise defintion of context. One attempt at this would be that context is
\begin{quotation}
the internal states, external environment and relationship of the individual to these states and environment and other individuals in their presence
\end{quotation}
Some researchers would not classify the Oken and Shiv studies as looking at placebo effects at all. It is arguable whether or not these effects of cognitive improvement can really be classified as placebo effects, as they could more properly be termed expectancy effects.
Others would also argue that the same is true of all placebo effects, as they appear to be mediated by expectancies \cite{Kirsch1985, Kirsch1997,Montgomery1997}. Recent research does seem to show that expectancies do not mediate all placebo effects \cite{Benedetti2003a}, so the distinction between these two terms is still useful. The only difference between the effects observed in these experiments and those seen in more typical placebo experiments appears to be that the latter can be conceptualised as treatments, while the former cannot.
\subsection{Use of the term placebo}
\label{sec:use-term-placebo}
This highlights a huge problem with the placebo terminology, it has broadened to mean any effect mediated by information and meaning rather than through biochemical mechanisms. An example of this is a recent study of exercise and the "placebo effect" \cite{Crum2007}. This fascinating experimental study which demonstrated that significant improvements in fitness could result from information provided about the health benefits of work is nonetheless symptomatic of an obscuring trend in social science research; that of treating any influence of mental information on the body as a placebo effect.
This kind of broadening of the concept is useless to continued progress is the field, as it obscures the very real and significant differences between these effects. Some can be attributed to the effects of expectancies, others to meaning, still others to subtle communications on the part of the experimenter or treatment provider, still others to biological mechanisms not well understood, and the lack of clarity in definitions and terminology is holding back research tremendously. One is almost tempted to ditch the concept entirely, as was proposed in a recent issue of the British Medical Journal \cite{nunn2009s}.
For all the appeal of starting again with a new term, such an approach would have a number of problems. The first of these is that the phrase, however poorly defined, conveys a meaning to researchers about how an effect occurred.
The second problem is what to do with clinical trials. In this situation, the placebo is a necessary control. Does Nunn suggest that we use only equivalence or non-inferiority trials? Obviously this would not work for new treatments, and even in cases where there is an accepted treatment, the statistical requirements with regard to sample size and other matters would cause large problems with the proving of the efficacy of a treatment \cite{Benedetti2008}. This follows as larger sample sizes would be required to determine that two treatments are not significantly different from one another at conventional levels of statistical power. %insert benedetti book reference here
Another issue with this jettisoning of the placebo concept is what to replace it with? The literature we have has established that treatments lacking a currently understood biological basis can have real effects \cite{Meissner2007} in pain \cite{Vase2002} , in depression \cite{Kirsch2002a} and in Parkinson's disease \cite{Benedetti2004a} among others. How are we to conceptualise these effects, if not in terms of placebo? Perhaps a better approach would be that of Gotzsche \cite{Gotzsche1995} where he suggested that placebo effects be broken down into those attributable to patient-provider interaction, those attributable to a stimulus from the outside world and those attributable to the patients belief in the treatment. Hrobjarttsson also gave a similiar definition where he described three types of placebo:
\begin{quotation}
change after a placebo intervention, effect of a placebo intervention and the effect of the patient provider interaction.
\end{quotation}
Of these three definitions, only the third is not tautological, and it has been covered when the Di Blasi definition was discussed above.
Even then, these effects are difficult to disentangle. Consider the case of a researcher giving a participant a placebo as a pain killing treatment, after which the participant reports less pain. While the administration of the inert substance and the patients belief in it certainly had an impact, so too did the presence of the researcher, who is the provider in this situation. It is difficult to see how these effects can be disentangled without specialised designs requiring the same participants being administered placebos by multiple researchers.
I would propose that the term placebo effect should be reserved for effects which exert a beneficial change in health due to effects of internal and/or external context. Other effects should be termed expectancy effects, or mind-body effects, or effects of therapeutic relationship (which have a long history in psychology, being typically known as experimenter effects) \cite{rosenthal1969interpersonal,rosenthal1967covert,Rosenthal1956}.
\subsection{Problems with establishing true placebo effects}
The focus of this section is on the definition of placebo, but the definition of placebo effects is without consequence if it is impossible to establish that a placebo effect has or has not occurred in a controlled setting. One of the largest problems with establishing this is when the outcome is a subjective one, such as pain (which is the domain where the placebo was examined in this thesis). Pain is a subjective experience, and equivalent stimulus levels may cause entirely different reactions in two different people \cite{Kirsch1997}.
This issue is often controlled for by calibration of the stimulus levels to the individual participant in an experiment. This allows the researcher to assess to what extent these pain levels are altered by the placebo treatment. Normally, pain intensity and unpleasantness levels are assessed on an eleven point Visual Analogue Scale (VAS) and these ratings are used as the input data for statistical analysis of different groups.
However, the use of these self report instruments carries with it a number of problems. Firstly, the issue of demand characteristics arises as a consequence of this. Demand characteristics refer to the subtle pressures faced by participants in experiments to live up to the expectations of the researcher \cite{weber1972subject} and were discussed in Section \ref{sec:plac-rand-trials}. The idea is that if a researcher really wants to demonstrate something, then he or she will communicate this subtly to the participants in the study, and they will (theoretically) respond in the manner in which the researcher prefers. This factor is one of the major reasons why tests of new drugs must blind the experimenter as well as the participants.
Another large factor which affects placebo analgesia research is the issue of response biases. The problem is that if participants receive a treatment which they believe to be effective, then their thresholds of perception may be altered without any actual biochemical changes. This is the theory put forward by Allan \& Siegel's \cite{Allan2002} analysis of the placebo phenomenon in terms of signal detection theory. These two factors were proposed to account for the entirety of the placebo effect in pain by Hrobjarrstsson \& Goetzche \cite{hrobjartsson2001placebo}. This interpretation of placebo seems less likely given that placebo effects have been observed on biochemical levels, and given that placebo analgesia can be removed by the administration of naloxone \cite{Levine1979} \cite{Benedetti2003}.
%there needs to be a section linking this part to the conclusion of this part of the chapter.
To conclude, the concept of the placebo is wide-ranging, and requires careful elucidation if research is not to be held back by mistaken assumptions surrounding the concept. Therefore, there is a need for this confusion to be resolved in one of two manners. Firstly, the definitions of placebo used could be tailored for the specific experiment or condition studied.
This would have the advantage of being able to more precisely define the expected effects and outcomes for this experiment. However, it would also create a large number of defined ``placebo effects'' and this would be problematic for the researcher who wishes to study these effects across a broad range of conditions and treatments. That being said, some researchers in the field continually make the point that there are many placebo effects, not just one and such a strategy would have the benefit of making this extremely clear \cite{Benedetti2008}
The second option would be to look for an all-encompassing definition somewhat like Shapiro's, and use this for all research in the field. While this would make matters easier for researchers interested in the broad concept of the placebo, it would create difficulties when new forms of placebo effects come to light in the course of research.
Such an all-encompassing definition would also lose much of the precision that allows for future avenues of research to be discerned from it, and prove much less useful to specialised researchers in the field. To summate, the best approach here may be to define the placebo as broadly as possible within the confines of effects on health and illness, and encourage researchers to specify exactly what they believe constitutes a placebo effect for the purposes of each study.
% \section{Common Features of Placebo Effects}
\section{Theories of Placebo Effects}
\label{sec:theor-plac-effects}
Following the discussion of the definitions of placebo and some problems arising from the different contexts within which it has been developed, the next step in this review is to examine the theories which have been proposed to account for this phenomenon.
In the case of the placebo effect, there are a few major theories, each of these will be described in turn and examined for the ways in which they account for the effects, and those features of the effect which they fail to explain (or reject as being illegitimate). A useful run-down of all of these theories appeared recently \cite{Stewart-Williams2004b} and that review has helped to inform the arguments presented here.
The first major theory which attempted to account for placebo effects was that of conditioning. Building off the demonstration of placebo effects in non human mammals \cite{Herrnstein1962} the conditioning theory argued that placebo effects resulted from the learned association between a contingency in the environment (the doctor, pill or medical setting) and healing. This contingency lead to the activation of healing mechanisms based on previous experience with the pill.
The conditioning theory has a number of advantages. Firstly, it can account for placebo effects in all mammals, as all seem to capable of learning through reinforcement. Secondly, it is parsimonious, as it allows us to explain the placebo phenomenon without invoking any new processes or mechanisms. Thirdly, it appears to account for much of the effects.
However, the fatal flaw in the conditioning explanation is that it cannot account for placebo effects from a product which they have not experienced before. Given the nature of clinical trials, this destroys conditioning as an explanation for all placebo effects. One could assume that the observed placebo effects in clinical trials result from generalised associations with medical treatments more generally \cite{pearce1987model}. This does retain aspects of the theory, and provides a testable hypothesis, which is the following - to the extent that there is commonalities between the learning environment and the clinical trial environment, placebo responses will be observed.
There do appear to be some placebo responses which are totally mediated by conditioning \cite{Amanzio1999}, but not all of them can be \cite{Benedetti2003a}. Further experimental research has elucidated some of the connections here, in that motor movement in participants with Parkinson's disease and pain can be modulated by expectancies while changes in hormone secretion appear to be modulated by conditioning exclusively (or at least have not been shown to be affected by expectancies) \cite{Benedetti2003a}.
The competing theory to conditioning for the past few decades was the expectancy theory, as expressed by Kirsch \cite{Kirsch1985}. Kirsch coined the term ``response expectancy'' to describe what he called ``the expectation of a non-volitional response''. A ten year review \cite{Kirsch1997} suggests that this theory has applications in hypnosis and placebo effects.
This theory competed with the conditioning theory for over a decade, but the issue was mostly resolved by a 1997 paper \cite{Montgomery1997}, which pitted the expectancy and conditioning explanations against one another. This study used the conditioning manipulation devised by Voudouris \cite{Voudouris1985} where the painful stimulus is reduced after application of a placebo cream to increase the size of the placebo effect.
One group was told of the pain reduction, while the other was not. The group who were told showed no enhanced placebo response, which supported the expectancy theory. A multiple regression also carried out as part of the study indicated that the effects of conditioning were completely mediated by expectancies.
This seemed to be convincing evidence in favour of the expectancy theory. However, it is worth noting that some authors \cite{Stewart-Williams2004a} argue that conditioning is a mechanism, not a theory, and they claim that conditioning is one method through which expectancies are formed. This theory does not appear to be plausible given the existence of placebo responses which have only been demonstrated with conditioning mechanisms \cite{Benedetti2003a}.
Given the expectancy theory's superiority within the field at present, it is important to look more closely at what the term means. Some authors \cite{Stewart-Williams2004a} argue that expectancies are necessarily conscious, which is a position which seems improbable, given the lack of awareness that typically accompanies observed placebo effects, and the deception which appears endemic to the field of study \cite{Miller2008a,Miller2008}.
Stewart-Williams et al were criticised for their position on expectancies \cite{Kirsch2004} and later retracted it, at least in its strongest form \cite{Stewart-Williams2004}. Expectancy is a catch-all phrase, and while it appears to have applications in a wide variety of areas \cite{Montgomery2007} the term is far too broad to focus research specifically.
The value of the expectancy framework is that it has provided both a common vocabulary and a common mechanism for the measurement of placebo responses. The very broadness of the construct allows for it to be used in a variety of situations, which has ensured its survival in the field. This is also the worst part of the definition, as its wide-ranging applicability coupled with its lack of falsifiable predictions has meant that it is regarded as a theory in the abstract, but research on the determinants and measurement of expectancies has not progressed much in the past decade.
Recent research has shown that conditioning and expectancies interact to produce much more sustained placebo effects than either alone. The implications of this, and other research findings discussed in this chapter will be expanded upon in Chapter \ref{cha:notes-towards-theory}.
There is some contradictory evidence on the relationship between conditioning and expectancies, as Klinger \textit{et al} \cite{Klinger2007a} found that participants who had been conditioned to respond to placebo actually reported greater pain relief when they were given neutral, rather than positive instructions. This may indicate that the expectancy subsystem can interfere with the conditioning system, suggesting that the two methods may activate neurotransmitters and hromones which regulate one another. That being said, this is not a finding which has been replicated, and so this may just have resulted from the particulat suggestions given to the participants in this study, as occurred in the study of Levine and colleagues \cite{Levine2006}.
A competing perspective on the placebo has been advanced recently by Michael Hyland \cite{Hyland2006}. Hyland's theory is called motivational concordance, and it regards the behaviours which people engage in and the meanings that they attach to these as primary, rather than the cognitive focus of response expectancy theory.
His research seems to show that depending on how a particular therapy is framed, different variables can predict the placebo response. In the study cited above, spirituality predicted the placebo response to Bach herbal essences, while expectancy was not an independent predictor. In further research \cite{Hyland2007} he established that this is only the case when flower essences were framed as a spiritual treatment, and not when they were described as motivational tools.
He also noted that when a placebo sleep therapy (which involved writing down things which participants were grateful for) was utilised, gratitude was the best predictor, and again, expectancy added nothing to the results.
These findings are quite interesting, as they imply that although expectancies may contribute to the placebo effects, they do not account for it totally. Some evidence which would argue for this theory's usefulness may be the effects of psychotherapy.
Psychotherapy is perhaps the very definition of a healing ritual, and some recent evidence suggests that perceived assignment to a real therapy is a much stronger predictor of improvement than actual assignment is \cite{Bausell2005,Linde2007}.
One issue with the theory of Hyland is that it has not undergone extensive testing, and has never been analysed in a double-blind design, so it is unknown at this point to what extent it will generalise across the various conditions of placebo administration. For example, his results could be due to experimenter effects and demand characteristics. This is less likely as interactions between the researcher and the participants were minimised, but the possibility has yet to be seriously examined.
Another recent theory regarding placebo effects is that of Geers \cite{Geers2005a}. Geers et al notes that motivational approaches to placebos were popular in the past. He suggests that this perspective may prove fruitful for an analysis of placebo effects. His research used priming techniques in order to influence the desires of participants to respond to the treatment.
The major finding of Geers \textit{et al} across a number of studies \cite{Geers2007,Geers2005a} was that placebo effects were significantly greater after participants had been primed with cooperative goals. His research also showed that expectancies had an impact, but again it was not independently significant after motivation was controlled for in a stepwise multiple regression procedure \footnote{But see section \ref{sec:regress-models} for some discussion of problems with this approach}. Effects of motivation were also demmonstrated by Jenson and Karoly, \cite{Jensen1991}. The Jensen \textit{et al } research found that motivation was a predictor of placebo response while expectancies were not, while the Geers \textit{et al } studies found that motivation and expectancies interacted to produce the observed effects.
Some research does suggest that goal directed behaviour may be associated with endogenous dopamine release \cite{Scott2007a} which could provide a plausible mechanism through which goals and motivation help to activate placebo effects. Some other evidence that would support the theory of Geers is that patients suffering from illness typically show much greater placebo responses to experimental pain than do healthy controls \cite{Klinger2007a}. The research of Geers \textit{et al } used various different priming manipulations to increase motivation to respond, which also suggests that implicit (or unconscious) motivations may be able to influence the response to placebo.
Some researchers, arguing from an anthropological perspective \cite{Thompson2009} have claimed that all of the current theories are far too cognitively focused, and argue for a conception of the placebo response as residing in embodied experience, rather than the constructs used to describe it currently.
This seems like an interesting hypothesis, but has not undergone much testing. It does however, fit with recent conceptions of cognition as embodied \cite{wilson2002six}, especially the conception that the body is intimately involved in cognition.
An experimental study \cite{Geers2006} demonstrated increased placebo responses when participants were asked to attend to bodily symptoms, which suggests some role for somatic awareness in the effect. This theory would also fit nicely with the recent meta-analysis \cite{Meissner2007} which noted large placebo effects on peripheral outcome parameters in organs, but very little in hormones.
This would fit the data as there are feedback mechanisms from organs to brain (through the central and peripheral nervous system), but the hormone system does not have as immediate feedback links to the parts of the brain involved in placebo effects, which would suggest that this theory deserves some credence. This theory is developed further and the implications expanded upon in Section \ref{cha:notes-towards-theory}.
A final theory concerning the placebo effect is the framing of Daniel Moerman \cite{Moerman2000a,Moerman2003}, who conceptualises the effect as a meaning response, which is a useful idea as it brings awareness to the important intra-individual factors which underlie the observed placebo effects.
However, research designs and ways of distinguishing meaning from expectancies are sorely lacking, so at present this theory is little more than a clever name change for the same old effect. This theory could be tested by requiring participants in clinical trials to keep diaries of their experience and analysing them using a structured approach to determine the individual construction of experience which presumably underlies the construct of meaning.
\section{Moderators of the Placebo Effect}
\label{sec:moder-plac-effect}
Moving on from theories about the nature of the effect, the next step is to examine factors which can moderate the placebo responses observed in research and clinical practice, covering both experimental data and the results of large meta-analyses.
% The placebo response is interesting in that it pointed towards cognitive influences on physical health long before there was any interest in such matters in other parts of medicine.
This section will address potential moderators of the effect. These come in two broad categories, in that we have the relationships developed from single studies, and we also have the relationships made clear through meta-analytic techniques. Both of these approaches have their strengths and weaknesses.
Single studies can often be biased by the researcher, the setting and factors such as the precise wording of suggestions. Meta--analysis should, in theory, remove these issues as problems, as (assuming the differences follow a normal distribution) they will cancel each other out when an average effect is calculated. However, this can cause problems of its own, as study selection (which is a subjective process) takes on huge importance in the understanding achieved by the meta-analytic procedure.
As an example of these problems, two meta-analyses were conducted on the placebo response in Irritable Bowel Syndrome \cite{Patel2005,Enck2005}. One found that the number of study visits increased the placebo response, while the other claimed that the number of visits decreased the effect. These meta-analyses both included one hundred studies, but only twenty six were common to both \cite{Klosterhalfen2008}. This example highlights the critical importance of looking at both single studies and meta-analysis if a comprehensive picture is desired.
\subsection{Effects Discerned from Single Studies}
\label{sec:effects-disc-from}
\subsubsection{Psychological Characteristics}
\label{sec:psych-char}
The first kind of moderators of placebo which will be reviewed are psychological characteristics of the individuals under study. Both state and trait variables may be involved here, though most of the research has focused on traits, as they tend to be easier to measure.
The first trait to examine is that of dispositional optimism, which is often defined as \textit{generalised outcome expectancies about the future}. When using this definition, it seems relatively likely that there may be a relationship between optimism and placebo response, and yet this has only been investigated in recent years.
Dispositional optimism appears to exert some influence on placebo effects, in some situations \cite{Geers2005,morton2009reproducibility}. The effect seems to be that those higher in optimism respond better to positive suggestions, while those higher in pessimism respond better to negative suggestions. Another study found that general (but not specific) expectancies had a significant impact on the response to placebo in a meta-analysis of randomised controlled trials of chronic back pain \cite{myers2008patient}. Generalised expectancies is essentially how optimism is defined, given that all expectancies around future states of health are essentially outcome expectancies.
Other studies appear to show that this optimism effect is not general, but rather depends on the context in which the experiment takes place \cite{Hyland2006}. In this experimental study, spirituality rather than optimism was a predictor of the response.
Crucially, this only occurred when the treatment was classified as spiritual. When a gratitude based treatment was used, gratitude acted as a predictor. These results suggest that any trait which predicts placebo response will likely only be effective in certain situational settings \cite{Kaptchuk2008a}. Moerman's meaning response theory would seem to be the most apt theory to use as a framework for understanding these effects, as the only element which differs in these experimental designs is the meaning which participants assign to the stimulus which acts as a placebo. Alternatively, one could argue that expectancies drive these effects by mediating the impact of other contextually relevant variables. However, to take this position would require that the theory of Kirsch, that expectancies exert direct physiological effects, would need to be abandoned \cite{Kirsch1985}.
A recent study \cite{morton2009reproducibility} in a placebo analgesia paradigm argues for a stronger interpretation of the role of optimism. This experimental study used a repeated measures design, and utilised a preconditioning method in the first session which is known to increase the size of the placebo response \cite{Voudouris1985}.
While in the first session there was no effect of optimism on the results, in the second study dispositional optimism was significantly correlated with placebo analgesia, explaining 55\% of the variance. This would suggest that while optimism may not produce a placebo response in itself, once a response has been produced it can be effective in maintaining it over time.
Hyland suggests that the optimism effects on placebo response are mediated through expectancies, and that when these are not a factor, neither is optimism. This sounds plausible, but the relationship could easily go the other way in that optimism could drive the observed effects of expectancies. This is not a question which can be answered without further empirical research, which was conducted as part of this research (see Chapter \ref{cha:primary-research}).
\subsubsection{Gender}
\label{sec:gender}
Gender appears to be an important factor in placebo effects, with differing results being noted depending on the gender interactions between experimenters and participants. A good example of this is the Oken \cite{Oken2008} study which looked at the effects of placebo pills on cognitive functions in older adults. This study had female experimenters, and a large placebo effect was shown for male participants but not for female participants. A Flaten \cite{Flaten2006} study also showed effects of gender on placebo response, as in this study with female experimenters, males did not show a placebo response, but females did. The authors explained this in terms of males being less willing to admit that they were in pain to female experimenters.
Another study \cite{Zubieta2006} showed a enhancement of dopamine production in males following placebo administration but not in females. The authors note that this may be because of physiological differences, but gender was not ruled out as a cause of this effect during this study. Another example of gender influences on placebo treatments was observed in Milling \cite{Milling2007} when they were looking at the effectiveness of hypnotic, CBT and placebo treatments for pain. They observed that there was a significant effect of gender, but fail to note the gender of the experimenters, which renders the effects of any interaction difficult to interpret.
\paragraph{Suggestion}
Suggestion is a feature which while prominent in explanations of hypnotic phenomena, is often neglected in studies of the placebo. This is despite the fact that often the placebo phenomenon is brought about by suggesting to participants that they have received an effective treatment. Recently, Kirsch proposed that placebos could be fruitfully considered in terms of suggestion rather than expectancies \cite{Kirsch1999}. This viewpoint seems illuminating, as there are large differences in the size of placebo effects depending on the type of suggestion used.
This line of research begain with Kirsch \cite{kirsch1988double} when he looked at the effects of either telling participants that they would receive coffee, or that they might receive coffee. This clever design mirrors the difference between placebo studies and double blind trials. This experimental study found that when coffee was deceptively administered, there was a much larger effect. This finding has been replicated in a clinical setting using analgesics following surgery, with the same results \cite{Amanzio2001}.
These research findings argue in favour of the suggestion in placebo being one of the major factors in driving the effect. Additionally, other authors have suggested that suggestion and placebo have much in common, and the lack of linkages between them may be due to lack of clarity in definitions \cite{DePascalis2002}.
One experimental study which shows the subtle effects of suggestion was the work of Levine \textit{et al} on motion sickness \cite{Levine2006}. In this study, participants were told that one placebo pill would reduce motion sickness (placebo suggestion) while the other would reduce spinning, but would increase the other effects (nocebo suggestion). Contrary to the hypothesis, participants in the nocebo condition showed the greatest reduction in symptoms. This was probably because of the caveat in the nocebo suggestion that spinning was felt by others to be the worst symptom. This example illustrates the need to be extremely careful when giving suggestions, lest an opposite result be obtained. This point is further discussed in Chapter \ref{cha:primary-research}, and in Chapter \ref{cha:general-discussion}.
Furthermore, some research in hypnosis indicates that even the features associated with hypnosis (lack of memory, lack of volition etc) are themselves the result of personal and cultural suggestions \cite{Kirsch1999}. This does seem to suggest that hypnosis and its effects are just as subject to suggestion as any other interpersonal phenomenon. It may be that placebo effects are merely the result of suggestions (conscious or unconscious) which are given in the domain of health, while hypnosis merely refers to suggestions given in the context of hypnotic treatment or entertainment.
Suggestion also appears to be able to override conditioning in some situations \cite{Benedetti2008}. In this study, after pre-conditioning with ketrolac (a non-opioid painkiller), analgesia could be induced with positive suggestions, while if negative suggestions were given, then they were able to override the prior conditioning. This may be an effect of salience asymmetry \cite{Rothermund2004}, where negative stimuli (in this case, suggestion) are more salient, or it could be related to the non-opiod nature of the conditioning. A contradictory finding, discussed above, was that in a trial of placebo analgesia, conditioned participants demonstrated a greater response to placebo following neutral, rather than positive suggestions \cite{Klinger2007a}.
\paragraph{Somatic Focus}
Somatic focus, or the focus on internal bodily sensations (the proprioceptive sense ) appears to have an impact on the response to placebo, though this finding has only been demonstrated in a small number of studies.
This finding arises from the work of Geers et al \cite{Geers2006} on somatic focus and its effect on the placebo response. In summary, this experimental study asked half the participants to attend to their somatic sensations following placebo administration, and gave the other half no such instructions. A similiar finding was made by Rainville \textit{et al} with regard to hypnotic suggestions \cite{Price2008}.
The participants who focused on their bodily sensations showed an increased placebo effect, which is an interesting finding for many reasons. Firstly, it suggests that the effectiveness of a treatment can be increased by asking participants to pay attention \footnote{this may also be a potential mechanism through which MBSR exerts its beneficial health effects}.
Secondly, it links in with an explanation given for differences in placebo response across treatments following a meta-analytic review \cite{Meissner2007} where the authors provide evidence that placebo effects are not common where the outcome measure is a hormone level, while they are common where the outcome measure is a peripheral disease parameter. They suggest that this occurs because nervous system feedback loops are available for the second kind of outcome, but not for the first. This finding is discussed further in Chapter \ref{cha:notes-towards-theory}.
\subsubsection{Psychopathology and catatrophising}
\label{sec:psych-catatr}
Some other personality characteristics have been linked to placebo response and also to active treatment response. Firstly, a recent controlled trial \cite{Wasan2005} showed that participants with higher levels of psychopathology (as measured using self report scales) derived significantly less benefit from analgesic treatment, but significantly more benefit from placebo analgesia treatment. Interestingly, levels of optimism also correlate with psychopathology, which may be a potential cause of this interesting finding \cite{Carver2010}.
A second finding in this area \cite{Sullivan2008} showed that in a trial of amyltriptine and ketamine versus placebo, the extent to which participants catastrophized about pain determined their treatment response. High catastrophisers reported a large effect from placebo, but low effects from the active treatment while for those low in catastrophizing, the results were the opposite. It is worth mentioning here that this Sullivan trial was a secondary analysis of a null result, so some cautions should be taken in its interpretation. No such caveats apply to the Wasan \textit{et al} trial.
changes as the result of placebo administration. Thirdly, the effect can be altered by gender effects and the setting in which the research is carried out (clinical trial or experiment). Fourthly, the effect is still not completely understood, yet research has come far closer to elucidating the complexities in this one area. For this reason, the placebo analgesia setting is perhaps the best place in which to carry out research aimed at getting to grips with the prediction, enhancement and control of this mysterious phenomenon.
\subsection{Provider Factors}
Another factor which is often claimed to be of importance in placebo effects is the patient-provider relationship. The classic study in this field was performed by Thomas \cite{thomas1987general} trial where patients suffering from unclear symptoms were given either either a positive or negative consultation. The results of this study showed that 2/3rds of the patients given a positive consultation improved, while only 1/3rd of the patients given a neutral consultation had. This finding influenced many researchers in the field over the next few decades.
However, more recent research has failed to replicate this effect in a sample of patients with pain problems \cite{Knipschild2005}. This newer study worked with general practitioners in the Netherlands, and although a large sample size was used, they failed to find any significant effects based on the positive consultation.
A few caveats apply here. Firstly, while the Thomas study had just one doctor involved, there were over 40 in the Knipschild study. Secondly, Thomas dealt with many different kinds of patients while Knipschild dealt only with pain patients. Thirdly, the Knipschild study used normal general practice clinics while a student sample was used in the Thomas study. Knipschild and Arntz suggest that the charisma of Thomas may have had something to do with his effectiveness.
They also note that many of the GP's did not like giving negative consultations and the tape recorded interviews suggested that they were far more comfortable dispensing the clear advice. Although there is substantial evidence for the impact of good patient provider relationships in the literature \cite{DiBlasi2001} the specific matter of whether or not a positive consultation improves medical outcomes must be regarded as open at present. The Di Blasi systematic review above looked at the impact of cognitive care and emotional care, and argued that these two features drive much of the patient provider effects on health outcomes.
A recent study \cite{Kaptchuk2008} using an RCT design with patients suffering from Irritable Bowel Syndrome (IBS) appears to indicate that interaction between patient and provider is a critical part of the placebo response. This study utilised sham acupuncture and divided participants into three groups. One received no treatment, another had sham acupuncture with minimal interaction while the third group had sham acupuncture and large amounts of interaction. The results showed that Group 3 had much better recovery rates then either of the other two groups, which would seem to suggest that the patient-provider interaction drives much of the observed placebo response, at least in this setting.
In addition, the documented success of treatments in medicine which have later been shown to be no better than a control treatment definitely involves the placebo effect. This finding comes from Roberts (1993), and is reported in Moerman (2000) \cite{Moerman2000a}. The most famous example of this is surgery for angina pectoris carried out in the 1950's which was shown to be completely ineffective in blinded trials. A more recent example was a study conducted on osteoarthritis of the knee, where there was no significant effect of the surgery \cite{horng2002placebo}. Nonetheless, this treatment was found effective by many patients before this trial, arguing that there is significant effects deriving from the provider interaction with the patient.
Moerman argues that the reason that these treatments were effective is that physicians believed in them, and they communicated this belief to their patients.
This belief (either directly or indirectly) caused the patients to respond well to the treatment \cite{Moerman2000}. Some authors have suggested that these patient-provider effects are the result of neural patterns laid down by caregivers in early childhood \cite{Kradin2004}.
More broadly, patient-provider effects on health outcomes can be conceptualised as a form of experimenter effect, whereby the provider exerts an influence on the results. This is the positive side of demand characteristics where the patient becomes aware of the beliefs of the provider, and responds to these. An extremely good study of these effects was carried out by Walach \textit{et al} \cite{Walach2002}. In this study, two students were recruited as experimenters, and induced to have either positive or negative beliefs about the efficiacy of placebo. These experimenters then carried out the same experiment, and achieved results in line with the induced expectancies.
% This would tie in with findings that PhD scientists are far more likely to find a significant placebo effect than are medical doctors can't find a cite for this, but I definitely remember reading about it.
\subsubsection{Treatment Factors}
\paragraph{Type of Placebo}
\label{sec:type-placebo}
This section will examine the effects of different kinds of placebo on their effectiveness. The most recent study in this field looked at the differential effects of two different kinds of placebo therapy \cite{Kaptchuk2006}. In this study two different kinds of placebo (a pill and a medical device) were used, and showed differential outcomes on the recovery of the patients involved in the trial.
These kinds of findings are one of the best proofs that placebos actually produce measurable effects, as if they do not, then the effects of two placebo therapies should not differ significantly from one another \cite{Kaptchuk2006}. In the Kaptchuk et al study, it was shown that a sham device for acupuncture produced much greater placebo responses than an inert pill.
Another study demonstrated that injections were associated with a much greater placebo response than pills in a study of placebos in the treatment of migraine \cite{Craen2000}. Physical placebos such as ultrasound have also been shown to have an larger average placebo effect size \cite{Ernst1995b}. The reasons for these differences are not clear at present, but some thoughts on this matter are presented in Chapter \ref{cha:notes-towards-theory}.
Another factor which affects the response to placebo is the name of the placebo. A recent study found that placebo responses to a placebo of the same name remained almost entirely constant, while the same inert cream given a different name evoked different responses from the same group of participants \cite{Whalley2008}. The authors use this finding to argue that this demonstrates that the placebo effect is completely inconsistent. This is not a particularly strong argument, as the name given to a placebo is one of the most important features, as it is one of the few pieces of information given to the participants in a typical trial.
\paragraph{Price}
\label{sec:price}
Another factor which may affect the placebo response is price \cite{Shiv2005a}. The Shiv et al study utilised an energy drink distributed to college students in an on campus gym under two conditions. In the first, they merely received the drink and were asked to solve a number of puzzles. In the second, they received the same drink, but were told that the price had been discounted (without being given a reason). The participants in the second condition solved significantly less puzzles than those in the first, suggesting an impact of perceived price on the effectiveness of the energy drink.
This is extremely topical, given the nature of the patenting process on pharmaceutical drugs and the proliferation of generic drugs following the expiry of the original patent. This finding probably reflects cultural associations of price with value, and one could hypothesise that in other cultures, items perceived as being of greater value would invoke similiar effects.
This finding has been replicated in placebo analgesia \cite{Waber2008}, which is perhaps more relevant to the discussion about pharmaceutical drugs. It is worth noting, however, that in neither study were the participants actually required to pay for the drugs, and such a study would have much greater external validity and relevance to health care policy makers. This research also ties into the classic paper by Branthwaite on branded and unbranded pills for the treatment of headaches \cite{Branthwaite1981}. This experimental study, using an extremely large sample, found that branded placebos were more effective than unbranded placebos, suggesting that either advertising or prior learning can affect the effectiveness of two identical preparations.
Again, the findings discussed in this section can be interpreted in terms of suggestion and expectancies. Price is typically taken as a signal for quality in Western societies, and particular brands of pharmaceuticals and medicines can be associated with relief. That being said, the branding experiments are equally conducive to being explained in terms of conditioning, while the price findings are certainly expectancy driven.
\subsubsection{Evidence from meta-analysis}
\label{sec:evidence-from-meta}
One interesting finding from a meta-analytic study comes from the work of De Craen \cite{Craen2000}. This study showed that injection placebos were more effective than pill placebos in the treatment of migraine. This could be the result of injections being typically associated with stronger painkillers while pills are often sold over the counter, thus leading to stronger response expectancies regarding the injection. This finding can also be explained by the effects of prior experience and therefore also compatible with a conditioned model of the placebo effect. This appears to relate to the findings noted above about sham devices and physical devices being more effective than inert pills. While it might seem appropriate to discuss the Hrobjarrtson and Goetzche meta-analysis in this section, this debate is better suited to its own section, which is Section \ref{sec:nature-existence}.
\subsection{Setting factors}
\label{sec:setting-factors}
\subsubsection{Geographical differences in placebo response rates}
\label{sec:geogr-diff-plac}
Another fascinating result from meta-analytic study was found by Garud et al \cite{Garud2008}. This meta-analysis looked at the placebo response in ulcerative colitis and found that that there were significant differences between the same placebos based on the geographical location of the trials.
In the USA, placebo response rates were 10\% lower than in Europe, suggesting that some cultural force might be driving this difference. Its hard to see what this cultural difference could be though, given the substantial heterogenity which exists between countries in Europe, which is far more than the comparable differences between states in the USA.
Nonetheless, this meta-analysis points toward some kind of cultural or geographic factor which can influence the placebo response, though a more nuanced and systematic explanation is lacking at present.
This would link in with earlier work carried out by Moerman into variability of placebo response rates across countries and cultures \cite{Moerman2000}, where differences in ulcer rates and blood pressure across cultures were noted. While this original meta-analysis by Moerman was criticised for a lack of rigour, another analysis by De Craen which did not suffer from these problems \cite{Craen1999a} confirmed these results.
\subsubsection{Effects of trial/study design}
\label{sec:effects-clin-trial}
The placebo response is commonly regarded as a nuisance in clinical trials, and the most recent medical guidelines suggest that placebo controlled trials should only be used when there is no proven alternative \cite{temple2000placebo}. This often makes ethical review committees reluctant to allow placebo conditions, and if they are allowed, the placebo group is often very small, in order to minimise the risk.
A recent meta-regression suggests that this may be counterproductive \cite{Papakostas2009}. This research found that as the size of the placebo group decreased, the size of the placebo response often increased, in some cases meaning that the trials could not show an advantage of drugs over placebo. The authors speculate that, as the participants were informed of the drug:placebo ratio as part of informed consent procedures, they had stronger expectancies on the likely results of treatment and therefore they reported a larger response to treatment, which caused the placebo response rates to increase. This situation is an excellent example of the law of unintended consequences, and is an interesting object of study in its own right.
One finding which provides cause of caution is that effect sizes for placebo tend to correlate with the size of the trial, suggesting that regression to the mean may be responsible for some of the effects observed in clinical trials.\cite{Enck2005a}. To a certain extent, given the nature of clinical trials this is unavoidable as the selection criteria will tend to select groups of participants most likely to demonstrate this phenomenon. Unfortunately, though no-treatment groups provide an excellent bulwark against this confounder, most clinical trials do not have them.
\subsubsection{Certain and Uncertain Expectations}
Perhaps the most important feature of trial design affecting the response to placebo is the influence of suggestion. In most clinical trials, participants are informed that they will receive either active treatment or placebo. Some authors have suggested \cite{kirsch1988double} that this process diminishes expectancies related to the treatment efficacy, which in turn reduces the effects \cite{Kleijnen1994}. The Kirsch study noted above looked at the effects of differing instructions on the results of ingesting placebo caffeine, and showed larger effects when participants were given placebo coffee with suggestions that it was real than when they were told there was a 50\% chance they would receive placebo. A replication attempt by Walach \textit{et al} did not confirm this finding, which may suggest that such an effect is contingent on the cultural background of the participants.
More recently, Amanzio and colleagues \cite{Amanzio2001} replicated this finding with patients recovering from thoracic surgery. The principal finding was that those patients who believed they were getting a real medicine required much less analgesia than those who believed that they might receive placebo. Such an effect could account for the differences in effect sizes seen between experimental and clinical studies of placebo. Indeed, Amanzio \textit{et al } suggested that variations in placebo response were responsible for much of the variability in the response to analgesics in general.
A second, related factor may be the use of suggestions in the experimental research. Participants are typically told that they will receive a powerful painkiller before placebo administration, whereas in the clinical trial, no such instructions are given. This finding regarding certain and uncertain expectations was also replicated in a test of a placebo sleep therapy by Geers \cite{Geers2005a}.
Linking to the discussion above regarding certain and uncertain expectations, it may also be important to examine a paper by Ploghaus et al \cite{Ploghaus2003} where the authors argue that certain expectations of aversive events are associated with fear, while uncertain expectations are associated with anxiety. Anxiety is associated with both the nocebo effect and the production of the hormone CCK, which may be why uncertain expectancies appear to lead to lower placebo effects \cite{Colloca2008b}. These two emotions activate differing parts of the brain, and given the finding that dopamine systems are activated differentially by certain and uncertain expectancies \cite{Scott2007a}, this may point towards some important future avenues for research. This new focus on the brain and body correlates of placebo effects has contributed much to the field, as we will see below in section \ref{sec:neur-plac-effect} .
\subsubsection{Treatment Preference}
\label{sec:treatment-preference}
There is some evidence that benefits accruing from clinical trials may result from the patients expectancies about whether or not they have received the real treatment \cite{Bausell2005}. This study showed no difference between sham and real acupuncture but showed large differences between the outcomes of those who believed they received real treatment versus those who did not. This factor is typically ignored in clinical trials, although prominent commentators have argued that it should be taken more into account \cite{Benedetti2007}. This finding was later replicated \cite{Linde2007} where four clinical trials of acupuncture were pooled. This study suggested that although real acupuncture showed similar improvements regardless of expectancies, the minimal acupuncture groups improvement was dependent on their expectancies around acupuncture and perceived treatment assignment.
In conclusion,the placebo response is a complex phenomenon and can be impacted by internal participant factors, features of the patient-provider relationship, features of the treatment itself, and also features of the setting in which the treatment is administered. Very few studies control all of these factors, and this may contribute to some of the confusion and controversy surrounding the construct.
\subsection{Just how powerful is the placebo?}
One well known meta-analysis suggested that the benefits of placebo were negligible in most areas \cite{hrobjartsson2001placebo}, with the exception of pain trials. While this meta-analysis has been critiqued for its ignorance of psychological studies of placebo \cite{Evans2003, Stewart-Williams2004b} and for the combining of placebo effects across 200 plus treatments \cite{Wickramasekera2001}, it has focused research on the placebo into the area of pain since its publication.
One of the major reasons for the popularity of pain studies in placebo research is probably the large effect sizes, as measured by Cohen's d. While effects in some areas range from about $d=.15-.25$, the effect sizes in pain studies tend to be much larger, ranging from $d=.45-.95$ \cite{Vase2002}. Given that the $d$ measure expresses effect sizes in terms of standard deviations, an effect of between a half and one standard deviation is quite respectable, and allows for smaller studies to examine effects of interest. But again, it has been argued that these effect sizes are illusory, and result from lack of blinding, inadequate controls and poor randomisation \cite{Hrobjartsson2003,Kienle1997} .
These two viewpoints can be reconciled, at least in the opinion of Vase et al (2002) \cite{Vase2002}. In this meta-analysis, Vase and colleagues looked at the sample of placebo pain trials in both the clinical area and in the experimental area. What they found was that effect sizes tended to be small when the placebo was used in a clinical trial, and much larger in experimental studies of the placebo effect. This analysis was disputed by Hrobjarrtson and Goetzche \cite{Hrobjartsson2003} who noted problems with the methods of analysis chosen by Vase et al. Even using the more conservative estimates of Hrobjarrtsson and Goetzche, the effect size from experimental research (d=0.5) is still twice as large as those observed in clinical trials. There are a number of factors which differ in these two contexts which could be responsible for these observed differences.
% although the two types of research seem to give us different information, they are mutually interdependent, as when enough single studies are conducted, this data becomes ripe pickings for a meta-analysis.
% The major conclusion we can draw from this section is that the placebo response is a complex phenomenon, upon which many factors exert an influence and that we need to bear in mind the individual level factors, the interaction factors, the treatment factors and those factors which only become apparent after looking at aggregated results of many similar studies.
If there is a gap in the literature, it probably results from a paucity of meta-analytic studies on non clinical trials of placebo (with some notable exceptions \cite{Wampol2007}\cite{Vase2002}), but this is a matter that could be easily addressed by future research. It is however, a matter worth addressing, as what little studies we do have indicate that there are a number of major differences between the data revealed by each of these methodologies.
The above review of the research relating to features associated with response to placebo has established the following.
\begin{itemize}
\item Placebo effects appear to be related to expectancies and through these, optimism
\item Gender may be a factor in the response to placebo
\item The relationship between patient and provider appears to drive some of the observed placebo effects
\item Study design and the expectancies induced by different study designs appears to have a large impact on the outcome of the trials.
\item The perceived value of a placebo (as expressed through price and branding effects the response
\item Treatment preference and beliefs about assignment to treatment appear to be more important than actual treatment assignment.
\end{itemize}
\section{Debates about the Nature and Existence of Placebo Effects}
\label{sec:nature-existence}
The placebo is quite a mysterious phenomenon and does not fit very well into the biomedical paradigm \cite{Kaptchuk1998} \cite{Caspi2002}. In this school of thought, specific treatments exert action on specific parts of the body and cause relief from sickness. The huge effectiveness of quinine, vaccination and pasteurisation on the health of European societies took place without any regard to the mental state of the patients receiving the treatment, and this encouraged a model of the body as machine, where certain inputs led to certain outcomes \cite{Caspi2002}.
It was this very commitment to evidence and specific remedies that has eventually led to the restoration of the place of the placebo effect in medicine. The first clinical trial was conducted in 1913, but the method was not widely adopted until after world war 2 \cite{Kaptchuk1998}. The adoption of randomised controlled trials led to on the one side, the denigration of the placebo as a mere experimental tool, and on the other side, this development created an environment where its effectiveness in a wide variety of situations could be recognised.
The placebo had concurrently fallen out of official use by doctors as it was regarded as a deception, although recent surveys in this age of far greater attention on ethical prescription show that many medical practitioners do not take this advice \cite{hrobjartsson2003use,sherman2008academic,tilburt2008prescribing,Sherman2008,Sherman2008a,Sherman2008b,Sherman2003,Ross1983,Buckalew1981,Krugman1964,Ross1962}.
% The placebo languished in the bowels of the developing controlled trials until in 1955, when Beecher published The Powerful Placebo and introduced a new excitement into the study of the field. Beecher also introduced some unfortunate canards into the field, with his research noting that around 1/3rd of patients reported a placebo effect. This statistic has been carried from article to article for over fifty years now, despite evidence that it is not even remotely true \cite{Kienle1997}.
The placebo remained a topic of interest to some scientists and psychologists for the next few decades, but many argued that it could not be real on principle, and others that while the effects seemed real, they were due to demand characteristics of the experimental situation or response biases. % Some even invoked signal detection theory in order to explain the effects away \cite{Allan2002}.
This is despite evidence from the early 1960's of placebo effects in other mammals \cite{Herrnstein1962}, which obviously could not be attributed to either of these experimental artefacts. Some argued that if the placebo did have an effect, then it could only be on subjective outcomes \cite{Hrobjartsson2001}.
A new phase of placebo research began in the late 1970's with the demonstration that placebo analgesia could be mediated by naloxone \cite{Levine1978a,Levine1984,Fields1981,Gordon1981,Levine1979}. This research was quickly contextualised by evidence which showed that although some placebo analgesic responses were mediated by endogenous opiates, not all could be \cite{Gracely1983,Levine1984}. These experiments represented perhaps a tipping point in the study of placebo. No longer could these effects be dismissed as mere response biases, and researchers in the field finally had a physiological mechanism to point at to convince doubters that the placebo effect was a real phenomenon and was worthy of study.
The nineteen eighties were a good decade for placebo research. The first placebo conditioning experiments were performed on humans \cite{Voudouris1985}, and a new theory of response expectancy was proposed to account for the effects \cite{Kirsch1985}. These two theories of conditioning and expectancies were tested against one another, and the controversy continues in the field today, despite what many regard as proof positive of the supremacy of expectancies over and above conditioning \cite{Montgomery1997}. More recent research (noted in the placebo analgesia section above) has contextualised the conditions under which placebo effects can be created by conditioning and expctancies \cite{Benedetti2003a}.
In 1998, an article was released that purported to show that anti-depressants were not much more effective than placebo \cite{Kirsch1998}. This research attracted a furore of publicity, and many newspapers argued in favour of ditching medications entirely and resorting to placebo.
While this is a misinterpretation of the research findings (if the patients were given placebo as placebo it is unlikely they would have recovered), it gave new impetus to the field. This research was further backed up in 2002 \cite{Kirsch2002a} with an analysis of all the information submitted to the FDA to prove the efficacy of Selective Serotonin Re-uptake Inhibitors (SSRI's) was shown to reveal little benefit for active drug over placebo. In the field of depression at least, it appeared that the placebo was indeed powerful.
However, in 2001, the field was thrown into controversy. Two well known researchers conducted a meta-analysis on all clinical trials which included both a placebo and a no treatment condition \cite{hrobjartsson2001placebo} and concluded that there was no evidence of placebo effects on objective paramters and only a minor effect on subjective parameters. This research caused a furore in academic circles, and was widely reported by the same media who had given the placebo such a warm welcome some years earlier. The meta-analysis was widely criticised \cite{Evans2003,Kirsch2001, Wickramasekera2001,Greene2001} on the grounds that it ignored psychological studies of placebo, considered too wide a variety of clinical conditions and did not properly ensure that the no treatment condition was indeed a no treatment condition. Nonetheless, this meta-analysis probably focused placebo research into the area of pain for many years afterwards. An update of the review in 2005 and in 2010 for the Cochrane Library revealed no essential changes in the findings of these authors \cite{Hrobjartsson2004}.
Predictably, a storm of research findings and counter finding ensued. Vase and colleagues collected data for experimental placebo studies and argued that the null results in the clinical trials studied were due to the lack of suggestion in the clinical trial setting \cite{Vase2002}. The original authors argued that this meta-analysis was sub-standard and when the same methods were used as in the original study, much smaller effects were found \cite{Hrobjartsson2003}.
The debate continued in the pages of the Journal of Clinical Psychology \cite{Wampold2005} where the noted authors argued that placebo effects were revealed in the situations where they were expected, and not elsewhere. Again, the response of Hrobjarrtson et al was forceful as they claimed that this was a post hoc rationalisation of the results, and argued in favour of dispensing with the placebo concept outside of clinical trials \cite{Hrobjartsson2007a} . Wampold et al replied to this accusation \cite{Wampol2007} with counter-evidence and Horbjarrtson et al countered that their original conclusions remained solid \cite{Hrobjartsson2007}.
%this section needs more description
The debate rested there for a time, with those researchers working with placebos convinced that there were working with a real effect, and possessing enough neuro-biological and physical evidence that they were not concerned with the by now infamous meta-analysis while those who did not accept the placebo theories were able to point to a large study that confirmed what they believed to be true. However, more recent research has cast new light on this complex issue.
In a recent meta-analysis \cite{Meissner2007}, Meissner and colleagues reviewed clinical trials using both placebo and no treatment conditions, and made a surprising discovery. In trials where the outcome measure was a physical parameter, there were large placebo effects but in trials where the outcome measure was hormone levels, there were no placebo effects. They re-analysed the studies looked at by Hrobjarrtsson and Goetzche and found that when this classification was utlised, placebo effects were observed in the data. %% this section appears quite a number of times
The debate about the reality and effectiveness of placebos appears to have subsided somewhat, though there are many unresolved issues. While the Meissner et al meta-analysis appears to resolve the major point of contention, more research is definitely needed to ascertain exactly why the effect sizes for placebo responses vary so much as a function of type of study.
\section{Physiology of the Placebo Effect}
\label{sec:neur-plac-effect}
The placebo effect is an interesting phenomenon in that it straddles the boundaries of psychological and physical. This section will examine the research demonstrating the effects of placebo on a neurological level, and then examine other physiological impacts and correlates of placebo administration.
\subsection{Brain Correlates of Placebo}
\label{sec:brain-corr-plac}
A strand of placebo research which has become more and more important with time has been the increasing focus on brain correlates of placebo responses.
% \subsection{Problems with fMRI studies of placebo}
% \label{sec:problems-with-fmri}
Much of the recent research on physiological correlates of placebo effects has been carried using using functional magnetic resonance imaging. This method is extremely expensive and time consuming, and so typically sample sizes are small. Despite this, many researchers have reported surprisingly large effects.
A potential reason for this pattern was proposed by Vul \textit{et al} in a paper which caused much controversy in the field \cite{vul2009puzzlingly}. Essentially Vul and colleagues pointed out that methods of analysing brain data were prey to issues of multiple comparisons, where voxels were chosen after the fact based on significance levels. This statistical error occurs in many contexts (indeed it was also a major problem in genomics) and leads to inferences which tend not to replicate. Some proposed methods of preventing these errors (borrowed from computer science) are discussed in Chapter \ref{cha:methodology}. Vul also pointed out that the maximum correlations that were possible between two measurements were bounded by the product of their reliabilities, and provided evidence that this restriction was violated in many fMRI studies. It is important to bear in mind that many of the studies reported here were conducted before the publication of this paper, and some of them may suffer from these same flaws. Issues with particular papers are noted as and when they appear.
\subsection{Neurobiology of self reported pain}
\label{sec:neur-self-report}
Pain is the area where much of the recent neurological work has been done, as this is where the majority of placebo research has been carried out in recent years. In this section, the correlates and findings from this area of research are summarised.
There appears to be a neural dissociation of the somatic and affective components of pain in the brain with the affective parts activating the dorsal anterior cingulate cortex and the sensory parts involving the somato-sensory cortex \cite{Lieberman2004}. This dissociation is also reflected in the measurement of pain in placebo studies, with intensity and unpleasantness being rated separately \cite{Price2008}.
Another interesting suggestion was that placebo analgesia experiments which show altered brain activity in the rostal anterior cingulate cortex (rACC) and orbito-frontal cortex (OFC) demonstrate the existence of a generalised expectancy network. This hypothesis received some support from a recent experimental study which used either true or false sound cues to create expectancies for particular aversive tastes. This study showed that the rACC and OFC and to a lesser extent, the dorso-lateral pre-frontal cortex (DLPFC) activated in response to these expectations, suggesting that these parts of the brain may well be associated with the expectancies \cite{Sarinopoulos2006}. Futher evidence for this viewpoint is research which shows that the rACC and the RLPFC are related to willed behaviour, which would seem to be associated with expectancies \cite{Beauregard2007a}
An interesting finding arose from an experimental study into patients suffering from Irritable Bowel Syndrome (IBS) \cite{Lieberman2004}. This study looked at placebo using a disruption theory account, which accounts for neural changes due to placebo in terms of inhibition. The authors found that although the right ventro-lateral pre-frontal cortex was activated by expectancies of analgesia, this activity was totally mediated by the dorsal anterior cingulate cortex which argues that this part of the brain is more foundational to the placebo response.
There is some evidence to suggest that some of the effects may involve both descending and ascending pathways within the brain, judging from the results of a study on mechanical hyperalgesia \cite{Goffaux2007}.This study used a counter-irritation technique and the use of a basin of water to act either as a placebo or nocebo. The authors suggested that the reflexes in the arm should not change if the placebo effect was completely cortically mediated, but the results suggest that descending pathways are equally as important in placebo analgesia. These pathways are controlled from the mid-brain and these findings suggest that the placebo effect exerts changes in large portions of the body, and is not exclusively a cortical phenomenon. This finding would seem to support a more embodied conception of placebo. This finding, and others like it are discussed more fully in Section \ref{sec:sec:non-brain-effects} below.
The lateral orbito-frontal cortex has also been associated with placebo analgesia in some studies \cite{Petrovic2002}, and additionally has been associated with the cognitive control of pain. This part of the brain may be associated with the generalised expectancy network suggested above, and this would seem to fit the evidece.
Further evidence in favour of the idea that placebo effects are mediated by both upward and downward pathways to and from the brain comes from the study of Matre \cite{Matre2006a} who noted large differences in mechanical hyperalgesia between placebo and control areas of the body, again suggesting the involvement of the whole body in the response. In this context, the results of Roelofs et al. \cite{Roelofs2000} are worth considering. Using similar techniques to the two other studies referenced in this Matre and Goffaux, they found no evidence that placebo effects cause changes in spinal reflex activity. However, this study also found no evidence for a placebo effect in general, which weakens their conclusions. It is worth mentioning that even though they found no significant effects, they did find a correlation between the brain activity and spinal reflexes, which suggests that they found an effect, but their study was either underpowered or used a badly designed expectancy manipulation (most likely the latter) \cite{Goffaux2007}.
An interesting finding which has come about through placebo research is what is known as the uncertainty principle in analgesia \cite{Colloca2005} , where it is argued that the effects of any analgesic can not be accurately measured in a clinical situation as the awareness of being given this substance will activate the opioid system which will further reduce pain. This finding arises from work done previously, where it was shown that open injections of painkillers or placebo registered far more variability than hidden injections \cite{Benedetti2003c}, suggesting that while physiological responses to analgesia may be similar across people, the awareness of treatment may invoke differential activation of endogenous painkilling systems which cause the total effects to appear to vary quite substantially \cite{Amanzio2001} .Research has also confirmed that placebo and opioid analgesia share the same neural patterns of activation in the brain \cite{Petrovic2005}.
\subsubsection{Depression and Placebo}
\label{sec:depression-placebo}
Much research has also been done in the area of depression and placebo response. A fascinating study \cite{Hunter2006} suggests that prior to treatment, placebos may induce changes in neurophysiology which predict later treatment response. This is an extremely interesing finding, however the authors used a new measure (that of EEG cordance) developed by themselves and to date, there have been no replications of the study. Another useful study of placebo neural activity in depression has also been conducted comparing the activation of particular brain regions following treatment with either prozac or placebo \cite{Mayberg2002}.
This experimental, double blind Positron Emission Topography (PET) study showed that placebo and Prozac both activated common brain regions in the prefrontal cortex, premotor cortex, posterior insula, posterior cingulate, subgenual cingulate, hypothalamus, thalamus, insula and parahippocampus. Prozac additionally activated areas of the striatum hippocampus and anterior insula. These findings are intriguing as they support the recent meta-analytic evidence that the placebo response accounts for much of the effect in antidepressants \cite{Kirsch2002a}.
One fascinating finding of the Mayberg et al study is that areas of the striatum were activated, and this region of the brain is known to be rich in dopamine receptors, which may suggest that while the placebo response in depression is primarily opioid mediated, the effects of SSRI's may also influence the dopamine systems, which may account for their superior effectiveness overall. However, some research shows that psychotherapy activates different brain regions in the treatment of depression, which argues against the existence of a common depression treatment pathway in the brain \cite{Benedetti2008}.
\subsection{Non Brain Effects and Correlates of Placebo response}
\label{sec:non-brain-effects}
\subsection{Opioids and Placebo}
\label{sec:opiods-placebo}
The biochemical history of placebo begins with Levine \cite{Levine1978a} and the demonstration that naloxone blocks many placebo pain responses. Induced from this is the notion that placebo pain relief is mediated by the endogenous opioid system.
This finding has been qualified by research over the past thirty years, suggesting that both opioid and non-opioid systems can be involved in the placebo pain responses depending on the the method of inducing placebo responses and the biological system involved. \cite{Amanzio2001, Benedetti2003a}. The lasting contribution of this research is that it paved the way for the placebo to come in from the fringes of medical science.
In this area, the work of Benedetti and his colleagues has been instrumental in unveiling the biochemical pathways through which placebos exert their effects, and much of this work is summarised in his book. It appears that both the opioid and dopaminergic systems are involved in the placebo effect. Benedetti and colleagues have demonstrated that respitory depression can be indcued by placebo administration \cite{Benedetti1999a}.
A further discovery with regard to placebo analgesia is that it can be directed at specific sites in the body \cite{Benedetti1999}. This study induced expectancies of placebo responses at either the right or the left hand, and demonstrated the expected placebo effects. These effects were completely antagonised by naloxone, which suggests that they were mediated by the endogenous opioid system.
This finding is interesting as it suggests that the opioid systems can be activated at specific parts of the body, and not just globally as some former theorists have claimed. A more recent finding \cite{Watson2006} found that perhaps 50\% of participants in a placebo analgesia study generalised a placebo response across both arms, even though cream was only applied to one arm for each person. This study would suggest that the placebo analgesia phenomenon is quite malleable and subject to individual interpretation.
Further research on the blockade of opioid receptors by naloxone has established that proglumide can be used to increase the size of placebo analgesic effects \cite{Benedetti1995}. Additionally, CCK, a chemical which tends to produce anxiety in human participants, has been shown to increase the size of the nocebo effect \cite{Benedetti1996}.
A recent meta-analytic review \cite{Sauro2005} seems to argue that placebo effects in pain are quite large (d=.89) and that naloxone is quite effective in reducing them (d=.55), pointing towards an interpretation of placebo effects in pain being substantially mediated by endogenous opioids.
Unfortunately, Sauro \textit{et al} do not report what kinds of receptors these endogenous opioids bound to, as this was not the primary focus of their meta-analysis. They did find a significant difference between the effect sizes in post-operative and experimental pain, with post-operative pain showing an average effect size of $d=0.65, 95CI(0.37-0.87)$ while the experimental studies showed an average effect size of from $d=0.53, 95CI(0.02-1.04)$ for shock induced pain, to $d=0.72, 95CI(0.34-1.16)$ for capsician induced pain, to $d=0.62, 95CI(1.00-1.46)$ for ischemic pain, suggesting that ischemic pain is the best way to invoke a substantial placebo effect. Note that all $d$ measures above, are Cohen's $d$, and are weighted effect sizes based on a the inverse of the variance in the sample from which the estimates were drawn.
The issue of which receptors are implicated in placebo analgesia is important, as they have different effects. Up to eighteen types have been reported, but there are three main types, the mu, kappa and delta receptors. These receptors have differing sites of action, and information regarding which ones are activated by placebo analgesia will presumably improve our understanding of how and when this phenomena is likely to occur. The anterior cingulate cortex, as discusssed above appears to be involved in placebo analgesia, and is also involved in the opioid system in the brain. This area is rich in mu receptors, which would seem to suggest that this kind of receptor is important in placebo analgesia.
One study which looked at patients suffering from IBS found that naloxone did not reduce the size of placebo effects, which would suggest that these were not opioid mediated \cite{Vase2005}. It remains to be determined why placebo effects in IBS are not opioid mediated, and understanding this may give some insight into the phenomenon.
\subsubsection{Dopamine and Placebo}
\label{sec:dopamine-placebo}
While Benedetti and others have done much of the research into the opioid system, De La Fuente Fernandez \cite{DeLaFuente-FernAndez2002} has published a large amount of research looking at the dopaminergic system. It has been observed that the dopamine system activates not just to reward, but rather the expectancy of reward, and that this release varies as a factor of the certainty of the expectancies \cite{Scott2007}. In one study, the activation of the dopaminergic systems during placebo analgesia was correlated with activity observed during a monetary reward task, suggesting that the mechanisms of reward are a common feature of placebo effects \cite{Scott2007a}. This finding also may provide a mechanism through which certain and uncertain expectancies exert their effects.
De La Fuente Fernandez has argued that there is a descending link from the OFC to the Periaqueductal Gray Area (PAG) and from here to the amygdala, and that this link is responsible for the observed placebo effects \cite{Fuente-Fernandez2002}. These areas (along with the substantia nigra, which has also been linked with placebo response, produce large amounts of dopamine, and this may play a large role in the mediation of placebo effects and the physical body.
Other research has shown that the expectation of drug triggers large releases of dopamine in the brains of patients with parkinson's disease, and this release of dopamine directly causes improvement \cite{Pollo2002}. The really interesting question that arises from this research is why, if patients with Parkinson's can release this dopamine when treatment is expected, do their brains not release these levels of dopamine naturally?
This finding should be tempered with other research that indicated that there was no correlation between the amount of dopamine released and the size of the placebo response \cite{Scott2007a}. However, correlations only test for linear relations, and many relationships in the body are non-linear, suggesting that this finding may not be particularly robust.
Additionally, the expectancies surrounding motor improvement have been found to correlate with actual motor improvement in both verum and placebo sub-cranial thalamic stimulation for Parkinson's disease.
Expectations of receiving caffeine have been associated with dopamine relaease, which would seem to provide further evidence that dopaminergic systems are involved in the difference in outcomes between certain and uncertain expectations Kassien (2004), cited in \cite{Beauregard2007a}.
Additionally, there is some evidence that the mid-brain dopamine cells associated with addiction and reward project on to areas which are involved with motor and emotional function \cite{DeLaFuente-Fernandez2002} . This ties in with the effects that CCK has on the nocebo effect (as it additionally induces anxiety), and the involvement of motor function again ties in with a conception of the placebo as a much more embodied phenomenon than is currently thought. The implications of these findings will be examined in Chapter \ref{cha:notes-towards-theory}.
It is important to realise that dopamine and opioid systems may interact, and the limbic system within the brain appears to be the site where they do so \cite{Fuente-FernAndez2002}. This section of the brain is associated with both mood and movement, and it may be here that the effects of CCK are exerted, along with those of proglumide, either reducing or increasing the size of the placebo response.
\subsubsection{Nitrous Oxide and Placebo}
\label{sec:nitr-oxide-plac}
It has also been hypothesised \cite{Stefano2001,Fricchione2005} that Nitrous Oxide (NO) is involved in the placebo effect. These authors argue that the placebo effect is similar to the relaxation response, and they present a substantial amount of evidence that links Nitrous Oxide to various health promoting systems in the body. However, all of their evidence is quite circumstantial and no empirical study has tested the involvement of the NO system in the placebo effect.
NO does regulate the production of both dopamine and norepiphedrine, and also disinhibits the actions of striatal neurons, which have been associated with placebo effects in Parkinson's and in placebo analgesia \cite{Fricchione2005,Fuente-FernAAndez2001}.
The rostral anterior cingulate cortex has been associated with placebo analgesia in many studies, and this part of the brain does produce much NO, which may suggest that nitrous oxide does have some input to placebo analgesia. The findings noted above with regards to the rACC also include that this area of the brain is associated with hypnotic analgesia also. This may be because both placebo and hypnotic analgesia are phenomena of suggestion, and thus fall under the control of the generalised expectancy network described above.
One intriguing finding, noted above is that conditioning can affect hormonal and endocrine responses, while expectancies cannot \cite{Benedetti2003a}. This finding is important, as hormonal and endocrine responses were those found by Meissner \textit{et al} to not show significant placebo effects in a review of randomised controlled trials. This would seem to suggest that the only placebo effects which are controlled for in clinical trials are those related to expectancies.
\subsubsection{Placebos and the heart}
\label{sec:placebos-heart}
Heart rate and patterns across heart beats are altered by opiods, and also by endogenous opioids released during the placebo effect. While heart rate and the placebo have not been studied as extensively in recent times as neural or endocrine correlates of the placebo, there has been some work done, mostly in an experimental setting.
One study found that the heart rate was reduced by placebo analgesia, and this response was blocked by naloxone, suggesting that this effect was mediated by endogenous opioids \cite{Benedetti2008}. Matre and colleagues found no effects of placebo administration on blood pressure or heart rate \cite{Matre2006a}, but they appear to have conducted a between groups ANOVA, which is an inefficient means of examining the changes evoked by placebo (as these variables would have changed continuously over time, and would probably ne best modelled as time series). Indeed, this problem is common in placebo releated research, and the effects of this problem and some possible resolutions are discussed in Chapter \ref{cha:methodology}.
Additionally, one of the studies investigated earlier showed an effect of placebo administration on blood pressure \cite{Shiv2005a}, but only when participants were motivated to solve puzzles (which was the task in that study). This particular effect seems more likely as it was examining effects of a energy drink placebo, and increased blood pressure is a side effect of consuming caffeinated drinks.
Heart rate variability is one measure which has been examined in some studies. Flaten (2008) argued that placebo administration would decrease the ratio of High frequency to low frequency waves, and this hypothesis was supported in their experiment \cite{Aslaksen2008}.
Skin conductance is another measure sometimes used in placebo experiments. Some authors have reported no difference between groups on this measure \cite{Flaten1999}, but these authors also examined differences between groups using an ANOVA method, which is not appropriate for this time series data. One extremely interesting study claimed that pain ratings could be derived from the measurement of skin conductance, and that active drugs changed the response patterns, while placebo administration did not \cite{Fujita2000}.
These kinds of physiological markers of response to placebo are extremely useful as they can be used to determine if a physiological placebo effect is occurring, or if the change in self rated pain is driven by more cognitive re-appraisals of the situation. If more cognitively driven, it would be expected that these changes would lag the changes in self reported pain, whereas if the placebo were mediated locally then it would be expected that the physiological changes would occur in advance of a reported drop in self rated pain.
\section{Review of the Placebo Effect}
\label{sec:revi-plac-effect}
In this section, the major themes which have emerged from the literature review of the placebo effect are recapped. There are a number of major points to bear in mind.
Firstly, the placebo effect is a difficult phenomenon to define precisely, and there is wide variability in what is considered to constitute a placebo effect. Perhaps the best conceptualisation of the effect is that proposed by Goetzche \cite{Gotzsche1995}, where it is argued that placebo effects should be broken down into three parts: those attributable to the patient-provider interaction, those attributable to the administration of the medicine, and those attributable to the context in which the treatment was delivered.
The next major theme to emerge from this review is that that there is one theoretical perspective shared by most researchers in the field, that of response expectancies \cite{Kirsch1997,Kirsch1985}. However, this theory, while still core to the conceptualisation of the effect, has been shaken by a number of demonstrations that expectancies are not always the best predictor of placebo \cite{Hyland2006,Geers2005a}. This theory also suffers from the lack of clear terminology and measurement toold to measure expectancies, as deficency which this research hopes, in some small way, to remedy.
The next theme to emerge from this review is that the placebo effect is a complex phenomenon which has been difficult to study accurately. However, there appear to be a number of characteristics of both patient, provider and context which can either facilitate or reduce placebo effects.
The major factors relating to context appears to be the setting in which placebo is administered (clinical trial or experimental), along with the participants beliefs about treatment assignment. Another major contextual factor appears to be the rationale given for the placebo's effects. In addition, provider factors appear to include the level of suggestion given (certain versus uncertain) and the charisma and authority of the provider. Patient characteristics which have been shown to effect the response to placebo include dispositional optimism and pain catastrophising.
A further theme which emerges from this review is that the various placebo responses observed in clinical and non-clinical populations appear to recruit a number of neurobiological systems, at the very least the opioid and dopamine systems, and potentially the serotonin system also. It also appears that placebo effects can display a high level of specific action at particular parts of the body, and involve both the central and peripheral nervous system.
The meta-analytic evidence, though conflicting, appears to indicate that placebo effects occur when the outcome variable is under the control of the central nervous system, and do not occur nearly as much in the endocrine system. However, this finding is based on clinical trial data and contradicts the successful conditioning of humans to respond to endocrine placebos \cite{Benedetti2003a}. The size of placebo effects is also a matter of some dispute, and appears to significantly differ as a function of study context, as noted above.
In conclusion, the placebo effect is a complex phenomenon which appears to provide a link between the psychological and physiological experience of the world, and which is associated with some psychological and physiological variables.
\part{The Implicit Association Test}
\label{part:impl-assoc-test}
\section{Introduction to Implicit Measures}
\label{sec:intr-impl-meas}
Psychology as a science, and indeed the social sciences more generally, have a problem. They seek to understand the mind and behaviour of individuals in particular contexts and cultures. While behaviour is less problematic to observe scientifically, the observation of mind is fraught with problems. Almost all of the constructs of interest to psychologists (mind, love, experience) are unobservable with the naked eye, and require interpretation in order to be understood.
These constructs are still measured however, but the means to get at them are more indirect and subtle than the microscope. In the case of many variables, psychologists measure what people think and believe from their answers to questions devised by the psychologist, normally referred to as self-report instruments.
This approach has obvious advantages, in that it is quick, cost-effective and can produce results of interest. However, as psychology has matured, a number of problems have become apparent with this approach \cite{Nisbett1977}. The first problem is that, especially in controversial topics, people may attempt to conceal their true beliefs or attitudes. The second, somewhat more philosophical problem, is that people may be unaware of their true beliefs, or at least may profess to believe one thing while behaving in a manner consistent with a belief in the opposite.
The first of these problems is called social desirability \cite{Egloff2003}, and indeed psychologists have developed other self report scales that purport to measure this construct. The second problem, first noted by Freud, is that of unconscious (or implicit) influences \cite{Hofmann2008}. While the system built by Freud no longer forms part of the framework of modern psychology, the contradictions between reports of experience and behaviour remain, and are still relevant to the aims of psychology. %%insert allport research on chinese attitudes and behaviour here
\subsection{Older non-self report methods}
\label{sec:older-non-self}
Some methods have been developed to get around this problem. One of the first techniques used for this purpose was that of free association, where a client was asked to respond to a stimulus word or picture with the first word that came to mind, without censoring the experience. These associations could then be used by the therapist to gain access to material which the client did not consciously report being aware of.
Another method which was used was that of Rorschach ink-blots, where ambiguous ink blots are shown to the client, who interprets them. This technique can also provide insight into the mind of the client, but again this requires interpretation on the part of the therapist. It is this interpretation process that causes these procedures to lack scientific validity in the eyes of many, as what one therapist understands by the clients words may differ completely from what another therapist takes from the same material.
These approaches were abandoned following the rise of behaviourism as the dominant approach within academic psychology, and further eclipsed by the notions of Karl Popper regarding falsifiability as a criterion for scientific theory. It was argued that since unconscious influences could be used to explain any criticism of the theory (and indeed, Freud was prone to doing this) then the theory was not truly scientific. The development of psychometric theory also played a role in the decline of interest in such instruments, as these methods seemed to produce reliable and valid data and scores could be corrected for impression management and social desirability biases by statistical techniques.
\subsection{The turn back to indirect measures}
\label{sec:turn-back-indirect}
In recent years, however, there has been a resurgence of interest in such techniques. This resurgence grew out of the work on implicit memory and learning, where participants would consciously deny awareness of some piece of information while their behaviour seemed to show signs of this knowledge .
This phenomenon can be seen in experiments like word completion tasks. If participants are given a list of words to memorise, and then distracted by another task, followed by the word completion task, they tend to far more frequently complete the word fragments with the words on the previous list which was to be memorised. However, they will typically deny this influence on their responding if asked \cite{Wittenbrink2007a}.
These approaches, allied with the continuing failure of self reported attitudes to predict behaviour, caused some researchers to look for another way to measure these constructs \cite{Greenwald1995a}. The result of these investigations was the Implicit Association Test, or IAT for short \cite{Greenwald1998}. The IAT is a reaction time measure which makes inferences about attitudes from the time which it takes participants to categorise words or pictures of a group into one of two categories.
\section{Introduction to the Implicit Association Test (IAT)}
\label{sec:intr-impl-assoc}
The IAT was developed by Greenwald, and he suggested that because of its design, it might be more resistant to social desirability influences and demand characteristics \cite{Greenwald1998}.
Social desirability tendencies would lead people to deny prejudicial behaviour in self report instruments, while as a reaction time measure, the IAT is less easy to fake. Demand characteristics result when participants in an experiment give the answers a researcher wants, rather than their real beliefs or attitudes. Again, it seems intuitively harder to do this within an IAT methodology as it would require tremendous and consistent control of reaction times to stimuli (though not impossible as discussed in Section \ref{sec:iat-contr-faking}.
\subsection{Description of the Procedure}
\label{sec:descr-proc}
The (IAT) is a computer administered procedure which purports to measure implicit associations not directly accessible to consciousness \cite{Greenwald1998}. The test was developed as a result of mounting evidence for learning without awareness in human participants (see above section \ref{sec:intr-impl-meas} for some examples).
This research was reviewed by Greenwald \& Banaji \cite{Greenwald1995a}, where they introduced a distinction between direct and indirect measures of social cognition. They referred to self report instruments as direct measures, and to such techniques as semantic priming as indirect measures. Semantic priming is the tendency for participants in experiments to give answers to ambiguous tasks similar to ones they have recently observed in their environment. They defined implicit associations in the following manner \textit{the unidentified or inaccurately identified traces of past experience}, and this definition implied that self report measures were not the best tools with which to assess these associations.
The procedure works as follows. Firstly participants sit down in front of a computer and are assigned two keys (typically the ``e'' and ``i'' keys) to respond to each word or image presented, which fall into one of two categories. The major measure is reaction time, and an assumption of the method is that categories which are more strongly associated in consciousness will be easier to combine than those which are less associated. The participant is asked to classify words into either pleasant or unpleasant categories as they appear on the screen at the front of the computer, for example love, hate, good and bad are words typically used in this part of the procedure.
Then, participants are asked to categorise faces into either black or white categories. Following this, the two categories are combined, with one key being pressed for White or Pleasant and another being pressed for Black or Unpleasant. Then, the labels are reversed, and the participant categorises White faces with Unpleasant and Black faces with pleasant. These response times are summed and averaged for each participant, and the two categories are subtracted from one another to produce a difference score which is referred to as an IAT effect.
The procedure is not limited to assessment of racial attitudes, and has been applied far more widely \cite{Craeynest2008,Greenwald2009, Schmukle2008,Walker2008}. A general schema for the process follows.
Firstly participants classify words as either belonging to Category X or Y, where X and Y are positively (love, flowers etc) or negatively (hate, insects etc) associated words or images. Then they classify faces or names as either belonging to Group A or Group B, where the labels are typically descriptive of groups of people. In the next step, these two associations are combined, with one key being a response for A and X and the other key being used for responses of B and Y.
In the fourth step, the keys for pleasant and unpleasant are reversed, and in the final step the two dimensions are combined in the opposite manner (A and Y or B and X). In practice, only the 3rd and the 5th steps are analysed, and the difference between mean response latencies on the different combination tasks is assumed to represent an IAT effect \cite{Greenwald1998} .
In essence, any differences in reaction time in the combination of the two categories are assumed to be due to underlying differences between the relative associations of the concepts. The authors claim that the use of difference scores allows them to prevent issues of processing and response speed variability across individuals from distorting the results. However, this assumption has been questioned recently \cite{Blanton2006}, who claimed that processing speed was a major moderator of the observed variance in scores across the population studied.
\subsection{Uses and Psychometric features of the IAT}
\label{sec:uses-psych-feat}
The method has become very popular, and has been applied to many areas of social psychology, such as attitudes towards fatness \cite{Ahern2008}, towards disability \cite{Pruett2006} and towards smoking \cite{Kahler2007}. A few common features of the measurements taken with this instrument seem to be the following.
Firstly, they are typically weakly correlated with explicit measures of similar attitudes. These correlations average ($r=0.39$) \cite{Nosek2005},and so one could be justified as regarding the two as distinct constructs \cite{Nosek2007a}. Indeed, this is the approach taken by many of the originators and early workers in the field \cite{Greenwald2000,Nosek2007a}.
Secondly, they tend to reveal stronger associations than explicit measures when the topic is politicised or controversial \cite{Greenwald2009}. Thirdly, they tend to reveal similar effects to other techniques such as semantic priming, although IAT's seem to be more sensitive to variations within the construct \cite{Wittenbrink2007a}. Fourthly, they have low test-retest reliability. This seems to average around ($r=.59$) which, while permissible in a psychometric instrument for theoretical purposes, is far too low for making clinical or legal judgements \cite{Greenwald2000, Blanton2006d}.
Split half reliabilities are typically higher, averaging around ($r=0.80$), but test re-test reliabilities are much lower, even one day afterwards. It is worth noting however, that the test-retest reliabilities do not drop much farther than this, even over periods as long as one year \cite{Egloff2005}.
This may suggest that the IAT response is composed of both state and trait portions, and that the trait portion of the measure is relatively invariant across temporal distance. Some authors have argued that this low test-retest reliability is due to both error variance and person by situation interactions \cite{Gschwendner2008}. Manipulation of accessibility of the constructs measured in the IAT has been shown to improve the temporal stability of the IAT scores, suggesting that particular situations may make the constructs more or less likely to be expressed \cite{Gschwendner2008}.
Additionally, some research has suggested that the picture of implicit attitudes as developed early in life and resistant to change, may be incorrect \cite{Gschwendner2008}. This, and the evidence given below in Section \ref{sec:iat-control-fake} may point towards implicit measures as being more state based than early theory surrounding dual-process models would suggest.
Some controversy has surrounded the use of difference scores as a metric \cite{Blanton2006d}. Blanton \textit{et al} argue that the only condition under which difference scores make sense is when positive and negative stimuli are equally valenced, and they further argue that this condition is not met for many of the most popular implicit association tests (specifically, the racism IAT). They further argue that, unless the IAT score can be linked to an observable outcome, then it is an arbitrary metric \cite{Blanton2006}. This contention of Blanton and Jaccard rests on modelling the IAT as two seperate measures which are then combined. In addition, they appear to argue that successive responses within an IAT trial are independent, which seems like an unjustified assumption. Given that the procedure requires participants to respond to multiple stimuli in quick succession, the responses within blocks are likely to be auto-correlated \cite{mccleary1980applied}, and even the independence of responses across a single IAT is in doubt.
The psychometric characteristics of the IAT are still not completely defined, and no single model has provided a coherent explanation for how and why the effects will occur. However, the measure does appear to have predictive validity (see Section \ref{sec:pred-valid-iat}) so the proper question for psychometric analysis is not whether the IAT has any effects, but rather how these effects occur.
%% Perhaps the best explanation for the effects of IAT's is that the result from individual differences in task-switching abilities, and that these task-swtiching skills are -
\subsection{IAT, Controllability and Faking}
\label{sec:iat-contr-faking}
While it appears that while the IAT resists demand characteristics, it does not prevent them entirely \cite{DeHouwer2007b}. De Houwer's research demonstrated that the IAT procedure can be faked when applied to novel attitudes. This research demonstrated a large IAT effect when participants were given positive or negative information about fictitious social groups.
De Houwer \textit{et al} also obtained IAT effects when participants were asked to respond in the opposite way from the information which they had been given, although the effects were smaller, and not all participants were capable of this feat.
The major contribution of this research is that it demonstrates that participants can alter IAT effects by choice, at least when applied to novel or unfamiliar attitudes. The authors suggest that this may restrict the ability of the IAT to study the development of novel attitudes, but it would not affect the study of well developed attitudes, as are looked at in most IAT research.
However, there is a more serious problem, as the research on training participants to fake the IAT indicates \cite{Fiedler2005}. This research carried out three experiments, one on-line and two in a controlled experimental set-up. The results indicated that given prior experience with the IAT, and some instructions on how the measure works, participants were capable of reversing the sign of the IAT effect (which equates to showing an attitude opposite to their own).
Clearly, this is an issue for researchers, especially with such well known IAT measures as the Race IAT. However, the results of this study also showed that mere experience was not enough to allow faking. Participants who were given no information on how to fake were not able to change the direction of their IAT effect. Worse still the authors sent the faked data to a number of researchers in the field, and only one of them was able to identify the faked results, and this was done with an accuracy of 58\% using an algorithmic procedure.
That being said, the results of Foroni and others \cite{Foroni2005} suggest that there are some interesting features involved in the modulation of IAT effects. This experiment utilised a flower-insect IAT \cite{Greenwald1998} and two conditions. In the first, participants read a story about how the world had changed and insects were now a major source of nutrition for humans, while flowers were poisonous.
The other condition presented the same information, but not in narrative form. The story proved successful at changing the IAT effect to insects positive and flower negative, while the same information presented in a descriptive fashion was unable to induce these changes. This study also examined the effects of telling participants to fake the IAT, and noted that this was not effective at all.
These two contradictory results present us with a problem to explain. There are two possible explanations which I will address in this paragraph. The first is that without instructions, the IAT is very difficult to fake. The second is that participants may be better able to fake attitudes towards social groups rather than flowers and insects.
The first explanation runs into difficulties as the Fiedler \textit{et al} study showed small effects of faking even without instructions. The second seems intuitively plausible, as people may have more incentive to conceal negative attitudes towards out-groups rather than insects, and thus may learn how to do so more quickly. Neither of these explanations are particularly satisfying however, so the issue of how and when the IAT can be faked remains and open question for future research.
\subsection{Predictive Validity of the IAT}
\label{sec:pred-valid-iat}
The major proposed advantage for the use of implicit measures is that they would act as better predictors of behaviour, or allow for more insight into hidden cognitions that were associated with behaviours yet either not accessible or not reported by participants \cite{Greenwald1998}. This section examines the extent to which these hopes have been fulfilled. While the IAT does not appear to be a better predictor of behaviour overall, it does possess some ability to predict behaviours which are typically hard to predict using self report measures.
The classic demonstration of the difference in prediction between implicit and explicit measures relates to \cite{Asendorpf2002} who investigated the attitude of shyness. In this study, spontaneous shy behaviour was predicted by implicit associations, while controlled shy behaviour was predicted by explicit attitudes. This pattern has become known as double dissociation, and has been observed in a number of studies \cite{Perugini2005}, and then supported by a meta-analysis \cite{Hofmann2005}.
A recent review of implicit measures of self esteem suggests that implicit and explicit self esteem are entirely distinct constructs \cite{Rudolph2008}. Implicit self esteem has been shown to predict response to success or failure \cite{Greenwald2000}. Additionally, the research on implicit and explicit self esteem seems to indicate that individuals can be classified as having particular types of self esteem based on their relative levels of implicit and explicit self esteem, where participants who have high levels of both implicit and explicit self esteem are classified as having genuine self esteem \cite{Meagher2004}. These participants tended to be more resilient and suffer less negative outcomes following a false feedback manipulation designed to reduce self esteem. Although the effect size for the explicit measures was much higher in this study, the implicit measure (the Self Apperception Test and the IAT) appeared to be more sensitive to the emotional tone of the feedback.
In the domain of personality, implicit measures of all Big 5 traits have been correlated with spontaneous behaviour which reflected these traits \cite{Steffens2006} (see Section \ref{sec:iat-personality}).
%%expand on this paragraph
There is some evidence that IAT's can predict spontaneous behaviour better than explicit measures \cite{Conner2005,Perugini2005,Grumm2007} ,In the Conner \textit{et al} (2005) study, using an experience sampling methodology, the IAT measured attitudes predicted how the participants felt on a day to day basis far better than the explicit (self-report) measures, but the explicit measures predicted global ratings better. The extra predictive power afforded by the IAT was only demonstrated for negative affect, while the explicit measures were equally predictive for both positive and negative affect. This is an interesting finding, as it suggests that there are differences in the effects of implicit attitudes on emotion dependent on the valence. As this pattern was not observed with the explicit measures, it may be an avenue for future research aimed at determining how implicit associations predict experience and behaviour.
%%this paragraph makes no sense without introducing the mindfulness findings.
% Interestingly enough, the Hofmann meta-analysis \cite{Hofmann2005} showed that introspection was negatively correlated with implicit-explicit consistencies, which seems strange in light of the relationship to mindfulness observed in the Connor and Brown study. % An important point
% to make here is that mindfulness is pre-reflexive , while introspection reflects on thought, mindfulness merely notes it without attempting to analyse it (even in vipassana meditation, the thought is labelled and then let go).
There exists substantial evidence that IAT measured preference for males over females can result in prejudicial behaviour against females in a simulated interview setting \cite{Greenwald2000,Heider2007}. The magnitude of the IAT scores was correlated with the observer-reported prejudicial behaviour scores.
Some have critiqued these findings as both IAT and behavioural assessments were carried out in the same session, and this may falsely inflate the attitude-behaviour correlation. Another study \cite{McConnell2001} showed substantial correlations between IAT assessed bias and ratings of friendliness given to each participant in a scripted interaction. More recent research has demonstrated that even when separated by a week, attitude assessments using the IAT are significant predictors of verbal and non verbal friendliness with a compatriot of opposite race \cite{Heider2007}.
The predictive validity of the IAT may also vary as a function of domain. An Italian study \cite{Arcuri2008} showed that the IAT was able to predict the future voting behaviour of people based on the results of an IAT measuring attitudes towards left and right wing candidates. This finding is quite impressive, given the difficulties political scientists normally find in predicting the behaviour of undecided voters. However, this should be qualified with the fact that it was self reported voting behaviour was measured, as opposed to actual voting patterns.
Linking to the notion that the predictive power of the IAT varies across domains is a recent meta-analysis \cite{Greenwald2009} which compared the predictive validity of explicit and implicit measures across a large number of domains. In general, explicit measures appeared to be more predictive, except in the cases of inter-group behaviour and race attitudes. While this meta-analysis appeared to use some unclear selection criteria for studies, in general it was well conducted and the results show that the IAT has good predictive validity, especially in domains where there are social or awareness difficulties with the use of explicit measures.
\subsection{IAT: Personal or Cultural?}
\label{sec:iat:-personal-or}
Another issue in the field of IAT research has been the relationship between implicit attitudes and cultural knowledge. Some have claimed that it is these extra-personal associations which influence response on the IAT \cite{Olson2004}. However, a recent study \cite{Nosek2007a} casts doubt on this interpretation. They found that consistently across domains measured by the IAT the relationship between cultural knowledge and implicit attitudes was almost completely accounted for by variability in explicit attitudes. Using a sample size of over 100,000 they found weak relationships between cultural knowledge and IAT effects. This would seem to indicate that whatever the IAT measures, it is more personal than extra-personal.
The IAT is a new and exciting measure which has opened up new avenues for research by psychologists. The measure can reliably differentiate between groups possessing different attitudes, and seems to produce results which while correlating with explicit attitudes, appear to reflect a different underlying process \cite{Nosek2007a}.
It can predict behaviour quite well in certain situations, especially in matters of importance to the participants and when they are working with limited cognitive resources. It appears to predict spontaneous behaviour much better than explicit measures, which is certainly of interest. This prediction of spontaneous behaviour, coupled with low test-retest reliabilities, would seem to suggest that there are major state components to the measure.
However, the measure still has its problems. There is still no generally accepted theoretical rationale for its effects, it can be contaminated by such issues as task switching costs\cite{Klauer2005}, processing speed\cite{Blanton2006} and the context in which it is administered \cite{Boysen2006}.
Although it was proposed as a potential ``true pipeline'' to the attitudes of persons, it seems as sensitive to social desirability concerns and the presence of others as explicit measures. Finally, although the measure is not perfect, it has noticeably increased our understanding of human social cognition and is still stimulating new and interesting research, which of course is the major criterion of success for any measure.
\section{Uses of the IAT}%%this title needs to be changed
\label{sec:uses-iat}
The IAT is a useful measure, and we have touched upon some of these uses in previous sections. In this section, we will look at some of the major areas to which it has been applied over the last decade. While the IAT began as a tool to assess attitudes towards social groups, its uses have broadened to include measurement of personality, pain associations and evaluative conditioning. We will look at the success or otherwise of these attempts in the following section.
\subsection{Consumer Research and the IAT}
\label{sec:cons-rese-iat}
Another area where the IAT has seen much use is in the area of consumer research \cite{Lane2007,Maison2001}. Research in this area has focused on the effects of implicit attitudes on consumer choices. The IAT has shown some predictive validity here \cite{Maison2004} but not significantly above the predictive power of explicit attitudes \cite{Greenwald2009}. One area where the IAT has proven useful is in examining the effects of evaluative conditioning on mature brand preferences \cite{Gibson2008} which reported that the conditioning could change attitudes (as measured with the IAT) towards brands for which the consumer had no pre-existing preference.
\subsection{IAT and Personality}
\label{sec:iat-personality}
One of the most interesting outgrowths from the IAT has been the measurement of implicit personality. In a recent study, Extraversion and Neuroticism IAT's were found to converge with the Extraversion and Conscientiousness traits as assessed by the NEO-FFI \cite{Grumm2007}. However, another recent article, which attempted to assess trait anxiousness and angriness, found that there was a significant interaction effect between the order of administration of the IAT's \cite{Schnabel2006}.
When the angriness IAT was administered first, the results of the anxiousness IAT were highly correlated. The converse, however, did not occur. The authors suggest that this may be because the participants applied a coding strategy to the first IAT which they then generalised to the second. This problem may also have arisen because Schnabel et al administered the two IAT's directly after one another, rather than with other tasks in between as was done in the Grumm et al paper.
Another recent paper \cite{Boldero2007} which used the Go/No Go Association Test (GNAT) showed that implicit Extraversion and Neuroticism were able to predict reaction time in the experiment. More generally, the implicit attitudes were able to predict scores on the explicit attitude measure.
However, the Big 5 traits predicted were different from the findings of other researchers, suggesting that there may be some method variance involved. This method variance has also caused problems for the IAT \cite{Mierke2003,Greenwald2003a} . Again, this highlights the issue that these new implicit measures may have confounding factors within them which can only be highlighted by further detailed research.
It does appear that implicit personality measurement can predict spontaneous behaviours related to personality traits \cite{Steffens2006}. This study predicted behaviour based on the Big 5 personality traits. There was no effect for extraversion, but the experimental manipulation may have been poorly designed. This findings lends more credence to the double dissociation theory \cite{Asendorpf2002} which has been shown to apply in a wide variety of tasks since then \cite{Perugini2005,Conner2005}.
\subsection{Clinical Use of the IAT}
\label{sec:clinical-use-iat}
The IAT has been applied to clinical psychological research recently enough, and this has been somewhat successful \cite{DeHouwer2002}. However, the test retest reliability of the IAT averages around 0.6, which is far too low to be used in clinical decision making. This has not stopped many researchers however, but these results should be treated with caution as the precision of measurement may not be adaquete for the purposes for which the IAT is being used.
One interesting study \cite{Grumm2008} points to a useful area in which the IAT has been applied. In this Grumm study, participants were recruited with chronic pain conditions. Their associations were measured using a self +pain design for the IAT, and they then underwent psychotherapy coupled with mindfulness programs to examine if these treatments were capable of changing their implicit associations.
The study found that they did, with the authors reporting significant drops in self+pain associations over the course of the treatment. This finding is important as it provides validity for the conception of the IAT as measuring something useful, as the scores on the measure tracked the progress of patients through the treatment program, and this is unlikely to have occurred by chance.
The IAT has also been successful in distinguishing between spider and snake phobics and the general population \cite{Egloff2002,Lane2007}, and one interesting avenue of research would be a conceptual replication of the Grumm et al 2008 study using phobia sufferers and CBT to examine whether or not the same pattern of changes in automatic responses would be found.
% Another area where the IAT has been applied is that of evaluative conditioning \cite{Mitchell2003}. This area has not received that much attention, as the results tend to be less spectacular than some of those in social cognition.
A recent paper \cite{Boschen2007} used an IAT and measures of skin conductance to look at the development of fear responses in phobics.
These authors found that while the skin conductance measures responded immediately to the intervention, this change was not reflected in the implicit associations until quite some time afterwards. This would seem to suggest that different mechanisms unerlie the information processing biases (measured by the IAT) and the autonomic responses (measured by SCR), which is a useful finding that may contribute to our understanding of the genesis of phobias.
\section{Moderators of the IAT}
\label{sec:moderators-iat}
\subsection{Contextual Moderators of IAT Effects}
\label{sec:cont-moder-iat}
Some contextual factors have been noted to affect IAT measures, even though these were some of the problems which the measure were designed to avoid. A study asking participants to complete the IAT under conditions when they believed that the experimenter would or would not know their scores (the so-called bogus pipeline) \cite{Boysen2006} showed a notable diminution in IAT effects in a measure of attitudes towards homosexuals when participants believed that the experimenter would know their scores. Further investigations revealed that this did not arise because of social desirability issues as the effects were similar under a bogus pipeline condition. Another study examining the attitudes of Italian students to Turkish immigrants replicated this finding, with IAT scores being reduced when in the presence of others \cite{Castelli2008}.
\subsection{Cognitive Moderators of IAT effects}
\label{sec:cogn-moder-iat}
Another factor which appears to moderate the observed IAT effects is that of memory resources. A study looking at attitudes towards Blacks and Turks \cite{Hofmann2008a} found that the IAT acted as a far better predictor of behaviour when participants had been asked to remember a list of words than when they were untaxed. This finding was replicated by the same authors using a different sample and IAT, suggesting that it is quite robust. This suggests that the attitudes measured by the IAT are the result of more automatic processes, and will have predictive power to the extent that the matter involved is not the subject of deep processing.
The Hofmann (2008) study described above involved interaction with an experimenter of the out-group measured in the IAT, and the second study separated the IAT from the behaviour assessment by one week, so these results are both good measures of behaviour and unaffected by issues of attitude-behaviour consistency .
One recent study \cite{Perugini2007} showed that the predictive ability of the IAT increases when under conditions of self activation. This was operationalised in these experiments by asking participants to either circle self or non-self related words in a passage of text. This finding was replicated across four domains and was found to be valid in all of them. In one case the correlation between IAT and behaviour was raised from .36 to .76 under conditions of self activation, which is a huge change.
Some research \cite{Dasgupta2001} suggests that showing exemplars of groups typically the subject of negative associations on the IAT measures (such as Black Americans or Females) can reduce the size of these associations. However, as described below, difficulty in recalling such exemplars can lead to larger IAT effects.
It also appears that IAT effects are influenced by ease of retrieval mechanisms \cite{Gawronski2005}. This extremely well conducted study examined a number of implicit measures and the mechanisms through which they are influenced by context.
Gawronski \textit{et al} class the IAT as a response compatibility measure, and argue that these measures are affected by ease of retrieval from memory. In support of this, participants who generally liked African Americans showed higher levels of implicit preference against this group when asked to generate a high number of either liked or disliked African Americans. Conversely, participants who generally disliked African Americans showed lower levels of prejudice when they generated a lower number of exemplars. The authors explicate this effect in terms of ease of retrieval. The subjective difficulty of generating the exemplars seems to alter the attitudes which the participants report. %%insert Kahneman and Tversky study here.
This is probably affected by attitude-behaviour consistency effects, so it would interesting to examine whether or not these changes remain stable over time.
The IAT also appears to be affected by attitude importance \cite{Karpinski2005}. In this study, the authors demonstrated that both scores on a Republican/Democrat IAT and a Coke/Pepsi IAT became more predictive of self reported behaviour as attitude importance increased. This would suggest that the measure is more useful in matters where the participants have a investment into the attitude being measured. However, this study used two domains where explicit measures are normally better predictors than implicit measures \cite{Nosek2007d}, so the findings here should be treated with some caution.
Another study \cite{Levesque2007} looked at the effects of mindfulness on expression of implicit attitudes and argued that high mindfulness can stop implicit attitudes from being expressed and over time, causes them to become more in tune with self reported attitudes. This study used an experience sampling methodology and examined attitudes towards autonomy and heteronomy. The findings suggested that participants high in mindfulness seemed to show higher levels of autonomy in general, and that mindfulness could act as a protective factor against the expression of unwanted implicit attitudes. This finding has also been supported by other recent research \cite{Gschwendner2006}, when they noted that Private Self Consciousness seemed to correlate with the expression of implicit attitudes towards Germans and Turks. In addition, the self reported habit scale, a measure which has been found to be negatively correlated with mindfulness, was found to be associated with stronger implicit attitudes and less congruence between explicit and implicit attitudes \cite{Conner2007}.
\subsection{Task Switching and the IAT}
\label{sec:task-switching-iat}
One psychometric characteristic correlated with IAT responses appears to be task-switching ability \cite{Mierke2003}. Mierke and Klauer established that IAT's which should not have been correlated showed substantial variance in common. Through a series of experiments they demonstrated that task switching abilities appear to be the cause for this. They also reviewed this work in later research, and established that these differences could be controlled for by using the new $D$ algorithm developed by Greenwald et al \cite{Greenwald2003,Klauer2005}.
The important question for future research in this area is whether or not task switching ability is independent of IAT scores, in that the distribution of IAT scores is similiar across all levels of the trait, or whether or not it affects the IAT scores significantly. If the former, then if this variable is controlled for, then there should be no problems, whereas if task switching ability is not independent of IAT scores, then the measure and scoring methods will need to be revised.
\section{Criticisms and Controversies of the IAT}
\label{sec:crit-contr-iat}
The IAT is a relatively recent measurement tool, and yet in its short life it has been involved in a number of disputes and controversies, far out of proportion to the number of criticisms which a measure usually faces \cite{VonHippel2004}. Some of this may be due to the extreme popularity of the measure and its lack of any firm theoretical foundations, while other issues may have more to do with the political nature of some of the results claimed for the measure, especially in the United States. The aim of this section is to examine these issues in chronological order, present the arguments for each side and to examine the changes which have taken place in the use of the IAT since these disputes.
For the first three years of the existence of the IAT, the measure did not seem to have many detractors. Many papers were published, and there seemed a general air of excitement around the measure, which offered a new approach to assessing attitudes.
This time of peace did not last too long however. In 2002, \cite{McFarland2002} published their paper, which showed correlations between ostensibly unrelated IAT measures. They argued that the incongruent associations (e.g. Self +terrible) required greater task-switching ability, and that those who were weak on this skill were biased towards appearing to have lower self-esteem scores and greater prejudice on IAT measures. This paper was quite seminal, and the repercussions of it are still being investigated. Without going into all the messy detail at this point, it has now been firmly established that differential costs of task switching in the IAT can bias the results \cite{Mierke2001,Mierke2003} (and see Section \ref{sec:task-switching-iat}).
This Mierke and Klauer paper demonstrated correlations between a geometrical shapes IAT, and a Race IAT. They exhaustively examined the details of this effect, and concluded that it could effect most IAT's. However, these authors also examined the new scoring measures developed by Greenwald \cite{Greenwald2003a}. This new scoring algorithm, called the D algorithm appears to control for these differential task switching costs. It does this by dividing the mean response times of each participant by their standard deviations which allows us to eliminate the kinds of biases that arise from task switching. It is also worth noting that this task switching dispute has also led to attempts to understand the IAT through the use of these processes, a method which appears to have had some success \cite{Klauer2005}.
\subsection{Arkes and Tetlock: Prejudice or Rationality?}
\label{sec:arkes-tetl-prej}
In 2004, Psychological Inquiry published a criticism of the IAT procedure \cite{Arkes2004}. The remainder of the issue was devoted to articles either supporting the original critique or arguing against it, with about equal contributors on either side. Their criticisms have three major strands. The first is that the IAT results may reflect cultural stereotypes rather than personal associations, the second is that these negative associations may not be due to prejudice, and finally, that these negative associations may be the result of perfectly rational behaviour. They also make the point that many researchers have moved too quickly from the discovery of implicit associations to the notion of implicit prejudice, which is a position that has some merit.
Their first strand of argument is that the associations revealed by the IAT represent cultural stereotypes. They argue that since some African Americans show implicit prejudice against their own race, then this cannot be the result of a personal attitude, but must be the result of a culturally held belief. They do, however, fail to account for the similar percentage of White IAT participants who show similar negative associations to their own race, and the explanation of shared cultural stereotypes does not seem to hold water in this case.
An article responding to the criticisms \cite{Sears2004} presents the results of survey research which indicates that approximately ten percent of respondents show a preference to a race other than their own. These figures are similar to those obtained with implicit association research, and would seem to cast doubt upon the arguments of Arkes \& Tetlock in this case.
Arkes \textit{et al} also argue that the low correlations between implicit and explicit measures of attitudes are the result of this shared cultural stereotype, while the explicit measures look at the extent to which the participant agrees with this cultural stereotype. The authors suggest that the time limit in IAT experiments forces participants to rely upon these shared cultural stereotypes, and that this factor is responsible for the effects. This seems like a plausible explanation, however there has been work which has linked individual Race IAT scores onto prejudiced behaviour \cite{Heider2007}, which argues against this interpretation of the IAT scores. The only possible way in which these two ideas can be reconciled is if we assume that knowledge of a cultural stereotype is correlated with prejudiced behaviour, which would presumably make prejudice researchers the most discriminatory people in the world.
The next part of their argument relies upon the \cite{Olson2004} study where the authors examined a personalised IAT. They then argue from analogy that this metric could cause Jesse Jackson to fail the IAT. This argument is somewhat unconvincing, and seems to fly in the face of research that indicates that IAT measures personal associations better than cultural ones \cite{Nosek2008a}. Arkes and Tetlock also argue against the studies which use body language as a measure of prejudice claiming that these behaviours could also be interpreted as shame or sorrow. Unfortunately, they provide no evidence in support of their claims.
% Their final argument is perhaps the most seductive, especially to readers of a statistical nature. Briefly, they argue that because most crime in America is committed by black people, it is perfectly rational for White Americans to associate negative words with Black Americans, and thus the associations revealed with the IAT are not prejudiced, because they are rational. Leaving aside the matter of whether individuals untrained in higher level mathematics are capable of assessing utility in this fashion, and if so, whether or not they do, this argument neglects to mention that class and income are also significantly influential in crime statistics and we do not have IAT results suggesting that poor people are prejudiced against rich people, and vice versa. Their rational calculation of crime statistics is also fatally flawed by the use of convictions, given the expense of lawyers and the likelihood that most poorer people will plead guilty in America in order to gain a reduced sentence (plea-bargaining).
In summation, the Arkes and Tetlock article appears to present no new research or evidence against the IAT but merely restate old problems and argue for doubt on the issue. The two points they do make cogently is that the leap from associations to attitudes has been made too quickly, and their argument that participants in the dispute make reputational bets following design of experimental studies to conclude the disagreement in a Bayesian fashion. Unfortunately, this offer does not appear to have been taken up at this point.
\subsection{Rothermund and Wentura: Salience or Associations?}
\label{sec:roth-went-sali}
In 2004, perhaps the most compehensive critique of the IAT was published \cite{Rothermund2004}. This critique focused on the validity of the association model presumed to underlie the observed IAT effects \cite{Greenwald1998}. Rothermund and Wentura proposed another model which they claimed could account for the effects and they called this the figure-ground model.
They argue that salience asymmetries between the different stimuli could be driving the observed effects. They claim that negatively valent words are more noticeable and thus become the figure, while positively valent words become the ground. This then drives the observed effect. In the paper noted above, Rothermund and Wentura produce results which appear to confirm this model, using non-words and strings of numbers as stimuli and producing typical IAT effects.
They note that this cannot be due to pre-existing associations, and claim that these results support the salience asymmetries model of the IAT. This paper is extremely well conducted, and has cast a shadow over IAT research. These criticisms have some merit, however, they note in passing that valence and familiarity probably drive these effects, so logically, these effects of salience can be controlled for by controlling the valence and familiarity in IAT research, which has since become common practice.
They also recommend that all IAT research should involve a word non-word task in order to assess the extent to which salience asymmetries contribute to the observed IAT effects. Greenwald et al \cite{Greenwald2005} responded to this critique, and argued that the salience asymmetry explanation conflicted with the literature showing impressive correlations with explicit behaviour in known groups studies and meta-analyses \cite{Greenwald2009}.
While this is a very important point, nonetheless this does not rule out a salience asymmetry explanation totally. Perhaps the most important element to take from this debate is that salience asymmetries need to be controlled for (by making sure valence and familiarity are matched across stimuli) so that we can be certain that any associations revealed are not spurious and reflect real differences in people's conceptions of the matters under study.
\subsection{Blanton and Jaccard: Associations versus Attitudes}
\label{sec:blant-jacc-assoc}
The most recent scrutiny has mostly emanated from Hart Blanton and James Jaccard, along with some of their associates. The first critique began as a commentary on Greenwald's \cite{greenwald2002} \textit{Unified theory of implicit attitudes, stereotypes, self esteem and self concept}, and expanded into a critique of multiplicative models within psychology more generally.
The major problems for Blanton and Jaccard \cite{Blanton2006a} were the following. Greenwald's theory posits that attitudes are associated with the self through a network comprising discrete concepts, many of which are either positively or negatively associated with one another. So far, so good. However, as a result of the theory, Greenwald makes a number of predctions which he proposes to test using multiple regression. It is here that problems arise in the view of Blanton and Jaccard.
Multiple regression and multiplicative models tend to require a rational zero point on the scale used. Blanton and Jaccard argue that this requirement has not been met for either the explicit or implicit measures used by Greenwald et al. They point out that Greenwald et al \cite{greenwald2002} assumes that the mid point of a scale measuring explicit attitudes represents a zero point. This requires that the scale be a perfect representation of the underlying construct, which seems somewhat unlikely. Greenwald often uses difference scores to avoid this kind of problem, but Blanton and Jaccard argue that this is only permissible if the positive and negative items on a scale are equally valent, which is an assumption which cannot be met for most scales used. However, it is an assumption that can be met for much of Greenwald's work, where stimuli are matched for valence.
Greenwald also tends to use identical stimuli for self report and IAT instruments so at least this assumption is met across both methods \cite{Farnham1999,Greenwald1998}. Blanton and Jaccard conclude by suggesting an alternative method of analysis for the data reported in Greenwald et al's (2002) article. This led to a reply by Greenwald et al \cite{Greenwald2006b}, which used his acquired data to test against the model suggested by Blanton and Jaccard. He discerned some problems with his own model, but also serious flaws with the strategy suggested by Blanton and Jaccard. Following simulation and meta analysis, he produces evidence which supports his multiplicative model, at least in a somewhat weaker form.
% If this were the limit of the issues raised by Blanton and Jaccard, it would seem not much more than an abtruse debate about relevant statistical models, with little to no relevance for practical use of the instrument, except perhaps for a greater awareness of its limitations.
Later that year Blanton and Jaccard released an article \cite{Blanton2006} in which they claimed to have identified a number of important confounds in the entire IAT procedure. The first, and perhaps most important of these, was a problem with general processing speed. They re-analysed previous data supplied by Greenwald, and using the practice steps as a measure for general processing speed, found that implicit preferences were strongly correlated with one another across different domains. However, when processing speed was controlled for, there were no significant associations. This raises the disturbing possibility that previous associations reported using the IAT may have been the result of processing speed rather than true effects of implicit attitudes.
One major flaw in this argument presented by Gonzalez \textit{et al} is that they assume that all responses in the IAT are independent of one another, across blocks. This is an important assumption as if the blocks were not independent, then blocks 2 and 4 could not be used as an index of general processing speed, as these authors did. Given the design of the IAT (a speeded response task where one block follows another almost immediately), this independence appears to be an assumption which needs further empirical testing.
Another major problem identified by this re-analysis concerned the definition of the IAT as an instrument which measured relative attitudes rather than absolute preferences \cite{Greenwald1998} . This assumption was so important that it was reflected in the scoring, where an IAT effect is defined as the difference between response latencies in the two conditions.