-
Notifications
You must be signed in to change notification settings - Fork 6
/
Copy pathg_flexmanip2rep.Rmd
976 lines (710 loc) · 90.4 KB
/
g_flexmanip2rep.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
---
title: Using repeatability of performance within and across contexts to validate measures of behavioral flexibility
author:
- '[McCune KB](https://www.kelseymccune.com)^1^*'
- '[Blaisdell AP](http://pigeonrat.psych.ucla.edu)^2^'
- "Johnson-Ulrich Z^1^"
- "Sevchik A^4^"
- '[Lukas D](http://dieterlukas.mystrikingly.com)^3^'
- '[MacPherson M](http://maggiepmacpherson.com)^1^'
- '[Seitz B](https://benjaminseitz.wixsite.com/mysite)^2^'
- '[Logan CJ](http://CorinaLogan.com)^3^'
output:
pdf_document:
keep_tex: yes
latex_engine: xelatex
github_document:
toc: yes
word_document:
toc: no
toc_depth: '4'
md_document:
toc: yes
html_document:
toc: yes
toc_depth: 4
toc_float:
collapsed: no
code_folding: hide
always_allow_html: yes
bibliography: MyLibrary.bib
csl: "https://raw.githubusercontent.com/citation-style-language/styles/master/apa.csl"
urlcolor: blue
header-includes:
- \usepackage{fancyhdr}
- \usepackage{pdflscape}
- \newcommand{\blandscape}{\begin{landscape}}
- \newcommand{\elandscape}{\end{landscape}}
---
1) University of California Santa Barbara, USA
2) University of California Los Angeles, USA
3) Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
4) Arizona State University, Tempe, AZ USA.
Corresponding author: McCune, Kelsey B.
email address: kelseybmccune@gmail.com
Open... {width=5%} access {width=5%} [code](https://github.com/corinalogan/grackles/blob/master/Files/Preregistrations/g_flexmanip2rep.Rmd) {width=5%} peer review {width=5%} [data](https://knb.ecoinformatics.org/view/doi:10.5063/F1VX0F0W)
[](https://doi.org/10.24072/pci.ecology.100407)
```{r setup, include=FALSE}
library(knitr)
library(formatR)
knitr::opts_chunk$set(tidy.opts=list(width.cutoff=70),tidy=TRUE)
#Make code chunks wrap text so it doesn't go off the page when knitting to PDF
knitr::opts_chunk$set(echo=F, include=T, results='asis', warning=F, message=F)
#sets global options to display code along with the results https://exeter-data-analytics.github.io/LitProg/r-markdown.html
#set echo=F for knitting to PDF (hide code), and echo=T for knitting to HTML (show code)
```
**This is a revision of the post-study manuscript of the preregistration that was pre-study peer reviewed and received an In Principle Recommendation on 26 Mar 2019 by:**
Aurélie Coulon (2019) Can context changes improve behavioral flexibility? Towards a better understanding of species adaptability to environmental changes. *Peer Community in Ecology*, 100019. [10.24072/pci.ecology.100019](https://doi.org/10.24072/pci.ecology.100019). Reviewers: Maxime Dahirel and Andrea Griffin
**Preregistration:** [html](http://corinalogan.com/Preregistrations/g_flexmanip.html), [pdf](https://github.com/corinalogan/grackles/blob/master/Files/Preregistrations/g_flexmanipPassedPreStudyPeerReview26Mar2019.pdf), [rmd](https://github.com/corinalogan/grackles/blob/d17a75c24df4b90aa607eda452f4fcc496ae9409/Files/Preregistrations/g_flexmanip.Rmd)
**Post-study manuscript** we submitted the first version of the post-study manuscript to PCI Ecology for post-study peer review on 3 Jan 2022; we revised it per reviewer comments and this piece was split from the other, distinct components of the preregistrations and resubmitted on 15 Aug 2022; additional reviewer feedback was incorporated and resubmitted to PCI Ecology on 10 May 2023. This version was recommended, with one minor change to a citation. Recommended preprint [pdf](https://doi.org/10.32942/X2R59K) at EcoEvoRxiv (version 5), [rmd](https://github.com/corinalogan/grackles/blob/master/Files/Preregistrations/g_flexmanip2rep.Rmd).
\newpage
## ABSTRACT
Research into animal cognitive abilities is increasing quickly and often uses methods where behavioral performance on a task is assumed to represent variation in the underlying cognitive trait. However, because these methods rely on behavioral responses as a proxy for cognitive ability, it is important to validate that the task structure does, in fact, target the cognitive trait of interest rather than non-target cognitive, personality, or motivational traits (construct validity). Although it can be difficult, or impossible, to definitively assign performance to one cognitive trait, one way to validate that task structure is more likely to elicit performance based on the target cognitive trait is to assess the temporal and contextual repeatability of performance. In other words, individual performance is likely to represent an inherent trait when it is consistent across time and across similar or different tasks that theoretically test the same trait. Here, we assessed the temporal and contextual repeatability of performance on tasks intended to test the cognitive trait behavioral flexibility in great-tailed grackles (*Quiscalus mexicanus*). For temporal repeatability, we quantified the number of trials to form a color preference after each of multiple color reversals on a serial reversal learning task. For contextual repeatability, we then compared performance on the serial color reversal task to the latency to switch among solutions on each of two different multi-access boxes. We found that the number of trials to form a preference in reversal learning was repeatable across serial color reversals and the latency to switch a preference was repeatable across color reversal learning and the multi-access box contexts. This supports the idea that the reversal learning task structure elicits performance reflective of an inherent trait, and that reversal learning and solution switching on multi-access boxes similarly reflect the inherent trait of behavioral flexibility.
## KEYWORDS
Behavioral flexibility, repeatability, construct validity, animal cognition
\newpage
## INTRODUCTION
Research on the cognitive abilities of non-human animals is important for several reasons. By understanding animal cognitive abilities, we can clarify factors that influenced the evolution of human cognition, the mechanisms that relate cognition to ecological and evolutionary dynamics, or we can use the knowledge to facilitate more humane treatment of captive animals [@shettleworth2010cognition]. In the last 50 years, comparative psychologists and behavioral ecologists have led a surge in studies innovating methods for measuring cognitive traits in animals. As a result, we have come to understand cognition as the process of acquiring information, followed by storage, retrieval, and use of that information for guiding behavior [@shettleworth2010cognition]. Evidence now exists that various species possess cognitive abilities in both the physical [e.g. object permanence: @salwiczek2009development; causal understanding: @taylor2012end] and social domains [e.g. social learning: @hoppitt2012identification; transitive inference: @maclean2008social].
Cognitive traits are not directly observable and nearly all methods to quantify cognition use behavioral performance as a proxy for cognitive ability. Consequently, it is important to evaluate the validity of the chosen methods for quantifying a cognitive trait. To better understand whether performance on a type of task is likely to reflect a target cognitive trait (i.e., that the method has construct validity), researchers can test for repeatability in individual performance within and across tasks [@volter2018comparative]. However, while many cognitive abilities have been tested, and various methods used, it is rare for one study to repeatedly test individuals with the same method or use multiple methods to test for a given cognitive ability. This could be problematic because cognitive traits are not directly observable, so nearly all methods use behavioral performance as a proxy for cognitive ability. Using only one method to measure a cognitive trait could be problematic because it is hard to discern whether non-target cognitive, personality, or motivational factors may be the cause of variation in performance on the task [@morand2016studying]. For example, the success of pheasants on multiple similar and different problem-solving tasks was related to individual variation in persistence and motivation, rather than problem solving ability [@van2016problem]. Additionally, performance on cognitive tasks can be affected by different learning styles, where individuals can vary in their perception of the salience of stimuli within a task, the impact of a reward (or non-reward) on future behavior, or the propensity to sample alternative stimuli [@rowe2014measuring]. By assessing the temporal and contextual repeatability of performance, researchers can quantify the proportion of variation in performance that is attributable to consistent individual differences likely to reflect the level of the cognitive trait relative to other ephemeral factors that affect individual performance [@cauchoix2018repeatability].
Behavioral flexibility, the ability to change behavior when circumstances change, is a general cognitive ability that likely affects interactions with both the social and physical environment [@bond2007serial]. Although by definition behavioral flexibility incorporates plasticity in behavior through learning, there is also evidence that the ability to change behavior could be an inherent trait that varies among individuals and species. For example, the pinyon jay - a highly social species of corvid - made fewer errors in a serial reversal learning task than the more asocial Clark’s nutcracker or Woodhouse’s scrub-jay, but all three species exhibited similar learning curves over successive reversals [@bond2007serial]. This indicates that the three species differed in the level of the inherent ability, but were similar in the plasticity of performance through learning. Behavioral flexibility could be measured using a variety of methods [@mikhalevich_is_2017], but the most popular method is reversal learning [@bond2007serial] where behavioral flexibility is quantified as the speed that individuals are able to switch a learned preference. However, to our knowledge, no studies have assessed the construct validity of this task by comparing performance of individuals over time and across different tasks that are predicted to require flexible behavior.
In the wild, this ability to change behavior when circumstances change is expected to result in individuals and species that adapt quickly to novelty by showing a high rate of foraging innovations. For example, cross-taxon correlational studies found that species that were “behaviorally flexible”, in that there were many documented foraging innovations, were also more likely to become invasive when introduced to novel habitats [@sol2002behavioural]. The ability to innovate solutions to novel problems can also be more directly quantified using a multi-access or puzzle box task, where the subject must use new behavior patterns to solve the task to get food. While it is generally assumed that foraging innovation rate corresponds to the cognitive ability behavioral flexibility [@sol2002behavioural], few studies compare innovation performance and solution switching (a measure of flexibility) on a multi-access box task to performance on a different behavioral flexibility task like reversal learning.
We tested two hypotheses about the construct validity of the reversal learning method as a measure of behavioral flexibility in the great-tailed grackle (*Quiscalus mexicanus*; hereafter “grackle”). First, we determined whether performance on a reversal learning task represents an inherent trait by assessing the repeatability of performance across serial reversals (temporal repeatability). Secondly, we determined whether the inherent trait measured by color reversal learning is likely to represent behavioral flexibility by assessing the cross-contextual repeatability of performance on this task with another task also thought to measure flexibility. Our previous research found that behavioral flexibility does affect innovation ability on a multi-access box [@logan2023flexmanip], so here our second hypothesis tested whether individuals show contextual repeatability of flexibility by comparing performance on the color reversal learning task to the latency of solution switching on two different multi-access boxes (Fig. 1). We chose solution switching because it requires similar attention to changing reward contingencies, thus serving as a measure of flexibility, but in a different context (e.g. the food is always visible, there is no color association learning required). In other words, in both color reversal learning and solution switching individuals learned a preferred way to obtain food, but then contingencies changed such that food can no longer be obtained with this learned preference and the grackle must be able to switch to a new method. As a human-associated species, the grackle is an ideal subject for this study because the rapid range expansion suggests that they adapted quickly in response to human-induced rapid environmental change [@wehtje2003range; @summers2023xpop] and the genus *Quiscalus* has a high rate of foraging innovations in the wild [@lefebvre2008brains; @grabrucker2010rare]. Therefore, as their environment may select for flexible and innovative behavior, we believe that these tasks are ecologically relevant and will elicit individual variation in performance.
{width=50%}
**Figure 1.** We assessed flexibility as the latency to switch a preference across 3 contexts illustrated here. A) We used two colored containers (tubes) in a color reversal learning task, as well as B) plastic and C) wooden multi-access boxes that each had 4 possible ways (loci) to access food. In each context, after a preference for a color/locus was formed, we made the preferred choice non-functional and then measured the latency of the grackle to switch to a new color/locus.
## METHODS
### Preregistration details
This experiment was one piece (**H3a and H3b**) of a larger project. This project is detailed in the [preregistration](http://corinalogan.com/Preregistrations/g_flexmanip.html) that was written (2017), submitted to PCI Ecology for peer review (July 2018), and received the first round of peer reviews a few days before data collection began (Sep 2018). We revised and resubmitted this preregistration after data collection had started (Feb 2019) and it passed peer review (Mar 2019) before any of the planned analyses had been conducted. See also the [peer review history](https://doi.org/10.24072/pci.ecology.100019) at PCI Ecology. The hypotheses, methods, and analysis plan for this experiment are described in detail in that peer-reviewed preregistration. We give a summary of these methods here, with any changes from the preregistration summarized in the *Deviations from the preregistration* section in the supplementary material and additionally noted in each relevant section of the preregistration (indicated in italics).
### Hypotheses
Our first hypothesis considered whether behavioral flexibility (as measured by reversal learning of a color preference) would be repeatable within individuals across serial reversals. Secondly, we hypothesized that, as an inherent trait, behavioral flexibility results in repeatable performance across other contexts (Fig. 1) that require changing behavior when circumstances change (context 1=reversal learning on colored tubes, context 2=plastic multi-access box, context 3=wooden multi-access box).
### Subjects
Great-tailed grackles were caught in the wild in Tempe, Arizona USA using a variety of trapping methods. All individuals received color leg bands for individual identification and some individuals (n=34) were brought temporarily into aviaries. Grackles were individually housed in an aviary (each 244 cm long by 122 cm wide by 213 cm tall) for a maximum of six months where they had *ad lib* access to water at all times. During testing, we removed their maintenance diet for up to four hours per day. During this time, they had the opportunity to receive high value food items by participating in tests. Individuals were given three to four days to habituate to the aviaries before we began testing. At the end of testing, all individuals were released back to the wild at the location where they were caught and no euthanasia of subjects was necessary. This research is carried out in accordance with permits from the US Fish and Wildlife Service (scientific collecting permit number MB76700A-0,1,2), US Geological Survey Bird Banding Laboratory (federal bird banding permit number 23872), Arizona Game and Fish Department (scientific collecting license number SP594338 [2017], SP606267 [2018], and SP639866 [2019]), Institutional Animal Care and Use Committee at Arizona State University (protocol number 17-1594R), and the University of Cambridge ethical review process (non-regulated use of animals in scientific procedures: zoo4/17 [2017]).
### Serial color reversal learning
We first used serial reversal learning to measure grackle behavioral flexibility. Briefly, we trained grackles to search in one of two differently colored containers for food (Fig. 1A). We used a random number generator to select the color (e.g. light gray) of the container that would consistently contain a food reward across the initial trials. Within each trial, grackles could choose only one container to look in for food. Eventually, grackles showed a significant preference for the rewarded color container (where preference was defined as a minimum of 17 out of 20 correct choices), completing the initial discrimination trials. We then switched the location of the food to the container of the other color (a reversal). The food reward was then consistently located in the container of this second color (e.g. dark gray) across trials until the grackles learned to switch their preference, after which we would again reverse the food to the original colored container (e.g. light gray) and so on back and forth until they passed the serial reversal learning experiment passing criterion [formed a preference in 2 sequential reversals in 50 or fewer trials: @logan2023flexmanip]. We measured behavioral flexibility on each reversal as the time it took grackles to switch their preference and search in the second colored container on a minimum of 17 out of 20 trials. See the protocol for serial reversal learning [here](https://docs.google.com/document/d/18D80XZV_XCG9urVzR9WzbfOKFprDV62v3P74upu01xU/edit?usp=sharing).
### Multi-access boxes
We additionally used two different multi-access boxes (hereafter “MAB”) to assess behavioral flexibility as the latency to switch loci when a preferred locus becomes non-functional. We randomized the order that grackles received each MAB and all grackles were given time to habituate to the MABs prior to testing. We set up the MABs in the aviary of each grackle with food in and around each apparatus in the days prior to testing. At this point all loci were absent or fixed in open, non-functional positions to prevent any early learning of how to solve each apparatus. We began testing when the grackle was eating comfortably from the MAB. For each MAB, the goal was to measure how quickly the grackle could learn to solve each locus, and then how quickly they could switch to attempting to solve a new locus. Consequently, we measured the number of trials to solve a locus and the number of trials until the grackle attempted a new locus after a previously solved locus was made non-functional (solution switching). See protocols for MAB habituation and testing [here](https://docs.google.com/document/d/18D80XZV_XCG9urVzR9WzbfOKFprDV62v3P74upu01xU/edit?usp=sharing).
**Plastic multi-access box** This apparatus consisted of a box with transparent plastic walls (Fig. 1B). There was a pedestal within the box where the food was placed and 4 different options (loci) set within the walls for accessing the food. One locus was a window that, when opened, allowed the grackle to reach in to grab the food. The second locus was a shovel that the food was placed on such that, when turned, the food fell from the pedestal and rolled out of the box. The third locus was a string attached to a tab that the food was placed on such that, when pulled, the food fell from the pedestal and rolled out of the box. The last locus was a horizontal stick that, when pushed, would shove the food off the pedestal such that it rolled out of the box. Each trial was 10 minutes long, or until the grackle used a locus to retrieve the food item. We reset the box out of view of the grackle to begin the next trial. To pass criterion for a locus, the grackle had to get food out of the box after touching the locus only once (i.e. used a functional behavior to retrieve the food) in more than 2 trials across 2 sessions. Afterward, the locus is made non-functional to encourage the grackle to interact with the other loci.
**Wooden multi-access box** This apparatus consisted of a natural log that contained 4 compartments (loci) covered by transparent plastic doors (Fig. 1C). Each door opened in a different way (open up like a hatch, out to the side like a car door, pull out like a drawer, or push in). During testing, all doors were closed and food was placed in each locus. Each trial lasted 10 minutes or until the grackle opened a door. After solving a locus, the experimenter re-baited that compartment, closed the door out of view of the grackle, and the next trial began. After a grackle solved one locus 3 times, that door was fixed in the open position and the compartment left empty to encourage the grackle to attempt the other loci.
### Repeatability analysis
Repeatability is defined as the proportion of total variation in performance that is attributable to differences among individuals [@nakagawa2010repeatability]. In other words, performance is likely to represent an inherent trait when there is significant among-individual variation in performance across repeated samples.
To measure repeatability within an individual across serial reversals of a color preference, we modeled the number of trials to pass a reversal (choosing correctly on at least 17 out of 20 sequential trials) as a function of the reversal number (i.e., first time the rewarded color is reversed, second time, third time, etc.) and a random effect for individual. The reversal number for each grackle ranged between 6 to 11 (mean = 7.6) reversals, and the range was based on when individuals were able to pass two sequential reversals in 50 or fewer trials, or (in 1 case) when we reached the maximum duration that we were permitted to keep grackles in the aviaries and they needed to be released. We thus used the adjusted repeatability [@nakagawa2010repeatability] as the variance components for the random effect and residual variance, after accounting for the variance attributed to reversal number, to determine the proportion of variance attributable to differences among individuals. Although our dependent variable (number of trials to reverse) is a count variable, the distribution of values was not appropriate for a poisson regression. When checking the fit of our data to a poisson model, the data were overdispersed and heteroscedastic. However, when log-transformed, the data approximate a normal distribution and are not heteroscedastic, indicating the Gaussian model fits our log-transformed data well.
By design in the serial reversal learning experiment, to reach the experiment ending criteria grackles became faster at switching across serial reversals. We did attempt to run a model that additionally included a random slope to test whether there were consistent individual differences in the rate that grackles switched their preferences across reversals. However, we could not get the model to converge with our sample size and the uninformative priors that were preregistered. We felt most comfortable using the preregistered methods to avoid biasing the model output. To determine the statistical significance of the repeatability value, while accounting for this non-independence of a change in reversal speed over time, we compared the actual performance on the number of trials to switch a preference in each reversal to simulated data where birds performed randomly within each reversal.
We tested for contextual repeatability by modeling the variance in latency (in seconds) to switch a preference among and within individuals across 3 behavior switching contexts. Note that the time it took to switch a colored tube preference in serial reversal learning was measured in trials, but the time it took to switch loci in the MAB experiment was measured in seconds. We used the trial start times in the serial reversal experiment to convert the latency to switch a preference from number of trials to number of seconds. Therefore, the contexts across which we measured repeatability of performance were the latency to switch a preference to a new color in the color reversal learning task and latency to switch to a new locus after a previously solved locus was made non-functional on both MABs.
We used the DHARMa package [@hartig2019dharma] in R to test whether our model fit our data and was not heteroscedastic, zero-inflated or over-dispersed. We used the MCMCglmm package [@hadfield2010mcmc], with uninformative priors, to model the relationships of interest for our two hypotheses.
### Open data
All data are available at the Knowledge Network for Biocomplexity’s data repository: https://knb.ecoinformatics.org/view/doi:10.5063/F1VX0F0W [@mccune2022flexmaniprepdata].
## RESULTS
Our sample size was 9 individual grackles and 68 total data points (one value for each of the 6-11 reversals that each grackle experienced) for our first hypothesis testing temporal repeatability of reversal learning performance.
Performance was repeatable within individuals within the context of reversal learning (Fig. 2): we obtained a repeatability value of 0.13 (95% credible interval (CI) = 4.64x10^-16^ - 0.43). We found that, although the lower bound of the credible interval is approximately zero, the mean repeatability value was significantly greater than expected if birds were performing randomly [p=0.003; @nakagawa2010repeatability]. Furthermore, the distribution of the posterior estimates for the actual data were much less skewed towards zero compared to the permuted data of birds performing randomly (Fig. 3; see analysis details in the R code for Analysis Plan > P3a), though with the uncertainty we cannot completely exclude that individual identity might not influence performance. Consequently, and as preregistered, we did not conduct the analysis for the P3a alternative to determine whether a lack of repeatability was due to motivation or hunger.
```{r Fig2, eval = F, echo = F, message = F, warning = F}
library(ggplot2)
d <- read.csv(url("https://raw.githubusercontent.com/corinalogan/grackles/master/Files/Preregistrations/g_flexmanip_data_repeatability.csv"), header=T, sep=",", stringsAsFactors=F)
results <- read.csv("https://raw.githubusercontent.com/corinalogan/grackles/master/Files/Preregistrations/g_flexmanip2rep_SimulationResults.csv")
#remove NAs from the variables that will be in the model
d <- subset(d,!(is.na(d["TrialsToReverse"])))
d <- subset(d,!(is.na(d["ReverseNumber"])))
#include only those birds in the reversal tubes experiment
d <- d[d$TubesOrTouchscreen=="TUBES" & d$ExperimentalGroup=="Manipulation",]
#factor variable
d$ID <- as.factor(d$ID)
#remove pilot birds
d <- d[!d$ID=="Fajita" & !d$ID=="Empanada",]
# FIGURE 2 - performance on serial reversal learning
d$ReverseNumber = as.factor(d$ReverseNumber)
d$ID = as.factor(as.character(d$ID))
maxrev = aggregate(as.numeric(ReverseNumber) ~ ID, data = d, FUN = "max")
colnames(maxrev)[2] = "MaxRev"
d = merge(d, maxrev, by = "ID",all = T)
p<-ggplot(d, aes(x=reorder(ID,MaxRev),y=TrialsToReverse,shape=ReverseNumber,color = ID))+
geom_point(size = 4, stroke = 2, position = position_dodge(width = 0.35)) +
scale_shape_manual(values = c(0:6,8,15,17,19))+
theme_bw() +
theme(axis.text = element_text(size = 15),
axis.title = element_text(size = 18),
legend.title=element_text(size=15),
legend.text=element_text(size=14)) +
labs(x = "Bird", y = "Trials to pass reversal", shape = "Reversal number") +
scale_colour_discrete(guide = "none")
p
```
{width=75%}
**Figure 2:** The number of trials each individual took to reverse a preference across serial reversals. The clustering of data points within each individual illustrates the temporal repeatability in performance. Each reversal is indicated by a different shape. Individuals are grouped by color and arranged from fastest to slowest to complete the serial reversal experiment. Note that as per the serial reversal experimental design, data from nearly all individuals include 2 reversals at or below 50 trials. The one exception was Memela, who never increased the speed to switch her preference.
{width=75%}
**Figure 3:** Frequency histograms of posterior repeatability estimates from the model testing the latency to switch a color preference in a serial reversal learning experiment. To determine the significance of our repeatability value while accounting for the non-independence of the serial reversal learning experimental design, we compared our repeatability value to repeatability posterior estimates calculated from permuted data where birds performed randomly within each reversal. Estimates from actual data (red) are compared to the distribution of estimates from randomized permutations of the data (green). The vertical red line at 0.13 is the observed mean repeatability estimate reported in this manuscript and it was significantly greater than random.
We then assessed the repeatability of performance across contexts by quantifying whether individuals that were fast to switch a preference in the color reversal task were also fast to switch to attempting a new solution after passing criterion on a different solution on the two MAB tasks. We converted our metric of reversal speed from trials to reverse to seconds to reverse so the measures across contexts would be on the same scale. We had repeated measures across contexts for 15 grackles that participated in at least one color reversal and one solution switch on either or both MAB tasks. We found significant repeatability across contexts (R=0.36, CI = 0.10 - 0.64, p=0.01; Fig. 4), where latency to switch was consistent within individuals and different among individuals.
```{r Fig4, eval=F, echo=F, message=FALSE, warning=FALSE}
# ADDED JULY 2022 - we realized we mis-specified the model for assessing repeatability of flexibility across contexts. Previously we were analyzing the relationship between overall performance on the reversal learning and MAB tasks, rather than individual consistency of performance. Consequently we conducted an analysis that evaluated performance as a function of task type, reversal/switch number and a random effect of ID.
rev <- read.csv(url("https://raw.githubusercontent.com/corinalogan/grackles/master/Files/Preregistrations/g_flexmanip2rep_data_xcontext.csv"), header=T, sep=",", stringsAsFactors=F) # data file with reversal learning measure in seconds to reverse
rev <- rev[!rev$ID == "Diablo",] #remove Diablo who did not pass criterion on solving any MAB loci and so we can't measure solution switching
## PLOTTING RESULTS ##
library(tidyverse)
library(ggplot2)
rev = rev %>%
group_by(Test) %>%
mutate(scale.dur = scale(Dur.sec, center=T, scale=T)) %>%
ungroup()
rev.all3 = rev %>%
group_by(ID, Test) %>%
summarise(median = median(scale.dur),
lcl = quantile(scale.dur, probs=0.25),
ucl = quantile(scale.dur, probs=0.75)) %>%
ungroup() %>%
mutate(Test = recode(Test, "LatencyMABplastic"="MAB plastic",
"LatencyMABwooden"="MAB wood",
"Reversal" = "Reversal learning")) #%>%
#performance across all tests
x = ggplot(rev.all3, aes(x = reorder(ID, median, mean), y = median, colour = ID, shape = Test)) +
geom_pointrange(aes(ymin=lcl, ymax=ucl), size = 1.5,
position = position_dodge(width = 0.5)) +
scale_shape_manual(values = c(15,17,8,19))+
labs(x = "Bird", y = "Latency to switch (scaled and centered seconds)") +
theme_bw() +
theme(axis.text = element_text(size = 15),
axis.title = element_text(size = 18),
legend.title=element_text(size=15),
legend.text=element_text(size=14)) +
scale_fill_discrete(guide = "none") +
scale_colour_discrete(guide = "none")
x
```

**Figure 4:** Grackle performance on the different contexts for measuring behavioral flexibility: multi-access box (MAB) plastic (square symbol), MAB wood (triangle symbol), and reversal learning with color tubes (star symbol). Points indicate the (scaled and centered) median performance of an individual in each context, the lines indicate the interquartile range of variation in performance across multiple switches within a context. Some individuals participated in a context, but did not experience multiple preference switches and so there is a point, but no line. Additionally, some individuals are missing points for a given context because they did not participate. Grackles are ordered on the x-axis from fastest (left) to slowest (right).
## DISCUSSION
We found that individual grackles were consistent in their behavioral flexibility performance during multiple assessments within the same context, and across multiple assessments in different contexts. This indicates that 1) the different methods we used to measure behavioral flexibility all likely measure the same latent inherent aspects affecting performance and 2) there is consistent individual variation in behavioral flexibility, which could impact other traits such as survival and fitness in novel areas, foraging, or social behavior.
In behavioral and cognitive research on animals, it is important to determine that the chosen method measures the trait of interest (construct validity). Many experimental methods may lack construct validity because they were adapted from research on other species [e.g. from humans: @wood1980object], applied to new contexts [e.g. from captive to wild animals: @mccune2019captive], or created from an anthropomorphic perspective [e.g. mirror self recognition tasks: @kohda2022further]. Funding and logistical limitations result in few researchers assessing the appropriateness of their methods by testing construct validity through convergent (similar performance across similar tasks) and discriminant validity (different performance across different tasks). Although our sample size was small, which likely led to moderately large credible intervals, we still found significant temporal and contextual repeatability of switching performance. This evidence for convergent validity indicates these similar tasks are likely assessing aspects of the same latent trait or traits [@volter2018comparative; @morand2022cognitive]. However, performance can potentially be affected by many traits, so future studies manipulating other factors that might influence performance are needed to continue to pinpoint the latent traits governing aspects of performance on cognitive tasks. Thus, it is important to also test for discriminant validity by comparing performance on flexibility tasks with tasks intended to measure different cognitive abilities. For example, it is possible that performance on serial reversal learning and solution switching on the MAB tasks is reflective of a trait other than behavioral flexibility, like inhibition [@maclean2014evolution]. Indeed, we previously found that the more flexible grackles on the serial reversal learning task were also better able to inhibit responding to a non-rewarded stimulus in a go/no-go task thought to measure self-control [@logan2021inhibition]. Consequently, more research is needed to interpret whether some aspect of performance on the go/no-go task reflects behavioral flexibility or whether performance on the reversal learning task is influenced by inhibition.
The repeatability estimate for cross-contextual switching performance was higher than the estimate for switching performance within a context, indicating that a larger portion of the variance in cross-contextual performance is attributable to individual differences (lower residual variance and/or greater among-individual variance). Performance on a task likely depends on multiple cognitive processes, some of which might be more repeatable than others. For example, @lukas2022flexmanip found that performance on the serial reversal learning task was related to two distinct components - the rate of updating an attraction to a colored tube (phi) and the likelihood of deviating from the learned attractions (lambda), where phi appeared to show more individual consistency than lambda. Repeatability might be higher for cross-contextual switching depending on which cognitive processes dominate in a given task and across contexts. Variation in the design of our tasks may lead to higher residual variance in individual performance across reversals because food is hidden in the serial reversal learning task but clearly visible behind transparent plastic barriers in both MAB tasks. After a reversal, to determine which of the two colored tubes to search in for food, grackles cannot rely on short term memory of the previous location of food, they must have some motivation to search in a new color of tube (lambda). Consequently, it is possible that higher within-individual variation in performance across serial reversals in the latency to switch was related to the other factors affecting an individual’s decision-making on each trial, like conflicting memories of reward history for each color tube or a tendency to make a choice based on a side bias. In contrast, in the MAB tasks, even if the previously rewarded option is non-functional the grackles can clearly see that the food is still there and which may facilitate motivation to change their behavior regardless of past memories of reward contingencies or bias towards certain stimuli.
The functional importance of behavioral flexibility is that it is thought to facilitate invasion success by allowing individuals to quickly change their behavior when circumstances change in new environments. For example, flexible grackles may innovate new foraging techniques or generalize standard techniques to new food items in novel areas. The great-tailed grackle has rapidly expanded its range [@wehtje2003range; @summers2023xpop], implying that it is able to have high survival and fitness in the face of environmental change. Although grackles are a behaviorally flexible species [@logan2016flexibilityproblem], we show here that there are consistent individual differences among grackles in how quickly they are able to change their behavior when circumstances change in multiple different contexts. While some grackles were consistently faster at changing their behavior (e.g., Chilaquile), others were consistently slower (e.g., Yuca). This consistency in performance may seem contradictory to our previous research where we found that we are able to manipulate grackles to be more flexible using serial reversal learning [@logan2023flexmanip]. That behavioral flexibility is both repeatable within individuals across reversals, indicating it is an inherent trait, as well as being manipulatable through serial reversals, aligns with the idea of behavioral reaction norms [@sih2013understanding]. This idea states that individuals can show consistent individual differences in the baseline or average values of a trait of interest across time or contexts, but the plasticity in the expression of the trait can also consistently vary among individuals. Due to our small sample size, we were not able to explicitly test for behavioral reaction norms, but this is an important next step in understanding consistent individual variation in behavioral flexibility in relation to rapid environmental change. Past experience (developmental or evolutionary) with environmental change influences how plastic the individuals are able to be [@sih2013understanding]. To understand the implications of this individual variation in performance in this species that has experienced much environmental change during the range expansion, our future research investigates how behavioral flexibility may relate to proximity to the range edge [@logan2020xpop], and the variety of foraging techniques used in the wild [@logan2019flexforaging].
By first validating the experimental methods for behavioral and cognitive traits, such that we are more certain that our tests are measuring the intended trait, we are better able to understand the causes and consequences of species, population, and individual differences. Individual variation in behavioral flexibility has the potential to influence species adaptation and persistence under human-induced rapid environmental change [@sih2013understanding]. Consequently, we believe the results presented here are a timely addition to the field by demonstrating two potential methods for measuring behavioral flexibility that produced repeatable performance in at least one system.
## ETHICS
This research is carried out in accordance with permits from the:
1) US Fish and Wildlife Service (scientific collecting permit number MB76700A-0,1,2)
2) US Geological Survey Bird Banding Laboratory (federal bird banding permit number 23872)
3) Arizona Game and Fish Department (scientific collecting license number SP594338 [2017], SP606267 [2018], and SP639866 [2019])
4) Institutional Animal Care and Use Committee at Arizona State University (protocol number 17-1594R)
5) University of Cambridge ethical review process (non-regulated use of animals in scientific procedures: zoo4/17 [2017])
## AUTHOR CONTRIBUTIONS
**McCune:** Added MAB log experiment, hypothesis development, protocol development, data collection, data interpretation, write up, revising/editing, materials.
**Blaisdell:** Prediction revision, assisted with programming the reversal learning touchscreen experiment, protocol development, data interpretation, revising/editing.
**Johnson-Ulrich:** Prediction revision, programming, data collection, data interpretation, revising/editing.
**Lukas:** Hypothesis development, simulation development, data interpretation, revising/editing.
**MacPherson:** Data collection, data interpretation, revising/editing.
**Seitz:** Prediction revision, programmed the reversal learning touchscreen experiment, protocol development, data interpretation, revising/editing.
**Sevchik:** Data collection, revising/editing.
**Logan:** Hypothesis development, protocol development, data collection, data analysis and interpretation, revising/editing, materials/funding.
## FUNDING
This research is funded by the Department of Human Behavior, Ecology and Culture at the Max Planck Institute for Evolutionary Anthropology (2017-current), and by a Leverhulme Early Career Research Fellowship to Logan (2017-2018).
## CONFLICT OF INTEREST DISCLOSURE
We, the authors, declare that we have no financial conflicts of interest with the content of this article. CJ Logan is a Recommender and, until 2022, was on the Managing Board at PCI Ecology. D Lukas is a Recommender at PCI Ecology. Also, CJ Logan and D Lukas are associate editors at PeerJ.
## ACKNOWLEDGEMENTS
We thank our PCI Ecology recommender, Aurelie Coulon, and reviewers, Maxime Dahirel and Andrea Griffin, for their feedback on this preregistration; Kevin Langergraber for serving as our ASU IACUC PI; Ben Trumble and Angela Bond for logistical support; Melissa Wilson for sponsoring our affiliations at Arizona State University and lending lab equipment; Kristine Johnson for technical advice on great-tailed grackles; Arizona State University School of Life Sciences Department Animal Care and Technologies for providing space for our aviaries and for their excellent support of our daily activities; Julia Cissewski for tirelessly solving problems involving financial transactions and contracts; Sophie Kaube for logistical support; Richard McElreath for project support; Aaron Blackwell and Ken Kosik for being the UCSB sponsors of the Cooperation Agreement with the Max Planck Institute for Evolutionary Anthropology; Tiana Lam, Anja Becker, and Brynna Hood for interobserver reliability video coding: Sawyer Lung for field support; Alexis Breen for coding multi-access box videos; and our research assistants: Aelin Mayer, Nancy Rodriguez, Brianna Thomas, Aldora Messinger, Elysia Mamola, Michael Guillen, Rita Barakat, Adriana Boderash, Olateju Ojekunle, August Sevchik, Justin Huynh, Jennifer Berens, Amanda Overholt, Michael Pickett, Sam Munoz, Sam Bowser, Emily Blackwell, Kaylee Delcid, Sofija Savic, Brynna Hood, Sierra Planck, and Elise Lange.
\newpage
## SUPPLEMENTARY MATERIALS
### Deviations from the preregistration
**In the middle of data collection**
1) We originally planned to use a touchscreen test of serial reversal learning as one of the contexts in this experiment. However, on 10 April 2019 we **discontinued the reversal learning experiment on the touchscreen** because it appears to measure something other than what we intended to test and it requires a huge time investment for each bird (which consequently reduces the number of other tests they are available to participate in). This is not necessarily surprising because this is the first time touchscreen tests have been conducted in this species, and also the first time (to our knowledge) this particular reversal experiment has been conducted on a touchscreen with birds. We based this decision on data from four grackles (2 in the flexibility manipulation group and 2 in the flexibility control group; 3 males and 1 female). All four of these individuals showed highly inconsistent learning curves and required hundreds more trials to form each preference when compared to the performance of these individuals on the colored tube reversal experiment. It appears that there is a confounding variable with the touchscreen such that they are extremely slow to learn a preference as indicated by passing our criterion of 17 correct trials out of the most recent 20. We will not include the data from this experiment when conducting the cross-test comparisons in the Analysis Plan section of the preregistration.
2) 16 April 2019: Because we discontinued the touchscreen reversal learning experiment, we **added an additional but distinct multi-access box** task, which allowed us to continue to measure flexibility across three different experiments. There are two main differences between the first multi-access box, which is made of plastic, and the new multi-access box, which is made of wood. First, the wooden multi-access box is a natural log in which we carved out 4 compartments. As a result, the apparatus and solving options are more comparable to what grackles experience in the wild, though each compartment is covered by a transparent plastic door that requires different behaviors to open. Furthermore, there is only one food item available in the plastic multi-access box and the bird could use any of 4 loci to reach it. In contrast, the wooden multi-access box has a piece of food in each of the 4 separate compartments.
**Post data collection, pre-data analysis**
3) We completed our simulation to explore the lower boundary of a minimum sample size and determined that **our sample size for the Arizona study site is above the minimum** (see details and code in [Ability to detect actual effects](http://corinalogan.com/Preregistrations/g_flexmanip.html#ability_to_detect_actual_effects); 17 April 2020).
4) We originally planned on testing only **adults** to have a better understanding of what the species is capable of, assuming the abilities we are testing are at their optimal levels in adulthood, and so we could increase our statistical power by eliminating the need to include age as an independent variable in the models. Because the grackles in Arizona were extremely difficult to catch, we ended up testing two juveniles in this experiment. The juveniles’ performance on the three tests was similar to the adults, therefore we decided not to add age as an independent variable in the models to avoid reducing our statistical power.
**Post data collection, mid-data analysis**
5) The distribution of values for the "number of trials to reverse" response variable in the [P3a analysis](#p3a-repeatable-within-individuals-within-a-context-reversal-learning) was not a good fit for the Poisson distribution because it was overdispersed and heteroscedastic. We log-transformed the data to approximate a normal distribution and it passed all of the data checks. Therefore, we used a Gaussian distribution for our model, which fits the log-transformed data well. (24 Aug 2021)
6) We realized we mis-specified the model and variables for evaluating cross-contextual repeatability [P3b analysis](#p3b-individual-consistency-across-contexts). The dependent variable should be latency to switch to a new preference (we previously listed “number of trials to solve”, which is more likely indicative of innovation rather than flexibility). Furthermore, to assess performance across contexts, this dependent variable should be the latency to switch in each of the 3 contexts. Note that the time it took to switch a colored tube preference in serial reversal learning was measured in trials, but the time it took to switch loci in the MAB experiment was measured in seconds. We used the trial start times in the serial reversal experiment to convert the latency to switch a preference from number of trials to number of seconds. In line with this change in the dependent variable, the independent variables are only Context (MAB plastic, MAB wood, reversal learning), and reversal number (the number of times individuals switched a preference when the previously preferred color/locus was made non-functional). Additionally, this dependent variable was heteroscedastic when we used a Poisson model, but passed all data checks when we log-transformed it to use a Gaussian model.
### PREREGISTRATION (detailed methods)
#### HYPOTHESES
##### **H3a**: Behavioral flexibility within a context is repeatable within individuals.
Repeatability of behavioral flexibility is defined as the number of trials to reverse a color preference being strongly negatively correlated within individuals with the number of reversals.
**P3a:** Individuals that are faster to reverse a color preference in the first reversal will also be faster to reverse a color preference in the second, etc. reversal due to natural individual variation.
**P3a alternative:** There is no repeatability in behavioral flexibility within individuals, which could indicate that performance is state dependent (e.g., it depends on their fluctuating motivation, hunger levels, etc.). We will determine whether performance on colored tube reversal learning related to motivation by examining whether the latency to make a choice influenced the results. We will also determine whether performance was related to hunger levels by examining whether the number of minutes since the removal of their maintenance diet from their aviary plus the number of food rewards they received since then influenced the results.
##### **H3b**: The consistency of behavioral flexibility in individuals across contexts (context 1=reversal learning on colored tubes, context 2=multi-access boxes, context 3=reversal learning on touchscreen) indicates their ability to generalize across contexts.
Individual consistency of behavioral flexibility is defined as the number of trials to reverse a color preference being strongly positively correlated within individuals with the latency to solve new loci on each of the multi-access boxes and with the number of trials to reverse a color preference on a touchscreen (total number of touchscreen reversals = 5 per bird).
*If P3a is supported (repeatability of flexibility within individuals)...*
**P3b:** ...and flexibility is correlated across contexts, then the more flexible individuals are better at generalizing across contexts.
**P3b alternative 1:** ...and flexibility is not correlated across contexts, then there is something that influences an individual's ability to discount cues in a given context. This could be the individual's reinforcement history (tested in P3a alternative), their reliance on particular learning strategies (one alternative is tested in H4), or their motivation (tested in P3a alternative) to engage with a particular task (e.g., difficulty level of the task).
#### DEPENDENT VARIABLES
*P3a and P3a alternative 1*
Number of trials to reverse a preference. An individual is considered to have a preference if it chose the rewarded option at least 17 out of the most recent 20 trials (with a minimum of 8 or 9 correct choices out of 10 on the two most recent sets of 10 trials). We use a sliding window to look at the most recent 10 trials for a bird, regardless of when the testing sessions occurred.
*P3b: additional analysis: individual consistency in flexibility across contexts + flexibility is correlated across contexts*
Number of trials to solve a new locus on the multi-access boxes *NOTE: Jul 2022 we realized this variable is more likely to represent innovation, and we mean to assess flexibility here. Therefore we changed this variable to latency to attempt to switch a preference after the previously rewarded color/locus becomes non-functional.*
#### INDEPENDENT VARIABLES
*P3a: repeatable within individuals within a context*
1) Reversal number
2) ID (random effect because repeated measures on the same individuals)
*P3a alternative 1: was the potential lack of repeatability on colored tube reversal learning due to motivation or hunger?*
1) Trial number
2) Latency from the beginning of the trial to when they make a choice
3) Minutes since maintenance diet was removed from the aviary
4) Cumulative number of rewards from previous trials on that day
5) ID (random effect because repeated measures on the same individuals)
6) Batch (random effect because repeated measures on the same individuals). Note: batch is a test cohort, consisting of 8 birds being tested simultaneously
*P3b: repeatable across contexts*
*NOTE: Jul 2022 we changed the dependent variable to reflect the general latency to switch a preference (in any of the three tasks) and so IVs 3 (Latency to solve a new locus) & 4 (Number of trials to reverse a preference), below, are redundant. Furthermore, we did not include the touchscreen experiment in this manuscript (previously accounted for with IV 5; see the Deviations section). Therefore, despite being listed here in the preregistration as IVs that we proposed to include in the P3b model, in our post-study manuscript we did not include these IVs in the final model. The IVs instead consisted of: Reversal (switch) number, Context (colored tubes, plastic multi-access box, wooden multi-access box) and ID (random effect because there were repeated measures on the same individuals).*
1) Reversal (switch) number
2) Context (colored tubes, plastic multi-access box, wooden multi-access box, touchscreen)
3) Latency to solve a new locus
4) Number of trials to reverse a preference (colored tubes)
5) Number of trials to reverse a preference (touchscreen)
6) ID (random effect because repeated measures on the same individuals)
#### ANALYSIS PLAN
*P3a: repeatable within individuals within a context (reversal learning)*
**Analysis:** Is reversal learning (colored tubes) repeatable within individuals within a context (reversal learning)? We will obtain repeatability estimates that account for the observed and latent scales, and then compare them with the raw repeatability estimate from the null model. The repeatability estimate indicates how much of the total variance, after accounting for fixed and random effects, is explained by individual differences (ID). We will run this GLMM using the MCMCglmm function in the MCMCglmm package [@hadfield2010mcmc] with a Poisson distribution and log link using 13,000 iterations with a thinning interval of 10, a burnin of 3,000, and minimal priors [V=1, nu=0; @hadfield2014coursenotes]. We will ensure the GLMM shows acceptable convergence [i.e., lag time autocorrelation values <0.01; @hadfield2010mcmc], and adjust parameters if necessary.
*NOTE (Aug 2021): our data checking process showed that the distribution of values of the data (number of trials to reverse) in this model was not a good fit for the Poisson distribution because it was overdispersed and heteroscedastic. However, when log-transformed the data approximate a normal distribution and pass all of the data checks, therefore we used a Gaussian distribution for our model, which fits the log-transformed data well.*
To roughly estimate our ability to detect actual effects (because these power analyses are designed for frequentist statistics, not Bayesian statistics), we ran a power analysis in G*Power with the following settings: test family=F tests, statistical test=linear multiple regression: Fixed model (R^2 deviation from zero), type of power analysis=a priori, alpha error probability=0.05. The number of predictor variables was restricted to only the fixed effects because this test was not designed for mixed models. We reduced the power to 0.70 and increased the effect size until the total sample size in the output matched our projected sample size (n=32). The protocol of the power analysis is here:
*Input:*
Effect size f² = 0.21
α err prob = 0.05
Power (1-β err prob) = 0.7
Number of predictors = 1
*Output:*
Noncentrality parameter λ = 6.7200000
Critical F = 4.1708768
Numerator df = 1
Denominator df = 30
Total sample size = 32
Actual power = 0.7083763
This means that, with our sample size of 32, we have a 71% chance of detecting a medium effect [approximated at f^2^=0.15 by @cohen1988statistical].
```{r repeatableR, eval=F}
d <- read.csv(url("https://raw.githubusercontent.com/corinalogan/grackles/master/Files/Preregistrations/g_flexmanip2rep_data.csv"), header=T, sep=",", stringsAsFactors=F)
#remove NAs from the variables that will be in the model
d <- subset(d,!(is.na(d["TrialsToReverse"])))
d <- subset(d,!(is.na(d["ReverseNumber"])))
#include only those birds in the reversal tubes experiment
d <- d[d$TubesOrTouchscreen=="TUBES" & d$ExperimentalGroup=="Manipulation",]
#factor variable
d$ID <- as.factor(d$ID)
#remove pilot birds
d <- d[!d$ID=="Fajita" & !d$ID=="Empanada",]
#n=9
#length(unique(d$ID))
# DATA CHECKING
# ADDED Aug 2021 - Although our dependent variable (number of trials to reverse) is a count variable, the distribution of values was not appropriate for a poisson regression. When checking the fit of our data to a Poisson model the data were overdispersed and heteroscedastic. However, when log-transformed the data approximate a normal distribution and pass all of the below data checks, indicating the Gaussian model fits our log-transformed data well.
library(DHARMa)
library(lme4)
simulationOutput <- simulateResiduals(fittedModel = glmer(log(TrialsToReverse) ~ ReverseNumber + (1|ID), family=gaussian, data=d), n=250) #250 simulations, but if want higher precision change n>1000; Log transform because trials does not fit a poisson distribution.
plot(simulationOutput$scaledResiduals) #Expect a flat distribution of the overall residuals, and uniformity in y direction if plotted against any predictor. Looks randomly scattered
testDispersion(simulationOutput) #if under- or over-dispersed, then p-value<0.05, but then check the dispersion parameter and try to determine what in the model could be the cause and address it there, also check for zero inflation.
#p=0.84
testZeroInflation(simulationOutput) #compare expected vs observed zeros, not zero-inflated if p>0.05.
#p=1 so not zero inflated
testUniformity(simulationOutput) #check for heteroscedasticity ("a systematic dependency of the dispersion / variance on another variable in the model" Hartig, https://cran.r-project.org/web/packages/DHARMa/vignettes/DHARMa.html), which is indicated if dots aren't on the red line and p<0.05.
#p=0.81 so NOT heteroscedastic
plot(simulationOutput) #...there should be no pattern in the data points in the right panel.
#No significant problems detected
# GLMM
library(MCMCglmm)
prior = list(R=list(R1=list(V=1,nu=0)), G=list(G1=list(V=1,nu=0)))
serial <- MCMCglmm(log(TrialsToReverse) ~ ReverseNumber, random=~ID, family="gaussian", data=d, verbose=F, prior=prior, nitt=1000000, thin=100, burnin=10000)
summary(serial) #changed nitt, etc to get units to converge. samples =
autocorr(serial$Sol) #Did fixed effects converge (<0.1)? yes
autocorr(serial$VCV) #Did random effects converge (<0.1)? yes
plot(serial)
#REPEATABILITY
#In MCMCglmm, the latent scale adjusted repeatability and its credible interval can simply be obtained by: serial$VCV[,ID]/(serial$VCV[,ID]+serial$VCV[,units]) - advice from Maxime Dahirel
repeata <- serial$VCV[,"ID"]/(serial$VCV[,"ID"]+serial$VCV[,"units"]) #latent scale adjusted repeatability and its credible interval
mean(repeata) #0.13
var(repeata) #0.02 variance
posterior.mode(repeata) #<0.01
HPDinterval(repeata, 0.95) #4.64e-16 to 0.43, probability=0.95
## ADDED Aug 2021
# Is 0.13 a statistically significant repeatability? Test whether it is significantly greater than expected at chance by permuting number of trials to reverse among individuals.
# NOTE: Because the flexibility manipulation requires the last two reversals to be less than or equal to 50 trials, and ReverseNumber is significant, indicating birds generally get faster over time, we must permute TrialsToReverse across birds within ReverseNumber.
# NOTE: This permutation takes a looooong time.
results = rep(NA, 1000)
for(i in 1:1000){
tmp1 = data.frame("ID" = d$ID[which(d$ReverseNumber==1)], "ReverseNumber"=d$ReverseNumber[which(d$ReverseNumber==1)],
"TrialsToReverse" = sample(d$TrialsToReverse[which(d$ReverseNumber==1)],replace=F))
tmp2 = data.frame("ID" = d$ID[which(d$ReverseNumber==2)], "ReverseNumber"=d$ReverseNumber[which(d$ReverseNumber==2)],
"TrialsToReverse" = sample(d$TrialsToReverse[which(d$ReverseNumber==2)],replace=F))
tmp3 = data.frame("ID" = d$ID[which(d$ReverseNumber==3)], "ReverseNumber"=d$ReverseNumber[which(d$ReverseNumber==3)],
"TrialsToReverse" = sample(d$TrialsToReverse[which(d$ReverseNumber==3)],replace=F))
tmp4 = data.frame("ID" = d$ID[which(d$ReverseNumber==4)], "ReverseNumber"=d$ReverseNumber[which(d$ReverseNumber==4)],
"TrialsToReverse" = sample(d$TrialsToReverse[which(d$ReverseNumber==4)],replace=F))
tmp5 = data.frame("ID" = d$ID[which(d$ReverseNumber==5)], "ReverseNumber"=d$ReverseNumber[which(d$ReverseNumber==5)],
"TrialsToReverse" = sample(d$TrialsToReverse[which(d$ReverseNumber==5)],replace=F))
tmp6 = data.frame("ID" = d$ID[which(d$ReverseNumber==6)], "ReverseNumber"=d$ReverseNumber[which(d$ReverseNumber==6)],
"TrialsToReverse" = sample(d$TrialsToReverse[which(d$ReverseNumber==6)],replace=F))
tmp7 = data.frame("ID" = d$ID[which(d$ReverseNumber==7)], "ReverseNumber"=d$ReverseNumber[which(d$ReverseNumber==7)],
"TrialsToReverse" = sample(d$TrialsToReverse[which(d$ReverseNumber==7)],replace=F))
tmp8 = data.frame("ID" = d$ID[which(d$ReverseNumber==8)], "ReverseNumber"=d$ReverseNumber[which(d$ReverseNumber==8)],
"TrialsToReverse" = sample(d$TrialsToReverse[which(d$ReverseNumber==8)],replace=F))
tmp = rbind(tmp1,tmp2,tmp3,tmp4,tmp5,tmp6,tmp7,tmp8)
m <- MCMCglmm(log(TrialsToReverse) ~ ReverseNumber, random=~ID, family="gaussian", data=tmp, verbose=F, prior=prior, nitt=1000000, thin=100, burnin=10000)
if(i==1){
rpt <- m$VCV[,"ID"]/(m$VCV[,"ID"]+m$VCV[,"units"]) #latent scale adjusted repeatability and its credible interval from the model
} else{
rpt = c(rpt, m$VCV[,"ID"]/(m$VCV[,"ID"]+m$VCV[,"units"])) #building by row all of the posterior estimates from each iteration of the permutation
}
results[i] = mean(rpt) #average repeatability estimate from each iteration of the permutation
}
hist(results) # average repeatability estimate from each iteration of the permutation, which randomizes choices within reversals
abline(v=0.13,col="red")
sum(results > 0.13)/1000 # probability the repeatability estimate from our actual data is greater than the repeatability estimates generated from the permutations which randomized the data
#p = 0.003 - Our repeatability value of 0.13 is significantly greater than that expected if birds are performing randomly in each reversal
write.csv(rpt, "SimulationResults_FlexManipRepeatability.csv") #very large data frame of 9900000 values
## Addressing M. Dahirel's comment in Revision 3 - incorporating variability in estimates of actual and permuted data into determination of significance
# Take all 19 million permuted estimates, for each permuted estimate pair it with one randomly sampled (with replacement) estimate from the actual data. This incorporates all of the variability in each data set.
# Then ask whether estimate from actual data is greater than random estimate to get the proportion of actual repeatability estimates greater than random.
rpt_permut = as.data.frame(rpt)
rpt_permut$choice = sample(repeata,nrow(rpt_permut),replace = T)
rpt_permut$gt_rand = rpt_permut$choice > rpt_permut$rpt
sum(rpt_permut$gt_rand)
#7811656
7811656/9900000
#0.79
# 21% chance p-value of observed data is not different from random
## Figure 3:
library(tidyr)
library(ggplot2)
rpt.plot = gather(rpt_permut, data.type, estimate, rpt:choice)
rpt.plot$data.type = as.factor(rpt.plot$data.type)
levels(rpt.plot$data.type)[levels(rpt.plot$data.type)=="rpt"] <- "permuted data"
levels(rpt.plot$data.type)[levels(rpt.plot$data.type)=="choice"] <- "actual data"
png(file = "Figure 3", width = 1100, height = 900, units = "px")
ggplot(rpt.plot, aes(x = estimate, fill = data.type))+
geom_histogram(position = "identity", alpha = 0.5) +
geom_vline(xintercept=0.13, color = "red", size=1) +
theme_bw()+
theme(axis.text = element_text(size = 15),
axis.title = element_text(size = 18),
legend.title = element_text(size = 15),
legend.text = element_text(size = 14)) +
scale_fill_discrete(name = "Data type")
dev.off()
### 10 Nov 2022 - Poisson model was overdispersed and heteroscedastic - per recommendation from the literature and reviewer, we added observation-level random effects. However, model would not converge with the preregistered uninformative priors so we are not pursuing this alternative analysis: ###
# d$OLRE = as.factor(seq_len(nrow(d)))
# simulationOutput <- simulateResiduals(fittedModel = glmer(TrialsToReverse ~ ReverseNumber + (1|ID) + (1|OLRE), family=poisson, data=d), n=250)
# plot(simulationOutput$scaledResiduals)
# #Looks randomly scattered
# testDispersion(simulationOutput) #if under- or over-dispersed, then p-value<0.05, but then check the dispersion parameter and try to determine what in the model could be the cause and address it there, also check for zero inflation.
# #p=0.53
# testZeroInflation(simulationOutput) #compare expected vs observed zeros, not zero-inflated if p>0.05.
# #p=1 No zeros, so not zero inflated
# testUniformity(simulationOutput) #check for heteroscedasticity ("a systematic dependency of the dispersion / variance on another variable in the model" Hartig, https://cran.r-project.org/web/packages/DHARMa/vignettes/DHARMa.html), which is indicated if dots aren't on the red line and p<0.05. p=0.78 so NOT heteroscedastic
# plot(simulationOutput) #...there should be no pattern in the data points in the right panel.
# #No significant problems detected
#
# prior2 = list(R=list(R1=list(V=1,nu=0)), G=list(G1=list(V=1,nu=0),G2=list(V=1,nu=0)))
# serial2 <- MCMCglmm(TrialsToReverse ~ ReverseNumber, random=~ID + OLRE, family="poisson", data=d, verbose=F, prior=prior, nitt=1000000, thin=1000, burnin=5000)
# ## ERROR - mixed model equations singular: use a (stronger) prior ##
# summary(serial2)
# autocorr(serial2$Sol)
# autocorr(serial2$VCV)
# plot(serial2)
#
# #REPEATABILITY
#
# #In MCMCglmm, the latent scale adjusted repeatability and its credible interval can simply be obtained by: serial$VCV[,ID]/(serial$VCV[,ID]+serial$VCV[,units]) - advice from Maxime Dahirel
# repeata2 <- serial2$VCV[,"ID"]/(serial2$VCV[,"ID"]+serial2$VCV[,"OLRE"]) #latent scale adjusted repeatability and its credible interval
# mean(repeata2)
# var(repeata2)
# posterior.mode(repeata2)
# HPDinterval(repeata2, 0.95)
#
#
# # ADDED Aug 2021
# #Test whether repeatability value is significantly greater than expected at chance by permuting number of trials to reverse among individuals.
# #NOTE: Because the flexibility manipulation requires the last two reversals to be less than or equal to 50 trials, and ReverseNumber is significant, indicating birds generally get faster over time, we must permute TrialsToReverse across birds within ReverseNumber.
#
# results2 = rep(NA, 1000)
# for(i in 1:1000){
# tmp1 = data.frame("ID" = d$ID[which(d$ReverseNumber==1)], "ReverseNumber"=d$ReverseNumber[which(d$ReverseNumber==1)],
# "TrialsToReverse" = sample(d$TrialsToReverse[which(d$ReverseNumber==1)],replace=F))
# tmp2 = data.frame("ID" = d$ID[which(d$ReverseNumber==2)], "ReverseNumber"=d$ReverseNumber[which(d$ReverseNumber==2)],
# "TrialsToReverse" = sample(d$TrialsToReverse[which(d$ReverseNumber==2)],replace=F))
# tmp3 = data.frame("ID" = d$ID[which(d$ReverseNumber==3)], "ReverseNumber"=d$ReverseNumber[which(d$ReverseNumber==3)],
# "TrialsToReverse" = sample(d$TrialsToReverse[which(d$ReverseNumber==3)],replace=F))
# tmp4 = data.frame("ID" = d$ID[which(d$ReverseNumber==4)], "ReverseNumber"=d$ReverseNumber[which(d$ReverseNumber==4)],
# "TrialsToReverse" = sample(d$TrialsToReverse[which(d$ReverseNumber==4)],replace=F))
# tmp5 = data.frame("ID" = d$ID[which(d$ReverseNumber==5)], "ReverseNumber"=d$ReverseNumber[which(d$ReverseNumber==5)],
# "TrialsToReverse" = sample(d$TrialsToReverse[which(d$ReverseNumber==5)],replace=F))
# tmp6 = data.frame("ID" = d$ID[which(d$ReverseNumber==6)], "ReverseNumber"=d$ReverseNumber[which(d$ReverseNumber==6)],
# "TrialsToReverse" = sample(d$TrialsToReverse[which(d$ReverseNumber==6)],replace=F))
# tmp7 = data.frame("ID" = d$ID[which(d$ReverseNumber==7)], "ReverseNumber"=d$ReverseNumber[which(d$ReverseNumber==7)],
# "TrialsToReverse" = sample(d$TrialsToReverse[which(d$ReverseNumber==7)],replace=F))
# tmp8 = data.frame("ID" = d$ID[which(d$ReverseNumber==8)], "ReverseNumber"=d$ReverseNumber[which(d$ReverseNumber==8)],
# "TrialsToReverse" = sample(d$TrialsToReverse[which(d$ReverseNumber==8)],replace=F))
# tmp = rbind(tmp1,tmp2,tmp3,tmp4,tmp5,tmp6,tmp7,tmp8)
# m2 <- MCMCglmm(TrialsToReverse ~ ReverseNumber, random=~ID + OLRE, family="poisson", data=tmp, verbose=F, prior=prior, nitt=10000, thin=50, burnin=500)
# rpt2 <- m2$VCV[,"ID"]/(m2$VCV[,"ID"]+m2$VCV[,"OLRE"]) #latent scale adjusted repeatability and its credible interval
# results2[i] = mean(rpt2)
# }
# #hist(results)
# #abline(v=,col="red")
# #sum(results > )/1000
# #p = -
# FIGURE 2 - performance on serial reversal learning
d$ReverseNumber = as.factor(d$ReverseNumber)
d$ID = as.factor(as.character(d$ID))
maxrev = aggregate(as.numeric(ReverseNumber) ~ ID, data = d, FUN = "max")
colnames(maxrev)[2] = "MaxRev"
d = merge(d, maxrev, by = "ID",all = T)
png(file = "Figure 2", width = 1700, height = 900, units = "px")
ggplot(d, aes(x=reorder(ID,MaxRev),y=TrialsToReverse,shape=ReverseNumber,color = ID))+
geom_point(size = 9, stroke = 4, position = position_dodge(width = 0.35)) +
scale_shape_manual(values = c(0:6,8,15,17,19))+
theme_bw() +
theme(axis.text = element_text(size = 20),
axis.title = element_text(size = 24),
legend.title = element_text(size = 20),
legend.text = element_text(size = 20),
legend.key.height = unit(1.4,"cm")) +
labs(x = "Bird", y = "Trials to pass reversal", shape = "Reversal number") +
scale_colour_discrete(guide = "none")
dev.off()
# results = as.data.frame(results)
# f<-ggplot(results, aes(x=results)) +
# theme_bw() +
# geom_histogram(color="black", fill="light grey",binwidth = 0.01) +
# geom_vline(xintercept = 0.13, color = "red") +
# theme(axis.text = element_text(size = 10),
# axis.title = element_text(size = 16)) +
# labs(x = "Repeatability", y = "Frequency")
# library(ggpubr)
# fig2 = ggarrange(p,f,
# labels = c("A","B"),
# ncol = 1, nrow = 2)
# WE DID NOT end up using the code below because the above gave us what we needed
# Repeatability on the data/observed scale (accounting for fixed effects)
#code from Supplementary Material S2 from Villemereuil et al. 2018 J Evol Biol
# vf <- sapply(1:nrow(serial[["Sol"]]), function(i) {
# var(predict(serial, it=i))
# }) #estimates for each iteration of the MCMC
#
# repeataF <- (vf+serial$VCV[,"ID"])/(vf+serial$VCV[,"ID"]+serial$VCV[,"units"]) #latent scale adjusted + data scale
# posterior.mode(repeataF) #0.998
# HPDinterval(repeataF, 0.95) #0.992 to 0.9998, probability=0.952
#
# # Now compare with the raw repeatability: null model. NOTE: we shouldn't run this one because the reversal was a manipulation so the reverse number must be included
# serialraw <- MCMCglmm(TrialsToReverse ~ 1, random=~ID, family="poisson", data=d, verbose=F, prior=prior, nitt=50000, thin=100, burnin=25000)
# #summary(serialraw)
#
# repeataraw <- serialraw$VCV[,"ID"]/(serialraw$VCV[,"ID"]+serialraw$VCV[,"units"]) #latent scale adjusted repeatability and its credible interval
# posterior.mode(repeataraw) # -0.00002
# HPDinterval(repeataraw, 0.95) #7.2e-16 to 0.18, probability=0.952
```
*P3a alternative: was the potential lack of repeatability on colored tube reversal learning due to motivation or hunger?*
**Analysis:** Because the independent variables could influence each other or measure the same variable, I will analyze them in a single model: Generalized Linear Mixed Model [GLMM; MCMCglmm function, MCMCglmm package; @hadfield2010mcmc] with a binomial distribution (called categorical in MCMCglmm) and logit link using 13,000 iterations with a thinning interval of 10, a burnin of 3,000, and minimal priors (V=1, nu=0) [@hadfield2014coursenotes]. We will ensure the GLMM shows acceptable convergence [lag time autocorrelation values <0.01; @hadfield2010mcmc], and adjust parameters if necessary. The contribution of each independent variable will be evaluated using the Estimate in the full model. *NOTE (Apr 2021): This analysis is restricted to data from their first reversal because this is the only reversal data that is comparable across the manipulated and control groups.*
To roughly estimate our ability to detect actual effects (because these power analyses are designed for frequentist statistics, not Bayesian statistics), we ran a power analysis in G*Power with the following settings: test family=F tests, statistical test=linear multiple regression: Fixed model (R^2 deviation from zero), type of power analysis=a priori, alpha error probability=0.05. We reduced the power to 0.70 and increased the effect size until the total sample size in the output matched our projected sample size (n=32). The number of predictor variables was restricted to only the fixed effects because this test was not designed for mixed models. The protocol of the power analysis is here:
*Input:*
Effect size f² = 0.31
α err prob = 0.05
Power (1-β err prob) = 0.7
Number of predictors = 4
*Output:*
Noncentrality parameter λ = 11.4700000
Critical F = 2.6684369
Numerator df = 4
Denominator df = 32
Total sample size = 37
Actual power = 0.7113216
This means that, with our sample size of 32, we have a 71% chance of detecting a large effect [approximated at f^2^=0.35 by @cohen1988statistical].
```{r motivation, eval=F}
d <- read.csv(url("https://raw.githubusercontent.com/corinalogan/grackles/master/Files/Preregistrations/g_flexmanip_data_reverseraw.csv"), header=T, sep=",", stringsAsFactors=F)
d <- d[d$Reversal==1,]
#want only data from reversal 1 (their first reversal) because this is the only reversal data that is comparable across birds in the control and manipulated groups
head(d)
# DATA CHECKING
library(DHARMa)
library(lme4)
simulationOutput <- simulateResiduals(fittedModel = glmer(CorrectChoice ~ Trial + LatencyToChoose + MinSinceFoodRemoved + NumberRewardsFromPrevTrials + (1|ID) + (1|Batch), family=binomial, data=d), n=250) #250 simulations, but if want higher precision change n>1000
simulationOutput$scaledResiduals #Expect a flat distribution of the overall residuals, and uniformity in y direction if plotted against any predictor
testDispersion(simulationOutput) #if under- or over-dispersed, then p-value<0.05, but then check the dispersion parameter and try to determine what in the model could be the cause and address it there, also check for zero inflation
testZeroInflation(simulationOutput) #compare expected vs observed zeros, not zero-inflated if p<0.05
testUniformity(simulationOutput) #check for heteroscedasticity ("a systematic dependency of the dispersion / variance on another variable in the model" Hartig, https://cran.r-project.org/web/packages/DHARMa/vignettes/DHARMa.html), which is indicated if dots aren't on the red line and p<0.05. Also...
plot(simulationOutput) #...there should be no pattern in the data points in the right panel
plotResiduals(LatencyToChoose, simulationOutput$scaledResiduals) #plot the residuals against other predictors - can't get this code to work yet
# GLMM - Is trial the main independent variable associated with learning performance (CorrectChoice) or are other variables associated with performance, including motivation and hunger?
library(MCMCglmm)
prior = list(R=list(R1=list(V=1,nu=0),R2=list(V=1,nu=0),R3=list(V=1,nu=0),R4=list(V=1,nu=0)), G=list(G1=list(V=1,nu=0),G2=list(V=1,nu=0)))
rr1 <- MCMCglmm(CorrectChoice ~ Trial + LatencyToChoose + MinSinceFoodRemoved + NumberRewardsFromPrevTrials, random=~ID+Batch, family="categorical", data=d, verbose=F, prior=prior, nitt=13000, thin=10, burnin=3000)
summary(rr1)
autocorr(rr1$Sol) #Did fixed effects converge?
autocorr(rr1$VCV) #Did random effects converge?
```
*P3b: individual consistency across contexts*
**Analysis:** Do those individuals that are faster to reverse a color preference also have lower latencies to switch to new options on the multi-access box? A Generalized Linear Mixed Model [GLMM; MCMCglmm function, MCMCglmm package; [@hadfield2010mcmc] will be used with a Poisson distribution and log link using 13,000 iterations with a thinning interval of 10, a burnin of 3,000, and minimal priors (V=1, nu=0) [@hadfield2014coursenotes]. We will ensure the GLMM shows acceptable convergence [lag time autocorrelation values <0.01; @hadfield2010mcmc], and adjust parameters if necessary. We will determine whether an independent variable had an effect or not using the Estimate in the full model.
To roughly estimate our ability to detect actual effects (because these power analyses are designed for frequentist statistics, not Bayesian statistics), we ran a power analysis in G\*Power with the following settings: test family=F tests, statistical test=linear multiple regression: Fixed model (R^2 deviation from zero), type of power analysis=a priori, alpha error probability=0.05. We reduced the power to 0.70 and increased the effect size until the total sample size in the output matched our projected sample size (n=32). The number of predictor variables was restricted to only the fixed effects because this test was not designed for mixed models. The protocol of the power analysis is here:
*Input:*
Effect size f² = 0.21
α err prob = 0.05
Power (1-β err prob) = 0.7
Number of predictors = 1
*Output:*
Noncentrality parameter λ = 6.7200000
Critical F = 4.1708768
Numerator df = 1
Denominator df = 30
Total sample size = 32
Actual power = 0.7083763
This means that, with our sample size of 32, we have a 71% chance of detecting a medium effect [approximated at f^2^=0.15 by @cohen1988statistical].
```{r consistent1, eval=F, echo=F, message=FALSE, warning=FALSE}
# ADDED JULY 2022 - we realized we mis-specified the model for assessing repeatability of flexibility across contexts. Previously we were analyzing the relationship between overall performance on the reversal learning and MAB tasks, rather than individual consistency of performance. Consequently we conducted an analysis that evaluated performance as a function of task type, reversal/switch number and a random effect of ID.
library(lubridate)
library(DHARMa)
library(lme4)
rev <- read.csv(url("https://raw.githubusercontent.com/corinalogan/grackles/master/Files/Preregistrations/g_flexmanip2rep_data_xcontext.csv"), header=T, sep=",", stringsAsFactors=F) # data file with reversal learning and MAB switch latency measure in seconds to reverse. "cent.rev" is the scaled and centered Reversal Number variable to facilitate model fit.
rev <- rev[!rev$ID == "Diablo",] #remove Diablo who did not pass criterion on solving any MAB loci and so we can't measure solution switching
simulationOutput <- simulateResiduals(fittedModel = glmer(log(Dur.sec) ~
Test + cent.rev + (1 | ID), family = gaussian, data = rev), n = 500) #250 simulations, but if want higher precision change n>1000
plot(simulationOutput$scaledResiduals) #Expect a flat distribution of the overall residuals, and uniformity in y direction if plotted against any predictor. Poisson does not look great. Try log-transforming and using gaussian - Looks randomly scattered
testDispersion(simulationOutput) #if under- or over-dispersed, then p-value<0.05, but then check the dispersion parameter and try to determine what in the model could be the cause and address it there, also check for zero inflation.
#p = 0.78
testZeroInflation(simulationOutput) #compare expected vs observed zeros, not zero-inflated if p>0.05.
#p=1 - there are no zeros, so not zero inflated
testUniformity(simulationOutput) #check for heteroscedasticity ('a systematic dependency of the dispersion / variance on another variable in the model' Hartig, https://cran.r-project.org/web/packages/DHARMa/vignettes/DHARMa.html), which is indicated if dots aren't on the red line and p<0.05.
#p = 0.07 not heteroscedastic
plot(simulationOutput) #...there should be no pattern in the data points in the right panel.
#REPEATABILITY
# GLMM
library(MCMCglmm)
prior = list(R=list(R1=list(V=1,nu=0)), G=list(G1=list(V=1,nu=0)))
xcontx <- MCMCglmm(log(Dur.sec) ~ Test + cent.rev, random=~ID, family="gaussian", data=rev, verbose=F, prior=prior, nitt=1000000, thin=1000, burnin=5000)
autocorr(xcontx$Sol) #Did fixed effects converge (<0.1)? yes
autocorr(xcontx$VCV) #Did random effects converge (<0.1)? yes
plot(xcontx) # all traces look great
#In MCMCglmm, the latent scale adjusted repeatability and its credible interval can simply be obtained by: serial$VCV[,ID]/(serial$VCV[,ID]+serial$VCV[,units]) - advice from Maxime Dahirel
repeata <- xcontx$VCV[,"ID"]/(xcontx$VCV[,"ID"]+xcontx$VCV[,"units"]) #latent scale adjusted repeatability and its credible interval
mean(repeata) #0.36
var(repeata) #0.02 variance
posterior.mode(repeata) #0.40
HPDinterval(repeata, 0.95) #0.10-0.64
library(rptR) # check significance of repeatability using rptR package
xcontx.rpt = rpt(log(Dur.sec) ~ Test + cent.rev + (1|ID), grname = "ID",
data = rev, datatype="Gaussian", nboot=1000, npermut=100)
xcontx.rpt
# R = 0.36, CI: 0.11-0.58, p = 0.01
# Almost exact same results for R and CI.
## PLOTTING RESULTS ##
library(tidyverse)
library(ggplot2)
rev = rev %>%
group_by(Test) %>%
mutate(scale.dur = scale(Dur.sec, center=T, scale=T)) %>%
ungroup()
rev.all3 = rev %>%
group_by(ID, Test) %>%
summarise(median = median(scale.dur),
lcl = quantile(scale.dur, probs=0.25),
ucl = quantile(scale.dur, probs=0.75)) %>%
ungroup() %>%
mutate(Test = recode(Test, "LatencyMABplastic"="MAB plastic",
"LatencyMABwooden"="MAB wood",
"Reversal" = "Reversal learning")) #%>%
# FIGURE 4 - performance across all tests
png(file = "Figure 4",width = 1300, height = 900, units = "px")
ggplot(rev.all3, aes(x = reorder(ID, median, mean), y = median, colour = ID, shape = Test)) +
geom_pointrange(aes(ymin=lcl, ymax=ucl), size = 1.5,
position = position_dodge(width = 0.5)) +
scale_shape_manual(values = c(15,17,8,19))+
labs(x = "Bird", y = "Latency to switch (scaled and centered seconds)") +
theme_bw() +
theme(axis.text = element_text(size = 15),
axis.title = element_text(size = 18),
legend.title=element_text(size=15),
legend.text=element_text(size=14)) +
scale_fill_discrete(guide = "none") +
scale_colour_discrete(guide = "none")
dev.off()
## CODE for previous (mis-specified) analysis ##
# d <- read.csv(url("https://raw.githubusercontent.com/corinalogan/grackles/master/Files/Preregistrations/g_flexmanip_data_reverse.csv"), header=T, sep=",", stringsAsFactors=F)
# setwd("~/Documents/Grackle project")
# d = read.csv("g_flexmanip_data_repeatability.csv")
# #remove NAs from the variables that will be in the model
# d <- subset(d,!(is.na(d["TrialsToReverse"])))
# d <- subset(d,!(is.na(d["ReverseNumber"])))
#
# #include only those birds in the serial reversal tubes experiment, and not values for the initial discrimination
# d <- d[d$ExperimentalGroup=="Manipulation" & !d$Test == "InitialDisc",]
# d <- d[d$TubesOrTouchscreen == "TUBES" & d$ExperimentalGroup == "Manipulation",
# ]
#
# #factor variable
# d$ID <- as.factor(d$ID)
#
# #remove Guacamole & Memela who did not pass the reversal experiment and therefore were not offered the MAB experiments
# d <- d[!d$ID=="Guacamole" & !d$ID=="Memela",]
# #remove Diablo and Chalupa who did not pass criterion on solving any MAB loci and so we can't measure solution switching
# d <- d[!d$ID=="Chalupa" & !d$ID == "Diablo",]
# #remove pilot birds
# d <- d[!d$ID == "Fajita" & !d$ID == "Empanada", ]
#
# #n=6
# #length(unique(d$ID))
#
# # Data checking
# library(DHARMa)
# library(lme4)
# simulationOutput <- simulateResiduals(fittedModel = glmer(log(Value) ~
# Test + ReverseNumber + (1 | ID), family = gaussian, data = d), n = 250) #250 simulations, but if want higher precision change n>1000; Log transform because trials does not fit a poisson distribution.
# plot(simulationOutput$scaledResiduals) #Expect a flat distribution of the overall residuals, and uniformity in y direction if plotted against any predictor. Looks randomly scattered
# testDispersion(simulationOutput) #if under- or over-dispersed, then p-value<0.05, but then check the dispersion parameter and try to determine what in the model could be the cause and address it there, also check for zero inflation. p=0.90
# testZeroInflation(simulationOutput) #compare expected vs observed zeros, not zero-inflated if p>0.05. p=1 - there are no zeros, so not zero inflated
# testUniformity(simulationOutput) #check for heteroscedasticity ('a systematic dependency of the dispersion / variance on another variable in the model' Hartig, https://cran.r-project.org/web/packages/DHARMa/vignettes/DHARMa.html), which is indicated if dots aren't on the red line and p<0.05. p=0.25 so NOT heteroscedastic
# plot(simulationOutput) #...there should be no pattern in the data points in the right panel.
#
# # GLMM color reversal tubes compared with multi-access box
# library(MCMCglmm)
# prior = list(R = list(R1 = list(V = 1, nu = 0), R2 = list(V = 1, nu = 0)), G = list(G1 = list(V = 1,
# nu = 0)))
#
# # plastic
# rm <- MCMCglmm(LatencyMABplastic ~ ReverseNumber * TrialsToReverse, random = ~ID,
# family = "poisson", data = d, verbose = F, prior = prior,
# nitt = 130000, thin = 1000, burnin = 30000)
# #summary(rm)
# # post.mean l-95% CI u-95% CI eff.samp pMCMC
# #(Intercept) 2.08708 -4.45451 11.67734 100 0.66
# #ReverseNumber 1.01476 -2.75484 5.49974 100 0.42
# #TrialsToReverse 0.01693 -0.09999 0.11593 100 0.58
# #ReverseNumber:TrialsToReverse -0.01159 -0.07061 0.03283 100 0.42
# #nothing significant so no consistent individual differences across contexts on MAB plastic and trials to reverse
#
# # wooden
# rmw <- MCMCglmm(LatencyMABwooden ~ ReverseNumber * TrialsToReverse, random = ~ID,
# family = "poisson", data = d, verbose = F, prior = prior,
# nitt = 130000, thin = 1000, burnin = 30000)
# #summary(rmw)
# # post.mean l-95% CI u-95% CI eff.samp pMCMC
# #(Intercept) 3.622381 0.148743 7.810863 159.0 0.08 .
# #ReverseNumber 0.211605 -1.843271 2.126334 100.0 0.88
# #TrialsToReverse 0.032183 -0.019718 0.076067 147.9 0.14
# #ReverseNumber:TrialsToReverse -0.004685 -0.037464 0.014299 100.0 0.62
# #nothing significant so no consistent individual differences across contexts on MAB wooden and trials to reverse
#
#
# # Make a table with the outputs from both models (following https://gkhajduk.github.io/2017-10-25-cleanMCMCglmm/)
# library(dplyr)
#
# # for 1 model
# clean.MCMC <- function(x) {
# sols <- summary(x)$solutions ## pull out relevant info from model summary
# Gcovs <- summary(x)$Gcovariances
# Rcovs <- summary(x)$Rcovariances
# fixed <- data.frame(row.names(sols), sols, row.names = NULL) ## convert to dataframes with the row.names as the first col
# random <- data.frame(row.names(Gcovs), Gcovs, row.names = NULL)
# residual <- data.frame(row.names(Rcovs), Rcovs, row.names = NULL)
# names(fixed)[names(fixed) == "row.names.sols."] <- "variable" ## change the columns names to variable, so they all match
# names(random)[names(random) == "row.names.Gcovs."] <- "variable"
# names(residual)[names(residual) == "row.names.Rcovs."] <- "variable"
# fixed$effect <- "fixed" ## add ID column for type of effect (fixed, random, residual)
# random$effect <- "random"
# residual$effect <- "residual"
# modelTerms <- as.data.frame(bind_rows(fixed, random, residual)) # merge it all together
# }
# check for one model - it works
#oneModel <- clean.MCMC(rmw) # get all the info from summary(modelName)
#oneModel$modelName <- getName.MCMC(rmw) # add the model's name in a new column
#oneModel # check out the created dataframe
# check for multiple models - it works
# dataList <- list(rm, rmw)
# dataListNames <- list("Plastic", "Wooden")
# readyList <- mapply(cbind, lapply(dataList, clean.MCMC), "modelName" = dataListNames, SIMPLIFY = F)
# mcmcOutputs <- as.data.frame(do.call(rbind, readyList), stringsAsFactors = FALSE)
#
# # NOTE: for html, change to type='html', for pdf, change to type='latex' (https://stackoverflow.com/questions/14670299/using-stargazer-with-rstudio-and-knitr , stargazer cheatsheet: https://www.jakeruss.com/cheatsheets/stargazer/#html-formatting)
# library(stargazer)
# stargazer(mcmcOutputs, summary=FALSE, header=FALSE, type='latex', digits=1)
```
\newpage
## [REFERENCES](MyLibrary.bib)