Code and data implementing statistical analyses for Neuron publication, "Genetic background limits generalizability of genotype-phenotype relationships."
This repository contains code and data reproducing our analyses of null allele effects in an F1 panel of 30 inbred strains.
Each F1 panel was obtained by breeding +/- C57BL/6J males with +/+ females from 30 inbred strains. This produced a panel of +/+ and +/- littermates that are isogenic at genetic loci other than the target allele. This breeding design was used to independently generate two cohorts of mice, one for the Cacna1c null allele (n = 723) and one for the Tcf7l2 null allele (n = 630).
We collected a variety of physiological and behavioral phenotype data for each of these panels. Mice in the Cacna1c cohort were tested for anxiety, methamphetamine sensitivity, depression-like behavior, and acoustic startle response. Mice in the Tcf7l2 cohort were tested for several behavioral traits: anxiety, fear conditioning, and sensorimotor gating, as well as several metabolic traits: body weight, baseline blood glucose and fasting blood glucose. In total, we obtained data for 15 phenotypes, 12 of which are behavioral. These data are stored in CSV files in the data folder.
The R scripts in the code folder implement analyses to assess and summarize the phenotypic effects of two null alleles expressed from crosses of different inbred strains.
Citing this repository
If the data or code in this repository is useful for your research project, please cite our paper:
L. J. Sittig, P. Carbonetto, K. A. Engel, K. S. Krauss, C. Barrios-Camacho and A. A. Palmer (2016). Genetic background limits generalizability of genotype-phenotype relationships. Neuron 91, 1253--1259.
Code and data licenses
All files in the code folder are free software: you can redistribute them under the terms of the GNU General Public License. All the files in this folder are part of the source code. This source code is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose. See file LICENSE for the full text of the license.
All files in the data folder are released to the public domain under Creative Commons Zero (CC0) license. To the extent possible under law, the authors have waived all copyright and related or neighboring rights to these data.
R scripts implementing data analyses
The two main R scripts are gen.all.barplots.R and run.all.anova.R. Running these two scripts will reproduce all the bar charts and ANOVA results given in the paper. Script gen.all.barplots.R requires that the plyr and ggplot2 packages be installed on your computer. Also note that the ANOVA tables in the Neuron paper sometimes show the rows in a different order from what is generated by the R script, but the results (e.g., p-values) within each row should be the same.
Physiological and behavioral trait data
File pheno.tcf7l2.csv contains physiological and behavioural phenotype data collected for 630 F1 mice from the Tcf7l2 cohort. File pheno.cacna1c.csv contains physiological and behavioral phenotype data for 723 F1 mice from the Cacna1c cohort. These data are stored in comma-delimited ("csv") format, with one line per sample. Missing entries are marked as "NA", following the convention used in R. Use R function read.pheno in file read.data.R to read these data into an R data frame.
This table includes the following columns:
id: Unique number assigned to each mouse.
strain: Inbred strain representing the mother of the F1 mouse. This is an abbreviation of standard inbred strain id.
sex: gender of mouse (M = male, F = female).
TCF7L2: WT means wild-type (+/+), and HET means heterozygous (+/-).
CACNA1C: WT means wild-type (+/+), and HET means heterozygous (+/-).
bw2, d50bw: Body weight (in g) measured on day 50 of age.
bw3: Body weight (in g) measured on day 100 of age.
fastglucose: Blood glucose level measured after fasting for 16 hours.
baseglucose: Baseline blood glucose level.
centerduration, d1centerdur: Duration in center of field during open field testing.
pctdurlight: Proportion of total time spent in the light half of the light/dark box.
totalactivity: Exploratory activity measured during the 30-min open field test.
d1totalactivity: Exploratory activity measured during day 1 of the 30-min open field test.
d2totalactivity: Exploratory activity measured during day 2 of the 30-min open field test.
d3.d2totalactivity: Exploratory activity measured on day 3 of the 30-min open field test (after methamphetamine injection) minus exploratory activity measured on day 2 of the 30-min open field test (after saline injection).
FCtimeofday: Time of day in which fear conditioning test was conducted (either "AM" or "PM").
d1tone: Average proportion of freezing to the tone during the two 30-second intervals (180-210 seconds and 240-270 seconds) on day 1 of the conditioned fear tests.
d2context: Average proportion of freezing in response to the test chamber over the 30-180 second interval on day 2 of the conditioned fear tests.
d3tone: Average proportion of freezing to the tone during the two 30-second intervals (180-210 seconds and 240-270 seconds) on day 3 of the conditioned fear tests.
PPIbox: Apparatus used for PPI behavioral testing.
PPI6: Average of the inhibition intensity, taken as the ratio of the prepulse response during 6-dB prepulse trials over the pulse-alone startle amplitude.
PPIstartle, startle: Average startle response during the pulse-alone trials (with 120-dB pulses).
immobdur: Amount of time (in seconds) spent immobile during the forced swim test.
Since the terminology in the Neuron paper sometimes differs from the column names used in the CSV files, we provide the table below to make it easier to draw the connection between the results generated by the R code and the results presented in the paper:
Cacna1c cohort (pheno.cacna1c.csv)
|phenotype name in manuscript (Table 1)||phenotype name in repository|
|percent time in the light box||pctdurlight|
|duration in center||d1centerdur|
|acoustic startle response||startle|
Tcf7l2 cohort (pheno.tcf7l2.csv)
|phenotype name in manuscript (Table 1)||phenotype name in repository|
|duration in center||centerduration|
|contextual fear learning||d2context|
|cued fear learning||d3tone|
|acoustic startle response||PPIstartle|
|fasting blood glucose levels||fastglucose|
|baseline blood glucose levels||baseglucose|
README.md: this file.
pheno.tcf7l2.csv: physiological and behavioural phenotype data collected in Tcf7l2 cohort, stored in CSV text format. Read below for details.
pheno.cacna1c.csv: physiological and behavioural phenotype data collected in Cacna1c cohort, stored in CSV text format. Read below for details
read.data.R: R source code defining functions to read and process data stored in text files.
transformation.functions.R: R source code defining some functions used to apply various transformations to the data.
data.manip.R: R source code defining functions to process and manipulate the phenotype data.
misc.R: R source code defining additional functions used in the statistical analyses.
defaults.tcf7l2.R: R source code specifying the default analysis settings for each phenotype analyzed in the Tcf7l2 cohort.
defaults.cacna1c.R: R source code specifying the default analysis settings for each phenotype analyzed in the Cacna1c cohort.
run.all.anova.R: This R script generates the ANOVA results for all phenotypes analyzed in the Tcf7l2 and Cacna1c cohorts.
For questions and feedback, please contact:
Department of Psychiatry
University of California, San Diego
9500 Gilman Drive
La Jolla, California, USA
Laura J. Sittig
Kyle A. Engel
Kathleen S. Krauss
Abraham A. Palmer