# Mutational Parameters

## All estimates

We conducted a survey of the literature looking for estimates of the deleterious mutation rate, $U$, and the mean deleterious effect of a mutation, $s$.  We obtained 34 estimates of $U$ and 50 estimates of $s$ based on 22 species from 33 studies.  When a single study included multiple estimates for a single species based on independent data sets, we used the median of those estimates.  

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

In [8]:
data = pd.read_csv('data/mutation_parameters.csv')
data.loc[data['hom'],'s'] = data['s'] / 2
del data['hom']
print('n =', pd.notna(data['U']).sum(), 'estimates of U')
print('n =', pd.notna(data['s']).sum(), 'estimates of s')
data

n = 35 estimates of U
n = 51 estimates of s


Unnamed: 0,species,group,mutator,U,s,method,source,notes
0,Vesicular Stomatitis Virus,virus,False,1.16,0.0022,BM,Elena & Moya 1999,median of 3 experiments
1,Vesicular Stomatitis Virus,virus,False,,0.191,direct mutagenesis,Sanjuan et al. 2004,
2,Tobacco Etch Virus,virus,False,,0.411,direct mutagenesis,Carrasco et al. 2007,
3,PhiX174 bacteriophage,virus,False,,0.512,direct mutagenesis,Vale et al. 2012,average for 2 hosts
4,PhiX174 bacteriophage,virus,False,,0.051,MA,Domingo-Calap et al. 2009,
5,G4 bacteriophage,virus,False,,0.072,MA,Domingo-Calap et al. 2009,
6,F1 bacteriophage,virus,False,,0.027,MA,Domingo-Calap et al. 2009,
7,Qbeta bacteriophage,virus,False,,0.04,MA,Domingo-Calap et al. 2009,
8,SP bacteriophage,virus,False,,0.057,MA,Domingo-Calap et al. 2009,
9,MS2 bacteriophage,virus,False,,0.046,MA,Domingo-Calap et al. 2009,


In [9]:
spp = data['species'].value_counts()
print('n =', len(spp), 'species')
spp

n = 23 species


Drosophila melanogaster       9
Escherichia coli              6
Caenorhabditis elegans        5
Saccharomyces cerevisiae      4
Chlamydomonas reinhardtii     3
PhiX174 bacteriophage         3
Daphnia pulicaria             2
Qbeta bacteriophage           2
Vesicular Stomatitis Virus    2
F1 bacteriophage              2
Arabidopsis thaliana          2
phi6 bacteriophage            1
Tobacco Etch Virus            1
G4 bacteriophage              1
Daphnia arenata               1
Tetrahymena thermophila       1
Oscheius myriophila           1
MS2 bacteriophage             1
SP bacteriophage              1
Caenorhabditis briggsae       1
Daktulosphaira vitifoliae     1
Amsinckia douglasiana         1
Amsinckia gloriosa            1
Name: species, dtype: int64

In [10]:
ref = data['source'].value_counts()
print('n =', len(ref), 'studies')
ref

n = 35 studies


Domingo-Calap et al. 2009     8
Garcia-Dorado et al. 1998     4
Baer et al. 2005              3
Deng & Lynch 1997             2
Robert et al. 2018            2
Vassilieva et al. 2000        2
Schoen 2005                   2
Zeyl & DeVisser 2001          2
Burch et al. 2007             1
Long et al. 2013              1
Kraemer et al. 2017           1
Downie 2003                   1
Matsuba et al. 2012           1
Schultz et al. 1999           1
Elena et al. 1998             1
Carrasco et al. 2007          1
Sanjuan et al. 2004           1
Loewe et al. 2003             1
Morgan et al. 2014            1
Joseph & Hall 2004            1
Elena & Moya 1999             1
Bondel et al. 2019            1
Garcia-Dorado et al. 1999     1
Latta et al. 2012             1
Keightley & Caballero 1997    1
Trindade et al. 2010          1
Wloch et al. 2001             1
Kibota & Lynch 1996           1
Vale et al. 2012              1
Fry & Heinsohn 2002           1
Chavarr�as et al. 2001        1
Peris et

### Summary statistics

Below are the median and interquartile range for $U$ and $s$.

In [11]:
U = data[pd.notna(data['U'])]['U']
print('U =', U.median().round(3), np.percentile(U, [25, 75]).round(3))
s = data[pd.notna(data['s'])]['s']
print('s =', s.median().round(3), np.percentile(s, [25, 75]).round(3))

U = 0.023 [0.005 0.074]
s = 0.061 [0.03  0.116]


Some estimates were obtained from mutator genotypes.

In [12]:
data[data['mutator']]

Unnamed: 0,species,group,mutator,U,s,method,source,notes
17,Escherichia coli,bacteria,True,0.005,0.03,ML,Trindade et al. 2010,deficient in the mismatch repair gene mutS
19,Escherichia coli,bacteria,True,0.139,0.0032,Novel,Robert et al. 2018,mutH s from Table S7
23,Saccharomyces cerevisiae,eukaryote,True,0.0228,0.015,BM,Zeyl & DeVisser 2001,excluding petites


In [15]:
data[(data['U'] > .01) & (data['U'] < .03)]

Unnamed: 0,species,group,mutator,U,s,method,source,notes
23,Saccharomyces cerevisiae,eukaryote,True,0.0228,0.015,BM,Zeyl & DeVisser 2001,excluding petites
29,Daphnia pulicaria,eukaryote,False,0.015,0.082,BM,Latta et al. 2012,median of 3 experiments
32,Caenorhabditis elegans,eukaryote,False,0.015,0.109,ML,Vassilieva et al. 2000,"Productivity, generation 214"
35,Caenorhabditis elegans,eukaryote,False,0.02,0.06,BM,Matsuba et al. 2012,median of all lines
40,Drosophila melanogaster,eukaryote,False,0.022,0.0955,MD,Garcia-Dorado et al. 1998,data from Mukai et al. 1972
41,Drosophila melanogaster,eukaryote,False,0.02,0.113,MD,Garcia-Dorado et al. 1998,data from Ohnishi 1997
51,Daktulosphaira vitifoliae,eukaryote,False,0.0234,0.116,BM,Downie 2003,"Day 29 survivorship, average of 2 generations"


In [23]:
data[(data['s'] > .05) & (data['s'] < .07)]

Unnamed: 0,species,group,mutator,U,s,method,source,notes
4,PhiX174 bacteriophage,virus,False,,0.051,MA,Domingo-Calap et al. 2009,
8,SP bacteriophage,virus,False,,0.057,MA,Domingo-Calap et al. 2009,
20,Saccharomyces cerevisiae,eukaryote,False,0.000126,0.061,ML,Joseph & Hall 2004,
25,Amsinckia douglasiana,eukaryote,False,0.161,0.0515,ML,Schoen 2005,beta = infinity average of two traits
34,Tetrahymena thermophila,eukaryote,False,0.0094,0.0545,BM,Long et al. 2013,Single GE lines
35,Caenorhabditis elegans,eukaryote,False,0.02,0.06,BM,Matsuba et al. 2012,median of all lines
39,Drosophila melanogaster,eukaryote,False,0.032,0.0515,MD,Garcia-Dorado et al. 1998,data from Fernandez & Lopez-Fanjul 1996
47,Drosophila melanogaster,eukaryote,False,0.06026,0.0565,BM,Charlesworth et al. 2004,median of 3 experiments


In [28]:
data[(data['s'] >= .03) & (data['s'] <= .116) & (data['U'] >= .005) & (data['U'] <= .074)]

Unnamed: 0,species,group,mutator,U,s,method,source,notes
13,phi6 bacteriophage,virus,False,0.03,0.093,ML,Burch et al. 2007,MLE nDEL = 56 correcting for selection
17,Escherichia coli,bacteria,True,0.005,0.03,ML,Trindade et al. 2010,deficient in the mismatch repair gene mutS
29,Daphnia pulicaria,eukaryote,False,0.015,0.082,BM,Latta et al. 2012,median of 3 experiments
31,Caenorhabditis elegans,eukaryote,False,0.0052,0.105,ML,Keightley & Caballero 1997,Productivity
32,Caenorhabditis elegans,eukaryote,False,0.015,0.109,ML,Vassilieva et al. 2000,"Productivity, generation 214"
34,Tetrahymena thermophila,eukaryote,False,0.0094,0.0545,BM,Long et al. 2013,Single GE lines
35,Caenorhabditis elegans,eukaryote,False,0.02,0.06,BM,Matsuba et al. 2012,median of all lines
38,Caenorhabditis briggsae,eukaryote,False,0.0495,0.075,BM,Baer et al. 2005,average of 2 experiments
39,Drosophila melanogaster,eukaryote,False,0.032,0.0515,MD,Garcia-Dorado et al. 1998,data from Fernandez & Lopez-Fanjul 1996
40,Drosophila melanogaster,eukaryote,False,0.022,0.0955,MD,Garcia-Dorado et al. 1998,data from Mukai et al. 1972


## Estimates from different species only

The data set includes multiple estimates for the same species.  We took the median of those estimates.  

In [13]:
sppdata = data.groupby('species').median()
del sppdata['mutator']
print('n =', pd.notna(sppdata['U']).sum(), 'estimates of U')
print('n =', pd.notna(sppdata['s']).sum(), 'estimates of s')
sppdata

n = 16 estimates of U
n = 23 estimates of s


Unnamed: 0_level_0,U,s
species,Unnamed: 1_level_1,Unnamed: 2_level_1
Amsinckia douglasiana,0.161,0.0515
Amsinckia gloriosa,0.198,0.025
Arabidopsis thaliana,0.052,0.1725
Caenorhabditis briggsae,0.0495,0.075
Caenorhabditis elegans,0.015,0.105
Chlamydomonas reinhardtii,0.000816,0.024
Daktulosphaira vitifoliae,0.0234,0.116
Daphnia arenata,0.69,0.035
Daphnia pulicaria,0.5025,0.0935
Drosophila melanogaster,0.052,0.10425


### Summary statistics

Below are the median and interquartile range for $U$ and $s$.

In [14]:
U = sppdata[pd.notna(sppdata['U'])]['U']
print('U =', U.median().round(3), np.percentile(U, [25, 75]).round(3))
s = sppdata[pd.notna(sppdata['s'])]['s']
print('s =', s.median().round(3), np.percentile(s, [25, 75]).round(3))

U = 0.04 [0.008 0.17 ]
s = 0.075 [0.052 0.105]


In [19]:
np.array([600, 30, 100, 900], dtype=int) * .3

array([180.,   9.,  30., 270.])