# Mutation Types and Genomic Element Types in `shadie`

## Mutation Types
Mutation types must be defined first in `shadie` using the `MutationType` class. `MutationType` requires a minimum of 3 arguments:
* dominance (float): dominance coefficient of the mutation for diploid
* distribution (str): distribution that the fitness effect will be selected from **to add: how to access a list and explanation of distributions**
* params (float): additional arguments that define the distribution - number of arguments will vary based on the distribution 

`shadie` defaults:
```python
NEUT = MutationType(0.5, "f", 0.0)		#neutral mutation
SYN = MutationType(0.5, "f", 0.0)         #synonymous
DEL = MutationType(0.1, "g", -0.03, 0.2)  #deleterious
BEN = MutationType(0.8, "e", 0.1)         #beneficial
```

In [3]:
from shadie import MutationType

In [27]:
#"f" is fixed fitness effect (no distribution), so takes a single argument
mut1 = MutationType(0.5, "f", 0.1)

#"e" is an exponential distribution that takes a single argument
mut5 = MutationType(0.5, "e", 0.02)

#"g", "n", and "w" take 2 arguments: 
mut2 = MutationType(0.1, "g", -0.03, 0.2)  #gamma
mut3 = MutationType(0.5, "n", 0.05, 0.1)   #normal
mut4 = MutationType(0.5, "w", -.01, 1.5)    #weibull



You must assign each created mutation type to an object in order to save them to a `MutationList`, which is needed for your simulation. The object name you assign will be saved as a "name" ___ in the `MutationList` and you will be able to refer to each mutation by this name *or* the `shadie`-assigned `idx` when creating your genomic element types. 

The repr lists the mutation settings:

In [4]:
mut4

<MutationType: m8, 0.5,w, (5.5, 2.4)>

Notice that your mutation type now has an `idx` (in the format m#). This is a unique id that `shadie` uses to keep track of mutations. It may not start at `m1`, as some default mutation types have already been defined by `shadie`. 

If you'd like more information about your mutation, you can use the `inspect()` function to visualize your mutation distribution and see the parameters listed explicitly:

In [29]:
mut5.inspect()

[1mMutation Type[0m
idx: m45
dominance coefficient: 0.5
distribution: e
distribution parameters: (0.02,)
Distribution plot:


mean: 0.02
standard deviation: 0.0004





## Save your custom mutations to a list
Create a `MutationList` class object that contains all the mutations you would like to use in your simulation. You will need to call this list when you initiate the script. If you are using custom mutations in adding to defaults, you will have to include those defaults in your new list.

In [6]:
from shadie import MutationList

In [7]:
#list containing custom mutations
mylist = MutationList(mut1,  mut2, mut3, mut4)
mylist

<MutationList: ['m5', 'm6', 'm7', 'm8']>

In [33]:
from shadie import globals
mymixedlist = MutationList(mut1,  mut2, mut3, mut4, mut5, globals.BEN, globals.NEUT, globals.DEL)
mymixedlist

<MutationList: ['m44', 'm46', 'm47', 'm48', 'm45', 'm4', 'm1', 'm3']>

**Note:** all neutral mutations in `shadie` are overlaid *after* the SLiM simulation has run. For this reason, you should not create any custom neutral mutations.

However, all genomic elements need at least one mutation type - so your non-coding regions will need a neutral mutation type. It is highly recommended you use the `shadie` default NONCOD, which sets the mutation rate to 0. Otherwise if you create your own, make sure you set the mutation rate to 0 (see how to do this below).

## Genomic Element Types
Each genomic element type describes a region of chromosome. It is helpful to think of these regions as "non=coding", "exon", "intron", etc. 
Each type requires 2 lists of equal length:
1. A list of mutations that can occur in that kind of region (e.g. only neutral mutations can occur in non-coding regions)
2. A list of relative frequencies 

`shadie` defaults:
```python
EXON = ElementType([SYN, DEL, BEN], (2,8,0.1))  #exon
INTRON = ElementType([SYN,DEL], (9,1))          #intron
NONCOD = ElementType(NEUT, 1)              	 #non-coding
```

In [35]:
from shadie import ElementType

In [50]:
#ElementType accepts two lists or tuples - or any
ex1 = ElementType([mut1, mut2, mut5], (.1, 9, 1))
ex2 = ElementType([mut3, mut4], [1, 9])
ex1, ex2

(<ElementType: 'None', g14, ['m44', 'm46', 'm45'], [0.1, 9, 1], mmJukesCantor(1e-06/3),
 <ElementType: 'None', g15, ['m47', 'm48'], [1, 9], mmJukesCantor(1e-06/3))

You can also access `shadie` `ElementType` globals. It is highly recommended you use the `NONCOD` default for your non-coding regions. 

In [39]:
ncdef = globals.NONCOD
ncdef

<ElementType: 'None', g3, ['m1'], [1], mmJukesCantor(0/3)

However, if you want to create your own, make sure to provide a mutation rate argument = 0:

In [42]:
nc1 = ElementType(globals.NEUT, 1, mutationrate = 0)
nc1

<ElementType: 'None', g8, ['m1'], [1], mmJukesCantor(0/3)

Same goes for introns - if you want them to be neutral, set the mutation rate to 0. However, you may way a slightly deleterious mutation to occur in introns. If this is the case, there is no need to add a neutral mutation to your intron because it will be overlaid after the simulation

In [45]:
in1 = ElementType(globals.NEUT, 1, mutationrate = 0) #neutral intron
in2 = ElementType([mut4], [1]) #intron in which deleterious mutations can occur. 
                               #Neutral mutations will also occur here
in1, in2

(<ElementType: 'None', g12, ['m1'], [1], mmJukesCantor(0/3),
 <ElementType: 'None', g13, ['m48'], [1], mmJukesCantor(1e-06/3))

Once you have created all the genomic elements you want in your chromosome, add them all to a `ElementList`. The first argument of your `ElementList` is the `MutationList` you made above. This saves both lists so `shadie` can use them to write the SLiM script. `shadie` will also double-check that all the mutations in your `ElementList` are on your `MutationList`

In [46]:
from shadie import ElementList

In [49]:
myellist = ElementList(mymixedlist, ex1, ex2, in1, in2, nc1)
myellist

<ElementList: ['g5', 'g6', 'g12', 'g13', 'g8']>

You can double-check that everything looks good using the `inspect()` function:

In [53]:
myellist.inspect()

[1mGenomic Element List[0m
Element types: (<ElementType: 'None', g5, ['m44', 'm46', 'm45'], [0.1, 9, 1], mmJukesCantor(1e-06/3), <ElementType: 'None', g6, ['m47', 'm48'], [1, 9], mmJukesCantor(1e-06/3), <ElementType: 'None', g12, ['m1'], [1], mmJukesCantor(0/3), <ElementType: 'None', g13, ['m48'], [1], mmJukesCantor(1e-06/3), <ElementType: 'None', g8, ['m1'], [1], mmJukesCantor(0/3))
Mutation types: <MutationList: ['m44', 'm46', 'm47', 'm48', 'm45', 'm4', 'm1', 'm3']>

[1mGenomic Element Type[0m
name: g5
alternate name: None
mutations: ['m44', 'm46', 'm45']
frequencies: [0.1, 9, 1]

[1mMutation Type[0m
idx: m44
dominance coefficient: 0.5
distribution: f
distribution parameters: (0.1,)
Distribution plot:
[1mNONE: fixed fitness effect = 0.1
[0m
[1mMutation Type[0m
idx: m46
dominance coefficient: 0.1
distribution: g
distribution parameters: (-0.03, 0.2)
Distribution plot:


mean: -0.03
standard deviation: 0.0670820393249937



[1mMutation Type[0m
idx: m45
dominance coefficient: 0.5
distribution: e
distribution parameters: (0.02,)
Distribution plot:


mean: 0.02
standard deviation: 0.0004



[1mGenomic Element Type[0m
name: g6
alternate name: None
mutations: ['m47', 'm48']
frequencies: [1, 9]

[1mMutation Type[0m
idx: m47
dominance coefficient: 0.5
distribution: n
distribution parameters: (0.05, 0.1)
Distribution plot:


mean: 0.05
standard deviation: 0.1



[1mMutation Type[0m
idx: m48
dominance coefficient: 0.5
distribution: w
distribution parameters: (-0.01, 1.5)
Distribution plot:


mean: -0.004807498567691361
standard deviation: Nan



[1mGenomic Element Type[0m
name: g12
alternate name: None
mutations: ['m1']
frequencies: [1]

[1mMutation Type[0m
idx: m1
dominance coefficient: 0.5
distribution: f
distribution parameters: (0.0,)
Distribution plot:
[1mNONE: fixed fitness effect = 0.0
[0m
[1mGenomic Element Type[0m
name: g13
alternate name: None
mutations: ['m48']
frequencies: [1]

[1mMutation Type[0m
idx: m48
dominance coefficient: 0.5
distribution: w
distribution parameters: (-0.01, 1.5)
Distribution plot:


mean: -0.004807498567691361
standard deviation: Nan



[1mGenomic Element Type[0m
name: g8
alternate name: None
mutations: ['m1']
frequencies: [1]

[1mMutation Type[0m
idx: m1
dominance coefficient: 0.5
distribution: f
distribution parameters: (0.0,)
Distribution plot:
[1mNONE: fixed fitness effect = 0.0
[0m
