# Building a Chromosome in `shadie`

You can find a detailed description of how chromosomes work in SLiM in the official SLiM manual - here we will just cover the basics necessary to build a chromosome in shadie. 

Chromosomes in SLiM are made of genomic elements. Recall that each genomic element is defined by the kinds of mutations that can occur within them. Chromosomes can be as simple or as complex as you want. Just keep in mind that a simple chromosome, while it may only reveal dynamics occurring at a limited number of loci, will be much easier to interpret and will usually allow the simulation to run faster than a complex one. 

## Using `shadie` defaults
`shadie` has a default chromosome, which consists of a gene in the middle (two exons and one intron), flanked by two non-coding regions. 

In [1]:
import shadie
default_chrom = shadie.chromosome.default()

You can visualize the genomic elements on a chromosome using the `draw()` function

In [2]:
default_chrom.draw();

The `Chromosome` class object in `shadie` has a number of ways to define a chromosome. The lowest effort method is to use the `shadie.chromosome.random()` function, which will attempt to create a 'realistic' chromosome of a certain length, in which exons and non-coding regions are interspersed, and introns always occur between exons.

Arguments:

- genome size (int): length of chromosome in base pairs
- intron (list): list of ElementTypes that can be used as introns
- exon (list): list of ElementTypes that can be used as exons
- noncds (list): list of ElementTypes that can be used as non-coding sequences
- intron_scale (int): average length of each intron
- cds_scale (int): average length of each exon
- noncds_scale (int): average length of each non-coding sequence
- seed (int): use to generate the same chromosome consistently - otherwise a different random chromosome will be generated each time

Use `shadie.chromosome.random??` to see defaults

In [3]:
#create a random chromosome - if no arguments are supplied, defaults will be used
random_chrom = shadie.chromosome.random()

## Visualizing your chromosome
The `draw()` function will generate a static `toyplot` of the chromosome. However over the different elements to see metadata. 

In [4]:
random_chrom.draw();

The `inspect()` function generates an interactive plot using the `altair` package, which allows you to zoom in on parts of your chromosome. This is useful for long or complex chromosomes. Click and draw in the lower plot to zoom in on a portion of the chromosome. Hover for more info. 

In [5]:
random_chrom.inspect()

The chromosome object has attributes, which you can access to check the setting before you begin your simulation:

In [6]:
# mutation types
random_chrom.mutations

[<MutationType: m3, 0.8, e, (0.04,)>,
 <MutationType: m1, 0.5, f, (0.0,)>,
 <MutationType: m2, 0.1, g, (-3.0, 1.5)>]

In [7]:
# element types
random_chrom.elements

[<ElementType: exon, g1, ['m2', 'm3'], (8, 0.1)>,
 <ElementType: noncds, g3, ['m1'], [1]>]

In [8]:
# dataframe of all the genomic elements
random_chrom.data

Unnamed: 0,name,start,end,eltype,script,coding
0,noncds,1,9988,g3,"<ElementType: noncds, g3, ['m1'], [1]>",0
9988,exon,9989,11336,g1,"<ElementType: exon, g1, ['m2', 'm3'], (8, 0.1)>",1
11336,noncds,11337,17938,g3,"<ElementType: noncds, g3, ['m1'], [1]>",0
17938,exon,17939,18436,g1,"<ElementType: exon, g1, ['m2', 'm3'], (8, 0.1)>",1
18436,noncds,18437,19635,g3,"<ElementType: noncds, g3, ['m1'], [1]>",0


In [13]:
random_chrom.review("chromosome")

AttributeError: 'ChromosomeRandom' object has no attribute 'review'

## Creating a Custom Chromosome
Finally, you can define your own chromosome by providing a pandas dataframe that contains, at minimum, the `name` of the genomic element (as defined by you *or* using `shadie` defaults), the `start base` and the `end base`. Alternatively, you can provide the internal `idx` of the genomic element (in place of the `name`). You must provide a GenomeList class object and your genomic elements must be defined in the list. 

### Interactive Plot
You can use the `review()` function to generate an interactive plot to inspect your chromosome structure further. Draw a region on the bottom plot to view that region in the top plot. 

In [10]:
default_chrom.

SyntaxError: invalid syntax (419111714.py, line 1)

## Random Chromosome with Custom Mutation Types & Genomic Element Types 

In [11]:
from shadie import ElementList
from shadie import ElementType
from shadie import MutationType
from shadie import MutationList

#create your custom mutation types and save to a MutationList
mut1 = MutationType(0.5, "f", 0)
mut2 = MutationType(0.5, "e", 0.4)
mut3 = MutationType(0.5, "n", 0.4, .1)
mut4 = MutationType(0.5, "w", 0.3, 0.2)
mut5 = MutationType(0.5, "g", -0.4, .1)

mutlist = MutationList(mut1, mut2, mut3, mut4, mut5)

#create your custom genomic element types and save to an ElementList
noncod = ElementType(mut1, 1, altname = "nc")
exon1 = ElementType([mut2, mut5], [1, 1], altname = "ex1")
exon2 = ElementType([mut2, mut3, mut4], [9, 1, .02], altname = "ex2")
intron1 = ElementType([mut2, mut5], [1, 1], altname = "int1")
intron2 = ElementType([mut2, mut5], [1, 1], altname = "int2")

mycustomlist = ElementList(mutlist, noncod, exon1, exon2, intron1, intron2)

ImportError: cannot import name 'ElementList' from 'shadie' (/Users/elissa/code/git/hacks/shadie/shadie/__init__.py)

In [12]:
from shadie import Build

#initialize the Build class object
custom_build = Build(
    exons = [exon1, exon2], 
    introns = [intron1, intron2], 
    noncoding = [noncod], 
    elementlist = mycustomlist)

ImportError: cannot import name 'Build' from 'shadie' (/Users/elissa/code/git/hacks/shadie/shadie/__init__.py)

In [18]:
#run the same random function
custom_build.random()

01:50 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:50 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:50 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:50 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:50 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:50 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:50 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:50 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:50 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:50 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:50 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:50 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:50 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:50 | INFO

In [19]:
from shadie import Chromosome

customized_chrom = Chromosome(genome = custom_build)

In [20]:
customized_chrom.review("elements")

[1mGenomic Elements:
[0m


Unnamed: 0,name,start,finish,eltype,script,type
0,nc,0,3847,g5,"'g5', c(m5),c(1), mmJukesCantor(1e-06/3)",noncoding
3848,ex1,3848,4063,g6,"'g6', c(m6, m9),c(1, 1), mmJukesCantor(1e-06/3)",exon
4064,int1,4064,4639,g8,"'g8', c(m6, m9),c(1, 1), mmJukesCantor(1e-06/3)",intron
4640,ex2,4640,4900,g7,"'g7', c(m6, m7, m8),c(9, 1, 0.02), mmJukesCant...",exon
4901,nc,4901,5780,g5,"'g5', c(m5),c(1), mmJukesCantor(1e-06/3)",noncoding
...,...,...,...,...,...,...
999020,int2,999020,999305,g9,"'g9', c(m6, m9),c(1, 1), mmJukesCantor(1e-06/3)",intron
999306,ex1,999306,999556,g6,"'g6', c(m6, m9),c(1, 1), mmJukesCantor(1e-06/3)",exon
999557,int1,999557,1000162,g8,"'g8', c(m6, m9),c(1, 1), mmJukesCantor(1e-06/3)",intron
1000163,ex1,1000163,1000346,g6,"'g6', c(m6, m9),c(1, 1), mmJukesCantor(1e-06/3)",exon


In [21]:
import pandas as pd
final_chromosome.genome["name"]

for index, row in final_chromosome.genome.iterrows():
    if pd.isna(row['name']):
        print(row['type'])
    else: 
        print(row['name'])

noncoding
exon
intron
exon
noncoding
exon
noncoding
exon
intron
exon
intron
exon
noncoding
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
noncoding
exon
intron
exon
noncoding
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
noncoding
exon
intron
exon
intron
exon
intron
exon
intron
exon
noncoding
exon
noncoding
exon
noncoding
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
noncoding
exon
intron
exon
noncoding
exon
intron
exon
intron
exon
intron
exon
noncoding
exon
intron
exon
noncoding
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
noncoding
exon
intron
exon
noncoding
exon
intron
exon
intron
exon
noncoding
exon
intron
exon
intron
exon
intron
exon
intron
exon
noncoding
exon
intron
exon
noncoding
exon
intron
exon
noncoding
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
nonc

In [22]:
customized_chrom.review("chromosome")

[1mChromosome Summary
[0m# of Genes: 220
Average # exons per gene: 3.8727272727272726
Average exon length: 259.46596244131456 nt
Average # introns per gene: 2.8727272727272726
Average introns length: 452.40822784810126 nt

Static Chromosome Plot:



In [23]:
customized_chrom.review("interactive")

Interactive altair chromosome map:
