# Building a Chromosome for `shadie`

The simplest way to prepare a SLiM simulation using `shadie` is to let the `Chromosome` class automatically generate a single gene. The gene will consist of a single exon by default, although the user can also specify number of exons. The user can also supply a length (in base pairs) for the `genome_size` argument. 

In [1]:
from shadie import Chromosome

2021-04-02 13:35:29.967 | DEBUG    | shadie.elements:__init__:78 - <MutationType: m2, 0.5,f, (0.0,)>
2021-04-02 13:35:29.967 | DEBUG    | shadie.elements:__init__:78 - <MutationType: m3, 0.1,g, (-0.03, 0.2)>
2021-04-02 13:35:29.968 | DEBUG    | shadie.elements:__init__:78 - <MutationType: m4, 0.8,e, (0.1,)>
2021-04-02 13:35:29.969 | DEBUG    | shadie.elements:__init__:78 - <MutationType: m2, 0.5,f, (0.0,)>
2021-04-02 13:35:29.969 | DEBUG    | shadie.elements:__init__:78 - <MutationType: m3, 0.1,g, (-0.03, 0.2)>
2021-04-02 13:35:29.970 | DEBUG    | shadie.elements:__init__:59 - frequencies: [1]
2021-04-02 13:35:29.970 | DEBUG    | shadie.elements:__init__:68 - mutation types: [<MutationType: m1, 0.5,f, (0.0,)>]
2021-04-02 13:35:29.971 | DEBUG    | shadie.elements:__init__:78 - <MutationType: m1, 0.5,f, (0.0,)>
2021-04-02 13:35:29.972 | DEBUG    | shadie.elements:__init__:78 - <MutationType: m2, 0.5,f, (0.0,)>
2021-04-02 13:35:29.972 | DEBUG    | shadie.elements:__init__:78 - <MutationTy

In [2]:
#create Chromosome object. If no "genome" argument is supplied, the Chromosome class will generate a single gene
one_gene = Chromosome(genome_size = 2000)

In [3]:
#the Chromosome object can now be inspected using review() function
Chromosome.review(one_gene, item = "mutations")
Chromosome.review(one_gene, item = "eltypes")
Chromosome.review(one_gene, item = "elements")

#simpler syntax
one_gene.review("chromosome")

[1mMutation Types:
[0m <MutationList: ['m2', 'm3', 'm4']> 

[1mGenomic Element Types:
[0m <ElementList: ['g1']> 

[1mGenomic Elements:
[0m


Unnamed: 0,type,name,start,finish,eltype,script
0,exon,exon,1,1999,g1,"<ElementType: 'None', g1, ['m2', 'm3', 'm4'], ..."


[1mChromosome Summary
[0m# of Genes: 1
Average # exons per gene: 1.0
Average exon length: 1999.0 nt
Average # introns per gene: 0.0
Average introns length: 0 nt

Static Chromosome Plot:



## Random Chromosome
You can also use the `Build` class of `shadie` to generate a random chromosome for you:

In [2]:
from shadie import Build

In [3]:
#create a Build class object
random_chromosome = Build()

#run the random() function to generate the chromosome
Build.random(random_chromosome)

01:35 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:35 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:35 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:35 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:35 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:35 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:35 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:35 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:35 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:35 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:35 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:35 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:35 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:35 | INFO

01:35 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:35 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:35 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:35 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:35 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:35 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:35 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:35 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:35 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:35 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:35 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:35 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:35 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
01:35 | INFO

In [6]:
print(random_chromosome.mutationlist)

<MutationList: ['m1', 'm2', 'm3', 'm4']>


### Pass Build object to Chromosome Class
`random_chromsome` is a Build class object that can be assigned to `genome` argument in `Chromosome` class:

In [7]:
final_chromosome = Chromosome(genome = random_chromosome)
final_chromosome.review("elements")

[1mGenomic Elements:
[0m


Unnamed: 0,type,name,start,finish,eltype,script
0,noncoding,,0,3091,g3,"<ElementType: 'None', g3, ['m1'], [1], mmJukes..."
3092,exon,,3092,3280,g1,"<ElementType: 'None', g1, ['m2', 'm3', 'm4'], ..."
3281,intron,,3281,3670,g2,"<ElementType: 'None', g2, ['m2', 'm3'], [9, 1]..."
3671,exon,,3671,4076,g1,"<ElementType: 'None', g1, ['m2', 'm3', 'm4'], ..."
4077,noncoding,,4077,5279,g3,"<ElementType: 'None', g3, ['m1'], [1], mmJukes..."
...,...,...,...,...,...,...
995715,intron,,995715,996132,g2,"<ElementType: 'None', g2, ['m2', 'm3'], [9, 1]..."
996133,exon,,996133,996319,g1,"<ElementType: 'None', g1, ['m2', 'm3', 'm4'], ..."
996320,intron,,996320,996748,g2,"<ElementType: 'None', g2, ['m2', 'm3'], [9, 1]..."
996749,exon,,996749,996978,g1,"<ElementType: 'None', g1, ['m2', 'm3', 'm4'], ..."


In [8]:
final_chromosome.review("chromosome")

[1mChromosome Summary
[0m# of Genes: 197
Average # exons per gene: 4.395939086294416
Average exon length: 262.08891454965357 nt
Average # introns per gene: 3.3959390862944163
Average introns length: 448.51270553064273 nt

Static Chromosome Plot:



### Interactive Plot
You can use the `review()` function to generate an interactive plot to inspect your chromosome structure further. Draw a region on the bottom plot to view that region in the top plot. 

In [5]:
final_chromosome.review("interactive")

Interactive altair chromosome map:


## Random Chromosome with Custom Mutation Types & Genomic Element Types 

In [11]:
from shadie import ElementList
from shadie import ElementType
from shadie import MutationType
from shadie import MutationList

#create your custom mutation types and save to a MutationList
mut1 = MutationType(0.5, "f", 0)
mut2 = MutationType(0.5, "e", 0.4)
mut3 = MutationType(0.5, "n", 0.4, .1)
mut4 = MutationType(0.5, "w", 0.3, 0.2)
mut5 = MutationType(0.5, "g", -0.4, .1)

mutlist = MutationList(mut1, mut2, mut3, mut4, mut5)

#create your custom genomic element types and save to an ElementList
noncod = ElementType(mut1, 1, altname = "nc")
exon1 = ElementType([mut2, mut5], [1, 1], altname = "ex1")
exon2 = ElementType([mut2, mut3, mut4], [9, 1, .02], altname = "ex2")
intron1 = ElementType([mut2, mut5], [1, 1], altname = "int1")
intron2 = ElementType([mut2, mut5], [1, 1], altname = "int2")

mycustomlist = ElementList(mutlist, noncod, exon1, exon2, intron1, intron2)



In [12]:
from shadie import Build

#initialize the Build class object
custom_build = Build(
    exons = [exon1, exon2], 
    introns = [intron1, intron2], 
    noncoding = [noncod], 
    elementlist = mycustomlist)

In [13]:
#run the same random function
custom_build.random()

12:46 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
12:46 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
12:46 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
12:46 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
12:46 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
12:46 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
12:46 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
12:46 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
12:46 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
12:46 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
12:46 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
12:46 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
12:46 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
12:46 | INFO

12:46 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
12:46 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
12:46 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
12:46 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
12:46 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
12:46 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
12:46 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
12:46 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
12:46 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
12:46 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
12:46 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
12:46 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
12:46 | INFO    | [1m[31mrandom         [0m[1m[0m | [1mGene added[0m
12:46 | INFO

In [14]:
from shadie import Chromosome

customized_chrom = Chromosome(genome = custom_build)

In [15]:
customized_chrom.review("elements")

[1mGenomic Elements:
[0m


Unnamed: 0,name,start,finish,eltype,script,type
0,nc,0,2261,g5,"'g5', c(m5),c(1), mmJukesCantor(1e-06/3)",noncoding
2262,ex2,2262,2476,g7,"'g7', c(m6, m7, m8),c(9, 1, 0.02), mmJukesCant...",exon
2477,nc,2477,5056,g5,"'g5', c(m5),c(1), mmJukesCantor(1e-06/3)",noncoding
5057,ex1,5057,5257,g6,"'g6', c(m6, m9),c(1, 1), mmJukesCantor(1e-06/3)",exon
5258,int2,5258,5741,g9,"'g9', c(m6, m9),c(1, 1), mmJukesCantor(1e-06/3)",intron
...,...,...,...,...,...,...
994934,int1,994934,995290,g8,"'g8', c(m6, m9),c(1, 1), mmJukesCantor(1e-06/3)",intron
995291,ex2,995291,995661,g7,"'g7', c(m6, m7, m8),c(9, 1, 0.02), mmJukesCant...",exon
995662,int2,995662,996122,g9,"'g9', c(m6, m9),c(1, 1), mmJukesCantor(1e-06/3)",intron
996123,ex1,996123,996369,g6,"'g6', c(m6, m9),c(1, 1), mmJukesCantor(1e-06/3)",exon


In [11]:
import pandas as pd
final_chromosome.genome["name"]

for index, row in final_chromosome.genome.iterrows():
    if pd.isna(row['name']):
        print(row['type'])
    else: 
        print(row['name'])

noncoding
exon
intron
exon
intron
exon
intron
exon
intron
exon
noncoding
exon
noncoding
exon
noncoding
exon
intron
exon
intron
exon
intron
exon
intron
exon
noncoding
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
noncoding
exon
intron
exon
noncoding
exon
intron
exon
intron
exon
noncoding
exon
noncoding
exon
intron
exon
noncoding
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
noncoding
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
noncoding
exon
noncoding
exon
noncoding
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
noncoding
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
noncoding
exon
noncoding
exon
noncoding
exon
intron
exon
intron
exon
intron
exon
noncoding
exon
intron
exon
noncoding
exon
intron
exon
intron
exon
intron
exon
noncoding
exon
intron
exon
noncoding
exon
intron
exon
intron
exon
intron
exon
noncoding
exon
intron


intron
exon
intron
exon
noncoding
exon
intron
exon
intron
exon
noncoding
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
noncoding
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
noncoding
exon
intron
exon
intron
exon
noncoding
exon
intron
exon
intron
exon
intron
exon
intron
exon
noncoding
exon
intron
exon
intron
exon
intron
exon
intron
exon
noncoding
exon
noncoding
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
noncoding
exon
intron
exon
intron
exon
noncoding
exon
intron
exon
intron
exon
intron
exon
noncoding
exon
noncoding
exon
noncoding
exon
noncoding
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
intron
exon
noncoding
exon
intron
exon
noncoding
exon
intron
exon
intron
exon
intron
exon
noncoding
exon
intron
exon
intron
exon
noncoding
exon
intron
exon
noncoding
exon
intron
exon
intron


In [16]:
customized_chrom.review("chromosome")

[1mChromosome Summary
[0m# of Genes: 186
Average # exons per gene: 4.553763440860215
Average exon length: 261.16056670602126 nt
Average # introns per gene: 3.553763440860215
Average introns length: 449.0574886535552 nt

Static Chromosome Plot:



In [17]:
customized_chrom.review("interactive")

Interactive altair chromosome map:


KeyError: 'name'

## Custom Chromosome
Finally, you can define your own chromosome by providing a pandas dataframe that contains, at minimum, the `name` of the genomic element (as defined by you *or* using `shadie` defaults), the `start base` and the `end base`. Alternatively, you can provide the internal `idx` of the genomic element (in place of the `name`). You must provide a GenomeList class object and your genomic elements must be defined in the list. 