# Peacocks and Pythons; Oh My!

This notebook walks you through the basic python coding skills and Pandas library skills that you need to generate inoformation on peacock genetics.  It uses some sample genetics provided in the [Tumblr post where a program that manipulates peacock genetics was discussed](https://www.tumblr.com/kedreeva/777970704354312192/transparencies-are-a-a-heavy-load-but-would-still). Note that this is NOT intended to be website that is easy for Joe Shmoe to use to calculate the results of a particular peacock cross; this is intended to guide you through the skills needed to build the code that would support that website.

Before we start, we import the necessary python libraries

In [74]:
import random

Then we need to create a table of genes for the peacocks.  These are the different chromasomal positions where traits could be stored in the peacoaks genetics. We also want to indicate if a particular gene is linked with another gene. As mentioned in the Tumblr post, peacock genetics are not super thuroughly known, so most genes will be unlinked.

To build this table, we create a python array, which is basically a list of items.  Since we are making a list of genes, we will name the array "genes".  Inside the array, each gene will be represented by another array, which will hold the information relate to that gene, such as the gene's name, and the linking type.  For each group of genes that are linked together, we will use a single linking type.

A couple notes about Python:
* Arrays are indicated with square braces: []
* Items in an array are seperated with commas: ,
* When creating data in Python that we will be using in the future, we give it a name, and then an equals sign, and then the data. this allows us to access that data in the future by simply using the name.  Most coding languages refer to these as variables.
* If we want to have text in Python (for exampe, the name of a gene), we need to surround it with quotes: "".  This will let Python know that it is text data, instead of trying to parse it as a variable name or a special command word.
* Lines that start with # are comments, and are not part of the program

In [75]:
genes = [
    # Short Name       Name              Link Type
    ["sex",            "Sex",            "sex"],
    ["pur",            "Purple",         "sex"],
    ["opl",            "Opal",           "none"],
    ["cam",            "Cameo",          "none"],
    ["haz",            "Hazel",          "none"]
]

Next, we need to create a table of alleles.  This is how we will primarily decide phenotypes for birds with simple genes, and will allow us to generate various genotypes.  These alleles should be connected to specific genes, and indicate whether they are recessive or dominant with a number, where higher number are more dominant. (For example, an allele that is recessive to everything would be 0, and an allele with a dominance of 1 would always be dominant to that recessive gene.  An allele with a dominance of 2 would be dominant to BOTH the 0 and 1 alleles.)

In [76]:
alleles = [
    # Allele  Allele Name   Gene   Dominance
    ["m",     "Male",       "sex", 1],
    ["f",     "Female",     "sex", 0],
    ["w",     "Not Purple", "pur", 0],
    ["Z(pl)", "Purple",     "pur", 0],
    ["WT",    "Not Opal",   "opl", 1],
    ["o",     "Opal",       "opl", 0],
    ["C",     "Cameo",      "cam", 1],
    ["c",     "Not Cameo",  "cam", 0],
    ["H",     "Hazel",      "haz", 1],
    ["h",     "Indigo",     "haz", 0]
]

Now that we have a list of genes and alleles, we can create a peacock that has a specific genotype.  To do that, we are going to create a dictionary with arrays of alleles.  In Python, a dictionary allows you to accociate a "key" with a "value", and then search based on those keys.

In [77]:
malePeacock = {
    "sex": ["m", "f"],
    "pur": ["Z(pl)", "Z(pl)"],
    "opl": ["WT", "o"],
    "cam": ["C", "c"],
    "haz": ["H", "h"]
}

Once we have a peacock, we can write out a human readable phenotype for the peacock by applying the list of alleles to the peacock, and matching the dominant/recessive genes. We do this by creating a for loop that looks at each key in the peacock dictionary, and determining if it is homo or het, and what the name of the most dominant allele is.

Python Notes:
* Control statements are a special type of code that determines how to run the code after them.  In Python, control statements indicate the code they are responsible for by indents; all code that is managed by a particular control statement will be indented underneath it
* A for loop is a control statement that runs a piece of code for a specific number of times. In this case, we are going to run it once for each gene.
    * The `for x in y` statement creates a special variable that changes for each run of the for loop, and allows us to access each gene one at a time
* An if statement is a control statement that only executes the code after it if the statement is true
    * == checks if two values are the same
    * != checks if two values are different

In [89]:
def calculateBasicPhenotype(peacock): # Create this code as a function so we can call it in several places
    print("Phenotype for Peacock")
    
    peacockPhenotype = {} # Holds final phenotype for reuse
    
    for gene in genes: # Look at each gene one at a time
        geneId = gene[0]; # Get the short Name for a gene
        geneAlleles = peacock[geneId]; # Get the genotype for the peacock for that specific gene
    
        geneName = gene[1];
        
        firstAllele = None;
        secondAllele = None; # Set the alleles to none initally while we find the information for them
        for allele in alleles: # Look at each allele one at a time
            alleleGene = allele[2]
            if alleleGene != geneId: # check if this is an allele for the gene we are looking at
                continue # If it is not, ignore it and go to the next allele
    
            alleleId = allele[0]
            if geneAlleles[0] == alleleId: # if the first allele in the peacock for this gene is the same as the one we are talking about
                firstAllele = allele # Save it's data to the first allele
    
            if geneAlleles[1] == alleleId:# if the second allele in the peacock for this gene is the same as the one we are talking about
                secondAllele = allele
    
            if firstAllele != None and secondAllele != None:
                break;
    
        if firstAllele == None or secondAllele == None: # After we finish looking at all the alleles, if first and second have not been assigned
            raise Exception("Peacock has an unknown allele in the " + geneName + " gene") # then we should tell the user there was an issue
    
        isHet = "het " if geneAlleles[0] != geneAlleles[1] else ""
    
        firstAlleleDominance = firstAllele[3]
        secondAlleleDominance = secondAllele[3]
        alleleName = "Unknown"
        if(firstAlleleDominance > secondAlleleDominance): # Save the phenotype as the more dominant allele
            alleleName = firstAllele[1]
        else:
            alleleName = secondAllele[1]
            
        print(geneName + ": " + isHet + alleleName)
        peacockPhenotype[geneId] = alleleName

    return peacockPhenotype

malePhenotype = calculateBasicPhenotype(malePeacock) # Call the phenotype calculation on our male peacock

Phenotype for Peacock
Sex: het Male
Purple: Purple
Opal: het Not Opal
Cameo: het Cameo
Hazel: het Hazel


Now that we can list the phenotypes for a peacock, we can start handling exceptional cases, such as peach.  To do this, we are going to use if-else control statements, to override the default phenotypes.

In [86]:
def calculateAdvancedPhenotype(peacockPhenotype):
    print("Additional phenotype information for peacock:")
    
    if (peacockPhenotype["pur"] == "Purple" and peacockPhenotype["cam"] == "Cameo"):
        print("Peach: Peach - Overrides other feather colors")
    
    if(peacockPhenotype["haz"] == "Hazel"):
        print("Has a possibility of being indigo instead of hazel")
    
    if(peacockPhenotype["haz"] == "Indigo"):
        print("Has a possibility of being hazel instead of indigo")

calculateAdvancedPhenotype(malePhenotype)

Additional phenotype information for peacock:
Peach: Peach - Overrides other feather colors
Has a possibility of being indigo instead of hazel


We can also create a female peacock, and run a similar analysis on her

In [87]:
femalePeacock = {
    "sex": ["f", "f"],
    "pur": ["w", "w"],
    "opl": ["o", "o"],
    "cam": ["c", "c"],
    "haz": ["h", "h"]
}

femalePhenotype = calculateBasicPhenotype(femalePeacock)
calculateAdvancedPhenotype(femalePhenotype)

Phenotype for Peacock
Sex: Female
Purple: Not Purple
Opal: Opal
Cameo: Not Cameo
Hazel: Indigo
{'sex': 'Female', 'pur': 'Not Purple', 'opl': 'Opal', 'cam': 'Not Cameo', 'haz': 'Indigo'}
Additional phenotype information for peacock:
Has a possibility of being hazel instead of indigo


Now that we can properly identify the phenotype of any specific bird, we can start generating children from a breeding pair.  We'll start by generating a single random child from two parents.  Note that because this uses random generation, the results will be different every time you run the code

In [90]:
def generateRandomChild(male, female):
    childGenotype = {} # Holds final phenotype for reuse
    
    for gene in genes:
        geneId = gene[0]
        maleAlleles = male[geneId]
        femaleAlleles = female[geneId]

        childGenotype[geneId] = [maleAlleles[random.randint(0, 1)], femaleAlleles[random.randint(0, 1)]]
    
    print("Generated Child: " + childGenotype)
    return childGenotype

childPeacock = generateRandomChild(malePeacock, femalePeacock)
childPhenotype = calculateBasicPhenotype(childPeacock)
calculateAdvancedPhenotype(childPhenotype)

{'sex': ['f', 'f'], 'pur': ['Z(pl)', 'w'], 'opl': ['WT', 'o'], 'cam': ['c', 'c'], 'haz': ['H', 'h']}
Phenotype for Peacock
Sex: Female
Purple: het Not Purple
Opal: het Not Opal
Cameo: Not Cameo
Hazel: het Hazel
Additional phenotype information for peacock:
Has a possibility of being indigo instead of hazel


With that, you could generate a random clutch from the two parents, simply by calling the random child generator as many times as you want