# **Bioinformatics with Jupyter Notebooks for WormBase:**
## **Utilities 3 - Chromosome Map**
Welcome to the last jupyter notebook in the WormBase tutorial series. Over this series of tutorials, we wrote code in Python that allows us to retrieve and perform simple analyses with data available on the WormBase sites.

This tutorial will deal with generating a chromosome map for the input gene list. Let's get started!

As always, we begin by importing the required libraries.

In [None]:
from __future__ import print_function
import regex as re
from PIL import Image, ImageFont, ImageDraw, ImageOps
from intermine.webservice import Service
service = Service("http://intermine.wormbase.org/tools/wormmine/service")

The chromosomes in WormBase site are numbered using the Roman Numeric System. Hence, it will be handy to have a function that would convert roman numerals to integers for easy manipulations!

The function `value` assigns the integer values to the roman numerals and the `romanToDecimal` function does the actual conversion.

In [None]:
def value(r):
    if (r == 'I'):
        return 1
    if (r == 'V'):
        return 5
    if (r == 'X'):
        return 10
    if (r == 'L'):
        return 50
    if (r == 'C'):
        return 100
    if (r == 'D'):
        return 500
    if (r == 'M'):
        return 1000
    return -1
  
def romanToDecimal(str):
    res = 0
    i = 0 
    while (i < len(str)):
        s1 = value(str[i])
        if (i + 1 < len(str)):
            s2 = value(str[i + 1])
            if (s1 >= s2):
                res = res + s1
                i = i + 1
            else:
                res = res + s2 - s1
                i = i + 2
        else:
            res = res + s1
            i = i + 1
 
    return res

Since we will be visualising the chromosomes, we require special fonts that we can download and use. The following cell will result in a folder named `imp_fonts` which contains only the required fonts!

In [None]:
!wget "https://noto-website-2.storage.googleapis.com/pkgs/NotoSans-hinted.zip"
!unzip "NotoSans-hinted.zip" -d "fonts_for_map"
!mkdir imp_fonts

!mv "fonts_for_map/NotoSans-BoldItalic.ttf" "imp_fonts"
!mv "fonts_for_map/NotoSans-Bold.ttf" "imp_fonts"
!mv "fonts_for_map/NotoSans-Italic.ttf" "imp_fonts"
!mv "fonts_for_map/NotoSans-ExtraBoldItalic.ttf" "imp_fonts"

!rm -r fonts_for_map
!rm "NotoSans-hinted.zip"

The text on the chromosome needs to be tilted at an angle and not completely horizontal. The following function deals with that. Again, no changes are necessary!

In [None]:
def rotatetxt(text='test', degree=90):
  
  font = ImageFont.truetype("imp_fonts/NotoSans-ExtraBoldItalic.ttf", size=12)
  width, height = font.getsize(text)

  image = Image.new('L', (height+50, width))
  draw = ImageDraw.Draw(image)
  draw.text((0, 0), text, font=font, fill=255)
    
  writeout = image.rotate(degree, expand=1)
  return writeout

Assign the gene list to the GeneNames variable! Add genes to the list separated by commas, while ensuring that the entire list is enclosed by double quotes.

We will then prepare the gene list in a format useful for us!

In [None]:
GeneNames = "eat-4, egl-19, C26C6.1, WBGene00006669, F19B2.10, gpa-3"
gene_strings = re.split('; |, |\*|\n', GeneNames)

The next step is to map the genes to the chromosome locations. For this we will use WormMine queries. Refer to tutorial 2 for more info on using WormMine. 

We will make a list of all genes that can mapped to chromosomes in _C.elegans_ and pair them with their locations.

In [None]:
genes_found = {} #dictionary as we need to pair genes with their locations!
genes_notfound = []

for gene_name in gene_strings:
    gene_name = gene_name.strip()
    
    query = service.new_query("Gene") 
    query.add_view("chromosome.primaryIdentifier", "locations.start")
    query.add_constraint("Gene", "LOOKUP", gene_name, "C. elegans", code = "A")
    
    if gene_name not in genes_found:
        if not query.rows():
            genes_notfound.append(gene_name)
            
        else:
            for generow in query.rows():
                gene_id = generow["primaryIdentifier"]
                gene_pub_name = generow["symbol"]
                gene_chr = generow["chromosome.primaryIdentifier"]
                gene_loc = generow["locations.start"]
                genes_found.update({gene_id : [gene_pub_name, gene_chr, gene_loc]})

Now that we have mapped the genes to the chromosomes in the dictionary, we will fetch the chromosome information. Again, we use WormMine.

In [None]:
query = service.new_query("Chromosome")
query.add_view("primaryIdentifier", "length")

chromosomes = {}

for chrrow in query.rows():
    chrom_label = chrrow["primaryIdentifier"]
    length = chrrow["length"]
    
    if 'X' in chrom_label:
        chr_num = (5+1)
    elif 'MtDNA' in chrom_label:
        chr_num = (6+1)
    else:
        chr_num = roman_to_int(chrom_label)
    
    chromosomes.update({chr_num : [chrom_label, length]})

Now we have all that we need to generate the chromosome map images!!

We will first assign all the image dimensions related variables appropriate values. Then deal with laying down the lines for chromosomes and then marking the gene names on them. You do not need to make any changes to this cell!

In [None]:
spacer = 65                               #increase to increase height
headspace = 60
mitohead = 200
txthead = 15
nameshift = 65
autoscale = 25000                         #decrease to increase width
mitoscale = (autoscale / 1000)

chromosome_map = Image.new('RGB', (int(25000000 / autoscale), (spacer * 8)), (255, 255, 255)) #dimension and color
graph = ImageDraw.Draw(chromosome_map)



for chromosome in chromosomes:
    
    chrnum = chromosome
    chromlabel = chromosomes[chromosome][0]
    chromlength = chromosomes[chromosome][1]
    
    font = ImageFont.truetype("imp_fonts/NotoSans-Bold.ttf", size=20)
    
    if 'MtDNA' in chromlabel: 
        graph.line((mitohead, (spacer * chrnum), (chromlength/mitoscale + mitohead), (spacer * chrnum)), 
                   fill=(0, 100, 255), width=5)
        graph.text((txthead, (spacer * chrnum - 15)),(chromlabel + ' (X 1,000)'),(0,0,0),font=font)
        
    else:
        graph.line((headspace, (spacer * chrnum), (chromlength/autoscale + headspace), (spacer * chrnum)), 
                   fill=(0, 100, 255), width=5)
        graph.text((txthead, (spacer * chrnum - 15)),(chromlabel),(0,0,0),font=font)
        

for gene in genes_found:
    
    geneid = gene
    pubname = genes_found[gene][0]
    chromo = genes_found[gene][1]
    location = genes_found[gene][2]
    
    font = ImageFont.truetype("imp_fonts/NotoSans-BoldItalic.ttf")
    
    if 'MtDNA' in chromo:
        chrnum = 7
        graph.line(((location/mitoscale + mitohead), (spacer * chrnum -2.5 ), (location/mitoscale + mitohead), 
                    (spacer * chrnum - 10)), fill=(0, 0, 0), width=2)
        chromosome_map.paste( ImageOps.colorize(rotatetxt(pubname, 45), (0,0,0), (0,0,0)), 
                 (int(location/mitoscale + mitohead - 5), (spacer * chrnum - nameshift)),  rotatetxt(pubname, 45))
    
    elif 'X' in chromo: 
        chrnum = 6
        graph.line(((location/autoscale + headspace), (spacer * chrnum -2.5 ), (location/autoscale + headspace), 
                    (spacer * chrnum - 10)), fill=(0, 0, 0), width=2)
        chromosome_map.paste( ImageOps.colorize(rotatetxt(pubname, 45), (0,0,0), (0,0,0)), 
                 (int(location/autoscale + headspace - 5), (spacer * chrnum - nameshift)),  rotatetxt(pubname, 45))
        
    else:
        chrnum = roman_to_int(chromo)
        graph.line(((location/autoscale + headspace), (spacer * chrnum -2.5 ), (location/autoscale + headspace), 
                    (spacer * chrnum - 10)), fill=(0, 0, 0), width=2)
        chromosome_map.paste( ImageOps.colorize(rotatetxt(pubname, 45), (0,0,0), (0,0,0)), 
                 (int(location/autoscale + headspace - 5), (spacer * chrnum - nameshift)),  rotatetxt(pubname, 45))

chromosome_map

This is the end of the tutorial on generating a chromosome map for a given list of genes using WormMine.

Acknowledgements:
- Worm Tutorials GitHub (https://github.com/Munfred/worm-tutorials)

This is also the end of the WormBase Jupyter Notebook series! Hope it was helpful!!