In [2]:
%matplotlib inline
import functions
import ipywidgets as widgets

In [3]:
%%javascript
IPython.OutputArea.prototype._should_scroll = function(lines) {
    return false;
}

<IPython.core.display.Javascript object>

In [4]:
import importlib
importlib.reload(functions)
m = functions.getPathNames()

This app will help you analyse your next generation sequencing data.

You might want to use some of the graphs for your infographic later.  If you want to save an image, right click on it and choose "Save Image As".

## Important: Select your dataset ID in this menu before running the analysis below.

In [5]:
dataset_id = widgets.Dropdown(options=range(1, 7))
dataset_id

## Examine the data

Your data consists of a set of 10,000 next generation sequencing **reads**.

Your job is to find out as much as you can about the animal the reads come from.

First, you need to look at your data.

Click on the button below to view 50 random reads from your data.

In [5]:
importlib.reload(functions)
button = widgets.Button(description="View 50 Reads",
                        layout=widgets.Layout(width='300px',
                                             height='50px'))

def onbuttonclick(b):
    functions.sampleReads(functions.FastaToDict("../data/sample_%s_reads.fasta" % dataset_id.value), 50)
button.on_click(onbuttonclick)
b = display(button)

If you want to see all of your reads, click here.

In [6]:
importlib.reload(functions)
button = widgets.Button(description="View All Reads",
                        layout=widgets.Layout(width='300px',
                                             height='50px'))

def onbuttonclick(b):
    functions.goToReads(dataset_id.value)
button.on_click(onbuttonclick)
b = display(button)

## Search your data

It is difficult to find anything out about your data just from looking at it!

Instead, we can start to search for sequences which look similar to different reference sequences.

### Research the Reference Microbes

First, you will search in your sample for each of 20 microbes which can cause disease in some animals.

You can find out about these microbes by selecting them in the menu below and by searching for them online.

In [7]:
importlib.reload(functions)
M = functions.getDropdown('microbe_name', microbes=m)
w = widgets.interact(functions.plotRefSeq, microbe=M)

### Match your reads to the reference data

We can now look if any of your reads match any of these known microbes.

In [18]:
importlib.reload(functions)
button = widgets.Button(description="Match Reads to Reference Sequences!",
                        layout=widgets.Layout(width='300px',
                                             height='50px'))

def onbuttonclick(b):
    r = functions.mapReadsDisplay(dataset_id.value, '../data/references.fasta')
button.on_click(onbuttonclick)
b = display(button)

Loading data....

Loaded 10,000 reads
Mapping reads...

1 / 20: Mapping reads to Feline leukemia virus
4347 reads identified matching Feline leukemia virus


Now your reads are mapped to the 20 reference microbe genomes.

Click on the button below to view how many results you found per reference genome.

In [9]:
importlib.reload(functions)
button = widgets.Button(description="Count Reads per Microbe",
                        layout=widgets.Layout(width='300px',
                                             height='50px'))

def onbuttonclick(b):
    functions.showMappingBar(dataset_id.value)
button.on_click(onbuttonclick)
b = display(button)

You can also look where on the microbe reference genome your reads were found.

In [10]:
importlib.reload(functions)
button = widgets.Button(description="Show Read Mapping",
                        layout=widgets.Layout(width='300px',
                                             height='50px'))

def onbuttonclick(b):
    functions.showMapping(dataset_id.value)
button.on_click(onbuttonclick)
b = display(button)

Finally, here are your results in a table.  You can download them by clicking the button and saving the output file.


In [11]:
importlib.reload(functions)
button = widgets.Button(description="Show Read Mapping Table",
                        layout=widgets.Layout(width='300px',
                                             height='50px'))

def onbuttonclick(b):
    functions.showMappingTable(dataset_id.value, 1)
button.on_click(onbuttonclick)
button2 = widgets.Button(description="Download Read Mapping Table",
                        layout=widgets.Layout(width='300px',
                                             height='50px'))

def onbuttonclick2(b):
    functions.showMappingTable(dataset_id.value, 2)
button2.on_click(onbuttonclick2)
b = display(button)
b2 = display(button2)

Find the **top three** disease causing microbes in your sample - you will use these results later.

## Match your reads to digestive bacteria

Next, we will look for **digestive bacteria** in the sample- these are helpful bacteria which help our bodies digest food and usally don't cause disease.

There are many more digestive bacteria to look for than disease causing microbes, so we won't look at them one by one.

Click on the button below to map your reads to 44 digestive bacteria.

In [6]:
importlib.reload(functions)
button = widgets.Button(description="Match Reads to Reference Sequences!",
                        layout=widgets.Layout(width='300px',
                                             height='50px'))

def onbuttonclick(b):
    r = functions.mapReadsDisplay(dataset_id.value, '../data/digest.fasta', typ='digest')
button.on_click(onbuttonclick)
b = display(button)

Loading data....

Loaded 10,000 reads
Mapping reads...

1 / 20: Mapping reads to Christensenella massiliensis
0 reads identified matching Christensenella massiliensis
0 reads identified in total

2 / 20: Mapping reads to Roseburia hominis
0 reads identified matching Roseburia hominis
0 reads identified in total

3 / 20: Mapping reads to Bacteroides vulgatus
0 reads identified matching Bacteroides vulgatus
0 reads identified in total

4 / 20: Mapping reads to Butyricicoccus pullicaecorum
0 reads identified matching Butyricicoccus pullicaecorum
0 reads identified in total

5 / 20: Mapping reads to Intestinibacter bartlettii
525 reads identified matching Intestinibacter bartlettii
525 reads identified in total

6 / 20: Mapping reads to Treponema succinifaciens
17 reads identified matching Treponema succinifaciens
542 reads identified in total

7 / 20: Mapping reads to Enterococcus hirae
1285 reads identified matching Enterococcus hirae
1827 reads identified in total

8 / 20: Mapping reads

In [21]:
importlib.reload(functions)
button = widgets.Button(description="Count Reads per Microbe",
                        layout=widgets.Layout(width='300px',
                                             height='50px'))

def onbuttonclick(b):
    functions.showMappingBar(dataset_id.value, typ='digest')
button.on_click(onbuttonclick)
b = display(button)