<a href="https://colab.research.google.com/github/saulobritto/genome-assembly-gcolab/blob/main/Mapping_Reads_To_A_Reference_Genome.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Pipeline para mapeamento de reads com um genoma de referência e seus dados.





---
**Pipeline desenvolvida por Saulo Britto da Silva**

*Última atualização no dia 26.11.2020*

>Pipeline teste desenvolvida para realizar o mapeamento de reads com um genoma de referência e seus dados.

---


#Instalações

In [None]:
# Download Miniconda installation script
!wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh

# Make it executable
!chmod +x Miniconda3-latest-Linux-x86_64.sh

# Start installation in silent mode
!bash ./Miniconda3-latest-Linux-x86_64.sh -b -f -p /usr/local

# Make conda packages available in current environment
import sys
sys.path.append('/usr/local/lib/python3.7/site-packages/')

In [None]:
!conda install -c bioconda bowtie2

In [None]:
#SAMtools
!wget https://github.com/samtools/samtools/releases/download/1.11/samtools-1.11.tar.bz2
!tar -jvxf samtools-1.11.tar.bz2
%cd samtools-1.11
!make

In [None]:
#Conectando o Google Drive à Pipeline
from google.colab import drive
drive.mount('/content/drive')

#Bowtie2

In [None]:
!bowtie2 

##Example

mapping short reads against E. coli 2011 outbreak genome


---


###Create bowtie2 index database

example genome: German 2011 E.coli outbreak (database name: ecoli)

!bowtie2-build GCF_000299455.fna ecoli


---


###Bowtie2 mapping

map reads (sample.fastq) against the E. coli genome database 'ecoli'

!bowtie2 -x ecoli -1 SAMPLE_r1.fastq -2 SAMPLE_r2.fastq -U SAMPLE_single_reads.fastq --no-unal -p 12 -S SAMPLE.sam


---



  -1            read 1 of paired reads
 
  -2            read 2 of paired reads
 
  -U            single unpaired reads
 
  -S SAMPLE.sam write bowtie2 output in SAM format to file SAMPLE.sam
 
  --no-unal     ignore reads that failed to align
 
  -p 12         use up to 12 parallel processors

  ---
  Author of this example:

Matthias Scholz

https://twitter.com/panphlan


#SAMtools

##Examples


---


convert a SAM file to a BAM file

!samtools view -b -S SAMPLE.sam > SAMPLE.bam

  -S Input is in SAM format

  -b Output in BAM format



---


convert a BAM file to a SAM file

!samtools view -h SAMPLE.bam > SAMPLE.sam



---


sort a BAM file

samtools sort SAMPLE.bam -o SAMPLE_sorted.bam


---


### Using a unix pipe (input '-')
!cat SAMPLE.bam | samtools sort - -o SAMPLE_sorted.bam

!samtools sort SAMPLE.bam SAMPLE   # old version v1.2

---
sort by readName

!samtools sort -n SAMPLE.bam -o SAMPLE_sorted.bam

---
Stats

get number of alignments

!samtools view -c SAMPLE.bam

more statistics about alignments

!samtools flagstat SAMPLE.bam

comprehensive statistics

!samtools stats SAMPLE.bam

---
###Get coverage
get coverage of a selected region (e.g., from base 1,958,700 to 1,958,907 of a contig)

!samtools index sampleID.bam

!samtools mpileup -r 'contigName:1,958,700-1,958,907' sampleID.bam

### same in combination with awk to count the total and averaged coverage

!samtools mpileup -r 'contigName:1,958,700-1,958,907' sampleID.bam | awk 'BEGIN{C=0}; {C=C+$4}; END{print C "\t" C/NR}'

---
see also: → Calling SNPs/INDELs with SAMtools/BCFtools

Note: SAMtools mpileup counts only primary aligned reads. SAMtools discards unmapped reads, secondary alignments and duplicates. To consider also secondary alignments, BEDtools could be an alternative.

---
###SAMtools documentation

[Samtools homepage](http://www.htslib.org/)

[Samtools documentation](http://www.htslib.org/doc/samtools.html)

[Introduction by Dave Tang](http://davetang.org/wiki/tiki-index.php?page=SAMTools)


Alternative BAM processing tools

→ Sambamba

→ BEDtools

---
Author of this example: 

Matthias Scholz

https://twitter.com/panphlan



#Jbrowser2

In [None]:
!curl -sL https://deb.nodesource.com/setup_15.x | sudo -E bash -
!sudo apt-get install -y nodejs


## Installing the NodeSource Node.js 15.x repo...


## Populating apt-get cache...

+ apt-get update
0% [Working]            Hit:1 http://ppa.launchpad.net/c2d4u.team/c2d4u4.0+/ubuntu bionic InRelease
0% [Waiting for headers] [Connecting to security.ubuntu.com (91.189.91.39)] [Co                                                                               Hit:2 http://archive.ubuntu.com/ubuntu bionic InRelease
                                                                               Get:3 http://archive.ubuntu.com/ubuntu bionic-updates InRelease [88.7 kB]
0% [3 InRelease 15.6 kB/88.7 kB 18%] [Connecting to security.ubuntu.com (91.1890% [1 InRelease gpgv 15.9 kB] [3 InRelease 15.6 kB/88.7 kB 18%] [Connecting to                                                                                Hit:4 http://ppa.launchpad.net/graphics-drivers/ppa/ubuntu bionic InRelease
0% [1 InRelease gpgv 15.9 kB] [3 InRelease 18.5 kB/88.7 kB 21%] [Connecting to                          

In [None]:
!npm install -g @jbrowse/cli

[K[?25h
changed 183 packages, and audited 183 packages in 6s

13 packages are looking for funding
  run `npm fund` for details

found [32m[1m0[22m[39m vulnerabilities


In [None]:
!jbrowse --version

@jbrowse/cli/1.0.1 linux-x64 node-v15.3.0


In [None]:
!jbrowse create jbrowse2

 [31m›[39m   Error: jbrowse2 This directory has existing files and could cause 
 [31m›[39m   conflicts with create. Please choose another directory or use the force 
 [31m›[39m   flag to overwrite existing files


In [None]:
%cd /content/jbrowse2

/content/jbrowse2


In [None]:
# Replace with the location of your BAM file
#jbrowse add-track /data/volvox.bam --load copy



 [31m›[39m   Error: Could not resolve to a file or a URL: 
 [31m›[39m   "/content/newbler_scaffolds_original.fasta.fai"


In [None]:
### Install ngrok
#!wget https://bin.equinox.io/c/4VmDzA7iaHb/ngrok-stable-linux-amd64.zip
#!unzip ngrok-stable-linux-amd64.zip

### Run ngrok to tunnel Dash app port 5000 to the outside world. 
### This command runs in the background.
get_ipython().system_raw('./ngrok http 9090 &')

### Get the public URL where you can access the Dash app. Copy this URL.
! curl -s http://localhost:4040/api/tunnels | python3 -c \
    "import sys, json; print(json.load(sys.stdin)['tunnels'][0]['public_url'])"

http://7f4e2ec1b186.ngrok.io


In [None]:
#Start the admin-server
#!jbrowse admin-server

#!npx serve .

[34m[39m
[34m   ┌────────────────────────────────────────────────────────────────────┐[39m
   [34m│[39m                                                                    [34m│[39m
   [34m│[39m   [32mNow serving JBrowse[39m                                              [34m│[39m
   [34m│[39m   [32mNavigate to the below URL to configure[39m                           [34m│[39m
   [34m│[39m                                                                    [34m│[39m
   [34m│[39m   [1m- Local:[22m            http://localhost:9090?adminKey=c0e4fab488    [34m│[39m
   [34m│[39m   [1m- On Your Network:[22m  http://172.28.0.2:9090?adminKey=c0e4fab488   [34m│[39m
   [34m│[39m                                                                    [34m│[39m
[34m   └────────────────────────────────────────────────────────────────────┘[39m
[34m[39m
If you are running yarn start you can launch http://localhost:3000?adminKey=c0e4fab488&adminServer=http://localhos