Skip to content

Commit

Permalink
upd
Browse files Browse the repository at this point in the history
  • Loading branch information
ctb committed Oct 12, 2016
1 parent 160ee2e commit 611d85d
Show file tree
Hide file tree
Showing 3 changed files with 63 additions and 26 deletions.
14 changes: 12 additions & 2 deletions assemble.rst
Original file line number Diff line number Diff line change
@@ -1,22 +1,32 @@
Run the MEGAHIT assembler
=========================

Assemble! ::
`MEGAHIT <https://github.com/voutcn/megahit>`__ is a very fast, quite
good assembler designed for metagenomes.

First, install it::

cd
git clone https://github.com/voutcn/megahit.git
cd megahit
make

Now, download some data::

cd /mnt/data
curl -O https://s3-us-west-1.amazonaws.com/dib-training.ucdavis.edu/metagenomics-scripps-2016-10-12/SRR1976948.abundtrim.subset.pe.fq.gz
curl -O https://s3-us-west-1.amazonaws.com/dib-training.ucdavis.edu/metagenomics-scripps-2016-10-12/SRR1977249.abundtrim.subset.pe.fq.gz

These are data that have been run through the @@trimming.

And, finally, run the assembler! ::

mkdir /mnt/assembly
cd /mnt/assembly
ln -fs ../data/*.subset.pe.fq.gz .

~/megahit/megahit --12 SRR1976948.abundtrim.subset.pe.fq.gz,SRR1977249.abundtrim.subset.pe.fq.gz -o combined
~/megahit/megahit --12 SRR1976948.abundtrim.subset.pe.fq.gz,SRR1977249.abundtrim.subset.pe.fq.gz \
-o combined

This will take about 25 minutes; at the end you should see output like
this::
Expand Down
4 changes: 2 additions & 2 deletions aws/boot.rst
Original file line number Diff line number Diff line change
Expand Up @@ -42,8 +42,8 @@ Use ami-05384865.
5. Click on "Select."
=====================

6. Choose m4.large.
===================
6. Choose m4.xlarge.
====================

.. thumbnail:: images/boot-4.png
:width: 20%
Expand Down
71 changes: 49 additions & 22 deletions quality.rst
Original file line number Diff line number Diff line change
@@ -1,11 +1,15 @@
Short read quality and trimming
===============================

@@Harriet, (Boot ami-05384865 and add 100 GB of storage on sdb.)
Start up an instance with ami-05384865 and 200 GB of local storage
(:doc:`aws/boot`). You should also configure your firewall
(:doc:`aws/configure-firewall`) to pass through TCP ports 8000-8888.

First, `Log into your computer <aws/login-shell.html>`__.
Then, `Log into your computer <aws/login-shell.html>`__.

OK, you should now be logged into your Amazon computer! You should see
---

You should now be logged into your Amazon computer! You should see
something like this::

ubuntu@ip-172-30-1-252:~$
Expand All @@ -18,16 +22,10 @@ Prepping the computer
Before we do anything else, we need to set up a place to work and
install a few things.

.. @@CTB /dev/xvdb
First, let's set up a place to work::
First, let's set up a place to work. Here, we'll make /mnt writeable::

mkfs -t ext4 /dev/xvdb
mount /dev/xvdb /mnt
sudo chmod a+rwxt /mnt

This makes '/mnt' a place where we can put data and working files.

.. note::

/mnt is the location we're going to use on Amazon computers, but
Expand All @@ -44,22 +42,53 @@ Installing some software
Run::

sudo apt-get -y update && \
sudo apt-get -y install r-base python3-matplotlib libzmq3-dev python3.5-dev \
texlive-latex-extra texlive-latex-recommended python3-virtualenv \
trimmomatic fastqc python-pip python-dev \
bowtie samtools zlib1g-dev ncurses-dev
sudo apt-get -y install trimmomatic fastqc python-pip \
samtools zlib1g-dev ncurses-dev

Install anaconda::

sudo pip install -U setuptools khmer==2.0 jupyter jupyter_client ipython
curl -O https://repo.continuum.io/archive/Anaconda3-4.2.0-Linux-x86_64.sh
bash Anaconda3-4.2.0-Linux-x86_64.sh

Then update your environment and install khmer::

source ~/.bashrc
pip install khmer==2.0

Running Jupyter Notebook
------------------------

Let's also run a Jupyter Notebook in /mnt. First, configure it a teensy bit
more securely, and also have it run in the background::

jupyter notebook --generate-config
cat >>/home/ubuntu/.jupyter/jupyter_notebook_config.py <<EOF
c = get_config()
c.NotebookApp.ip = '*'
c.NotebookApp.open_browser = False
c.NotebookApp.password = u'sha1:5d813e5d59a7:b4e430cf6dbd1aad04838c6e9cf684f4d76e245c'
c.NotebookApp.port = 8000

EOF

Now, run! ::

cd /mnt
jupyter notebook &

You should be able to visit port 8000 on your AWS computer and see the
Jupyter console. (The password is 'davis'.)

Data source
-----------

.. @@ CTB
We're going to be using a subset of data from `Hu et al.,
2016 <http://mbio.asm.org/content/7/1/e01669-15.full>`__. This paper
from the Banfield lab does cool stuff.
from the Banfield lab samples some relatively low diversity environments
and finds a bunch of nearly complete genomes.

(See `DATA.md <https://github.com/ngs-docs/2016-metagenomics-sio/blob/work/DATA.md>`__ for a list of the data sets we're using in this tutorial.)

1. Copying in some data to work with.
-------------------------------------
Expand Down Expand Up @@ -150,10 +179,8 @@ to list the files, and you should see:
SRR1976948_2_fastqc.html
SRR1976948_2_fastqc.zip

We are *not* going to show you how to look at these files right now -
you need to copy them to your local computer to do that. We'll show
you that tomorrow. But! we can show you what they look like, because
I've made copies of them for you:
You can download these files using your Jupyter Notebook console, if you like;
or you can look at these copies of them::

* `SRR1976948_1_fastqc/fastqc_report.html <http://2016-metagenomics-sio.readthedocs.io/en/work/_static/SRR1976948_1_fastqc/fastqc_report.html>`__
* `SRR1976948_2_fastqc/fastqc_report.html <http://2016-metagenomics-sio.readthedocs.io/en/work/_static/SRR1976948_2_fastqc/fastqc_report.html>`__
Expand Down

0 comments on commit 611d85d

Please sign in to comment.