### Assignment: assemble an ipyrad example data set

Follow the instructions here: http://ipyrad.readthedocs.io/API_user-guide.html to assemble a dataset using the ipyrad API. You will need to download the dataset as instructed below. This dataset is different from the one in the linked tutorial. Be sure to download the data into your scratch space, and to set the project directory for you ipyrad analysis to your scratch directory. You can use any of the datasets in the downloaded directory. Read the ipyrad docs if you have questions and/or hit up the gitter chatroom. 

** When finished copy this notebook to your assignments/ dir, push it, and make a pull request**. 

In [1]:
import ipyrad as ip
import ipyparallel as ipp

  from ._conv import register_converters as _register_converters


### Download the data
You will probably want to move the data to your scratch directory. You can run this code here to download it, or from a terminal. 

In [3]:
%%bash
## The curl command needs a capital O, not a zero
curl -LkO https://github.com/dereneaton/ipyrad/raw/master/tests/ipsimdata.tar.gz
tar -xvzf ipsimdata.tar.gz

./ipsimdata/
./ipsimdata/pairgbs_example_R2_.fastq.gz
./ipsimdata/pairgbs_wmerge_example_barcodes.txt
./ipsimdata/rad_example_genome.fa
./ipsimdata/pairddrad_example_genome.fa
./ipsimdata/pairgbs_example_R1_.fastq.gz
./ipsimdata/pairgbs_wmerge_example_R2_.fastq.gz
./ipsimdata/rad_example_genome.fa.fai
./ipsimdata/pairddrad_example_R2_.fastq.gz
./ipsimdata/pairddrad_example_genome.fa.sma
./ipsimdata/pairddrad_example_genome.fa.fai
./ipsimdata/pairgbs_wmerge_example_genome.fa
./ipsimdata/pairddrad_wmerge_example_genome.fa
./ipsimdata/pairddrad_example_genome.fa.smi
./ipsimdata/pairgbs_wmerge_example_R1_.fastq.gz
./ipsimdata/rad_example_genome.fa.smi
./ipsimdata/gbs_example_barcodes.txt
./ipsimdata/pairgbs_example_barcodes.txt
./ipsimdata/pairddrad_example_R1_.fastq.gz
./ipsimdata/pairddrad_wmerge_example_barcodes.txt
./ipsimdata/rad_example_barcodes.txt
./ipsimdata/pairddrad_wmerge_example_R1_.fastq.gz
./ipsimdata/pairddrad_wmerge_example_R2_.fastq.gz
./ipsimdata/gbs_example_R1_.fastq.gz

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0100   147  100   147    0     0    147      0  0:00:01 --:--:--  0:00:01   696
  0 11.8M    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0 85 11.8M   85 10.0M    0     0  10.0M      0  0:00:01  0:00:01 --:--:-- 10.4M100 11.8M  100 11.8M    0     0  11.8M      0  0:00:01  0:00:01 --:--:-- 11.0M


In [2]:
ls ipsimdata/

[0m[01;34mgbs[0m/  [01;34mpairddrad[0m/  [01;34mpairddrad_wmerge[0m/  [01;34mpairgbs[0m/  [01;34mpairgbs_wmerge[0m/  [01;34mrad[0m/


### Connect to an ipcluster instance

In [2]:
ipyclient = ipp.Client()
ipyclient.ids

            Controller appears to be listening on localhost, but not on this machine.
            If this is true, you should specify Client(...,sshserver='you@nickpichome-AB350N-Gaming-WIFI')
            or instruct your controller to listen on an external IP.


[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]

In [3]:

rad = ip.Assembly("rad")
rad.set_params('project_dir', "outfiles")
rad.set_params('raw_fastq_path', "./ipsimdata/rad/*.gz")
rad.set_params('datatype', 'rad')
rad.set_params('barcodes_path', './ipsimdata/rad/rad_example_barcodes.txt')
rad.set_params('reference_sequence', './ipsimdata/rad/rad_example_genome.fa')
rad.get_params()

New Assembly: rad
0   assembly_name               rad                                          
1   project_dir                 ./outfiles                                   
2   raw_fastq_path              ./ipsimdata/rad/*.gz                         
3   barcodes_path               ./ipsimdata/rad/rad_example_barcodes.txt     
4   sorted_fastq_path                                                        
5   assembly_method             denovo                                       
6   reference_sequence          ./ipsimdata/rad/rad_example_genome.fa        
7   datatype                    rad                                          
8   restriction_overhang        ('TGCAG', '')                                
9   max_low_qual_bases          5                                            
10  phred_Qscore_offset         33                                           
11  mindepth_statistical        6                                            
12  mindepth_majrule            6             

### Assembly the dataset from step 1 to step 7

In [None]:
rad.run("1", show_cluster=True, force=True)

### Print the final assembly stats

### Show the location of your assembled output files