Skip to content
Permalink
Branch: master
Find file Copy path
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
3111 lines (3110 sloc) 116 KB
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Running in Docker container on Ostrich\n",
"\n",
"#### Started Docker container with the following command:\n",
"\n",
"```docker run -p 8888:8888 -v /Users/sam/data/:/data -v /Users/sam/owl_home/:/owl_home -v /Users/sam/owl_web/:/owl_web -v /Users/sam/gitrepos:/gitrepos -it f99537d7e06a```\n",
"\n",
"The command allows access to Jupyter Notebook over port 8888 and makes my Jupyter Notebook GitHub repo and my data files on Owl/home and Owl/web accessible to the Docker container.\n",
"\n",
"Once the container was started, started Jupyter Notebook with the following command inside the Docker container:\n",
"\n",
"```jupyter notebook```\n",
"\n",
"This is configured in the Docker container to launch a Jupyter Notebook without a browser on port 8888.\n",
"\n",
"The Docker container is running on an image created from this [Dockerfile (Git commit 443bc42)](https://github.com/sr320/LabDocs/blob/443bc425cd36d23a07cf12625f38b7e3a397b9be/code/dockerfiles/Dockerfile.bio)"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Mon Feb 27 18:32:53 UTC 2017\n"
]
}
],
"source": [
"%%bash\n",
"date"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Check computer specs"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0f2bca9c664b\n"
]
}
],
"source": [
"%%bash\n",
"hostname"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Architecture: x86_64\n",
"CPU op-mode(s): 32-bit, 64-bit\n",
"Byte Order: Little Endian\n",
"CPU(s): 8\n",
"On-line CPU(s) list: 0-7\n",
"Thread(s) per core: 1\n",
"Core(s) per socket: 8\n",
"Socket(s): 1\n",
"Vendor ID: GenuineIntel\n",
"CPU family: 6\n",
"Model: 26\n",
"Model name: Intel(R) Xeon(R) CPU E5520 @ 2.27GHz\n",
"Stepping: 5\n",
"CPU MHz: 2260.998\n",
"BogoMIPS: 4521.99\n",
"Hypervisor vendor: KVM\n",
"Virtualization type: full\n",
"L1d cache: 32K\n",
"L1i cache: 32K\n",
"L2 cache: 256K\n",
"L3 cache: 8192K\n"
]
}
],
"source": [
"%%bash\n",
"lscpu"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Download Jay's Non-Demultiplexed data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### The files in this folder are as follows (email correspondence):\n",
"> Hi Sam,\n",
">\n",
"Your new directory is on the server \"Dimond_170224\", the checksum info in listed in a text file for the three files. Is there a better way to list the checksums? I'm new to this. \n",
"> \n",
">Also, I gave you the 2 reads and the 6bp index file. \n",
"> \n",
">Best,\n",
">\n",
">Shana\n",
"\n",
">Shana McDevitt\n",
"\n",
">Director\n",
"\n",
">Vincent J. Coates Genomics Sequencing Laboratory\n",
"\n",
">California Institute for Quantitative Biosciences (QB3)\n",
"\n",
">University of California, Berkeley"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"/data/20170227_jay_data_tmp\n"
]
}
],
"source": [
"cd /data/20170227_jay_data_tmp/"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### The following command uses ```wget``` to download all of the files in the target directory. Here's an explanation of the code:\n",
"\n",
"- ```time```: Evaluates how long it takes for the command to complete. \n",
"\n",
"- ```WGETRC=/data/wgetrc_berk_seq```: This assigns the value of the bash variable ```WGETRC``` to the contents of the ```wgetrc_berk_seq``` file. This file contains the username and password needed to ftp the data from the UC Berkeley server. Using this allows me to run the command in a Jupyter notebook without the need for pasting the actual username and password into the command string.\n",
"\n",
"- ```-r```: Recursive; i.e. download all things in this directory and anything in any subdirectories.\n",
"\n",
"- ```-np```: No parent; i.e. do not ascend to higher directories.\n",
"\n",
"- ```-nc```: No clobber; i.e. do not overwrite any existing files in the download directory.\n",
"\n",
"- ```-q```: Quiet; i.e. do not print wget status to screen. This is to prevent bogging down the Jupyter notebook with thousands of output lines."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n",
"real\t37m7.999s\n",
"user\t0m3.130s\n",
"sys\t20m34.060s\n"
]
}
],
"source": [
"%%bash\n",
"time WGETRC=/data/wgetrc_berk_seq wget -r -np -nc -q ftp://gslserver.qb3.berkeley.edu/Dimond_170224"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"total 0\n",
"drwxr-xr-x 1 srlab staff 102 Feb 27 19:54 gslserver.qb3.berkeley.edu\n"
]
}
],
"source": [
"%%bash\n",
"ls -lh"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"/data/20170227_jay_data_tmp/gslserver.qb3.berkeley.edu/Dimond_170224\n"
]
}
],
"source": [
"cd gslserver.qb3.berkeley.edu/Dimond_170224/"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"total 41G\n",
"-rw-r--r-- 1 srlab staff 2.1G Feb 24 23:28 JD002_S0_L005_I1_001.fastq.gz\n",
"-rw-r--r-- 1 srlab staff 18G Feb 24 23:28 JD002_S0_L005_R1_001.fastq.gz\n",
"-rw-r--r-- 1 srlab staff 22G Feb 24 23:28 JD002_S0_L005_R2_001.fastq.gz\n",
"-rw-r--r-- 1 srlab staff 192 Feb 25 00:01 md5sum_report\n"
]
}
],
"source": [
"%%bash\n",
"ls -lh"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"baa87464b77f937fccf496351bb7f000 JD002_S0_L005_I1_001.fastq.gz\r\n",
"e05eea61dbd405c890f241f824b2012b JD002_S0_L005_R1_001.fastq.gz\r\n",
"9e34ddfc4dbdd9a96bd4f8f102f52693 JD002_S0_L005_R2_001.fastq.gz\r\n"
]
}
],
"source": [
"cat md5sum_report"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Generate our own MD5 checksums"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n",
"real\t8m0.815s\n",
"user\t0m4.260s\n",
"sys\t4m44.580s\n"
]
}
],
"source": [
"%%bash\n",
"time for i in *.gz\n",
" do\n",
" md5sum \"$i\" >> checksums.md5\n",
" done"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"baa87464b77f937fccf496351bb7f000 JD002_S0_L005_I1_001.fastq.gz\r\n",
"e05eea61dbd405c890f241f824b2012b JD002_S0_L005_R1_001.fastq.gz\r\n",
"9e34ddfc4dbdd9a96bd4f8f102f52693 JD002_S0_L005_R2_001.fastq.gz\r\n"
]
}
],
"source": [
"cat checksums.md5"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Compare MD5 checksums\n",
"\n",
"Visual inspection suggests that these are good to go, but we'll compare them programmatically anyway..."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"%%bash\n",
"diff checksums.md5 md5sum_report"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"No output means no differences between the two files. However, to further verify, we'll check the exit status of the last command run (should be 0 if last command completed successfully with no errors). This is accomplished by calling the bash variable ```$?```."
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0\n"
]
}
],
"source": [
"%%bash\n",
"echo $?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Copy files to directories on Owl\n",
"\n",
"Jay has three different species in his sequencing data, so I'm copying the data to each of three different species folders on Owl. The code below uses the ```-no-clobber``` argument to prevent the program from overwriting any existing files in the destination directory that might have the same file name."
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n",
"real\t130m18.473s\n",
"user\t0m0.020s\n",
"sys\t18m23.290s\n"
]
}
],
"source": [
"%%bash\n",
"time for file in *.gz\n",
" do\n",
" cp --no-clobber \"$file\" /owl_web/nightingales/P_generosa/\n",
" cp --no-clobber \"$file\" /owl_web/nightingales/Porites_spp/\n",
" cp --no-clobber \"$file\" /owl_web/nightingales/A_elegantissima/\n",
" done"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Generate new checksums for files copied to Owl"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n",
"real\t31m19.144s\n",
"user\t0m3.860s\n",
"sys\t4m44.720s\n"
]
}
],
"source": [
"%%bash\n",
"time for i in /owl_web/nightingales/P_generosa/JD002_S0_L005*.gz\n",
" do\n",
" md5sum \"$i\" >> temp_checksums.md5\n",
" done"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n",
"real\t27m17.756s\n",
"user\t0m4.390s\n",
"sys\t4m44.170s\n"
]
}
],
"source": [
"%%bash\n",
"time for i in /owl_web/nightingales/Porites_spp/JD002_S0_L005*.gz\n",
" do\n",
" md5sum \"$i\" >> temp_checksums.md5\n",
" done"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n",
"real\t26m45.605s\n",
"user\t0m4.870s\n",
"sys\t4m43.930s\n"
]
}
],
"source": [
"%%bash\n",
"time for i in /owl_web/nightingales/A_elegantissima/JD002_S0_L005*.gz\n",
" do\n",
" md5sum \"$i\" >> temp_checksums.md5\n",
" done"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Compare initial checksums with temporary checksums on Owl"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"I screwed up and didn't create/write the ```temp_checksums.md5``` file into the different directories on Owl. Will create a concatenated ```md5sum_report``` file that mimics the contents of the ```temp_checksums.md5``` file."
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"baa87464b77f937fccf496351bb7f000 /owl_web/nightingales/P_generosa/JD002_S0_L005_I1_001.fastq.gz\n",
"e05eea61dbd405c890f241f824b2012b /owl_web/nightingales/P_generosa/JD002_S0_L005_R1_001.fastq.gz\n",
"9e34ddfc4dbdd9a96bd4f8f102f52693 /owl_web/nightingales/P_generosa/JD002_S0_L005_R2_001.fastq.gz\n",
"baa87464b77f937fccf496351bb7f000 /owl_web/nightingales/Porites_spp/JD002_S0_L005_I1_001.fastq.gz\n",
"e05eea61dbd405c890f241f824b2012b /owl_web/nightingales/Porites_spp/JD002_S0_L005_R1_001.fastq.gz\n",
"9e34ddfc4dbdd9a96bd4f8f102f52693 /owl_web/nightingales/Porites_spp/JD002_S0_L005_R2_001.fastq.gz\n",
"baa87464b77f937fccf496351bb7f000 /owl_web/nightingales/A_elegantissima/JD002_S0_L005_I1_001.fastq.gz\n",
"e05eea61dbd405c890f241f824b2012b /owl_web/nightingales/A_elegantissima/JD002_S0_L005_R1_001.fastq.gz\n",
"9e34ddfc4dbdd9a96bd4f8f102f52693 /owl_web/nightingales/A_elegantissima/JD002_S0_L005_R2_001.fastq.gz\n"
]
}
],
"source": [
"%%bash\n",
"cat temp_checksums.md5"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"%%bash\n",
"cat md5sum_report >> md5sum_report_cat\n",
"cat md5sum_report >> md5sum_report_cat\n",
"cat md5sum_report >> md5sum_report_cat"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"baa87464b77f937fccf496351bb7f000 JD002_S0_L005_I1_001.fastq.gz\n",
"e05eea61dbd405c890f241f824b2012b JD002_S0_L005_R1_001.fastq.gz\n",
"9e34ddfc4dbdd9a96bd4f8f102f52693 JD002_S0_L005_R2_001.fastq.gz\n",
"baa87464b77f937fccf496351bb7f000 JD002_S0_L005_I1_001.fastq.gz\n",
"e05eea61dbd405c890f241f824b2012b JD002_S0_L005_R1_001.fastq.gz\n",
"9e34ddfc4dbdd9a96bd4f8f102f52693 JD002_S0_L005_R2_001.fastq.gz\n",
"baa87464b77f937fccf496351bb7f000 JD002_S0_L005_I1_001.fastq.gz\n",
"e05eea61dbd405c890f241f824b2012b JD002_S0_L005_R1_001.fastq.gz\n",
"9e34ddfc4dbdd9a96bd4f8f102f52693 JD002_S0_L005_R2_001.fastq.gz\n"
]
}
],
"source": [
"%%bash\n",
"cat md5sum_report_cat"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1,9c1,9\n",
"< baa87464b77f937fccf496351bb7f000 JD002_S0_L005_I1_001.fastq.gz\n",
"< e05eea61dbd405c890f241f824b2012b JD002_S0_L005_R1_001.fastq.gz\n",
"< 9e34ddfc4dbdd9a96bd4f8f102f52693 JD002_S0_L005_R2_001.fastq.gz\n",
"< baa87464b77f937fccf496351bb7f000 JD002_S0_L005_I1_001.fastq.gz\n",
"< e05eea61dbd405c890f241f824b2012b JD002_S0_L005_R1_001.fastq.gz\n",
"< 9e34ddfc4dbdd9a96bd4f8f102f52693 JD002_S0_L005_R2_001.fastq.gz\n",
"< baa87464b77f937fccf496351bb7f000 JD002_S0_L005_I1_001.fastq.gz\n",
"< e05eea61dbd405c890f241f824b2012b JD002_S0_L005_R1_001.fastq.gz\n",
"< 9e34ddfc4dbdd9a96bd4f8f102f52693 JD002_S0_L005_R2_001.fastq.gz\n",
"---\n",
"> baa87464b77f937fccf496351bb7f000 /owl_web/nightingales/P_generosa/JD002_S0_L005_I1_001.fastq.gz\n",
"> e05eea61dbd405c890f241f824b2012b /owl_web/nightingales/P_generosa/JD002_S0_L005_R1_001.fastq.gz\n",
"> 9e34ddfc4dbdd9a96bd4f8f102f52693 /owl_web/nightingales/P_generosa/JD002_S0_L005_R2_001.fastq.gz\n",
"> baa87464b77f937fccf496351bb7f000 /owl_web/nightingales/Porites_spp/JD002_S0_L005_I1_001.fastq.gz\n",
"> e05eea61dbd405c890f241f824b2012b /owl_web/nightingales/Porites_spp/JD002_S0_L005_R1_001.fastq.gz\n",
"> 9e34ddfc4dbdd9a96bd4f8f102f52693 /owl_web/nightingales/Porites_spp/JD002_S0_L005_R2_001.fastq.gz\n",
"> baa87464b77f937fccf496351bb7f000 /owl_web/nightingales/A_elegantissima/JD002_S0_L005_I1_001.fastq.gz\n",
"> e05eea61dbd405c890f241f824b2012b /owl_web/nightingales/A_elegantissima/JD002_S0_L005_R1_001.fastq.gz\n",
"> 9e34ddfc4dbdd9a96bd4f8f102f52693 /owl_web/nightingales/A_elegantissima/JD002_S0_L005_R2_001.fastq.gz\n"
]
}
],
"source": [
"%%bash\n",
"diff md5sum_report_cat temp_checksums.md5"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Well, I didn't take into account that the full path to the file would be written to the checksum file. As such, the ```diff``` command sees this. However, the checksums appear to visually match. Will proceed with adding the checksums to the checksum files in each Owl directory."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Append checksums to existing checksum files in each directory"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"%%bash\n",
"cat checksums.md5sums.md5sum_report >> /owl_web/nightingales/P_generosa/checksums.md5\n",
"cat md5sum_report >> /owl_web/nightingales/Porites_spp/checksums.md5\n",
"cat md5sum_report >> /owl_web/nightingales/A_elegantissima/checksums.md5"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Whoops! Typo in that first line above! Fixed below"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"%%bash\n",
"cat md5sum_report >> /owl_web/nightingales/P_generosa/checksums.md5"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"total 41G\n",
"-rw-r--r-- 1 srlab staff 2.1G Feb 24 23:28 JD002_S0_L005_I1_001.fastq.gz\n",
"-rw-r--r-- 1 srlab staff 18G Feb 24 23:28 JD002_S0_L005_R1_001.fastq.gz\n",
"-rw-r--r-- 1 srlab staff 22G Feb 24 23:28 JD002_S0_L005_R2_001.fastq.gz\n",
"-rw-r--r-- 1 srlab staff 192 Feb 27 20:44 checksums.md5\n",
"-rw-r--r-- 1 srlab staff 192 Feb 25 00:01 md5sum_report\n",
"-rw-r--r-- 1 srlab staff 576 Feb 28 01:13 md5sum_report_cat\n",
"-rw-r--r-- 1 srlab staff 891 Feb 28 01:13 temp_checksums.md5\n"
]
}
],
"source": [
"%%bash\n",
"ls -lh"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"rm -rf /data/20170227_jay_data_tmp/gslserver.qb3.berkeley.edu/"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"total 0\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory\n"
]
}
],
"source": [
"%%bash\n",
"ls -lh /data/20170227_jay_data_tmp/"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Download Jay's demultiplexed data"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"/data/20170227_jay_data_tmp\n"
]
}
],
"source": [
"cd /data/20170227_jay_data_tmp/"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n",
"real\t41m4.943s\n",
"user\t0m3.130s\n",
"sys\t17m12.850s\n"
]
}
],
"source": [
"%%bash\n",
"time WGETRC=/data/wgetrc_berk_seq wget -r -np -nc -q ftp://gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/Roberts"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"total 34G\n",
"-rw-r--r-- 1 srlab staff 1.1G Feb 21 23:13 JD002A_S131_L005_R1_001.fastq.gz\n",
"-rw-r--r-- 1 srlab staff 1.3G Feb 21 23:13 JD002A_S131_L005_R2_001.fastq.gz\n",
"-rw-r--r-- 1 srlab staff 1.3G Feb 21 23:13 JD002B_S132_L005_R1_001.fastq.gz\n",
"-rw-r--r-- 1 srlab staff 1.6G Feb 21 23:13 JD002B_S132_L005_R2_001.fastq.gz\n",
"-rw-r--r-- 1 srlab staff 1.2G Feb 21 23:13 JD002C_S133_L005_R1_001.fastq.gz\n",
"-rw-r--r-- 1 srlab staff 1.5G Feb 21 23:13 JD002C_S133_L005_R2_001.fastq.gz\n",
"-rw-r--r-- 1 srlab staff 1.4G Feb 21 23:13 JD002D_S134_L005_R1_001.fastq.gz\n",
"-rw-r--r-- 1 srlab staff 1.7G Feb 21 23:13 JD002D_S134_L005_R2_001.fastq.gz\n",
"-rw-r--r-- 1 srlab staff 1.4G Feb 21 23:13 JD002E_S135_L005_R1_001.fastq.gz\n",
"-rw-r--r-- 1 srlab staff 1.8G Feb 21 23:13 JD002E_S135_L005_R2_001.fastq.gz\n",
"-rw-r--r-- 1 srlab staff 1.5G Feb 21 23:13 JD002F_S136_L005_R1_001.fastq.gz\n",
"-rw-r--r-- 1 srlab staff 1.9G Feb 21 23:13 JD002F_S136_L005_R2_001.fastq.gz\n",
"-rw-r--r-- 1 srlab staff 1.1G Feb 21 23:13 JD002G_S137_L005_R1_001.fastq.gz\n",
"-rw-r--r-- 1 srlab staff 1.3G Feb 21 23:13 JD002G_S137_L005_R2_001.fastq.gz\n",
"-rw-r--r-- 1 srlab staff 1.3G Feb 21 23:13 JD002H_S138_L005_R1_001.fastq.gz\n",
"-rw-r--r-- 1 srlab staff 1.8G Feb 21 23:13 JD002H_S138_L005_R2_001.fastq.gz\n",
"-rw-r--r-- 1 srlab staff 1.2G Feb 21 23:13 JD002I_S139_L005_R1_001.fastq.gz\n",
"-rw-r--r-- 1 srlab staff 1.6G Feb 21 23:13 JD002I_S139_L005_R2_001.fastq.gz\n",
"-rw-r--r-- 1 srlab staff 1.2G Feb 21 23:13 JD002J_S140_L005_R1_001.fastq.gz\n",
"-rw-r--r-- 1 srlab staff 1.5G Feb 21 23:13 JD002J_S140_L005_R2_001.fastq.gz\n",
"-rw-r--r-- 1 srlab staff 1.4G Feb 21 23:13 JD002K_S141_L005_R1_001.fastq.gz\n",
"-rw-r--r-- 1 srlab staff 1.8G Feb 21 23:13 JD002K_S141_L005_R2_001.fastq.gz\n",
"-rw-r--r-- 1 srlab staff 1.4G Feb 21 23:13 JD002L_S142_L005_R1_001.fastq.gz\n",
"-rw-r--r-- 1 srlab staff 1.8G Feb 21 23:13 JD002L_S142_L005_R2_001.fastq.gz\n",
"-rw-r--r-- 1 srlab staff 1.6K Feb 27 19:11 demultiplexed_checksums\n"
]
}
],
"source": [
"%%bash\n",
"ls -lh /data/20170227_jay_data_tmp/gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/Roberts/"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"/data/20170227_jay_data_tmp/gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/Roberts\n"
]
}
],
"source": [
"cd /data/20170227_jay_data_tmp/gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/Roberts/"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"c6d2bab7dabb6043a8565482b7b03cda JD002A_S131_L005_R1_001.fastq.gz\n",
"77d17d6425e818a798e28ff3dd7f34f0 JD002A_S131_L005_R2_001.fastq.gz\n",
"51346326dfa475706b1c219dd86dc4f2 JD002B_S132_L005_R1_001.fastq.gz\n",
"0062c59dc8fcda8fcfa452ebb717419c JD002B_S132_L005_R2_001.fastq.gz\n",
"91ae2b454af8343fe79f5db506e71438 JD002C_S133_L005_R1_001.fastq.gz\n",
"c428291a87958081fdc647e9d121506d JD002C_S133_L005_R2_001.fastq.gz\n",
"27a4bed7f18e5f71372f4676e0369a6e JD002D_S134_L005_R1_001.fastq.gz\n",
"84f0710eee4765e9aba0db009d004244 JD002D_S134_L005_R2_001.fastq.gz\n",
"0d4e953296924154d616c1143f9c4ad8 JD002E_S135_L005_R1_001.fastq.gz\n",
"896b03ed79dbc33113aaf646ce94b65a JD002E_S135_L005_R2_001.fastq.gz\n",
"ae11c97c5c877787088e236e8c158346 JD002F_S136_L005_R1_001.fastq.gz\n",
"d8968dc209461435a66af6382f049a19 JD002F_S136_L005_R2_001.fastq.gz\n",
"83b5e361d8d3c4cff1d464d0428171d1 JD002G_S137_L005_R1_001.fastq.gz\n",
"168c2dbf1a7585d2a2a1b13a56e2f4e6 JD002G_S137_L005_R2_001.fastq.gz\n",
"1403cafaa172d2b009404e98ef6503ae JD002H_S138_L005_R1_001.fastq.gz\n",
"0c2f1b9a951b54694e1b8f287ac82793 JD002H_S138_L005_R2_001.fastq.gz\n",
"1abea619075366f23f4510ffb1d9ad26 JD002I_S139_L005_R1_001.fastq.gz\n",
"9e930c5276aae350cb34e0f42f954faf JD002I_S139_L005_R2_001.fastq.gz\n",
"37e5dff58e642a2f23d4e8eb1d5339bb JD002J_S140_L005_R1_001.fastq.gz\n",
"e5da7e7ce6492940461d5dbd50f832c6 JD002J_S140_L005_R2_001.fastq.gz\n",
"66f2c2c3f9fcdbdbbc1d91c9661de5a6 JD002K_S141_L005_R1_001.fastq.gz\n",
"02db37aebfc0cd0f8e1184e9d444bb2d JD002K_S141_L005_R2_001.fastq.gz\n",
"103ca692e4d824eebc907495f6f288d7 JD002L_S142_L005_R1_001.fastq.gz\n",
"09e54284660986b049c9cf07dc0ef35e JD002L_S142_L005_R2_001.fastq.gz\n"
]
}
],
"source": [
"%%bash\n",
"cat demultiplexed_checksums"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Generate our own checksums"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n",
"real\t7m0.125s\n",
"user\t0m6.260s\n",
"sys\t3m52.740s\n"
]
}
],
"source": [
"%%bash\n",
"time for i in *.gz\n",
" do\n",
" md5sum \"$i\" >> checksums.md5\n",
" done"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"c6d2bab7dabb6043a8565482b7b03cda JD002A_S131_L005_R1_001.fastq.gz\n",
"77d17d6425e818a798e28ff3dd7f34f0 JD002A_S131_L005_R2_001.fastq.gz\n",
"51346326dfa475706b1c219dd86dc4f2 JD002B_S132_L005_R1_001.fastq.gz\n",
"0062c59dc8fcda8fcfa452ebb717419c JD002B_S132_L005_R2_001.fastq.gz\n",
"91ae2b454af8343fe79f5db506e71438 JD002C_S133_L005_R1_001.fastq.gz\n",
"c428291a87958081fdc647e9d121506d JD002C_S133_L005_R2_001.fastq.gz\n",
"27a4bed7f18e5f71372f4676e0369a6e JD002D_S134_L005_R1_001.fastq.gz\n",
"84f0710eee4765e9aba0db009d004244 JD002D_S134_L005_R2_001.fastq.gz\n",
"0d4e953296924154d616c1143f9c4ad8 JD002E_S135_L005_R1_001.fastq.gz\n",
"896b03ed79dbc33113aaf646ce94b65a JD002E_S135_L005_R2_001.fastq.gz\n",
"ae11c97c5c877787088e236e8c158346 JD002F_S136_L005_R1_001.fastq.gz\n",
"d8968dc209461435a66af6382f049a19 JD002F_S136_L005_R2_001.fastq.gz\n",
"83b5e361d8d3c4cff1d464d0428171d1 JD002G_S137_L005_R1_001.fastq.gz\n",
"168c2dbf1a7585d2a2a1b13a56e2f4e6 JD002G_S137_L005_R2_001.fastq.gz\n",
"1403cafaa172d2b009404e98ef6503ae JD002H_S138_L005_R1_001.fastq.gz\n",
"0c2f1b9a951b54694e1b8f287ac82793 JD002H_S138_L005_R2_001.fastq.gz\n",
"1abea619075366f23f4510ffb1d9ad26 JD002I_S139_L005_R1_001.fastq.gz\n",
"9e930c5276aae350cb34e0f42f954faf JD002I_S139_L005_R2_001.fastq.gz\n",
"37e5dff58e642a2f23d4e8eb1d5339bb JD002J_S140_L005_R1_001.fastq.gz\n",
"e5da7e7ce6492940461d5dbd50f832c6 JD002J_S140_L005_R2_001.fastq.gz\n",
"66f2c2c3f9fcdbdbbc1d91c9661de5a6 JD002K_S141_L005_R1_001.fastq.gz\n",
"02db37aebfc0cd0f8e1184e9d444bb2d JD002K_S141_L005_R2_001.fastq.gz\n",
"103ca692e4d824eebc907495f6f288d7 JD002L_S142_L005_R1_001.fastq.gz\n",
"09e54284660986b049c9cf07dc0ef35e JD002L_S142_L005_R2_001.fastq.gz\n"
]
}
],
"source": [
"%%bash\n",
"cat checksums.md5"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Compare MD5 checksums"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"%%bash\n",
"diff demultiplexed_checksums checksums.md5"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0\n"
]
}
],
"source": [
"%%bash\n",
"echo $?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Copy files to directories on Owl\n",
"\n",
"Jay has three different species in his sequencing data, so I'm copying the data to each of three different species folders on Owl. The code below uses the ```-no-clobber``` argument to prevent the program from overwriting any existing files in the destination directory that might have the same file name."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n",
"real\t155m12.124s\n",
"user\t0m0.180s\n",
"sys\t15m13.790s\n"
]
}
],
"source": [
"%%bash\n",
"time for file in *.gz\n",
" do\n",
" cp --no-clobber \"$file\" /owl_web/nightingales/P_generosa/\n",
" cp --no-clobber \"$file\" /owl_web/nightingales/Porites_spp/\n",
" cp --no-clobber \"$file\" /owl_web/nightingales/A_elegantissima/\n",
" done"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Generate MD5 checksums for files copied to Owl"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"md5sum: /owl_web/nightingales/P_generosa/JD002[A-Z].gz: No such file or directory\n",
"\n",
"real\t0m3.691s\n",
"user\t0m0.000s\n",
"sys\t0m0.000s\n"
]
}
],
"source": [
"%%bash\n",
"time for i in /owl_web/nightingales/P_generosa/JD002[A-Z]*.gz\n",
" do\n",
" md5sum \"$i\" >> temp_checksums.md5\n",
" done"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Whoops! Typo! Fixed below..."
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n",
"real\t24m29.381s\n",
"user\t0m5.040s\n",
"sys\t3m53.230s\n"
]
}
],
"source": [
"%%bash\n",
"time for i in /owl_web/nightingales/P_generosa/JD002[A-Z]*.gz\n",
" do\n",
" md5sum \"$i\" >> temp_checksums.md5\n",
" done"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n",
"real\t25m3.500s\n",
"user\t0m5.080s\n",
"sys\t3m52.580s\n"
]
}
],
"source": [
"%%bash\n",
"time for i in /owl_web/nightingales/Porites_spp/JD002[A-Z]*.gz\n",
" do\n",
" md5sum \"$i\" >> temp_checksums.md5\n",
" done"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n",
"real\t25m41.897s\n",
"user\t0m4.710s\n",
"sys\t3m52.660s\n"
]
}
],
"source": [
"%%bash\n",
"time for i in /owl_web/nightingales/A_elegantissima/JD002[A-Z]*.gz\n",
" do\n",
" md5sum \"$i\" >> temp_checksums.md5\n",
" done"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"c6d2bab7dabb6043a8565482b7b03cda /owl_web/nightingales/P_generosa/JD002A_S131_L005_R1_001.fastq.gz\n",
"77d17d6425e818a798e28ff3dd7f34f0 /owl_web/nightingales/P_generosa/JD002A_S131_L005_R2_001.fastq.gz\n",
"51346326dfa475706b1c219dd86dc4f2 /owl_web/nightingales/P_generosa/JD002B_S132_L005_R1_001.fastq.gz\n",
"0062c59dc8fcda8fcfa452ebb717419c /owl_web/nightingales/P_generosa/JD002B_S132_L005_R2_001.fastq.gz\n",
"91ae2b454af8343fe79f5db506e71438 /owl_web/nightingales/P_generosa/JD002C_S133_L005_R1_001.fastq.gz\n",
"c428291a87958081fdc647e9d121506d /owl_web/nightingales/P_generosa/JD002C_S133_L005_R2_001.fastq.gz\n",
"27a4bed7f18e5f71372f4676e0369a6e /owl_web/nightingales/P_generosa/JD002D_S134_L005_R1_001.fastq.gz\n",
"84f0710eee4765e9aba0db009d004244 /owl_web/nightingales/P_generosa/JD002D_S134_L005_R2_001.fastq.gz\n",
"0d4e953296924154d616c1143f9c4ad8 /owl_web/nightingales/P_generosa/JD002E_S135_L005_R1_001.fastq.gz\n",
"896b03ed79dbc33113aaf646ce94b65a /owl_web/nightingales/P_generosa/JD002E_S135_L005_R2_001.fastq.gz\n",
"ae11c97c5c877787088e236e8c158346 /owl_web/nightingales/P_generosa/JD002F_S136_L005_R1_001.fastq.gz\n",
"d8968dc209461435a66af6382f049a19 /owl_web/nightingales/P_generosa/JD002F_S136_L005_R2_001.fastq.gz\n",
"83b5e361d8d3c4cff1d464d0428171d1 /owl_web/nightingales/P_generosa/JD002G_S137_L005_R1_001.fastq.gz\n",
"168c2dbf1a7585d2a2a1b13a56e2f4e6 /owl_web/nightingales/P_generosa/JD002G_S137_L005_R2_001.fastq.gz\n",
"1403cafaa172d2b009404e98ef6503ae /owl_web/nightingales/P_generosa/JD002H_S138_L005_R1_001.fastq.gz\n",
"0c2f1b9a951b54694e1b8f287ac82793 /owl_web/nightingales/P_generosa/JD002H_S138_L005_R2_001.fastq.gz\n",
"1abea619075366f23f4510ffb1d9ad26 /owl_web/nightingales/P_generosa/JD002I_S139_L005_R1_001.fastq.gz\n",
"9e930c5276aae350cb34e0f42f954faf /owl_web/nightingales/P_generosa/JD002I_S139_L005_R2_001.fastq.gz\n",
"37e5dff58e642a2f23d4e8eb1d5339bb /owl_web/nightingales/P_generosa/JD002J_S140_L005_R1_001.fastq.gz\n",
"e5da7e7ce6492940461d5dbd50f832c6 /owl_web/nightingales/P_generosa/JD002J_S140_L005_R2_001.fastq.gz\n",
"66f2c2c3f9fcdbdbbc1d91c9661de5a6 /owl_web/nightingales/P_generosa/JD002K_S141_L005_R1_001.fastq.gz\n",
"02db37aebfc0cd0f8e1184e9d444bb2d /owl_web/nightingales/P_generosa/JD002K_S141_L005_R2_001.fastq.gz\n",
"103ca692e4d824eebc907495f6f288d7 /owl_web/nightingales/P_generosa/JD002L_S142_L005_R1_001.fastq.gz\n",
"09e54284660986b049c9cf07dc0ef35e /owl_web/nightingales/P_generosa/JD002L_S142_L005_R2_001.fastq.gz\n",
"c6d2bab7dabb6043a8565482b7b03cda /owl_web/nightingales/Porites_spp/JD002A_S131_L005_R1_001.fastq.gz\n",
"77d17d6425e818a798e28ff3dd7f34f0 /owl_web/nightingales/Porites_spp/JD002A_S131_L005_R2_001.fastq.gz\n",
"51346326dfa475706b1c219dd86dc4f2 /owl_web/nightingales/Porites_spp/JD002B_S132_L005_R1_001.fastq.gz\n",
"0062c59dc8fcda8fcfa452ebb717419c /owl_web/nightingales/Porites_spp/JD002B_S132_L005_R2_001.fastq.gz\n",
"91ae2b454af8343fe79f5db506e71438 /owl_web/nightingales/Porites_spp/JD002C_S133_L005_R1_001.fastq.gz\n",
"c428291a87958081fdc647e9d121506d /owl_web/nightingales/Porites_spp/JD002C_S133_L005_R2_001.fastq.gz\n",
"27a4bed7f18e5f71372f4676e0369a6e /owl_web/nightingales/Porites_spp/JD002D_S134_L005_R1_001.fastq.gz\n",
"84f0710eee4765e9aba0db009d004244 /owl_web/nightingales/Porites_spp/JD002D_S134_L005_R2_001.fastq.gz\n",
"0d4e953296924154d616c1143f9c4ad8 /owl_web/nightingales/Porites_spp/JD002E_S135_L005_R1_001.fastq.gz\n",
"896b03ed79dbc33113aaf646ce94b65a /owl_web/nightingales/Porites_spp/JD002E_S135_L005_R2_001.fastq.gz\n",
"ae11c97c5c877787088e236e8c158346 /owl_web/nightingales/Porites_spp/JD002F_S136_L005_R1_001.fastq.gz\n",
"d8968dc209461435a66af6382f049a19 /owl_web/nightingales/Porites_spp/JD002F_S136_L005_R2_001.fastq.gz\n",
"83b5e361d8d3c4cff1d464d0428171d1 /owl_web/nightingales/Porites_spp/JD002G_S137_L005_R1_001.fastq.gz\n",
"168c2dbf1a7585d2a2a1b13a56e2f4e6 /owl_web/nightingales/Porites_spp/JD002G_S137_L005_R2_001.fastq.gz\n",
"1403cafaa172d2b009404e98ef6503ae /owl_web/nightingales/Porites_spp/JD002H_S138_L005_R1_001.fastq.gz\n",
"0c2f1b9a951b54694e1b8f287ac82793 /owl_web/nightingales/Porites_spp/JD002H_S138_L005_R2_001.fastq.gz\n",
"1abea619075366f23f4510ffb1d9ad26 /owl_web/nightingales/Porites_spp/JD002I_S139_L005_R1_001.fastq.gz\n",
"9e930c5276aae350cb34e0f42f954faf /owl_web/nightingales/Porites_spp/JD002I_S139_L005_R2_001.fastq.gz\n",
"37e5dff58e642a2f23d4e8eb1d5339bb /owl_web/nightingales/Porites_spp/JD002J_S140_L005_R1_001.fastq.gz\n",
"e5da7e7ce6492940461d5dbd50f832c6 /owl_web/nightingales/Porites_spp/JD002J_S140_L005_R2_001.fastq.gz\n",
"66f2c2c3f9fcdbdbbc1d91c9661de5a6 /owl_web/nightingales/Porites_spp/JD002K_S141_L005_R1_001.fastq.gz\n",
"02db37aebfc0cd0f8e1184e9d444bb2d /owl_web/nightingales/Porites_spp/JD002K_S141_L005_R2_001.fastq.gz\n",
"103ca692e4d824eebc907495f6f288d7 /owl_web/nightingales/Porites_spp/JD002L_S142_L005_R1_001.fastq.gz\n",
"09e54284660986b049c9cf07dc0ef35e /owl_web/nightingales/Porites_spp/JD002L_S142_L005_R2_001.fastq.gz\n",
"c6d2bab7dabb6043a8565482b7b03cda /owl_web/nightingales/A_elegantissima/JD002A_S131_L005_R1_001.fastq.gz\n",
"77d17d6425e818a798e28ff3dd7f34f0 /owl_web/nightingales/A_elegantissima/JD002A_S131_L005_R2_001.fastq.gz\n",
"51346326dfa475706b1c219dd86dc4f2 /owl_web/nightingales/A_elegantissima/JD002B_S132_L005_R1_001.fastq.gz\n",
"0062c59dc8fcda8fcfa452ebb717419c /owl_web/nightingales/A_elegantissima/JD002B_S132_L005_R2_001.fastq.gz\n",
"91ae2b454af8343fe79f5db506e71438 /owl_web/nightingales/A_elegantissima/JD002C_S133_L005_R1_001.fastq.gz\n",
"c428291a87958081fdc647e9d121506d /owl_web/nightingales/A_elegantissima/JD002C_S133_L005_R2_001.fastq.gz\n",
"27a4bed7f18e5f71372f4676e0369a6e /owl_web/nightingales/A_elegantissima/JD002D_S134_L005_R1_001.fastq.gz\n",
"84f0710eee4765e9aba0db009d004244 /owl_web/nightingales/A_elegantissima/JD002D_S134_L005_R2_001.fastq.gz\n",
"0d4e953296924154d616c1143f9c4ad8 /owl_web/nightingales/A_elegantissima/JD002E_S135_L005_R1_001.fastq.gz\n",
"896b03ed79dbc33113aaf646ce94b65a /owl_web/nightingales/A_elegantissima/JD002E_S135_L005_R2_001.fastq.gz\n",
"ae11c97c5c877787088e236e8c158346 /owl_web/nightingales/A_elegantissima/JD002F_S136_L005_R1_001.fastq.gz\n",
"d8968dc209461435a66af6382f049a19 /owl_web/nightingales/A_elegantissima/JD002F_S136_L005_R2_001.fastq.gz\n",
"83b5e361d8d3c4cff1d464d0428171d1 /owl_web/nightingales/A_elegantissima/JD002G_S137_L005_R1_001.fastq.gz\n",
"168c2dbf1a7585d2a2a1b13a56e2f4e6 /owl_web/nightingales/A_elegantissima/JD002G_S137_L005_R2_001.fastq.gz\n",
"1403cafaa172d2b009404e98ef6503ae /owl_web/nightingales/A_elegantissima/JD002H_S138_L005_R1_001.fastq.gz\n",
"0c2f1b9a951b54694e1b8f287ac82793 /owl_web/nightingales/A_elegantissima/JD002H_S138_L005_R2_001.fastq.gz\n",
"1abea619075366f23f4510ffb1d9ad26 /owl_web/nightingales/A_elegantissima/JD002I_S139_L005_R1_001.fastq.gz\n",
"9e930c5276aae350cb34e0f42f954faf /owl_web/nightingales/A_elegantissima/JD002I_S139_L005_R2_001.fastq.gz\n",
"37e5dff58e642a2f23d4e8eb1d5339bb /owl_web/nightingales/A_elegantissima/JD002J_S140_L005_R1_001.fastq.gz\n",
"e5da7e7ce6492940461d5dbd50f832c6 /owl_web/nightingales/A_elegantissima/JD002J_S140_L005_R2_001.fastq.gz\n",
"66f2c2c3f9fcdbdbbc1d91c9661de5a6 /owl_web/nightingales/A_elegantissima/JD002K_S141_L005_R1_001.fastq.gz\n",
"02db37aebfc0cd0f8e1184e9d444bb2d /owl_web/nightingales/A_elegantissima/JD002K_S141_L005_R2_001.fastq.gz\n",
"103ca692e4d824eebc907495f6f288d7 /owl_web/nightingales/A_elegantissima/JD002L_S142_L005_R1_001.fastq.gz\n",
"09e54284660986b049c9cf07dc0ef35e /owl_web/nightingales/A_elegantissima/JD002L_S142_L005_R2_001.fastq.gz\n"
]
}
],
"source": [
"%%bash\n",
"cat temp_checksums.md5"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Since there are so many files this time, I'm going to strip the leading file path from the filenames so that I can actually use the ```diff``` command to compare checksums. The code below uses ```sed``` to edit the file in place (using the ```-i``` argument), automatically creates a backup of the original file with the extension ```.bak``` and then substitutes everything up to the last slash from the specified input file."
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n",
"real\t0m0.021s\n",
"user\t0m0.000s\n",
"sys\t0m0.000s\n"
]
}
],
"source": [
"%%bash\n",
"time sed -i.bak 's/^.*\\///' temp_checksums.md5 "
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"JD002A_S131_L005_R1_001.fastq.gz\n",
"JD002A_S131_L005_R2_001.fastq.gz\n",
"JD002B_S132_L005_R1_001.fastq.gz\n",
"JD002B_S132_L005_R2_001.fastq.gz\n",
"JD002C_S133_L005_R1_001.fastq.gz\n",
"JD002C_S133_L005_R2_001.fastq.gz\n",
"JD002D_S134_L005_R1_001.fastq.gz\n",
"JD002D_S134_L005_R2_001.fastq.gz\n",
"JD002E_S135_L005_R1_001.fastq.gz\n",
"JD002E_S135_L005_R2_001.fastq.gz\n",
"JD002F_S136_L005_R1_001.fastq.gz\n",
"JD002F_S136_L005_R2_001.fastq.gz\n",
"JD002G_S137_L005_R1_001.fastq.gz\n",
"JD002G_S137_L005_R2_001.fastq.gz\n",
"JD002H_S138_L005_R1_001.fastq.gz\n",
"JD002H_S138_L005_R2_001.fastq.gz\n",
"JD002I_S139_L005_R1_001.fastq.gz\n",
"JD002I_S139_L005_R2_001.fastq.gz\n",
"JD002J_S140_L005_R1_001.fastq.gz\n",
"JD002J_S140_L005_R2_001.fastq.gz\n",
"JD002K_S141_L005_R1_001.fastq.gz\n",
"JD002K_S141_L005_R2_001.fastq.gz\n",
"JD002L_S142_L005_R1_001.fastq.gz\n",
"JD002L_S142_L005_R2_001.fastq.gz\n",
"JD002A_S131_L005_R1_001.fastq.gz\n",
"JD002A_S131_L005_R2_001.fastq.gz\n",
"JD002B_S132_L005_R1_001.fastq.gz\n",
"JD002B_S132_L005_R2_001.fastq.gz\n",
"JD002C_S133_L005_R1_001.fastq.gz\n",
"JD002C_S133_L005_R2_001.fastq.gz\n",
"JD002D_S134_L005_R1_001.fastq.gz\n",
"JD002D_S134_L005_R2_001.fastq.gz\n",
"JD002E_S135_L005_R1_001.fastq.gz\n",
"JD002E_S135_L005_R2_001.fastq.gz\n",
"JD002F_S136_L005_R1_001.fastq.gz\n",
"JD002F_S136_L005_R2_001.fastq.gz\n",
"JD002G_S137_L005_R1_001.fastq.gz\n",
"JD002G_S137_L005_R2_001.fastq.gz\n",
"JD002H_S138_L005_R1_001.fastq.gz\n",
"JD002H_S138_L005_R2_001.fastq.gz\n",
"JD002I_S139_L005_R1_001.fastq.gz\n",
"JD002I_S139_L005_R2_001.fastq.gz\n",
"JD002J_S140_L005_R1_001.fastq.gz\n",
"JD002J_S140_L005_R2_001.fastq.gz\n",
"JD002K_S141_L005_R1_001.fastq.gz\n",
"JD002K_S141_L005_R2_001.fastq.gz\n",
"JD002L_S142_L005_R1_001.fastq.gz\n",
"JD002L_S142_L005_R2_001.fastq.gz\n",
"JD002A_S131_L005_R1_001.fastq.gz\n",
"JD002A_S131_L005_R2_001.fastq.gz\n",
"JD002B_S132_L005_R1_001.fastq.gz\n",
"JD002B_S132_L005_R2_001.fastq.gz\n",
"JD002C_S133_L005_R1_001.fastq.gz\n",
"JD002C_S133_L005_R2_001.fastq.gz\n",
"JD002D_S134_L005_R1_001.fastq.gz\n",
"JD002D_S134_L005_R2_001.fastq.gz\n",
"JD002E_S135_L005_R1_001.fastq.gz\n",
"JD002E_S135_L005_R2_001.fastq.gz\n",
"JD002F_S136_L005_R1_001.fastq.gz\n",
"JD002F_S136_L005_R2_001.fastq.gz\n",
"JD002G_S137_L005_R1_001.fastq.gz\n",
"JD002G_S137_L005_R2_001.fastq.gz\n",
"JD002H_S138_L005_R1_001.fastq.gz\n",
"JD002H_S138_L005_R2_001.fastq.gz\n",
"JD002I_S139_L005_R1_001.fastq.gz\n",
"JD002I_S139_L005_R2_001.fastq.gz\n",
"JD002J_S140_L005_R1_001.fastq.gz\n",
"JD002J_S140_L005_R2_001.fastq.gz\n",
"JD002K_S141_L005_R1_001.fastq.gz\n",
"JD002K_S141_L005_R2_001.fastq.gz\n",
"JD002L_S142_L005_R1_001.fastq.gz\n",
"JD002L_S142_L005_R2_001.fastq.gz\n"
]
}
],
"source": [
"%%bash\n",
"cat temp_checksums.md5"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Well, that didn't work. It eliminated the first column (the checksums). Let's restore our file from the .bak backup file. Actually, I phrased that incorrectly. It did work exactly as it should. Sed edits things by lines, so each line was read and the pattern matching applied, leaving just the file name on each line."
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"%%bash\n",
"mv temp_checksums.md5.bak temp_checksums.md5"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"c6d2bab7dabb6043a8565482b7b03cda /owl_web/nightingales/P_generosa/JD002A_S131_L005_R1_001.fastq.gz\n",
"77d17d6425e818a798e28ff3dd7f34f0 /owl_web/nightingales/P_generosa/JD002A_S131_L005_R2_001.fastq.gz\n",
"51346326dfa475706b1c219dd86dc4f2 /owl_web/nightingales/P_generosa/JD002B_S132_L005_R1_001.fastq.gz\n",
"0062c59dc8fcda8fcfa452ebb717419c /owl_web/nightingales/P_generosa/JD002B_S132_L005_R2_001.fastq.gz\n",
"91ae2b454af8343fe79f5db506e71438 /owl_web/nightingales/P_generosa/JD002C_S133_L005_R1_001.fastq.gz\n",
"c428291a87958081fdc647e9d121506d /owl_web/nightingales/P_generosa/JD002C_S133_L005_R2_001.fastq.gz\n",
"27a4bed7f18e5f71372f4676e0369a6e /owl_web/nightingales/P_generosa/JD002D_S134_L005_R1_001.fastq.gz\n",
"84f0710eee4765e9aba0db009d004244 /owl_web/nightingales/P_generosa/JD002D_S134_L005_R2_001.fastq.gz\n",
"0d4e953296924154d616c1143f9c4ad8 /owl_web/nightingales/P_generosa/JD002E_S135_L005_R1_001.fastq.gz\n",
"896b03ed79dbc33113aaf646ce94b65a /owl_web/nightingales/P_generosa/JD002E_S135_L005_R2_001.fastq.gz\n",
"ae11c97c5c877787088e236e8c158346 /owl_web/nightingales/P_generosa/JD002F_S136_L005_R1_001.fastq.gz\n",
"d8968dc209461435a66af6382f049a19 /owl_web/nightingales/P_generosa/JD002F_S136_L005_R2_001.fastq.gz\n",
"83b5e361d8d3c4cff1d464d0428171d1 /owl_web/nightingales/P_generosa/JD002G_S137_L005_R1_001.fastq.gz\n",
"168c2dbf1a7585d2a2a1b13a56e2f4e6 /owl_web/nightingales/P_generosa/JD002G_S137_L005_R2_001.fastq.gz\n",
"1403cafaa172d2b009404e98ef6503ae /owl_web/nightingales/P_generosa/JD002H_S138_L005_R1_001.fastq.gz\n",
"0c2f1b9a951b54694e1b8f287ac82793 /owl_web/nightingales/P_generosa/JD002H_S138_L005_R2_001.fastq.gz\n",
"1abea619075366f23f4510ffb1d9ad26 /owl_web/nightingales/P_generosa/JD002I_S139_L005_R1_001.fastq.gz\n",
"9e930c5276aae350cb34e0f42f954faf /owl_web/nightingales/P_generosa/JD002I_S139_L005_R2_001.fastq.gz\n",
"37e5dff58e642a2f23d4e8eb1d5339bb /owl_web/nightingales/P_generosa/JD002J_S140_L005_R1_001.fastq.gz\n",
"e5da7e7ce6492940461d5dbd50f832c6 /owl_web/nightingales/P_generosa/JD002J_S140_L005_R2_001.fastq.gz\n",
"66f2c2c3f9fcdbdbbc1d91c9661de5a6 /owl_web/nightingales/P_generosa/JD002K_S141_L005_R1_001.fastq.gz\n",
"02db37aebfc0cd0f8e1184e9d444bb2d /owl_web/nightingales/P_generosa/JD002K_S141_L005_R2_001.fastq.gz\n",
"103ca692e4d824eebc907495f6f288d7 /owl_web/nightingales/P_generosa/JD002L_S142_L005_R1_001.fastq.gz\n",
"09e54284660986b049c9cf07dc0ef35e /owl_web/nightingales/P_generosa/JD002L_S142_L005_R2_001.fastq.gz\n",
"c6d2bab7dabb6043a8565482b7b03cda /owl_web/nightingales/Porites_spp/JD002A_S131_L005_R1_001.fastq.gz\n",
"77d17d6425e818a798e28ff3dd7f34f0 /owl_web/nightingales/Porites_spp/JD002A_S131_L005_R2_001.fastq.gz\n",
"51346326dfa475706b1c219dd86dc4f2 /owl_web/nightingales/Porites_spp/JD002B_S132_L005_R1_001.fastq.gz\n",
"0062c59dc8fcda8fcfa452ebb717419c /owl_web/nightingales/Porites_spp/JD002B_S132_L005_R2_001.fastq.gz\n",
"91ae2b454af8343fe79f5db506e71438 /owl_web/nightingales/Porites_spp/JD002C_S133_L005_R1_001.fastq.gz\n",
"c428291a87958081fdc647e9d121506d /owl_web/nightingales/Porites_spp/JD002C_S133_L005_R2_001.fastq.gz\n",
"27a4bed7f18e5f71372f4676e0369a6e /owl_web/nightingales/Porites_spp/JD002D_S134_L005_R1_001.fastq.gz\n",
"84f0710eee4765e9aba0db009d004244 /owl_web/nightingales/Porites_spp/JD002D_S134_L005_R2_001.fastq.gz\n",
"0d4e953296924154d616c1143f9c4ad8 /owl_web/nightingales/Porites_spp/JD002E_S135_L005_R1_001.fastq.gz\n",
"896b03ed79dbc33113aaf646ce94b65a /owl_web/nightingales/Porites_spp/JD002E_S135_L005_R2_001.fastq.gz\n",
"ae11c97c5c877787088e236e8c158346 /owl_web/nightingales/Porites_spp/JD002F_S136_L005_R1_001.fastq.gz\n",
"d8968dc209461435a66af6382f049a19 /owl_web/nightingales/Porites_spp/JD002F_S136_L005_R2_001.fastq.gz\n",
"83b5e361d8d3c4cff1d464d0428171d1 /owl_web/nightingales/Porites_spp/JD002G_S137_L005_R1_001.fastq.gz\n",
"168c2dbf1a7585d2a2a1b13a56e2f4e6 /owl_web/nightingales/Porites_spp/JD002G_S137_L005_R2_001.fastq.gz\n",
"1403cafaa172d2b009404e98ef6503ae /owl_web/nightingales/Porites_spp/JD002H_S138_L005_R1_001.fastq.gz\n",
"0c2f1b9a951b54694e1b8f287ac82793 /owl_web/nightingales/Porites_spp/JD002H_S138_L005_R2_001.fastq.gz\n",
"1abea619075366f23f4510ffb1d9ad26 /owl_web/nightingales/Porites_spp/JD002I_S139_L005_R1_001.fastq.gz\n",
"9e930c5276aae350cb34e0f42f954faf /owl_web/nightingales/Porites_spp/JD002I_S139_L005_R2_001.fastq.gz\n",
"37e5dff58e642a2f23d4e8eb1d5339bb /owl_web/nightingales/Porites_spp/JD002J_S140_L005_R1_001.fastq.gz\n",
"e5da7e7ce6492940461d5dbd50f832c6 /owl_web/nightingales/Porites_spp/JD002J_S140_L005_R2_001.fastq.gz\n",
"66f2c2c3f9fcdbdbbc1d91c9661de5a6 /owl_web/nightingales/Porites_spp/JD002K_S141_L005_R1_001.fastq.gz\n",
"02db37aebfc0cd0f8e1184e9d444bb2d /owl_web/nightingales/Porites_spp/JD002K_S141_L005_R2_001.fastq.gz\n",
"103ca692e4d824eebc907495f6f288d7 /owl_web/nightingales/Porites_spp/JD002L_S142_L005_R1_001.fastq.gz\n",
"09e54284660986b049c9cf07dc0ef35e /owl_web/nightingales/Porites_spp/JD002L_S142_L005_R2_001.fastq.gz\n",
"c6d2bab7dabb6043a8565482b7b03cda /owl_web/nightingales/A_elegantissima/JD002A_S131_L005_R1_001.fastq.gz\n",
"77d17d6425e818a798e28ff3dd7f34f0 /owl_web/nightingales/A_elegantissima/JD002A_S131_L005_R2_001.fastq.gz\n",
"51346326dfa475706b1c219dd86dc4f2 /owl_web/nightingales/A_elegantissima/JD002B_S132_L005_R1_001.fastq.gz\n",
"0062c59dc8fcda8fcfa452ebb717419c /owl_web/nightingales/A_elegantissima/JD002B_S132_L005_R2_001.fastq.gz\n",
"91ae2b454af8343fe79f5db506e71438 /owl_web/nightingales/A_elegantissima/JD002C_S133_L005_R1_001.fastq.gz\n",
"c428291a87958081fdc647e9d121506d /owl_web/nightingales/A_elegantissima/JD002C_S133_L005_R2_001.fastq.gz\n",
"27a4bed7f18e5f71372f4676e0369a6e /owl_web/nightingales/A_elegantissima/JD002D_S134_L005_R1_001.fastq.gz\n",
"84f0710eee4765e9aba0db009d004244 /owl_web/nightingales/A_elegantissima/JD002D_S134_L005_R2_001.fastq.gz\n",
"0d4e953296924154d616c1143f9c4ad8 /owl_web/nightingales/A_elegantissima/JD002E_S135_L005_R1_001.fastq.gz\n",
"896b03ed79dbc33113aaf646ce94b65a /owl_web/nightingales/A_elegantissima/JD002E_S135_L005_R2_001.fastq.gz\n",
"ae11c97c5c877787088e236e8c158346 /owl_web/nightingales/A_elegantissima/JD002F_S136_L005_R1_001.fastq.gz\n",
"d8968dc209461435a66af6382f049a19 /owl_web/nightingales/A_elegantissima/JD002F_S136_L005_R2_001.fastq.gz\n",
"83b5e361d8d3c4cff1d464d0428171d1 /owl_web/nightingales/A_elegantissima/JD002G_S137_L005_R1_001.fastq.gz\n",
"168c2dbf1a7585d2a2a1b13a56e2f4e6 /owl_web/nightingales/A_elegantissima/JD002G_S137_L005_R2_001.fastq.gz\n",
"1403cafaa172d2b009404e98ef6503ae /owl_web/nightingales/A_elegantissima/JD002H_S138_L005_R1_001.fastq.gz\n",
"0c2f1b9a951b54694e1b8f287ac82793 /owl_web/nightingales/A_elegantissima/JD002H_S138_L005_R2_001.fastq.gz\n",
"1abea619075366f23f4510ffb1d9ad26 /owl_web/nightingales/A_elegantissima/JD002I_S139_L005_R1_001.fastq.gz\n",
"9e930c5276aae350cb34e0f42f954faf /owl_web/nightingales/A_elegantissima/JD002I_S139_L005_R2_001.fastq.gz\n",
"37e5dff58e642a2f23d4e8eb1d5339bb /owl_web/nightingales/A_elegantissima/JD002J_S140_L005_R1_001.fastq.gz\n",
"e5da7e7ce6492940461d5dbd50f832c6 /owl_web/nightingales/A_elegantissima/JD002J_S140_L005_R2_001.fastq.gz\n",
"66f2c2c3f9fcdbdbbc1d91c9661de5a6 /owl_web/nightingales/A_elegantissima/JD002K_S141_L005_R1_001.fastq.gz\n",
"02db37aebfc0cd0f8e1184e9d444bb2d /owl_web/nightingales/A_elegantissima/JD002K_S141_L005_R2_001.fastq.gz\n",
"103ca692e4d824eebc907495f6f288d7 /owl_web/nightingales/A_elegantissima/JD002L_S142_L005_R1_001.fastq.gz\n",
"09e54284660986b049c9cf07dc0ef35e /owl_web/nightingales/A_elegantissima/JD002L_S142_L005_R2_001.fastq.gz\n"
]
}
],
"source": [
"%%bash\n",
"cat temp_checksums.md5"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Found a solution using awk (duh, as it's perfect for operating on specific columns). The code below uses the gsub fucntion in awk to substitute the longest string that ends with a forward slash with nothing (the empty quotes) in column 2 ($2) and print the result of the new file in its entirety. In this case, I redirect the output of that command to the ```temp_checksums.md5``` file."
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"%%bash\n",
"awk '{gsub(/\\/.*\\//,\"\",$2); print}' < temp_checksums.md5.bak > temp_checksums.md5"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"c6d2bab7dabb6043a8565482b7b03cda JD002A_S131_L005_R1_001.fastq.gz\n",
"77d17d6425e818a798e28ff3dd7f34f0 JD002A_S131_L005_R2_001.fastq.gz\n",
"51346326dfa475706b1c219dd86dc4f2 JD002B_S132_L005_R1_001.fastq.gz\n",
"0062c59dc8fcda8fcfa452ebb717419c JD002B_S132_L005_R2_001.fastq.gz\n",
"91ae2b454af8343fe79f5db506e71438 JD002C_S133_L005_R1_001.fastq.gz\n",
"c428291a87958081fdc647e9d121506d JD002C_S133_L005_R2_001.fastq.gz\n",
"27a4bed7f18e5f71372f4676e0369a6e JD002D_S134_L005_R1_001.fastq.gz\n",
"84f0710eee4765e9aba0db009d004244 JD002D_S134_L005_R2_001.fastq.gz\n",
"0d4e953296924154d616c1143f9c4ad8 JD002E_S135_L005_R1_001.fastq.gz\n",
"896b03ed79dbc33113aaf646ce94b65a JD002E_S135_L005_R2_001.fastq.gz\n",
"ae11c97c5c877787088e236e8c158346 JD002F_S136_L005_R1_001.fastq.gz\n",
"d8968dc209461435a66af6382f049a19 JD002F_S136_L005_R2_001.fastq.gz\n",
"83b5e361d8d3c4cff1d464d0428171d1 JD002G_S137_L005_R1_001.fastq.gz\n",
"168c2dbf1a7585d2a2a1b13a56e2f4e6 JD002G_S137_L005_R2_001.fastq.gz\n",
"1403cafaa172d2b009404e98ef6503ae JD002H_S138_L005_R1_001.fastq.gz\n",
"0c2f1b9a951b54694e1b8f287ac82793 JD002H_S138_L005_R2_001.fastq.gz\n",
"1abea619075366f23f4510ffb1d9ad26 JD002I_S139_L005_R1_001.fastq.gz\n",
"9e930c5276aae350cb34e0f42f954faf JD002I_S139_L005_R2_001.fastq.gz\n",
"37e5dff58e642a2f23d4e8eb1d5339bb JD002J_S140_L005_R1_001.fastq.gz\n",
"e5da7e7ce6492940461d5dbd50f832c6 JD002J_S140_L005_R2_001.fastq.gz\n",
"66f2c2c3f9fcdbdbbc1d91c9661de5a6 JD002K_S141_L005_R1_001.fastq.gz\n",
"02db37aebfc0cd0f8e1184e9d444bb2d JD002K_S141_L005_R2_001.fastq.gz\n",
"103ca692e4d824eebc907495f6f288d7 JD002L_S142_L005_R1_001.fastq.gz\n",
"09e54284660986b049c9cf07dc0ef35e JD002L_S142_L005_R2_001.fastq.gz\n",
"c6d2bab7dabb6043a8565482b7b03cda JD002A_S131_L005_R1_001.fastq.gz\n",
"77d17d6425e818a798e28ff3dd7f34f0 JD002A_S131_L005_R2_001.fastq.gz\n",
"51346326dfa475706b1c219dd86dc4f2 JD002B_S132_L005_R1_001.fastq.gz\n",
"0062c59dc8fcda8fcfa452ebb717419c JD002B_S132_L005_R2_001.fastq.gz\n",
"91ae2b454af8343fe79f5db506e71438 JD002C_S133_L005_R1_001.fastq.gz\n",
"c428291a87958081fdc647e9d121506d JD002C_S133_L005_R2_001.fastq.gz\n",
"27a4bed7f18e5f71372f4676e0369a6e JD002D_S134_L005_R1_001.fastq.gz\n",
"84f0710eee4765e9aba0db009d004244 JD002D_S134_L005_R2_001.fastq.gz\n",
"0d4e953296924154d616c1143f9c4ad8 JD002E_S135_L005_R1_001.fastq.gz\n",
"896b03ed79dbc33113aaf646ce94b65a JD002E_S135_L005_R2_001.fastq.gz\n",
"ae11c97c5c877787088e236e8c158346 JD002F_S136_L005_R1_001.fastq.gz\n",
"d8968dc209461435a66af6382f049a19 JD002F_S136_L005_R2_001.fastq.gz\n",
"83b5e361d8d3c4cff1d464d0428171d1 JD002G_S137_L005_R1_001.fastq.gz\n",
"168c2dbf1a7585d2a2a1b13a56e2f4e6 JD002G_S137_L005_R2_001.fastq.gz\n",
"1403cafaa172d2b009404e98ef6503ae JD002H_S138_L005_R1_001.fastq.gz\n",
"0c2f1b9a951b54694e1b8f287ac82793 JD002H_S138_L005_R2_001.fastq.gz\n",
"1abea619075366f23f4510ffb1d9ad26 JD002I_S139_L005_R1_001.fastq.gz\n",
"9e930c5276aae350cb34e0f42f954faf JD002I_S139_L005_R2_001.fastq.gz\n",
"37e5dff58e642a2f23d4e8eb1d5339bb JD002J_S140_L005_R1_001.fastq.gz\n",
"e5da7e7ce6492940461d5dbd50f832c6 JD002J_S140_L005_R2_001.fastq.gz\n",
"66f2c2c3f9fcdbdbbc1d91c9661de5a6 JD002K_S141_L005_R1_001.fastq.gz\n",
"02db37aebfc0cd0f8e1184e9d444bb2d JD002K_S141_L005_R2_001.fastq.gz\n",
"103ca692e4d824eebc907495f6f288d7 JD002L_S142_L005_R1_001.fastq.gz\n",
"09e54284660986b049c9cf07dc0ef35e JD002L_S142_L005_R2_001.fastq.gz\n",
"c6d2bab7dabb6043a8565482b7b03cda JD002A_S131_L005_R1_001.fastq.gz\n",
"77d17d6425e818a798e28ff3dd7f34f0 JD002A_S131_L005_R2_001.fastq.gz\n",
"51346326dfa475706b1c219dd86dc4f2 JD002B_S132_L005_R1_001.fastq.gz\n",
"0062c59dc8fcda8fcfa452ebb717419c JD002B_S132_L005_R2_001.fastq.gz\n",
"91ae2b454af8343fe79f5db506e71438 JD002C_S133_L005_R1_001.fastq.gz\n",
"c428291a87958081fdc647e9d121506d JD002C_S133_L005_R2_001.fastq.gz\n",
"27a4bed7f18e5f71372f4676e0369a6e JD002D_S134_L005_R1_001.fastq.gz\n",
"84f0710eee4765e9aba0db009d004244 JD002D_S134_L005_R2_001.fastq.gz\n",
"0d4e953296924154d616c1143f9c4ad8 JD002E_S135_L005_R1_001.fastq.gz\n",
"896b03ed79dbc33113aaf646ce94b65a JD002E_S135_L005_R2_001.fastq.gz\n",
"ae11c97c5c877787088e236e8c158346 JD002F_S136_L005_R1_001.fastq.gz\n",
"d8968dc209461435a66af6382f049a19 JD002F_S136_L005_R2_001.fastq.gz\n",
"83b5e361d8d3c4cff1d464d0428171d1 JD002G_S137_L005_R1_001.fastq.gz\n",
"168c2dbf1a7585d2a2a1b13a56e2f4e6 JD002G_S137_L005_R2_001.fastq.gz\n",
"1403cafaa172d2b009404e98ef6503ae JD002H_S138_L005_R1_001.fastq.gz\n",
"0c2f1b9a951b54694e1b8f287ac82793 JD002H_S138_L005_R2_001.fastq.gz\n",
"1abea619075366f23f4510ffb1d9ad26 JD002I_S139_L005_R1_001.fastq.gz\n",
"9e930c5276aae350cb34e0f42f954faf JD002I_S139_L005_R2_001.fastq.gz\n",
"37e5dff58e642a2f23d4e8eb1d5339bb JD002J_S140_L005_R1_001.fastq.gz\n",
"e5da7e7ce6492940461d5dbd50f832c6 JD002J_S140_L005_R2_001.fastq.gz\n",
"66f2c2c3f9fcdbdbbc1d91c9661de5a6 JD002K_S141_L005_R1_001.fastq.gz\n",
"02db37aebfc0cd0f8e1184e9d444bb2d JD002K_S141_L005_R2_001.fastq.gz\n",
"103ca692e4d824eebc907495f6f288d7 JD002L_S142_L005_R1_001.fastq.gz\n",
"09e54284660986b049c9cf07dc0ef35e JD002L_S142_L005_R2_001.fastq.gz\n"
]
}
],
"source": [
"%%bash\n",
"cat temp_checksums.md5"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Compare initial checksums with temporary checksums on Owl"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"%%bash\n",
"cat demultiplexed_checksums >> demultiplexed_checksums_cat\n",
"cat demultiplexed_checksums >> demultiplexed_checksums_cat\n",
"cat demultiplexed_checksums >> demultiplexed_checksums_cat"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"c6d2bab7dabb6043a8565482b7b03cda JD002A_S131_L005_R1_001.fastq.gz\n",
"77d17d6425e818a798e28ff3dd7f34f0 JD002A_S131_L005_R2_001.fastq.gz\n",
"51346326dfa475706b1c219dd86dc4f2 JD002B_S132_L005_R1_001.fastq.gz\n",
"0062c59dc8fcda8fcfa452ebb717419c JD002B_S132_L005_R2_001.fastq.gz\n",
"91ae2b454af8343fe79f5db506e71438 JD002C_S133_L005_R1_001.fastq.gz\n",
"c428291a87958081fdc647e9d121506d JD002C_S133_L005_R2_001.fastq.gz\n",
"27a4bed7f18e5f71372f4676e0369a6e JD002D_S134_L005_R1_001.fastq.gz\n",
"84f0710eee4765e9aba0db009d004244 JD002D_S134_L005_R2_001.fastq.gz\n",
"0d4e953296924154d616c1143f9c4ad8 JD002E_S135_L005_R1_001.fastq.gz\n",
"896b03ed79dbc33113aaf646ce94b65a JD002E_S135_L005_R2_001.fastq.gz\n",
"ae11c97c5c877787088e236e8c158346 JD002F_S136_L005_R1_001.fastq.gz\n",
"d8968dc209461435a66af6382f049a19 JD002F_S136_L005_R2_001.fastq.gz\n",
"83b5e361d8d3c4cff1d464d0428171d1 JD002G_S137_L005_R1_001.fastq.gz\n",
"168c2dbf1a7585d2a2a1b13a56e2f4e6 JD002G_S137_L005_R2_001.fastq.gz\n",
"1403cafaa172d2b009404e98ef6503ae JD002H_S138_L005_R1_001.fastq.gz\n",
"0c2f1b9a951b54694e1b8f287ac82793 JD002H_S138_L005_R2_001.fastq.gz\n",
"1abea619075366f23f4510ffb1d9ad26 JD002I_S139_L005_R1_001.fastq.gz\n",
"9e930c5276aae350cb34e0f42f954faf JD002I_S139_L005_R2_001.fastq.gz\n",
"37e5dff58e642a2f23d4e8eb1d5339bb JD002J_S140_L005_R1_001.fastq.gz\n",
"e5da7e7ce6492940461d5dbd50f832c6 JD002J_S140_L005_R2_001.fastq.gz\n",
"66f2c2c3f9fcdbdbbc1d91c9661de5a6 JD002K_S141_L005_R1_001.fastq.gz\n",
"02db37aebfc0cd0f8e1184e9d444bb2d JD002K_S141_L005_R2_001.fastq.gz\n",
"103ca692e4d824eebc907495f6f288d7 JD002L_S142_L005_R1_001.fastq.gz\n",
"09e54284660986b049c9cf07dc0ef35e JD002L_S142_L005_R2_001.fastq.gz\n",
"c6d2bab7dabb6043a8565482b7b03cda JD002A_S131_L005_R1_001.fastq.gz\n",
"77d17d6425e818a798e28ff3dd7f34f0 JD002A_S131_L005_R2_001.fastq.gz\n",
"51346326dfa475706b1c219dd86dc4f2 JD002B_S132_L005_R1_001.fastq.gz\n",
"0062c59dc8fcda8fcfa452ebb717419c JD002B_S132_L005_R2_001.fastq.gz\n",
"91ae2b454af8343fe79f5db506e71438 JD002C_S133_L005_R1_001.fastq.gz\n",
"c428291a87958081fdc647e9d121506d JD002C_S133_L005_R2_001.fastq.gz\n",
"27a4bed7f18e5f71372f4676e0369a6e JD002D_S134_L005_R1_001.fastq.gz\n",
"84f0710eee4765e9aba0db009d004244 JD002D_S134_L005_R2_001.fastq.gz\n",
"0d4e953296924154d616c1143f9c4ad8 JD002E_S135_L005_R1_001.fastq.gz\n",
"896b03ed79dbc33113aaf646ce94b65a JD002E_S135_L005_R2_001.fastq.gz\n",
"ae11c97c5c877787088e236e8c158346 JD002F_S136_L005_R1_001.fastq.gz\n",
"d8968dc209461435a66af6382f049a19 JD002F_S136_L005_R2_001.fastq.gz\n",
"83b5e361d8d3c4cff1d464d0428171d1 JD002G_S137_L005_R1_001.fastq.gz\n",
"168c2dbf1a7585d2a2a1b13a56e2f4e6 JD002G_S137_L005_R2_001.fastq.gz\n",
"1403cafaa172d2b009404e98ef6503ae JD002H_S138_L005_R1_001.fastq.gz\n",
"0c2f1b9a951b54694e1b8f287ac82793 JD002H_S138_L005_R2_001.fastq.gz\n",
"1abea619075366f23f4510ffb1d9ad26 JD002I_S139_L005_R1_001.fastq.gz\n",
"9e930c5276aae350cb34e0f42f954faf JD002I_S139_L005_R2_001.fastq.gz\n",
"37e5dff58e642a2f23d4e8eb1d5339bb JD002J_S140_L005_R1_001.fastq.gz\n",
"e5da7e7ce6492940461d5dbd50f832c6 JD002J_S140_L005_R2_001.fastq.gz\n",
"66f2c2c3f9fcdbdbbc1d91c9661de5a6 JD002K_S141_L005_R1_001.fastq.gz\n",
"02db37aebfc0cd0f8e1184e9d444bb2d JD002K_S141_L005_R2_001.fastq.gz\n",
"103ca692e4d824eebc907495f6f288d7 JD002L_S142_L005_R1_001.fastq.gz\n",
"09e54284660986b049c9cf07dc0ef35e JD002L_S142_L005_R2_001.fastq.gz\n",
"c6d2bab7dabb6043a8565482b7b03cda JD002A_S131_L005_R1_001.fastq.gz\n",
"77d17d6425e818a798e28ff3dd7f34f0 JD002A_S131_L005_R2_001.fastq.gz\n",
"51346326dfa475706b1c219dd86dc4f2 JD002B_S132_L005_R1_001.fastq.gz\n",
"0062c59dc8fcda8fcfa452ebb717419c JD002B_S132_L005_R2_001.fastq.gz\n",
"91ae2b454af8343fe79f5db506e71438 JD002C_S133_L005_R1_001.fastq.gz\n",
"c428291a87958081fdc647e9d121506d JD002C_S133_L005_R2_001.fastq.gz\n",
"27a4bed7f18e5f71372f4676e0369a6e JD002D_S134_L005_R1_001.fastq.gz\n",
"84f0710eee4765e9aba0db009d004244 JD002D_S134_L005_R2_001.fastq.gz\n",
"0d4e953296924154d616c1143f9c4ad8 JD002E_S135_L005_R1_001.fastq.gz\n",
"896b03ed79dbc33113aaf646ce94b65a JD002E_S135_L005_R2_001.fastq.gz\n",
"ae11c97c5c877787088e236e8c158346 JD002F_S136_L005_R1_001.fastq.gz\n",
"d8968dc209461435a66af6382f049a19 JD002F_S136_L005_R2_001.fastq.gz\n",
"83b5e361d8d3c4cff1d464d0428171d1 JD002G_S137_L005_R1_001.fastq.gz\n",
"168c2dbf1a7585d2a2a1b13a56e2f4e6 JD002G_S137_L005_R2_001.fastq.gz\n",
"1403cafaa172d2b009404e98ef6503ae JD002H_S138_L005_R1_001.fastq.gz\n",
"0c2f1b9a951b54694e1b8f287ac82793 JD002H_S138_L005_R2_001.fastq.gz\n",
"1abea619075366f23f4510ffb1d9ad26 JD002I_S139_L005_R1_001.fastq.gz\n",
"9e930c5276aae350cb34e0f42f954faf JD002I_S139_L005_R2_001.fastq.gz\n",
"37e5dff58e642a2f23d4e8eb1d5339bb JD002J_S140_L005_R1_001.fastq.gz\n",
"e5da7e7ce6492940461d5dbd50f832c6 JD002J_S140_L005_R2_001.fastq.gz\n",
"66f2c2c3f9fcdbdbbc1d91c9661de5a6 JD002K_S141_L005_R1_001.fastq.gz\n",
"02db37aebfc0cd0f8e1184e9d444bb2d JD002K_S141_L005_R2_001.fastq.gz\n",
"103ca692e4d824eebc907495f6f288d7 JD002L_S142_L005_R1_001.fastq.gz\n",
"09e54284660986b049c9cf07dc0ef35e JD002L_S142_L005_R2_001.fastq.gz\n"
]
}
],
"source": [
"%%bash\n",
"cat demultiplexed_checksums_cat"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1,72c1,72\n",
"< c6d2bab7dabb6043a8565482b7b03cda JD002A_S131_L005_R1_001.fastq.gz\n",
"< 77d17d6425e818a798e28ff3dd7f34f0 JD002A_S131_L005_R2_001.fastq.gz\n",
"< 51346326dfa475706b1c219dd86dc4f2 JD002B_S132_L005_R1_001.fastq.gz\n",
"< 0062c59dc8fcda8fcfa452ebb717419c JD002B_S132_L005_R2_001.fastq.gz\n",
"< 91ae2b454af8343fe79f5db506e71438 JD002C_S133_L005_R1_001.fastq.gz\n",
"< c428291a87958081fdc647e9d121506d JD002C_S133_L005_R2_001.fastq.gz\n",
"< 27a4bed7f18e5f71372f4676e0369a6e JD002D_S134_L005_R1_001.fastq.gz\n",
"< 84f0710eee4765e9aba0db009d004244 JD002D_S134_L005_R2_001.fastq.gz\n",
"< 0d4e953296924154d616c1143f9c4ad8 JD002E_S135_L005_R1_001.fastq.gz\n",
"< 896b03ed79dbc33113aaf646ce94b65a JD002E_S135_L005_R2_001.fastq.gz\n",
"< ae11c97c5c877787088e236e8c158346 JD002F_S136_L005_R1_001.fastq.gz\n",
"< d8968dc209461435a66af6382f049a19 JD002F_S136_L005_R2_001.fastq.gz\n",
"< 83b5e361d8d3c4cff1d464d0428171d1 JD002G_S137_L005_R1_001.fastq.gz\n",
"< 168c2dbf1a7585d2a2a1b13a56e2f4e6 JD002G_S137_L005_R2_001.fastq.gz\n",
"< 1403cafaa172d2b009404e98ef6503ae JD002H_S138_L005_R1_001.fastq.gz\n",
"< 0c2f1b9a951b54694e1b8f287ac82793 JD002H_S138_L005_R2_001.fastq.gz\n",
"< 1abea619075366f23f4510ffb1d9ad26 JD002I_S139_L005_R1_001.fastq.gz\n",
"< 9e930c5276aae350cb34e0f42f954faf JD002I_S139_L005_R2_001.fastq.gz\n",
"< 37e5dff58e642a2f23d4e8eb1d5339bb JD002J_S140_L005_R1_001.fastq.gz\n",
"< e5da7e7ce6492940461d5dbd50f832c6 JD002J_S140_L005_R2_001.fastq.gz\n",
"< 66f2c2c3f9fcdbdbbc1d91c9661de5a6 JD002K_S141_L005_R1_001.fastq.gz\n",
"< 02db37aebfc0cd0f8e1184e9d444bb2d JD002K_S141_L005_R2_001.fastq.gz\n",
"< 103ca692e4d824eebc907495f6f288d7 JD002L_S142_L005_R1_001.fastq.gz\n",
"< 09e54284660986b049c9cf07dc0ef35e JD002L_S142_L005_R2_001.fastq.gz\n",
"< c6d2bab7dabb6043a8565482b7b03cda JD002A_S131_L005_R1_001.fastq.gz\n",
"< 77d17d6425e818a798e28ff3dd7f34f0 JD002A_S131_L005_R2_001.fastq.gz\n",
"< 51346326dfa475706b1c219dd86dc4f2 JD002B_S132_L005_R1_001.fastq.gz\n",
"< 0062c59dc8fcda8fcfa452ebb717419c JD002B_S132_L005_R2_001.fastq.gz\n",
"< 91ae2b454af8343fe79f5db506e71438 JD002C_S133_L005_R1_001.fastq.gz\n",
"< c428291a87958081fdc647e9d121506d JD002C_S133_L005_R2_001.fastq.gz\n",
"< 27a4bed7f18e5f71372f4676e0369a6e JD002D_S134_L005_R1_001.fastq.gz\n",
"< 84f0710eee4765e9aba0db009d004244 JD002D_S134_L005_R2_001.fastq.gz\n",
"< 0d4e953296924154d616c1143f9c4ad8 JD002E_S135_L005_R1_001.fastq.gz\n",
"< 896b03ed79dbc33113aaf646ce94b65a JD002E_S135_L005_R2_001.fastq.gz\n",
"< ae11c97c5c877787088e236e8c158346 JD002F_S136_L005_R1_001.fastq.gz\n",
"< d8968dc209461435a66af6382f049a19 JD002F_S136_L005_R2_001.fastq.gz\n",
"< 83b5e361d8d3c4cff1d464d0428171d1 JD002G_S137_L005_R1_001.fastq.gz\n",
"< 168c2dbf1a7585d2a2a1b13a56e2f4e6 JD002G_S137_L005_R2_001.fastq.gz\n",
"< 1403cafaa172d2b009404e98ef6503ae JD002H_S138_L005_R1_001.fastq.gz\n",
"< 0c2f1b9a951b54694e1b8f287ac82793 JD002H_S138_L005_R2_001.fastq.gz\n",
"< 1abea619075366f23f4510ffb1d9ad26 JD002I_S139_L005_R1_001.fastq.gz\n",
"< 9e930c5276aae350cb34e0f42f954faf JD002I_S139_L005_R2_001.fastq.gz\n",
"< 37e5dff58e642a2f23d4e8eb1d5339bb JD002J_S140_L005_R1_001.fastq.gz\n",
"< e5da7e7ce6492940461d5dbd50f832c6 JD002J_S140_L005_R2_001.fastq.gz\n",
"< 66f2c2c3f9fcdbdbbc1d91c9661de5a6 JD002K_S141_L005_R1_001.fastq.gz\n",
"< 02db37aebfc0cd0f8e1184e9d444bb2d JD002K_S141_L005_R2_001.fastq.gz\n",
"< 103ca692e4d824eebc907495f6f288d7 JD002L_S142_L005_R1_001.fastq.gz\n",
"< 09e54284660986b049c9cf07dc0ef35e JD002L_S142_L005_R2_001.fastq.gz\n",
"< c6d2bab7dabb6043a8565482b7b03cda JD002A_S131_L005_R1_001.fastq.gz\n",
"< 77d17d6425e818a798e28ff3dd7f34f0 JD002A_S131_L005_R2_001.fastq.gz\n",
"< 51346326dfa475706b1c219dd86dc4f2 JD002B_S132_L005_R1_001.fastq.gz\n",
"< 0062c59dc8fcda8fcfa452ebb717419c JD002B_S132_L005_R2_001.fastq.gz\n",
"< 91ae2b454af8343fe79f5db506e71438 JD002C_S133_L005_R1_001.fastq.gz\n",
"< c428291a87958081fdc647e9d121506d JD002C_S133_L005_R2_001.fastq.gz\n",
"< 27a4bed7f18e5f71372f4676e0369a6e JD002D_S134_L005_R1_001.fastq.gz\n",
"< 84f0710eee4765e9aba0db009d004244 JD002D_S134_L005_R2_001.fastq.gz\n",
"< 0d4e953296924154d616c1143f9c4ad8 JD002E_S135_L005_R1_001.fastq.gz\n",
"< 896b03ed79dbc33113aaf646ce94b65a JD002E_S135_L005_R2_001.fastq.gz\n",
"< ae11c97c5c877787088e236e8c158346 JD002F_S136_L005_R1_001.fastq.gz\n",
"< d8968dc209461435a66af6382f049a19 JD002F_S136_L005_R2_001.fastq.gz\n",
"< 83b5e361d8d3c4cff1d464d0428171d1 JD002G_S137_L005_R1_001.fastq.gz\n",
"< 168c2dbf1a7585d2a2a1b13a56e2f4e6 JD002G_S137_L005_R2_001.fastq.gz\n",
"< 1403cafaa172d2b009404e98ef6503ae JD002H_S138_L005_R1_001.fastq.gz\n",
"< 0c2f1b9a951b54694e1b8f287ac82793 JD002H_S138_L005_R2_001.fastq.gz\n",
"< 1abea619075366f23f4510ffb1d9ad26 JD002I_S139_L005_R1_001.fastq.gz\n",
"< 9e930c5276aae350cb34e0f42f954faf JD002I_S139_L005_R2_001.fastq.gz\n",
"< 37e5dff58e642a2f23d4e8eb1d5339bb JD002J_S140_L005_R1_001.fastq.gz\n",
"< e5da7e7ce6492940461d5dbd50f832c6 JD002J_S140_L005_R2_001.fastq.gz\n",
"< 66f2c2c3f9fcdbdbbc1d91c9661de5a6 JD002K_S141_L005_R1_001.fastq.gz\n",
"< 02db37aebfc0cd0f8e1184e9d444bb2d JD002K_S141_L005_R2_001.fastq.gz\n",
"< 103ca692e4d824eebc907495f6f288d7 JD002L_S142_L005_R1_001.fastq.gz\n",
"< 09e54284660986b049c9cf07dc0ef35e JD002L_S142_L005_R2_001.fastq.gz\n",
"---\n",
"> c6d2bab7dabb6043a8565482b7b03cda JD002A_S131_L005_R1_001.fastq.gz\n",
"> 77d17d6425e818a798e28ff3dd7f34f0 JD002A_S131_L005_R2_001.fastq.gz\n",
"> 51346326dfa475706b1c219dd86dc4f2 JD002B_S132_L005_R1_001.fastq.gz\n",
"> 0062c59dc8fcda8fcfa452ebb717419c JD002B_S132_L005_R2_001.fastq.gz\n",
"> 91ae2b454af8343fe79f5db506e71438 JD002C_S133_L005_R1_001.fastq.gz\n",
"> c428291a87958081fdc647e9d121506d JD002C_S133_L005_R2_001.fastq.gz\n",
"> 27a4bed7f18e5f71372f4676e0369a6e JD002D_S134_L005_R1_001.fastq.gz\n",
"> 84f0710eee4765e9aba0db009d004244 JD002D_S134_L005_R2_001.fastq.gz\n",
"> 0d4e953296924154d616c1143f9c4ad8 JD002E_S135_L005_R1_001.fastq.gz\n",
"> 896b03ed79dbc33113aaf646ce94b65a JD002E_S135_L005_R2_001.fastq.gz\n",
"> ae11c97c5c877787088e236e8c158346 JD002F_S136_L005_R1_001.fastq.gz\n",
"> d8968dc209461435a66af6382f049a19 JD002F_S136_L005_R2_001.fastq.gz\n",
"> 83b5e361d8d3c4cff1d464d0428171d1 JD002G_S137_L005_R1_001.fastq.gz\n",
"> 168c2dbf1a7585d2a2a1b13a56e2f4e6 JD002G_S137_L005_R2_001.fastq.gz\n",
"> 1403cafaa172d2b009404e98ef6503ae JD002H_S138_L005_R1_001.fastq.gz\n",
"> 0c2f1b9a951b54694e1b8f287ac82793 JD002H_S138_L005_R2_001.fastq.gz\n",
"> 1abea619075366f23f4510ffb1d9ad26 JD002I_S139_L005_R1_001.fastq.gz\n",
"> 9e930c5276aae350cb34e0f42f954faf JD002I_S139_L005_R2_001.fastq.gz\n",
"> 37e5dff58e642a2f23d4e8eb1d5339bb JD002J_S140_L005_R1_001.fastq.gz\n",
"> e5da7e7ce6492940461d5dbd50f832c6 JD002J_S140_L005_R2_001.fastq.gz\n",
"> 66f2c2c3f9fcdbdbbc1d91c9661de5a6 JD002K_S141_L005_R1_001.fastq.gz\n",
"> 02db37aebfc0cd0f8e1184e9d444bb2d JD002K_S141_L005_R2_001.fastq.gz\n",
"> 103ca692e4d824eebc907495f6f288d7 JD002L_S142_L005_R1_001.fastq.gz\n",
"> 09e54284660986b049c9cf07dc0ef35e JD002L_S142_L005_R2_001.fastq.gz\n",
"> c6d2bab7dabb6043a8565482b7b03cda JD002A_S131_L005_R1_001.fastq.gz\n",
"> 77d17d6425e818a798e28ff3dd7f34f0 JD002A_S131_L005_R2_001.fastq.gz\n",
"> 51346326dfa475706b1c219dd86dc4f2 JD002B_S132_L005_R1_001.fastq.gz\n",
"> 0062c59dc8fcda8fcfa452ebb717419c JD002B_S132_L005_R2_001.fastq.gz\n",
"> 91ae2b454af8343fe79f5db506e71438 JD002C_S133_L005_R1_001.fastq.gz\n",
"> c428291a87958081fdc647e9d121506d JD002C_S133_L005_R2_001.fastq.gz\n",
"> 27a4bed7f18e5f71372f4676e0369a6e JD002D_S134_L005_R1_001.fastq.gz\n",
"> 84f0710eee4765e9aba0db009d004244 JD002D_S134_L005_R2_001.fastq.gz\n",
"> 0d4e953296924154d616c1143f9c4ad8 JD002E_S135_L005_R1_001.fastq.gz\n",
"> 896b03ed79dbc33113aaf646ce94b65a JD002E_S135_L005_R2_001.fastq.gz\n",
"> ae11c97c5c877787088e236e8c158346 JD002F_S136_L005_R1_001.fastq.gz\n",
"> d8968dc209461435a66af6382f049a19 JD002F_S136_L005_R2_001.fastq.gz\n",
"> 83b5e361d8d3c4cff1d464d0428171d1 JD002G_S137_L005_R1_001.fastq.gz\n",
"> 168c2dbf1a7585d2a2a1b13a56e2f4e6 JD002G_S137_L005_R2_001.fastq.gz\n",
"> 1403cafaa172d2b009404e98ef6503ae JD002H_S138_L005_R1_001.fastq.gz\n",
"> 0c2f1b9a951b54694e1b8f287ac82793 JD002H_S138_L005_R2_001.fastq.gz\n",
"> 1abea619075366f23f4510ffb1d9ad26 JD002I_S139_L005_R1_001.fastq.gz\n",
"> 9e930c5276aae350cb34e0f42f954faf JD002I_S139_L005_R2_001.fastq.gz\n",
"> 37e5dff58e642a2f23d4e8eb1d5339bb JD002J_S140_L005_R1_001.fastq.gz\n",
"> e5da7e7ce6492940461d5dbd50f832c6 JD002J_S140_L005_R2_001.fastq.gz\n",
"> 66f2c2c3f9fcdbdbbc1d91c9661de5a6 JD002K_S141_L005_R1_001.fastq.gz\n",
"> 02db37aebfc0cd0f8e1184e9d444bb2d JD002K_S141_L005_R2_001.fastq.gz\n",
"> 103ca692e4d824eebc907495f6f288d7 JD002L_S142_L005_R1_001.fastq.gz\n",
"> 09e54284660986b049c9cf07dc0ef35e JD002L_S142_L005_R2_001.fastq.gz\n",
"> c6d2bab7dabb6043a8565482b7b03cda JD002A_S131_L005_R1_001.fastq.gz\n",
"> 77d17d6425e818a798e28ff3dd7f34f0 JD002A_S131_L005_R2_001.fastq.gz\n",
"> 51346326dfa475706b1c219dd86dc4f2 JD002B_S132_L005_R1_001.fastq.gz\n",
"> 0062c59dc8fcda8fcfa452ebb717419c JD002B_S132_L005_R2_001.fastq.gz\n",
"> 91ae2b454af8343fe79f5db506e71438 JD002C_S133_L005_R1_001.fastq.gz\n",
"> c428291a87958081fdc647e9d121506d JD002C_S133_L005_R2_001.fastq.gz\n",
"> 27a4bed7f18e5f71372f4676e0369a6e JD002D_S134_L005_R1_001.fastq.gz\n",
"> 84f0710eee4765e9aba0db009d004244 JD002D_S134_L005_R2_001.fastq.gz\n",
"> 0d4e953296924154d616c1143f9c4ad8 JD002E_S135_L005_R1_001.fastq.gz\n",
"> 896b03ed79dbc33113aaf646ce94b65a JD002E_S135_L005_R2_001.fastq.gz\n",
"> ae11c97c5c877787088e236e8c158346 JD002F_S136_L005_R1_001.fastq.gz\n",
"> d8968dc209461435a66af6382f049a19 JD002F_S136_L005_R2_001.fastq.gz\n",
"> 83b5e361d8d3c4cff1d464d0428171d1 JD002G_S137_L005_R1_001.fastq.gz\n",
"> 168c2dbf1a7585d2a2a1b13a56e2f4e6 JD002G_S137_L005_R2_001.fastq.gz\n",
"> 1403cafaa172d2b009404e98ef6503ae JD002H_S138_L005_R1_001.fastq.gz\n",
"> 0c2f1b9a951b54694e1b8f287ac82793 JD002H_S138_L005_R2_001.fastq.gz\n",
"> 1abea619075366f23f4510ffb1d9ad26 JD002I_S139_L005_R1_001.fastq.gz\n",
"> 9e930c5276aae350cb34e0f42f954faf JD002I_S139_L005_R2_001.fastq.gz\n",
"> 37e5dff58e642a2f23d4e8eb1d5339bb JD002J_S140_L005_R1_001.fastq.gz\n",
"> e5da7e7ce6492940461d5dbd50f832c6 JD002J_S140_L005_R2_001.fastq.gz\n",
"> 66f2c2c3f9fcdbdbbc1d91c9661de5a6 JD002K_S141_L005_R1_001.fastq.gz\n",
"> 02db37aebfc0cd0f8e1184e9d444bb2d JD002K_S141_L005_R2_001.fastq.gz\n",
"> 103ca692e4d824eebc907495f6f288d7 JD002L_S142_L005_R1_001.fastq.gz\n",
"> 09e54284660986b049c9cf07dc0ef35e JD002L_S142_L005_R2_001.fastq.gz\n",
"1\n"
]
}
],
"source": [
"%%bash\n",
"diff demultiplexed_checksums_cat temp_checksums.md5\n",
"echo $?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Argh! Delimiter between the two columns is different!! That leads to ```diff``` identifying each line as being different in each file. Let's try again..."
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"UsageError: %%bash is a cell magic, but the cell body is empty.\n"
]
}
],
"source": [
"%%bash"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"%%bash\n",
"awk '{print $1 \" \" $2}' temp_checksums.md5.bak > temp_checksums.md5"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"c6d2bab7dabb6043a8565482b7b03cda /owl_web/nightingales/P_generosa/JD002A_S131_L005_R1_001.fastq.gz\n",
"77d17d6425e818a798e28ff3dd7f34f0 /owl_web/nightingales/P_generosa/JD002A_S131_L005_R2_001.fastq.gz\n",
"51346326dfa475706b1c219dd86dc4f2 /owl_web/nightingales/P_generosa/JD002B_S132_L005_R1_001.fastq.gz\n",
"0062c59dc8fcda8fcfa452ebb717419c /owl_web/nightingales/P_generosa/JD002B_S132_L005_R2_001.fastq.gz\n",
"91ae2b454af8343fe79f5db506e71438 /owl_web/nightingales/P_generosa/JD002C_S133_L005_R1_001.fastq.gz\n",
"c428291a87958081fdc647e9d121506d /owl_web/nightingales/P_generosa/JD002C_S133_L005_R2_001.fastq.gz\n",
"27a4bed7f18e5f71372f4676e0369a6e /owl_web/nightingales/P_generosa/JD002D_S134_L005_R1_001.fastq.gz\n",
"84f0710eee4765e9aba0db009d004244 /owl_web/nightingales/P_generosa/JD002D_S134_L005_R2_001.fastq.gz\n",
"0d4e953296924154d616c1143f9c4ad8 /owl_web/nightingales/P_generosa/JD002E_S135_L005_R1_001.fastq.gz\n",
"896b03ed79dbc33113aaf646ce94b65a /owl_web/nightingales/P_generosa/JD002E_S135_L005_R2_001.fastq.gz\n",
"ae11c97c5c877787088e236e8c158346 /owl_web/nightingales/P_generosa/JD002F_S136_L005_R1_001.fastq.gz\n",
"d8968dc209461435a66af6382f049a19 /owl_web/nightingales/P_generosa/JD002F_S136_L005_R2_001.fastq.gz\n",
"83b5e361d8d3c4cff1d464d0428171d1 /owl_web/nightingales/P_generosa/JD002G_S137_L005_R1_001.fastq.gz\n",
"168c2dbf1a7585d2a2a1b13a56e2f4e6 /owl_web/nightingales/P_generosa/JD002G_S137_L005_R2_001.fastq.gz\n",
"1403cafaa172d2b009404e98ef6503ae /owl_web/nightingales/P_generosa/JD002H_S138_L005_R1_001.fastq.gz\n",
"0c2f1b9a951b54694e1b8f287ac82793 /owl_web/nightingales/P_generosa/JD002H_S138_L005_R2_001.fastq.gz\n",
"1abea619075366f23f4510ffb1d9ad26 /owl_web/nightingales/P_generosa/JD002I_S139_L005_R1_001.fastq.gz\n",
"9e930c5276aae350cb34e0f42f954faf /owl_web/nightingales/P_generosa/JD002I_S139_L005_R2_001.fastq.gz\n",
"37e5dff58e642a2f23d4e8eb1d5339bb /owl_web/nightingales/P_generosa/JD002J_S140_L005_R1_001.fastq.gz\n",
"e5da7e7ce6492940461d5dbd50f832c6 /owl_web/nightingales/P_generosa/JD002J_S140_L005_R2_001.fastq.gz\n",
"66f2c2c3f9fcdbdbbc1d91c9661de5a6 /owl_web/nightingales/P_generosa/JD002K_S141_L005_R1_001.fastq.gz\n",
"02db37aebfc0cd0f8e1184e9d444bb2d /owl_web/nightingales/P_generosa/JD002K_S141_L005_R2_001.fastq.gz\n",
"103ca692e4d824eebc907495f6f288d7 /owl_web/nightingales/P_generosa/JD002L_S142_L005_R1_001.fastq.gz\n",
"09e54284660986b049c9cf07dc0ef35e /owl_web/nightingales/P_generosa/JD002L_S142_L005_R2_001.fastq.gz\n",
"c6d2bab7dabb6043a8565482b7b03cda /owl_web/nightingales/Porites_spp/JD002A_S131_L005_R1_001.fastq.gz\n",
"77d17d6425e818a798e28ff3dd7f34f0 /owl_web/nightingales/Porites_spp/JD002A_S131_L005_R2_001.fastq.gz\n",
"51346326dfa475706b1c219dd86dc4f2 /owl_web/nightingales/Porites_spp/JD002B_S132_L005_R1_001.fastq.gz\n",
"0062c59dc8fcda8fcfa452ebb717419c /owl_web/nightingales/Porites_spp/JD002B_S132_L005_R2_001.fastq.gz\n",
"91ae2b454af8343fe79f5db506e71438 /owl_web/nightingales/Porites_spp/JD002C_S133_L005_R1_001.fastq.gz\n",
"c428291a87958081fdc647e9d121506d /owl_web/nightingales/Porites_spp/JD002C_S133_L005_R2_001.fastq.gz\n",
"27a4bed7f18e5f71372f4676e0369a6e /owl_web/nightingales/Porites_spp/JD002D_S134_L005_R1_001.fastq.gz\n",
"84f0710eee4765e9aba0db009d004244 /owl_web/nightingales/Porites_spp/JD002D_S134_L005_R2_001.fastq.gz\n",
"0d4e953296924154d616c1143f9c4ad8 /owl_web/nightingales/Porites_spp/JD002E_S135_L005_R1_001.fastq.gz\n",
"896b03ed79dbc33113aaf646ce94b65a /owl_web/nightingales/Porites_spp/JD002E_S135_L005_R2_001.fastq.gz\n",
"ae11c97c5c877787088e236e8c158346 /owl_web/nightingales/Porites_spp/JD002F_S136_L005_R1_001.fastq.gz\n",
"d8968dc209461435a66af6382f049a19 /owl_web/nightingales/Porites_spp/JD002F_S136_L005_R2_001.fastq.gz\n",
"83b5e361d8d3c4cff1d464d0428171d1 /owl_web/nightingales/Porites_spp/JD002G_S137_L005_R1_001.fastq.gz\n",
"168c2dbf1a7585d2a2a1b13a56e2f4e6 /owl_web/nightingales/Porites_spp/JD002G_S137_L005_R2_001.fastq.gz\n",
"1403cafaa172d2b009404e98ef6503ae /owl_web/nightingales/Porites_spp/JD002H_S138_L005_R1_001.fastq.gz\n",
"0c2f1b9a951b54694e1b8f287ac82793 /owl_web/nightingales/Porites_spp/JD002H_S138_L005_R2_001.fastq.gz\n",
"1abea619075366f23f4510ffb1d9ad26 /owl_web/nightingales/Porites_spp/JD002I_S139_L005_R1_001.fastq.gz\n",
"9e930c5276aae350cb34e0f42f954faf /owl_web/nightingales/Porites_spp/JD002I_S139_L005_R2_001.fastq.gz\n",
"37e5dff58e642a2f23d4e8eb1d5339bb /owl_web/nightingales/Porites_spp/JD002J_S140_L005_R1_001.fastq.gz\n",
"e5da7e7ce6492940461d5dbd50f832c6 /owl_web/nightingales/Porites_spp/JD002J_S140_L005_R2_001.fastq.gz\n",
"66f2c2c3f9fcdbdbbc1d91c9661de5a6 /owl_web/nightingales/Porites_spp/JD002K_S141_L005_R1_001.fastq.gz\n",
"02db37aebfc0cd0f8e1184e9d444bb2d /owl_web/nightingales/Porites_spp/JD002K_S141_L005_R2_001.fastq.gz\n",
"103ca692e4d824eebc907495f6f288d7 /owl_web/nightingales/Porites_spp/JD002L_S142_L005_R1_001.fastq.gz\n",
"09e54284660986b049c9cf07dc0ef35e /owl_web/nightingales/Porites_spp/JD002L_S142_L005_R2_001.fastq.gz\n",
"c6d2bab7dabb6043a8565482b7b03cda /owl_web/nightingales/A_elegantissima/JD002A_S131_L005_R1_001.fastq.gz\n",
"77d17d6425e818a798e28ff3dd7f34f0 /owl_web/nightingales/A_elegantissima/JD002A_S131_L005_R2_001.fastq.gz\n",
"51346326dfa475706b1c219dd86dc4f2 /owl_web/nightingales/A_elegantissima/JD002B_S132_L005_R1_001.fastq.gz\n",
"0062c59dc8fcda8fcfa452ebb717419c /owl_web/nightingales/A_elegantissima/JD002B_S132_L005_R2_001.fastq.gz\n",
"91ae2b454af8343fe79f5db506e71438 /owl_web/nightingales/A_elegantissima/JD002C_S133_L005_R1_001.fastq.gz\n",
"c428291a87958081fdc647e9d121506d /owl_web/nightingales/A_elegantissima/JD002C_S133_L005_R2_001.fastq.gz\n",
"27a4bed7f18e5f71372f4676e0369a6e /owl_web/nightingales/A_elegantissima/JD002D_S134_L005_R1_001.fastq.gz\n",
"84f0710eee4765e9aba0db009d004244 /owl_web/nightingales/A_elegantissima/JD002D_S134_L005_R2_001.fastq.gz\n",
"0d4e953296924154d616c1143f9c4ad8 /owl_web/nightingales/A_elegantissima/JD002E_S135_L005_R1_001.fastq.gz\n",
"896b03ed79dbc33113aaf646ce94b65a /owl_web/nightingales/A_elegantissima/JD002E_S135_L005_R2_001.fastq.gz\n",
"ae11c97c5c877787088e236e8c158346 /owl_web/nightingales/A_elegantissima/JD002F_S136_L005_R1_001.fastq.gz\n",
"d8968dc209461435a66af6382f049a19 /owl_web/nightingales/A_elegantissima/JD002F_S136_L005_R2_001.fastq.gz\n",
"83b5e361d8d3c4cff1d464d0428171d1 /owl_web/nightingales/A_elegantissima/JD002G_S137_L005_R1_001.fastq.gz\n",
"168c2dbf1a7585d2a2a1b13a56e2f4e6 /owl_web/nightingales/A_elegantissima/JD002G_S137_L005_R2_001.fastq.gz\n",
"1403cafaa172d2b009404e98ef6503ae /owl_web/nightingales/A_elegantissima/JD002H_S138_L005_R1_001.fastq.gz\n",
"0c2f1b9a951b54694e1b8f287ac82793 /owl_web/nightingales/A_elegantissima/JD002H_S138_L005_R2_001.fastq.gz\n",
"1abea619075366f23f4510ffb1d9ad26 /owl_web/nightingales/A_elegantissima/JD002I_S139_L005_R1_001.fastq.gz\n",
"9e930c5276aae350cb34e0f42f954faf /owl_web/nightingales/A_elegantissima/JD002I_S139_L005_R2_001.fastq.gz\n",
"37e5dff58e642a2f23d4e8eb1d5339bb /owl_web/nightingales/A_elegantissima/JD002J_S140_L005_R1_001.fastq.gz\n",
"e5da7e7ce6492940461d5dbd50f832c6 /owl_web/nightingales/A_elegantissima/JD002J_S140_L005_R2_001.fastq.gz\n",
"66f2c2c3f9fcdbdbbc1d91c9661de5a6 /owl_web/nightingales/A_elegantissima/JD002K_S141_L005_R1_001.fastq.gz\n",
"02db37aebfc0cd0f8e1184e9d444bb2d /owl_web/nightingales/A_elegantissima/JD002K_S141_L005_R2_001.fastq.gz\n",
"103ca692e4d824eebc907495f6f288d7 /owl_web/nightingales/A_elegantissima/JD002L_S142_L005_R1_001.fastq.gz\n",
"09e54284660986b049c9cf07dc0ef35e /owl_web/nightingales/A_elegantissima/JD002L_S142_L005_R2_001.fastq.gz\n"
]
}
],
"source": [
"%%bash\n",
"cat temp_checksums.md5"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Yeesh, screwed it up again (but fixed the spacing!). Here we go with another shot. The awk code is changed from earlier in that the print statement is modified to print the first column (```$1```), followed by two spaces (that's what's contained in the double quotes), and then print the second column (```$2```)."
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"%%bash\n",
"awk '{gsub(/\\/.*\\//,\"\",$2); print $1 \" \" $2}' < temp_checksums.md5.bak > temp_checksums.md5"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"c6d2bab7dabb6043a8565482b7b03cda JD002A_S131_L005_R1_001.fastq.gz\n",
"77d17d6425e818a798e28ff3dd7f34f0 JD002A_S131_L005_R2_001.fastq.gz\n",
"51346326dfa475706b1c219dd86dc4f2 JD002B_S132_L005_R1_001.fastq.gz\n",
"0062c59dc8fcda8fcfa452ebb717419c JD002B_S132_L005_R2_001.fastq.gz\n",
"91ae2b454af8343fe79f5db506e71438 JD002C_S133_L005_R1_001.fastq.gz\n",
"c428291a87958081fdc647e9d121506d JD002C_S133_L005_R2_001.fastq.gz\n",
"27a4bed7f18e5f71372f4676e0369a6e JD002D_S134_L005_R1_001.fastq.gz\n",
"84f0710eee4765e9aba0db009d004244 JD002D_S134_L005_R2_001.fastq.gz\n",
"0d4e953296924154d616c1143f9c4ad8 JD002E_S135_L005_R1_001.fastq.gz\n",
"896b03ed79dbc33113aaf646ce94b65a JD002E_S135_L005_R2_001.fastq.gz\n",
"ae11c97c5c877787088e236e8c158346 JD002F_S136_L005_R1_001.fastq.gz\n",
"d8968dc209461435a66af6382f049a19 JD002F_S136_L005_R2_001.fastq.gz\n",
"83b5e361d8d3c4cff1d464d0428171d1 JD002G_S137_L005_R1_001.fastq.gz\n",
"168c2dbf1a7585d2a2a1b13a56e2f4e6 JD002G_S137_L005_R2_001.fastq.gz\n",
"1403cafaa172d2b009404e98ef6503ae JD002H_S138_L005_R1_001.fastq.gz\n",
"0c2f1b9a951b54694e1b8f287ac82793 JD002H_S138_L005_R2_001.fastq.gz\n",
"1abea619075366f23f4510ffb1d9ad26 JD002I_S139_L005_R1_001.fastq.gz\n",
"9e930c5276aae350cb34e0f42f954faf JD002I_S139_L005_R2_001.fastq.gz\n",
"37e5dff58e642a2f23d4e8eb1d5339bb JD002J_S140_L005_R1_001.fastq.gz\n",
"e5da7e7ce6492940461d5dbd50f832c6 JD002J_S140_L005_R2_001.fastq.gz\n",
"66f2c2c3f9fcdbdbbc1d91c9661de5a6 JD002K_S141_L005_R1_001.fastq.gz\n",
"02db37aebfc0cd0f8e1184e9d444bb2d JD002K_S141_L005_R2_001.fastq.gz\n",
"103ca692e4d824eebc907495f6f288d7 JD002L_S142_L005_R1_001.fastq.gz\n",
"09e54284660986b049c9cf07dc0ef35e JD002L_S142_L005_R2_001.fastq.gz\n",
"c6d2bab7dabb6043a8565482b7b03cda JD002A_S131_L005_R1_001.fastq.gz\n",
"77d17d6425e818a798e28ff3dd7f34f0 JD002A_S131_L005_R2_001.fastq.gz\n",
"51346326dfa475706b1c219dd86dc4f2 JD002B_S132_L005_R1_001.fastq.gz\n",
"0062c59dc8fcda8fcfa452ebb717419c JD002B_S132_L005_R2_001.fastq.gz\n",
"91ae2b454af8343fe79f5db506e71438 JD002C_S133_L005_R1_001.fastq.gz\n",
"c428291a87958081fdc647e9d121506d JD002C_S133_L005_R2_001.fastq.gz\n",
"27a4bed7f18e5f71372f4676e0369a6e JD002D_S134_L005_R1_001.fastq.gz\n",
"84f0710eee4765e9aba0db009d004244 JD002D_S134_L005_R2_001.fastq.gz\n",
"0d4e953296924154d616c1143f9c4ad8 JD002E_S135_L005_R1_001.fastq.gz\n",
"896b03ed79dbc33113aaf646ce94b65a JD002E_S135_L005_R2_001.fastq.gz\n",
"ae11c97c5c877787088e236e8c158346 JD002F_S136_L005_R1_001.fastq.gz\n",
"d8968dc209461435a66af6382f049a19 JD002F_S136_L005_R2_001.fastq.gz\n",
"83b5e361d8d3c4cff1d464d0428171d1 JD002G_S137_L005_R1_001.fastq.gz\n",
"168c2dbf1a7585d2a2a1b13a56e2f4e6 JD002G_S137_L005_R2_001.fastq.gz\n",
"1403cafaa172d2b009404e98ef6503ae JD002H_S138_L005_R1_001.fastq.gz\n",
"0c2f1b9a951b54694e1b8f287ac82793 JD002H_S138_L005_R2_001.fastq.gz\n",
"1abea619075366f23f4510ffb1d9ad26 JD002I_S139_L005_R1_001.fastq.gz\n",
"9e930c5276aae350cb34e0f42f954faf JD002I_S139_L005_R2_001.fastq.gz\n",
"37e5dff58e642a2f23d4e8eb1d5339bb JD002J_S140_L005_R1_001.fastq.gz\n",
"e5da7e7ce6492940461d5dbd50f832c6 JD002J_S140_L005_R2_001.fastq.gz\n",
"66f2c2c3f9fcdbdbbc1d91c9661de5a6 JD002K_S141_L005_R1_001.fastq.gz\n",
"02db37aebfc0cd0f8e1184e9d444bb2d JD002K_S141_L005_R2_001.fastq.gz\n",
"103ca692e4d824eebc907495f6f288d7 JD002L_S142_L005_R1_001.fastq.gz\n",
"09e54284660986b049c9cf07dc0ef35e JD002L_S142_L005_R2_001.fastq.gz\n",
"c6d2bab7dabb6043a8565482b7b03cda JD002A_S131_L005_R1_001.fastq.gz\n",
"77d17d6425e818a798e28ff3dd7f34f0 JD002A_S131_L005_R2_001.fastq.gz\n",
"51346326dfa475706b1c219dd86dc4f2 JD002B_S132_L005_R1_001.fastq.gz\n",
"0062c59dc8fcda8fcfa452ebb717419c JD002B_S132_L005_R2_001.fastq.gz\n",
"91ae2b454af8343fe79f5db506e71438 JD002C_S133_L005_R1_001.fastq.gz\n",
"c428291a87958081fdc647e9d121506d JD002C_S133_L005_R2_001.fastq.gz\n",
"27a4bed7f18e5f71372f4676e0369a6e JD002D_S134_L005_R1_001.fastq.gz\n",
"84f0710eee4765e9aba0db009d004244 JD002D_S134_L005_R2_001.fastq.gz\n",
"0d4e953296924154d616c1143f9c4ad8 JD002E_S135_L005_R1_001.fastq.gz\n",
"896b03ed79dbc33113aaf646ce94b65a JD002E_S135_L005_R2_001.fastq.gz\n",
"ae11c97c5c877787088e236e8c158346 JD002F_S136_L005_R1_001.fastq.gz\n",
"d8968dc209461435a66af6382f049a19 JD002F_S136_L005_R2_001.fastq.gz\n",
"83b5e361d8d3c4cff1d464d0428171d1 JD002G_S137_L005_R1_001.fastq.gz\n",
"168c2dbf1a7585d2a2a1b13a56e2f4e6 JD002G_S137_L005_R2_001.fastq.gz\n",
"1403cafaa172d2b009404e98ef6503ae JD002H_S138_L005_R1_001.fastq.gz\n",
"0c2f1b9a951b54694e1b8f287ac82793 JD002H_S138_L005_R2_001.fastq.gz\n",
"1abea619075366f23f4510ffb1d9ad26 JD002I_S139_L005_R1_001.fastq.gz\n",
"9e930c5276aae350cb34e0f42f954faf JD002I_S139_L005_R2_001.fastq.gz\n",
"37e5dff58e642a2f23d4e8eb1d5339bb JD002J_S140_L005_R1_001.fastq.gz\n",
"e5da7e7ce6492940461d5dbd50f832c6 JD002J_S140_L005_R2_001.fastq.gz\n",
"66f2c2c3f9fcdbdbbc1d91c9661de5a6 JD002K_S141_L005_R1_001.fastq.gz\n",
"02db37aebfc0cd0f8e1184e9d444bb2d JD002K_S141_L005_R2_001.fastq.gz\n",
"103ca692e4d824eebc907495f6f288d7 JD002L_S142_L005_R1_001.fastq.gz\n",
"09e54284660986b049c9cf07dc0ef35e JD002L_S142_L005_R2_001.fastq.gz\n"
]
}
],
"source": [
"%%bash\n",
"cat temp_checksums.md5"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0\n"
]
}
],
"source": [
"%%bash\n",
"diff demultiplexed_checksums_cat temp_checksums.md5\n",
"echo $?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Boom! Got it! Now, I'll append the facility checksums to the cheksum files in the directories on Owl."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Append checksums to existing checksum files in each directory"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"%%bash\n",
"cat demultiplexed_checksums >> /owl_web/nightingales/P_generosa/checksums.md5\n",
"cat demultiplexed_checksums >> /owl_web/nightingales/Porites_spp/checksums.md5\n",
"cat demultiplexed_checksums >> /owl_web/nightingales/A_elegantissima/checksums.md5"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"total 0\n"
]
}
],
"source": [
"%%bash\n",
"cd\n",
"rm -rf /data/20170227_jay_data_tmp/gslserver.qb3.berkeley.edu/\n",
"ls -lh /data/20170227_jay_data_tmp"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"### Download \"undetermined\" FASTQ files left from demultiplexing"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"/data/20170227_jay_data_tmp\n"
]
}
],
"source": [
"cd /data/20170227_jay_data_tmp"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"wget: missing URL\n",
"Usage: wget [OPTION]... [URL]...\n",
"\n",
"Try `wget --help' for more options.\n",
"\n",
"real\t0m0.026s\n",
"user\t0m0.010s\n",
"sys\t0m0.000s\n"
]
}
],
"source": [
"%%bash\n",
"time WGETRC=/data/wgetrc_berk_seq wget -r -np -nc -q --accept \"Undetermined*\" --reject \"*S0_L00[1234678]\"ftp://gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Typo! Need space after reject list (between quotation and URL)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The wget command below adds an accept list and reject list to download only Jay's sequencing files (his were in Lane 5; L005) and the corresponding MD5 checksum file, named \"Undetermined_checksums\"."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Process is interrupted.\n"
]
}
],
"source": [
"%%bash\n",
"time WGETRC=/data/wgetrc_berk_seq wget -r -np -nc -q --accept \"Undetermined*\" --reject \"*S0_L00[1234678]\" ftp://gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Turns out, the accept/reject lists weren't working - all of the \"Undetermined files were being download."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"total 0\r\n",
"drwxr-xr-x 1 srlab staff 102 Mar 1 21:35 \u001b[0m\u001b[01;34mgslserver.qb3.berkeley.edu\u001b[0m/\r\n"
]
}
],
"source": [
"ls -lh"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"total 9.1G\r\n",
"-rw-r--r-- 1 srlab staff 528M Feb 21 21:32 Undetermined_S0_L001_R1_001.fastq.gz\r\n",
"-rw-r--r-- 1 srlab staff 745M Feb 21 21:32 Undetermined_S0_L001_R2_001.fastq.gz\r\n",
"-rw-r--r-- 1 srlab staff 2.0G Feb 21 22:00 Undetermined_S0_L002_R1_001.fastq.gz\r\n",
"-rw-r--r-- 1 srlab staff 2.4G Feb 21 22:00 Undetermined_S0_L002_R2_001.fastq.gz\r\n",
"-rw-r--r-- 1 srlab staff 1.9G Feb 21 22:27 Undetermined_S0_L003_R1_001.fastq.gz\r\n",
"-rw-r--r-- 1 srlab staff 1.7G Mar 1 21:53 Undetermined_S0_L003_R2_001.fastq.gz\r\n"
]
}
],
"source": [
"ls -lh gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"rm: cannot remove '*.gz': No such file or directory\r\n",
"rm: cannot remove 'gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/': Is a directory\r\n"
]
}
],
"source": [
"rm *.gz gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"%%bash\n",
"for i in gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/*.gz\n",
" do\n",
" rm \"$i\"\n",
" done"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"total 0\r\n"
]
}
],
"source": [
"ls -lh gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n",
"real\t11m7.430s\n",
"user\t0m4.890s\n",
"sys\t4m35.840s\n"
]
}
],
"source": [
"%%bash\n",
"time WGETRC=/data/wgetrc_berk_seq wget -r -np -nc -q --accept \"Undetermined_S0_L005*\" ftp://gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"total 0\r\n",
"drwxr-xr-x 1 srlab staff 102 Mar 1 21:35 \u001b[0m\u001b[01;34mgslserver.qb3.berkeley.edu\u001b[0m/\r\n"
]
}
],
"source": [
"ls -lh"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n",
"real\t0m2.741s\n",
"user\t0m0.020s\n",
"sys\t0m0.040s\n"
]
}
],
"source": [
"%%bash\n",
"time WGETRC=/data/wgetrc_berk_seq wget -r -np -nc -q --accept \"Undetermined_checksums\" ftp://gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"total 0\r\n",
"drwxr-xr-x 1 srlab staff 102 Mar 1 21:35 \u001b[0m\u001b[01;34mgslserver.qb3.berkeley.edu\u001b[0m/\r\n"
]
}
],
"source": [
"ls -lh"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Generate new MD5 checksums"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"md5sum: *.gz: No such file or directory\n",
"\n",
"real\t0m0.016s\n",
"user\t0m0.000s\n",
"sys\t0m0.000s\n"
]
}
],
"source": [
"%%bash\n",
"time for i in *.gz\n",
" do\n",
" md5sum \"$i\" >> checksums.md5\n",
" done"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"2\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"diff: Undetermined_checksums: No such file or directory\n"
]
}
],
"source": [
"%%bash\n",
"diff Undetermined_checksums checksums.md5\n",
"echo $?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Duh! I ran all of this from the wrong directory. Here we go again..."
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"/data/20170227_jay_data_tmp/gslserver.qb3.berkeley.edu/170217_100PE_HS4KA\n"
]
}
],
"source": [
"cd gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"total 5.2G\n",
"drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 Alfaro\n",
"drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 Chang\n",
"drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 Coates\n",
"drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 Doudna\n",
"drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 Johnson\n",
"drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 Pachter\n",
"drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 Roberts\n",
"-rw-r--r-- 1 srlab staff 2.2G Feb 21 23:13 Undetermined_S0_L005_R1_001.fastq.gz\n",
"-rw-r--r-- 1 srlab staff 3.1G Feb 21 23:13 Undetermined_S0_L005_R2_001.fastq.gz\n",
"-rw-r--r-- 1 srlab staff 142 Feb 28 17:18 Undetermined_checksums\n",
"drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 Wayne\n"
]
}
],
"source": [
"%%bash\n",
"ls -lh"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n",
"real\t1m48.294s\n",
"user\t0m15.830s\n",
"sys\t0m33.760s\n"
]
}
],
"source": [
"%%bash\n",
"time for i in *.gz\n",
" do\n",
" md5sum \"$i\" >> checksums.md5\n",
" done"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0\n"
]
}
],
"source": [
"%%bash\n",
"diff Undetermined_checksums checksums.md5\n",
"echo $?"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"484082c497c7a52fa225cb0983c709a9 Undetermined_S0_L005_R1_001.fastq.gz\n",
"9718d259172f2c05ef97eb0d439c31da Undetermined_S0_L005_R2_001.fastq.gz\n"
]
}
],
"source": [
"%%bash\n",
"cat Undetermined_checksums"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Copy files to Owl"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n",
"real\t20m26.143s\n",
"user\t0m0.040s\n",
"sys\t3m3.080s\n"
]
}
],
"source": [
"%%bash\n",
"time for file in *.gz\n",
" do\n",
" cp --no-clobber \"$file\" /owl_web/nightingales/P_generosa/\n",
" cp --no-clobber \"$file\" /owl_web/nightingales/Porites_spp/\n",
" cp --no-clobber \"$file\" /owl_web/nightingales/A_elegantissima/\n",
" done"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Generate MD5 checksums for files copied to Owl"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n",
"real\t5m2.141s\n",
"user\t0m7.910s\n",
"sys\t0m37.020s\n"
]
}
],
"source": [
"%%bash\n",
"time for i in /owl_web/nightingales/P_generosa/Undetermined_S0_L005_R*.gz\n",
" do\n",
" md5sum \"$i\" >> temp_checksums.md5\n",
" done"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n",
"real\t5m42.952s\n",
"user\t0m6.680s\n",
"sys\t0m39.320s\n"
]
}
],
"source": [
"%%bash\n",
"time for i in /owl_web/nightingales/Porites_spp/Undetermined_S0_L005_R*.gz\n",
" do\n",
" md5sum \"$i\" >> temp_checksums.md5\n",
" done"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n",
"real\t5m57.994s\n",
"user\t0m6.940s\n",
"sys\t0m38.520s\n"
]
}
],
"source": [
"%%bash\n",
"time for i in /owl_web/nightingales/A_elegantissima/Undetermined_S0_L005_R*.gz\n",
" do\n",
" md5sum \"$i\" >> temp_checksums.md5\n",
" done"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"484082c497c7a52fa225cb0983c709a9 /owl_web/nightingales/P_generosa/Undetermined_S0_L005_R1_001.fastq.gz\n",
"9718d259172f2c05ef97eb0d439c31da /owl_web/nightingales/P_generosa/Undetermined_S0_L005_R2_001.fastq.gz\n",
"484082c497c7a52fa225cb0983c709a9 /owl_web/nightingales/Porites_spp/Undetermined_S0_L005_R1_001.fastq.gz\n",
"9718d259172f2c05ef97eb0d439c31da /owl_web/nightingales/Porites_spp/Undetermined_S0_L005_R2_001.fastq.gz\n",
"484082c497c7a52fa225cb0983c709a9 /owl_web/nightingales/A_elegantissima/Undetermined_S0_L005_R1_001.fastq.gz\n",
"9718d259172f2c05ef97eb0d439c31da /owl_web/nightingales/A_elegantissima/Undetermined_S0_L005_R2_001.fastq.gz\n"
]
}
],
"source": [
"%%bash\n",
"cat temp_checksums.md5"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"484082c497c7a52fa225cb0983c709a9 Undetermined_S0_L005_R1_001.fastq.gz\n",
"9718d259172f2c05ef97eb0d439c31da Undetermined_S0_L005_R2_001.fastq.gz\n"
]
}
],
"source": [
"%%bash\n",
"cat Undetermined_checksums"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Looks like everything matches."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Append facility checksums to checksum files on Owl"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"%%bash\n",
"cat Undetermined_checksums >> /owl_web/nightingales/P_generosa/checksums.md5\n",
"cat Undetermined_checksums >> /owl_web/nightingales/Porites_spp/checksums.md5\n",
"cat Undetermined_checksums >> /owl_web/nightingales/A_elegantissima/checksums.md5"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Download barcode html file"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"I'm not entirely sure what this is, but it might be useful to have."
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n",
"real\t0m2.784s\n",
"user\t0m0.000s\n",
"sys\t0m0.070s\n"
]
}
],
"source": [
"%%bash\n",
"time WGETRC=/data/wgetrc_berk_seq wget -r -np -nc -q --accept \"laneBarcode.html\" ftp://gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"total 5.2G\r\n",
"drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 \u001b[0m\u001b[01;34mAlfaro\u001b[0m/\r\n",
"drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 \u001b[01;34mChang\u001b[0m/\r\n",
"drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 \u001b[01;34mCoates\u001b[0m/\r\n",
"drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 \u001b[01;34mDoudna\u001b[0m/\r\n",
"drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 \u001b[01;34mJohnson\u001b[0m/\r\n",
"drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 \u001b[01;34mPachter\u001b[0m/\r\n",
"drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 \u001b[01;34mRoberts\u001b[0m/\r\n",
"-rw-r--r-- 1 srlab staff 2.2G Feb 21 23:13 Undetermined_S0_L005_R1_001.fastq.gz\r\n",
"-rw-r--r-- 1 srlab staff 3.1G Feb 21 23:13 Undetermined_S0_L005_R2_001.fastq.gz\r\n",
"-rw-r--r-- 1 srlab staff 142 Feb 28 17:18 Undetermined_checksums\r\n",
"drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 \u001b[01;34mWayne\u001b[0m/\r\n",
"-rw-r--r-- 1 srlab staff 142 Mar 1 22:18 checksums.md5\r\n",
"drwxr-xr-x 1 srlab staff 102 Mar 1 23:13 \u001b[01;34mgslserver.qb3.berkeley.edu\u001b[0m/\r\n",
"-rw-r--r-- 1 srlab staff 636 Mar 1 22:59 temp_checksums.md5\r\n"
]
}
],
"source": [
"ls -lh"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\" \"http://www.w3.org/TR/html4/loose.dtd\">\n",
"<html xmlns:bcl2fastq=\"http://www.illumina.com/bcl2fastq\">\n",
"<link rel=\"stylesheet\" href=\"../../../../Report.css\" type=\"text/css\">\n",
"<body>\n",
"<table width=\"100%\"><tr>\n",
"<td><p><p>HG3WNBBXX /\n",
" [all projects] /\n",
" [all samples] /\n",
" [all barcodes]</p></p></td>\n",
"<td><p align=\"right\"><a href=\"../../../../HG3WNBBXX/all/all/all/lane.html\">hide barcodes</a></p></td>\n"
]
}
],
"source": [
"%%bash\n",
"head gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/laneBarcode.html"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<td>GTCCCG</td>\n",
"<td>187,142</td>\n",
"<td>CGGACGAG</td>\n",
"<td>290,611</td>\n",
"<td>CGAGGCTG+CACAAAAA</td>\n",
"</tr>\n",
"</table>\n",
"<p></p>\n",
"</body>\n",
"</html>\n"
]
}
],
"source": [
"%%bash\n",
"tail gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/laneBarcode.html"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"I'm going to rename the file so that it has better association with these files and then copy to each of the directories on Owl."
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"%%bash\n",
"mv gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/laneBarcode.html JD_L005_laneBarcode.html"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"total 5426292\r\n",
"drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 \u001b[0m\u001b[01;34mAlfaro\u001b[0m/\r\n",
"drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 \u001b[01;34mChang\u001b[0m/\r\n",
"drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 \u001b[01;34mCoates\u001b[0m/\r\n",
"drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 \u001b[01;34mDoudna\u001b[0m/\r\n",
"-rw-r--r-- 1 srlab staff 43031 Feb 22 00:29 JD_L005_laneBarcode.html\r\n",
"drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 \u001b[01;34mJohnson\u001b[0m/\r\n",
"drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 \u001b[01;34mPachter\u001b[0m/\r\n",
"drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 \u001b[01;34mRoberts\u001b[0m/\r\n",
"-rw-r--r-- 1 srlab staff 2281224946 Feb 21 23:13 Undetermined_S0_L005_R1_001.fastq.gz\r\n",
"-rw-r--r-- 1 srlab staff 3275238267 Feb 21 23:13 Undetermined_S0_L005_R2_001.fastq.gz\r\n",
"-rw-r--r-- 1 srlab staff 142 Feb 28 17:18 Undetermined_checksums\r\n",
"drwxr-xr-x 1 srlab staff 68 Mar 1 22:07 \u001b[01;34mWayne\u001b[0m/\r\n",
"-rw-r--r-- 1 srlab staff 142 Mar 1 22:18 checksums.md5\r\n",
"drwxr-xr-x 1 srlab staff 102 Mar 1 23:13 \u001b[01;34mgslserver.qb3.berkeley.edu\u001b[0m/\r\n",
"-rw-r--r-- 1 srlab staff 636 Mar 1 22:59 temp_checksums.md5\r\n"
]
}
],
"source": [
"ls -l"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"%%bash\n",
"cp JD_L005_laneBarcode.html /owl_web/nightingales/P_generosa/JD_L005_laneBarcode.html\n",
"cp JD_L005_laneBarcode.html /owl_web/nightingales/Porites_spp/JD_L005_laneBarcode.html\n",
"cp JD_L005_laneBarcode.html /owl_web/nightingales/A_elegantissima/JD_L005_laneBarcode.html"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"/owl_web/nightingales/P_generosa/JD_L005_laneBarcode.html\n",
"/owl_web/nightingales/Porites_spp/JD_L005_laneBarcode.html\n",
"/owl_web/nightingales/A_elegantissima/JD_L005_laneBarcode.html\n"
]
}
],
"source": [
"%%bash\n",
"ls /owl_web/nightingales/P_generosa/JD_L005_laneBarcode.html\n",
"ls /owl_web/nightingales/Porites_spp/JD_L005_laneBarcode.html\n",
"ls /owl_web/nightingales/A_elegantissima/JD_L005_laneBarcode.html"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
}
],
"metadata": {
"anaconda-cloud": {},
"kernelspec": {
"display_name": "Python [default]",
"language": "python",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.12"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
You can’t perform that action at this time.