### Running in Docker container on Ostrich

#### Started Docker container with the following command:

```docker run -p 8888:8888 -v /Users/sam/data/:/data -v /Users/sam/owl_home/:/owl_home -v /Users/sam/owl_web/:/owl_web -v /Users/sam/gitrepos:/gitrepos -it f99537d7e06a```

The command allows access to Jupyter Notebook over port 8888 and makes my Jupyter Notebook GitHub repo and my data files on Owl/home and Owl/web accessible to the Docker container.

Once the container was started, started Jupyter Notebook with the following command inside the Docker container:

```jupyter notebook```

This is configured in the Docker container to launch a Jupyter Notebook without a browser on port 8888.

The Docker container is running on an image created from this [Dockerfile (Git commit 443bc42)](https://github.com/sr320/LabDocs/blob/443bc425cd36d23a07cf12625f38b7e3a397b9be/code/dockerfiles/Dockerfile.bio)

In [1]:
%%bash
date

Mon Feb 27 18:32:53 UTC 2017


### Check computer specs

In [2]:
%%bash
hostname

0f2bca9c664b


In [3]:
%%bash
lscpu

Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                8
On-line CPU(s) list:   0-7
Thread(s) per core:    1
Core(s) per socket:    8
Socket(s):             1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 26
Model name:            Intel(R) Xeon(R) CPU           E5520  @ 2.27GHz
Stepping:              5
CPU MHz:               2260.998
BogoMIPS:              4521.99
Hypervisor vendor:     KVM
Virtualization type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              8192K


### Download Jay's Non-Demultiplexed data

#### The files in this folder are as follows (email correspondence):
> Hi Sam,
>
Your new directory is on the server "Dimond_170224", the checksum info in listed in a text file for the three files. Is there a better way to list the checksums? I'm new to this.  
>  
>Also, I gave you the 2 reads and the 6bp index file.  
>  
>Best,
>
>Shana

>Shana McDevitt

>Director

>Vincent J. Coates Genomics Sequencing Laboratory

>California Institute for Quantitative Biosciences (QB3)

>University of California, Berkeley

In [1]:
cd /data/20170227_jay_data_tmp/

/data/20170227_jay_data_tmp


#### The following command uses ```wget``` to download all of the files in the target directory. Here's an explanation of the code:

- ```time```: Evaluates how long it takes for the command to complete. 

- ```WGETRC=/data/wgetrc_berk_seq```: This assigns the value of the bash variable ```WGETRC``` to the contents of the ```wgetrc_berk_seq``` file. This file contains the username and password needed to ftp the data from the UC Berkeley server. Using this allows me to run the command in a Jupyter notebook without the need for pasting the actual username and password into the command string.

- ```-r```: Recursive; i.e. download all things in this directory and anything in any subdirectories.

- ```-np```: No parent; i.e. do not ascend to higher directories.

- ```-nc```: No clobber; i.e. do not overwrite any existing files in the download directory.

- ```-q```: Quiet; i.e. do not print wget status to screen. This is to prevent bogging down the Jupyter notebook with thousands of output lines.

In [5]:
%%bash
time WGETRC=/data/wgetrc_berk_seq wget -r -np -nc -q ftp://gslserver.qb3.berkeley.edu/Dimond_170224


real	37m7.999s
user	0m3.130s
sys	20m34.060s


In [6]:
%%bash
ls -lh

total 0
drwxr-xr-x 1 srlab staff 102 Feb 27 19:54 gslserver.qb3.berkeley.edu


In [7]:
cd gslserver.qb3.berkeley.edu/Dimond_170224/

/data/20170227_jay_data_tmp/gslserver.qb3.berkeley.edu/Dimond_170224


In [8]:
%%bash
ls -lh

total 41G
-rw-r--r-- 1 srlab staff 2.1G Feb 24 23:28 JD002_S0_L005_I1_001.fastq.gz
-rw-r--r-- 1 srlab staff  18G Feb 24 23:28 JD002_S0_L005_R1_001.fastq.gz
-rw-r--r-- 1 srlab staff  22G Feb 24 23:28 JD002_S0_L005_R2_001.fastq.gz
-rw-r--r-- 1 srlab staff  192 Feb 25 00:01 md5sum_report


In [9]:
cat md5sum_report

baa87464b77f937fccf496351bb7f000  JD002_S0_L005_I1_001.fastq.gz
e05eea61dbd405c890f241f824b2012b  JD002_S0_L005_R1_001.fastq.gz
9e34ddfc4dbdd9a96bd4f8f102f52693  JD002_S0_L005_R2_001.fastq.gz


### Generate our own MD5 checksums

In [10]:
%%bash
time for i in *.gz
    do
    md5sum "$i" >> checksums.md5
    done


real	8m0.815s
user	0m4.260s
sys	4m44.580s


In [11]:
cat checksums.md5

baa87464b77f937fccf496351bb7f000  JD002_S0_L005_I1_001.fastq.gz
e05eea61dbd405c890f241f824b2012b  JD002_S0_L005_R1_001.fastq.gz
9e34ddfc4dbdd9a96bd4f8f102f52693  JD002_S0_L005_R2_001.fastq.gz


### Compare MD5 checksums

Visual inspection suggests that these are good to go, but we'll compare them programmatically anyway...

In [12]:
%%bash
diff checksums.md5 md5sum_report

No output means no differences between the two files. However, to further verify, we'll check the exit status of the last command run (should be 0 if last command completed successfully with no errors). This is accomplished by calling the bash variable ```$?```.

In [13]:
%%bash
echo $?

0


### Copy files to directories on Owl

Jay has three different species in his sequencing data, so I'm copying the data to each of three different species folders on Owl. The code below uses the ```-no-clobber``` argument to prevent the program from overwriting any existing files in the destination directory that might have the same file name.

In [14]:
%%bash
time for file in *.gz
    do
    cp --no-clobber "$file" /owl_web/nightingales/P_generosa/
    cp --no-clobber "$file" /owl_web/nightingales/Porites_spp/
    cp --no-clobber "$file" /owl_web/nightingales/A_elegantissima/
    done


real	130m18.473s
user	0m0.020s
sys	18m23.290s


### Generate new checksums for files copied to Owl

In [17]:
%%bash
time for i in /owl_web/nightingales/P_generosa/JD002_S0_L005*.gz
    do
    md5sum "$i" >> temp_checksums.md5
    done


real	31m19.144s
user	0m3.860s
sys	4m44.720s


In [18]:
%%bash
time for i in /owl_web/nightingales/Porites_spp/JD002_S0_L005*.gz
    do
    md5sum "$i" >> temp_checksums.md5
    done


real	27m17.756s
user	0m4.390s
sys	4m44.170s


In [19]:
%%bash
time for i in /owl_web/nightingales/A_elegantissima/JD002_S0_L005*.gz
    do
    md5sum "$i" >> temp_checksums.md5
    done


real	26m45.605s
user	0m4.870s
sys	4m43.930s


### Compare initial checksums with temporary checksums on Owl

I screwed up and didn't create/write the ```temp_checksums.md5``` file into the different directories on Owl. Will create a concatenated ```md5sum_report``` file that mimics the contents of the ```temp_checksums.md5``` file.

In [20]:
%%bash
cat temp_checksums.md5

baa87464b77f937fccf496351bb7f000  /owl_web/nightingales/P_generosa/JD002_S0_L005_I1_001.fastq.gz
e05eea61dbd405c890f241f824b2012b  /owl_web/nightingales/P_generosa/JD002_S0_L005_R1_001.fastq.gz
9e34ddfc4dbdd9a96bd4f8f102f52693  /owl_web/nightingales/P_generosa/JD002_S0_L005_R2_001.fastq.gz
baa87464b77f937fccf496351bb7f000  /owl_web/nightingales/Porites_spp/JD002_S0_L005_I1_001.fastq.gz
e05eea61dbd405c890f241f824b2012b  /owl_web/nightingales/Porites_spp/JD002_S0_L005_R1_001.fastq.gz
9e34ddfc4dbdd9a96bd4f8f102f52693  /owl_web/nightingales/Porites_spp/JD002_S0_L005_R2_001.fastq.gz
baa87464b77f937fccf496351bb7f000  /owl_web/nightingales/A_elegantissima/JD002_S0_L005_I1_001.fastq.gz
e05eea61dbd405c890f241f824b2012b  /owl_web/nightingales/A_elegantissima/JD002_S0_L005_R1_001.fastq.gz
9e34ddfc4dbdd9a96bd4f8f102f52693  /owl_web/nightingales/A_elegantissima/JD002_S0_L005_R2_001.fastq.gz


In [21]:
%%bash
cat md5sum_report >> md5sum_report_cat
cat md5sum_report >> md5sum_report_cat
cat md5sum_report >> md5sum_report_cat

In [22]:
%%bash
cat md5sum_report_cat

baa87464b77f937fccf496351bb7f000  JD002_S0_L005_I1_001.fastq.gz
e05eea61dbd405c890f241f824b2012b  JD002_S0_L005_R1_001.fastq.gz
9e34ddfc4dbdd9a96bd4f8f102f52693  JD002_S0_L005_R2_001.fastq.gz
baa87464b77f937fccf496351bb7f000  JD002_S0_L005_I1_001.fastq.gz
e05eea61dbd405c890f241f824b2012b  JD002_S0_L005_R1_001.fastq.gz
9e34ddfc4dbdd9a96bd4f8f102f52693  JD002_S0_L005_R2_001.fastq.gz
baa87464b77f937fccf496351bb7f000  JD002_S0_L005_I1_001.fastq.gz
e05eea61dbd405c890f241f824b2012b  JD002_S0_L005_R1_001.fastq.gz
9e34ddfc4dbdd9a96bd4f8f102f52693  JD002_S0_L005_R2_001.fastq.gz


In [23]:
%%bash
diff md5sum_report_cat temp_checksums.md5

1,9c1,9
< baa87464b77f937fccf496351bb7f000  JD002_S0_L005_I1_001.fastq.gz
< e05eea61dbd405c890f241f824b2012b  JD002_S0_L005_R1_001.fastq.gz
< 9e34ddfc4dbdd9a96bd4f8f102f52693  JD002_S0_L005_R2_001.fastq.gz
< baa87464b77f937fccf496351bb7f000  JD002_S0_L005_I1_001.fastq.gz
< e05eea61dbd405c890f241f824b2012b  JD002_S0_L005_R1_001.fastq.gz
< 9e34ddfc4dbdd9a96bd4f8f102f52693  JD002_S0_L005_R2_001.fastq.gz
< baa87464b77f937fccf496351bb7f000  JD002_S0_L005_I1_001.fastq.gz
< e05eea61dbd405c890f241f824b2012b  JD002_S0_L005_R1_001.fastq.gz
< 9e34ddfc4dbdd9a96bd4f8f102f52693  JD002_S0_L005_R2_001.fastq.gz
---
> baa87464b77f937fccf496351bb7f000  /owl_web/nightingales/P_generosa/JD002_S0_L005_I1_001.fastq.gz
> e05eea61dbd405c890f241f824b2012b  /owl_web/nightingales/P_generosa/JD002_S0_L005_R1_001.fastq.gz
> 9e34ddfc4dbdd9a96bd4f8f102f52693  /owl_web/nightingales/P_generosa/JD002_S0_L005_R2_001.fastq.gz
> baa87464b77f937fccf496351bb7f000  /owl_web/nightingales/Porites_spp/JD002_S0_L005_I1_001.fastq.

Well, I didn't take into account that the full path to the file would be written to the checksum file. As such, the ```diff``` command sees this. However, the checksums appear to visually match. Will proceed with adding the checksums to the checksum files in each Owl directory.

### Append checksums to existing checksum files in each directory

In [24]:
%%bash
cat checksums.md5sums.md5sum_report >> /owl_web/nightingales/P_generosa/checksums.md5
cat md5sum_report >> /owl_web/nightingales/Porites_spp/checksums.md5
cat md5sum_report >> /owl_web/nightingales/A_elegantissima/checksums.md5

Whoops! Typo in that first line above! Fixed below

In [25]:
%%bash
cat md5sum_report >> /owl_web/nightingales/P_generosa/checksums.md5

In [26]:
%%bash
ls -lh

total 41G
-rw-r--r-- 1 srlab staff 2.1G Feb 24 23:28 JD002_S0_L005_I1_001.fastq.gz
-rw-r--r-- 1 srlab staff  18G Feb 24 23:28 JD002_S0_L005_R1_001.fastq.gz
-rw-r--r-- 1 srlab staff  22G Feb 24 23:28 JD002_S0_L005_R2_001.fastq.gz
-rw-r--r-- 1 srlab staff  192 Feb 27 20:44 checksums.md5
-rw-r--r-- 1 srlab staff  192 Feb 25 00:01 md5sum_report
-rw-r--r-- 1 srlab staff  576 Feb 28 01:13 md5sum_report_cat
-rw-r--r-- 1 srlab staff  891 Feb 28 01:13 temp_checksums.md5


In [27]:
rm -rf /data/20170227_jay_data_tmp/gslserver.qb3.berkeley.edu/

In [28]:
%%bash
ls -lh /data/20170227_jay_data_tmp/

total 0


shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory


### Download Jay's demultiplexed data

In [1]:
cd /data/20170227_jay_data_tmp/

/data/20170227_jay_data_tmp


In [2]:
%%bash
time WGETRC=/data/wgetrc_berk_seq wget -r -np -nc -q ftp://gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/Roberts


real	41m4.943s
user	0m3.130s
sys	17m12.850s


In [4]:
%%bash
ls -lh /data/20170227_jay_data_tmp/gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/Roberts/

total 34G
-rw-r--r-- 1 srlab staff 1.1G Feb 21 23:13 JD002A_S131_L005_R1_001.fastq.gz
-rw-r--r-- 1 srlab staff 1.3G Feb 21 23:13 JD002A_S131_L005_R2_001.fastq.gz
-rw-r--r-- 1 srlab staff 1.3G Feb 21 23:13 JD002B_S132_L005_R1_001.fastq.gz
-rw-r--r-- 1 srlab staff 1.6G Feb 21 23:13 JD002B_S132_L005_R2_001.fastq.gz
-rw-r--r-- 1 srlab staff 1.2G Feb 21 23:13 JD002C_S133_L005_R1_001.fastq.gz
-rw-r--r-- 1 srlab staff 1.5G Feb 21 23:13 JD002C_S133_L005_R2_001.fastq.gz
-rw-r--r-- 1 srlab staff 1.4G Feb 21 23:13 JD002D_S134_L005_R1_001.fastq.gz
-rw-r--r-- 1 srlab staff 1.7G Feb 21 23:13 JD002D_S134_L005_R2_001.fastq.gz
-rw-r--r-- 1 srlab staff 1.4G Feb 21 23:13 JD002E_S135_L005_R1_001.fastq.gz
-rw-r--r-- 1 srlab staff 1.8G Feb 21 23:13 JD002E_S135_L005_R2_001.fastq.gz
-rw-r--r-- 1 srlab staff 1.5G Feb 21 23:13 JD002F_S136_L005_R1_001.fastq.gz
-rw-r--r-- 1 srlab staff 1.9G Feb 21 23:13 JD002F_S136_L005_R2_001.fastq.gz
-rw-r--r-- 1 srlab staff 1.1G Feb 21 23:13 JD002G_S137_L005_R1_001.fastq.gz
-r

In [5]:
cd /data/20170227_jay_data_tmp/gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/Roberts/

/data/20170227_jay_data_tmp/gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/Roberts


In [6]:
%%bash
cat demultiplexed_checksums

c6d2bab7dabb6043a8565482b7b03cda  JD002A_S131_L005_R1_001.fastq.gz
77d17d6425e818a798e28ff3dd7f34f0  JD002A_S131_L005_R2_001.fastq.gz
51346326dfa475706b1c219dd86dc4f2  JD002B_S132_L005_R1_001.fastq.gz
0062c59dc8fcda8fcfa452ebb717419c  JD002B_S132_L005_R2_001.fastq.gz
91ae2b454af8343fe79f5db506e71438  JD002C_S133_L005_R1_001.fastq.gz
c428291a87958081fdc647e9d121506d  JD002C_S133_L005_R2_001.fastq.gz
27a4bed7f18e5f71372f4676e0369a6e  JD002D_S134_L005_R1_001.fastq.gz
84f0710eee4765e9aba0db009d004244  JD002D_S134_L005_R2_001.fastq.gz
0d4e953296924154d616c1143f9c4ad8  JD002E_S135_L005_R1_001.fastq.gz
896b03ed79dbc33113aaf646ce94b65a  JD002E_S135_L005_R2_001.fastq.gz
ae11c97c5c877787088e236e8c158346  JD002F_S136_L005_R1_001.fastq.gz
d8968dc209461435a66af6382f049a19  JD002F_S136_L005_R2_001.fastq.gz
83b5e361d8d3c4cff1d464d0428171d1  JD002G_S137_L005_R1_001.fastq.gz
168c2dbf1a7585d2a2a1b13a56e2f4e6  JD002G_S137_L005_R2_001.fastq.gz
1403cafaa172d2b009404e98ef6503ae  JD002H_S138_L005_R1_001.fast

### Generate our own checksums

In [7]:
%%bash
time for i in *.gz
    do
    md5sum "$i" >> checksums.md5
    done


real	7m0.125s
user	0m6.260s
sys	3m52.740s


In [8]:
%%bash
cat checksums.md5

c6d2bab7dabb6043a8565482b7b03cda  JD002A_S131_L005_R1_001.fastq.gz
77d17d6425e818a798e28ff3dd7f34f0  JD002A_S131_L005_R2_001.fastq.gz
51346326dfa475706b1c219dd86dc4f2  JD002B_S132_L005_R1_001.fastq.gz
0062c59dc8fcda8fcfa452ebb717419c  JD002B_S132_L005_R2_001.fastq.gz
91ae2b454af8343fe79f5db506e71438  JD002C_S133_L005_R1_001.fastq.gz
c428291a87958081fdc647e9d121506d  JD002C_S133_L005_R2_001.fastq.gz
27a4bed7f18e5f71372f4676e0369a6e  JD002D_S134_L005_R1_001.fastq.gz
84f0710eee4765e9aba0db009d004244  JD002D_S134_L005_R2_001.fastq.gz
0d4e953296924154d616c1143f9c4ad8  JD002E_S135_L005_R1_001.fastq.gz
896b03ed79dbc33113aaf646ce94b65a  JD002E_S135_L005_R2_001.fastq.gz
ae11c97c5c877787088e236e8c158346  JD002F_S136_L005_R1_001.fastq.gz
d8968dc209461435a66af6382f049a19  JD002F_S136_L005_R2_001.fastq.gz
83b5e361d8d3c4cff1d464d0428171d1  JD002G_S137_L005_R1_001.fastq.gz
168c2dbf1a7585d2a2a1b13a56e2f4e6  JD002G_S137_L005_R2_001.fastq.gz
1403cafaa172d2b009404e98ef6503ae  JD002H_S138_L005_R1_001.fast

### Compare MD5 checksums

In [9]:
%%bash
diff demultiplexed_checksums checksums.md5

In [10]:
%%bash
echo $?

0


### Copy files to directories on Owl

Jay has three different species in his sequencing data, so I'm copying the data to each of three different species folders on Owl. The code below uses the ```-no-clobber``` argument to prevent the program from overwriting any existing files in the destination directory that might have the same file name.

In [11]:
%%bash
time for file in *.gz
    do
    cp --no-clobber "$file" /owl_web/nightingales/P_generosa/
    cp --no-clobber "$file" /owl_web/nightingales/Porites_spp/
    cp --no-clobber "$file" /owl_web/nightingales/A_elegantissima/
    done


real	155m12.124s
user	0m0.180s
sys	15m13.790s


### Generate MD5 checksums for files copied to Owl

In [12]:
%%bash
time for i in /owl_web/nightingales/P_generosa/JD002[A-Z]*.gz
    do
    md5sum "$i" >> temp_checksums.md5
    done

md5sum: /owl_web/nightingales/P_generosa/JD002[A-Z].gz: No such file or directory

real	0m3.691s
user	0m0.000s
sys	0m0.000s


Whoops! Typo! Fixed below...

In [13]:
%%bash
time for i in /owl_web/nightingales/P_generosa/JD002[A-Z]*.gz
    do
    md5sum "$i" >> temp_checksums.md5
    done


real	24m29.381s
user	0m5.040s
sys	3m53.230s


In [14]:
%%bash
time for i in /owl_web/nightingales/Porites_spp/JD002[A-Z]*.gz
    do
    md5sum "$i" >> temp_checksums.md5
    done


real	25m3.500s
user	0m5.080s
sys	3m52.580s


In [15]:
%%bash
time for i in /owl_web/nightingales/A_elegantissima/JD002[A-Z]*.gz
    do
    md5sum "$i" >> temp_checksums.md5
    done


real	25m41.897s
user	0m4.710s
sys	3m52.660s


In [16]:
%%bash
cat temp_checksums.md5

c6d2bab7dabb6043a8565482b7b03cda  /owl_web/nightingales/P_generosa/JD002A_S131_L005_R1_001.fastq.gz
77d17d6425e818a798e28ff3dd7f34f0  /owl_web/nightingales/P_generosa/JD002A_S131_L005_R2_001.fastq.gz
51346326dfa475706b1c219dd86dc4f2  /owl_web/nightingales/P_generosa/JD002B_S132_L005_R1_001.fastq.gz
0062c59dc8fcda8fcfa452ebb717419c  /owl_web/nightingales/P_generosa/JD002B_S132_L005_R2_001.fastq.gz
91ae2b454af8343fe79f5db506e71438  /owl_web/nightingales/P_generosa/JD002C_S133_L005_R1_001.fastq.gz
c428291a87958081fdc647e9d121506d  /owl_web/nightingales/P_generosa/JD002C_S133_L005_R2_001.fastq.gz
27a4bed7f18e5f71372f4676e0369a6e  /owl_web/nightingales/P_generosa/JD002D_S134_L005_R1_001.fastq.gz
84f0710eee4765e9aba0db009d004244  /owl_web/nightingales/P_generosa/JD002D_S134_L005_R2_001.fastq.gz
0d4e953296924154d616c1143f9c4ad8  /owl_web/nightingales/P_generosa/JD002E_S135_L005_R1_001.fastq.gz
896b03ed79dbc33113aaf646ce94b65a  /owl_web/nightingales/P_generosa/JD002E_S135_L005_R2_001.fastq.gz


Since there are so many files this time, I'm going to strip the leading file path from the filenames so that I can actually use the ```diff``` command to compare checksums. The code below uses ```sed``` to edit the file in place (using the ```-i``` argument), automatically creates a backup of the original file with the extension ```.bak``` and then substitutes everything up to the last slash from the specified input file.

In [18]:
%%bash
time sed -i.bak 's/^.*\///' temp_checksums.md5 


real	0m0.021s
user	0m0.000s
sys	0m0.000s


In [19]:
%%bash
cat temp_checksums.md5

JD002A_S131_L005_R1_001.fastq.gz
JD002A_S131_L005_R2_001.fastq.gz
JD002B_S132_L005_R1_001.fastq.gz
JD002B_S132_L005_R2_001.fastq.gz
JD002C_S133_L005_R1_001.fastq.gz
JD002C_S133_L005_R2_001.fastq.gz
JD002D_S134_L005_R1_001.fastq.gz
JD002D_S134_L005_R2_001.fastq.gz
JD002E_S135_L005_R1_001.fastq.gz
JD002E_S135_L005_R2_001.fastq.gz
JD002F_S136_L005_R1_001.fastq.gz
JD002F_S136_L005_R2_001.fastq.gz
JD002G_S137_L005_R1_001.fastq.gz
JD002G_S137_L005_R2_001.fastq.gz
JD002H_S138_L005_R1_001.fastq.gz
JD002H_S138_L005_R2_001.fastq.gz
JD002I_S139_L005_R1_001.fastq.gz
JD002I_S139_L005_R2_001.fastq.gz
JD002J_S140_L005_R1_001.fastq.gz
JD002J_S140_L005_R2_001.fastq.gz
JD002K_S141_L005_R1_001.fastq.gz
JD002K_S141_L005_R2_001.fastq.gz
JD002L_S142_L005_R1_001.fastq.gz
JD002L_S142_L005_R2_001.fastq.gz
JD002A_S131_L005_R1_001.fastq.gz
JD002A_S131_L005_R2_001.fastq.gz
JD002B_S132_L005_R1_001.fastq.gz
JD002B_S132_L005_R2_001.fastq.gz
JD002C_S133_L005_R1_001.fastq.gz
JD002C_S133_L005_R2_001.fastq.gz
JD002D_S13

Well, that didn't work. It eliminated the first column (the checksums). Let's restore our file from the .bak backup file. Actually, I phrased that incorrectly. It did work exactly as it should. Sed edits things by lines, so each line was read and the pattern matching applied, leaving just the file name on each line.

In [20]:
%%bash
mv temp_checksums.md5.bak temp_checksums.md5

In [21]:
%%bash
cat temp_checksums.md5

c6d2bab7dabb6043a8565482b7b03cda  /owl_web/nightingales/P_generosa/JD002A_S131_L005_R1_001.fastq.gz
77d17d6425e818a798e28ff3dd7f34f0  /owl_web/nightingales/P_generosa/JD002A_S131_L005_R2_001.fastq.gz
51346326dfa475706b1c219dd86dc4f2  /owl_web/nightingales/P_generosa/JD002B_S132_L005_R1_001.fastq.gz
0062c59dc8fcda8fcfa452ebb717419c  /owl_web/nightingales/P_generosa/JD002B_S132_L005_R2_001.fastq.gz
91ae2b454af8343fe79f5db506e71438  /owl_web/nightingales/P_generosa/JD002C_S133_L005_R1_001.fastq.gz
c428291a87958081fdc647e9d121506d  /owl_web/nightingales/P_generosa/JD002C_S133_L005_R2_001.fastq.gz
27a4bed7f18e5f71372f4676e0369a6e  /owl_web/nightingales/P_generosa/JD002D_S134_L005_R1_001.fastq.gz
84f0710eee4765e9aba0db009d004244  /owl_web/nightingales/P_generosa/JD002D_S134_L005_R2_001.fastq.gz
0d4e953296924154d616c1143f9c4ad8  /owl_web/nightingales/P_generosa/JD002E_S135_L005_R1_001.fastq.gz
896b03ed79dbc33113aaf646ce94b65a  /owl_web/nightingales/P_generosa/JD002E_S135_L005_R2_001.fastq.gz


Found a solution using awk (duh, as it's perfect for operating on specific columns). The code below uses the gsub fucntion in awk to substitute the longest string that ends with a forward slash with nothing (the empty quotes) in column 2 ($2) and print the result of the new file in its entirety. In this case, I redirect the output of that command to the ```temp_checksums.md5``` file.

In [24]:
%%bash
awk '{gsub(/\/.*\//,"",$2); print}' < temp_checksums.md5.bak > temp_checksums.md5

In [25]:
%%bash
cat temp_checksums.md5

c6d2bab7dabb6043a8565482b7b03cda JD002A_S131_L005_R1_001.fastq.gz
77d17d6425e818a798e28ff3dd7f34f0 JD002A_S131_L005_R2_001.fastq.gz
51346326dfa475706b1c219dd86dc4f2 JD002B_S132_L005_R1_001.fastq.gz
0062c59dc8fcda8fcfa452ebb717419c JD002B_S132_L005_R2_001.fastq.gz
91ae2b454af8343fe79f5db506e71438 JD002C_S133_L005_R1_001.fastq.gz
c428291a87958081fdc647e9d121506d JD002C_S133_L005_R2_001.fastq.gz
27a4bed7f18e5f71372f4676e0369a6e JD002D_S134_L005_R1_001.fastq.gz
84f0710eee4765e9aba0db009d004244 JD002D_S134_L005_R2_001.fastq.gz
0d4e953296924154d616c1143f9c4ad8 JD002E_S135_L005_R1_001.fastq.gz
896b03ed79dbc33113aaf646ce94b65a JD002E_S135_L005_R2_001.fastq.gz
ae11c97c5c877787088e236e8c158346 JD002F_S136_L005_R1_001.fastq.gz
d8968dc209461435a66af6382f049a19 JD002F_S136_L005_R2_001.fastq.gz
83b5e361d8d3c4cff1d464d0428171d1 JD002G_S137_L005_R1_001.fastq.gz
168c2dbf1a7585d2a2a1b13a56e2f4e6 JD002G_S137_L005_R2_001.fastq.gz
1403cafaa172d2b009404e98ef6503ae JD002H_S138_L005_R1_001.fastq.gz
0c2f1b9a95

### Compare initial checksums with temporary checksums on Owl

In [26]:
%%bash
cat demultiplexed_checksums >> demultiplexed_checksums_cat
cat demultiplexed_checksums >> demultiplexed_checksums_cat
cat demultiplexed_checksums >> demultiplexed_checksums_cat

In [27]:
%%bash
cat demultiplexed_checksums_cat

c6d2bab7dabb6043a8565482b7b03cda  JD002A_S131_L005_R1_001.fastq.gz
77d17d6425e818a798e28ff3dd7f34f0  JD002A_S131_L005_R2_001.fastq.gz
51346326dfa475706b1c219dd86dc4f2  JD002B_S132_L005_R1_001.fastq.gz
0062c59dc8fcda8fcfa452ebb717419c  JD002B_S132_L005_R2_001.fastq.gz
91ae2b454af8343fe79f5db506e71438  JD002C_S133_L005_R1_001.fastq.gz
c428291a87958081fdc647e9d121506d  JD002C_S133_L005_R2_001.fastq.gz
27a4bed7f18e5f71372f4676e0369a6e  JD002D_S134_L005_R1_001.fastq.gz
84f0710eee4765e9aba0db009d004244  JD002D_S134_L005_R2_001.fastq.gz
0d4e953296924154d616c1143f9c4ad8  JD002E_S135_L005_R1_001.fastq.gz
896b03ed79dbc33113aaf646ce94b65a  JD002E_S135_L005_R2_001.fastq.gz
ae11c97c5c877787088e236e8c158346  JD002F_S136_L005_R1_001.fastq.gz
d8968dc209461435a66af6382f049a19  JD002F_S136_L005_R2_001.fastq.gz
83b5e361d8d3c4cff1d464d0428171d1  JD002G_S137_L005_R1_001.fastq.gz
168c2dbf1a7585d2a2a1b13a56e2f4e6  JD002G_S137_L005_R2_001.fastq.gz
1403cafaa172d2b009404e98ef6503ae  JD002H_S138_L005_R1_001.fast

In [28]:
%%bash
diff demultiplexed_checksums_cat temp_checksums.md5
echo $?

1,72c1,72
< c6d2bab7dabb6043a8565482b7b03cda  JD002A_S131_L005_R1_001.fastq.gz
< 77d17d6425e818a798e28ff3dd7f34f0  JD002A_S131_L005_R2_001.fastq.gz
< 51346326dfa475706b1c219dd86dc4f2  JD002B_S132_L005_R1_001.fastq.gz
< 0062c59dc8fcda8fcfa452ebb717419c  JD002B_S132_L005_R2_001.fastq.gz
< 91ae2b454af8343fe79f5db506e71438  JD002C_S133_L005_R1_001.fastq.gz
< c428291a87958081fdc647e9d121506d  JD002C_S133_L005_R2_001.fastq.gz
< 27a4bed7f18e5f71372f4676e0369a6e  JD002D_S134_L005_R1_001.fastq.gz
< 84f0710eee4765e9aba0db009d004244  JD002D_S134_L005_R2_001.fastq.gz
< 0d4e953296924154d616c1143f9c4ad8  JD002E_S135_L005_R1_001.fastq.gz
< 896b03ed79dbc33113aaf646ce94b65a  JD002E_S135_L005_R2_001.fastq.gz
< ae11c97c5c877787088e236e8c158346  JD002F_S136_L005_R1_001.fastq.gz
< d8968dc209461435a66af6382f049a19  JD002F_S136_L005_R2_001.fastq.gz
< 83b5e361d8d3c4cff1d464d0428171d1  JD002G_S137_L005_R1_001.fastq.gz
< 168c2dbf1a7585d2a2a1b13a56e2f4e6  JD002G_S137_L005_R2_001.fastq.gz
< 1403cafaa172d2b009404e

Argh! Delimiter between the two columns is different!! That leads to ```diff``` identifying each line as being different in each file. Let's try again...

In [29]:
%%bash

UsageError: %%bash is a cell magic, but the cell body is empty.


In [30]:
%%bash
awk '{print $1 "  " $2}' temp_checksums.md5.bak > temp_checksums.md5

In [31]:
%%bash
cat temp_checksums.md5

c6d2bab7dabb6043a8565482b7b03cda  /owl_web/nightingales/P_generosa/JD002A_S131_L005_R1_001.fastq.gz
77d17d6425e818a798e28ff3dd7f34f0  /owl_web/nightingales/P_generosa/JD002A_S131_L005_R2_001.fastq.gz
51346326dfa475706b1c219dd86dc4f2  /owl_web/nightingales/P_generosa/JD002B_S132_L005_R1_001.fastq.gz
0062c59dc8fcda8fcfa452ebb717419c  /owl_web/nightingales/P_generosa/JD002B_S132_L005_R2_001.fastq.gz
91ae2b454af8343fe79f5db506e71438  /owl_web/nightingales/P_generosa/JD002C_S133_L005_R1_001.fastq.gz
c428291a87958081fdc647e9d121506d  /owl_web/nightingales/P_generosa/JD002C_S133_L005_R2_001.fastq.gz
27a4bed7f18e5f71372f4676e0369a6e  /owl_web/nightingales/P_generosa/JD002D_S134_L005_R1_001.fastq.gz
84f0710eee4765e9aba0db009d004244  /owl_web/nightingales/P_generosa/JD002D_S134_L005_R2_001.fastq.gz
0d4e953296924154d616c1143f9c4ad8  /owl_web/nightingales/P_generosa/JD002E_S135_L005_R1_001.fastq.gz
896b03ed79dbc33113aaf646ce94b65a  /owl_web/nightingales/P_generosa/JD002E_S135_L005_R2_001.fastq.gz


Yeesh, screwed it up again (but fixed the spacing!). Here we go with another shot. The awk code is changed from earlier in that the print statement is modified to print the first column (```$1```), followed by two spaces (that's what's contained in the double quotes), and then print the second column (```$2```).

In [32]:
%%bash
awk '{gsub(/\/.*\//,"",$2); print $1 "  " $2}' < temp_checksums.md5.bak > temp_checksums.md5

In [33]:
%%bash
cat temp_checksums.md5

c6d2bab7dabb6043a8565482b7b03cda  JD002A_S131_L005_R1_001.fastq.gz
77d17d6425e818a798e28ff3dd7f34f0  JD002A_S131_L005_R2_001.fastq.gz
51346326dfa475706b1c219dd86dc4f2  JD002B_S132_L005_R1_001.fastq.gz
0062c59dc8fcda8fcfa452ebb717419c  JD002B_S132_L005_R2_001.fastq.gz
91ae2b454af8343fe79f5db506e71438  JD002C_S133_L005_R1_001.fastq.gz
c428291a87958081fdc647e9d121506d  JD002C_S133_L005_R2_001.fastq.gz
27a4bed7f18e5f71372f4676e0369a6e  JD002D_S134_L005_R1_001.fastq.gz
84f0710eee4765e9aba0db009d004244  JD002D_S134_L005_R2_001.fastq.gz
0d4e953296924154d616c1143f9c4ad8  JD002E_S135_L005_R1_001.fastq.gz
896b03ed79dbc33113aaf646ce94b65a  JD002E_S135_L005_R2_001.fastq.gz
ae11c97c5c877787088e236e8c158346  JD002F_S136_L005_R1_001.fastq.gz
d8968dc209461435a66af6382f049a19  JD002F_S136_L005_R2_001.fastq.gz
83b5e361d8d3c4cff1d464d0428171d1  JD002G_S137_L005_R1_001.fastq.gz
168c2dbf1a7585d2a2a1b13a56e2f4e6  JD002G_S137_L005_R2_001.fastq.gz
1403cafaa172d2b009404e98ef6503ae  JD002H_S138_L005_R1_001.fast

In [34]:
%%bash
diff demultiplexed_checksums_cat temp_checksums.md5
echo $?

0


Boom! Got it! Now, I'll append the facility checksums to the cheksum files in the directories on Owl.

### Append checksums to existing checksum files in each directory

In [35]:
%%bash
cat demultiplexed_checksums >> /owl_web/nightingales/P_generosa/checksums.md5
cat demultiplexed_checksums >> /owl_web/nightingales/Porites_spp/checksums.md5
cat demultiplexed_checksums >> /owl_web/nightingales/A_elegantissima/checksums.md5

In [36]:
%%bash
cd
rm -rf /data/20170227_jay_data_tmp/gslserver.qb3.berkeley.edu/
ls -lh /data/20170227_jay_data_tmp

total 0


### Download "undetermined" FASTQ files left from demultiplexing

In [1]:
cd /data/20170227_jay_data_tmp

/data/20170227_jay_data_tmp


In [2]:
%%bash
time WGETRC=/data/wgetrc_berk_seq wget -r -np -nc -q --accept "Undetermined*" --reject "*S0_L00[1234678]"ftp://gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/

wget: missing URL
Usage: wget [OPTION]... [URL]...

Try `wget --help' for more options.

real	0m0.026s
user	0m0.010s
sys	0m0.000s


Typo! Need space after reject list (between quotation and URL)

The wget command below adds an accept list and reject list to download only Jay's sequencing files (his were in Lane 5; L005) and the corresponding MD5 checksum file, named "Undetermined_checksums".

In [3]:
%%bash
time WGETRC=/data/wgetrc_berk_seq wget -r -np -nc -q --accept "Undetermined*" --reject "*S0_L00[1234678]" ftp://gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/

Process is interrupted.


Turns out, the accept/reject lists weren't working - all of the "Undetermined files were being download.

In [4]:
ls -lh

total 0
drwxr-xr-x 1 srlab staff 102 Mar  1 21:35 [0m[01;34mgslserver.qb3.berkeley.edu[0m/


In [5]:
ls -lh gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/

total 9.1G
-rw-r--r-- 1 srlab staff 528M Feb 21 21:32 Undetermined_S0_L001_R1_001.fastq.gz
-rw-r--r-- 1 srlab staff 745M Feb 21 21:32 Undetermined_S0_L001_R2_001.fastq.gz
-rw-r--r-- 1 srlab staff 2.0G Feb 21 22:00 Undetermined_S0_L002_R1_001.fastq.gz
-rw-r--r-- 1 srlab staff 2.4G Feb 21 22:00 Undetermined_S0_L002_R2_001.fastq.gz
-rw-r--r-- 1 srlab staff 1.9G Feb 21 22:27 Undetermined_S0_L003_R1_001.fastq.gz
-rw-r--r-- 1 srlab staff 1.7G Mar  1 21:53 Undetermined_S0_L003_R2_001.fastq.gz


In [6]:
rm *.gz gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/

rm: cannot remove '*.gz': No such file or directory
rm: cannot remove 'gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/': Is a directory


In [7]:
%%bash
for i in gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/*.gz
    do
    rm "$i"
    done

In [8]:
ls -lh gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/

total 0


In [9]:
%%bash
time WGETRC=/data/wgetrc_berk_seq wget -r -np -nc -q --accept "Undetermined_S0_L005*"  ftp://gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/


real	11m7.430s
user	0m4.890s
sys	4m35.840s


In [10]:
ls -lh

total 0
drwxr-xr-x 1 srlab staff 102 Mar  1 21:35 [0m[01;34mgslserver.qb3.berkeley.edu[0m/


In [11]:
%%bash
time WGETRC=/data/wgetrc_berk_seq wget -r -np -nc -q --accept "Undetermined_checksums"  ftp://gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/


real	0m2.741s
user	0m0.020s
sys	0m0.040s


In [12]:
ls -lh

total 0
drwxr-xr-x 1 srlab staff 102 Mar  1 21:35 [0m[01;34mgslserver.qb3.berkeley.edu[0m/


### Generate new MD5 checksums

In [13]:
%%bash
time for i in *.gz
    do
    md5sum "$i" >> checksums.md5
    done

md5sum: *.gz: No such file or directory

real	0m0.016s
user	0m0.000s
sys	0m0.000s


In [14]:
%%bash
diff Undetermined_checksums checksums.md5
echo $?

2


diff: Undetermined_checksums: No such file or directory


Duh! I ran all of this from the wrong directory. Here we go again...

In [15]:
cd gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/

/data/20170227_jay_data_tmp/gslserver.qb3.berkeley.edu/170217_100PE_HS4KA


In [16]:
%%bash
ls -lh

total 5.2G
drwxr-xr-x 1 srlab staff   68 Mar  1 22:07 Alfaro
drwxr-xr-x 1 srlab staff   68 Mar  1 22:07 Chang
drwxr-xr-x 1 srlab staff   68 Mar  1 22:07 Coates
drwxr-xr-x 1 srlab staff   68 Mar  1 22:07 Doudna
drwxr-xr-x 1 srlab staff   68 Mar  1 22:07 Johnson
drwxr-xr-x 1 srlab staff   68 Mar  1 22:07 Pachter
drwxr-xr-x 1 srlab staff   68 Mar  1 22:07 Roberts
-rw-r--r-- 1 srlab staff 2.2G Feb 21 23:13 Undetermined_S0_L005_R1_001.fastq.gz
-rw-r--r-- 1 srlab staff 3.1G Feb 21 23:13 Undetermined_S0_L005_R2_001.fastq.gz
-rw-r--r-- 1 srlab staff  142 Feb 28 17:18 Undetermined_checksums
drwxr-xr-x 1 srlab staff   68 Mar  1 22:07 Wayne


In [17]:
%%bash
time for i in *.gz
    do
    md5sum "$i" >> checksums.md5
    done


real	1m48.294s
user	0m15.830s
sys	0m33.760s


In [18]:
%%bash
diff Undetermined_checksums checksums.md5
echo $?

0


In [19]:
%%bash
cat Undetermined_checksums

484082c497c7a52fa225cb0983c709a9  Undetermined_S0_L005_R1_001.fastq.gz
9718d259172f2c05ef97eb0d439c31da  Undetermined_S0_L005_R2_001.fastq.gz


### Copy files to Owl

In [20]:
%%bash
time for file in *.gz
    do
    cp --no-clobber "$file" /owl_web/nightingales/P_generosa/
    cp --no-clobber "$file" /owl_web/nightingales/Porites_spp/
    cp --no-clobber "$file" /owl_web/nightingales/A_elegantissima/
    done


real	20m26.143s
user	0m0.040s
sys	3m3.080s


### Generate MD5 checksums for files copied to Owl

In [21]:
%%bash
time for i in /owl_web/nightingales/P_generosa/Undetermined_S0_L005_R*.gz
    do
    md5sum "$i" >> temp_checksums.md5
    done


real	5m2.141s
user	0m7.910s
sys	0m37.020s


In [22]:
%%bash
time for i in /owl_web/nightingales/Porites_spp/Undetermined_S0_L005_R*.gz
    do
    md5sum "$i" >> temp_checksums.md5
    done


real	5m42.952s
user	0m6.680s
sys	0m39.320s


In [23]:
%%bash
time for i in /owl_web/nightingales/A_elegantissima/Undetermined_S0_L005_R*.gz
    do
    md5sum "$i" >> temp_checksums.md5
    done


real	5m57.994s
user	0m6.940s
sys	0m38.520s


In [24]:
%%bash
cat temp_checksums.md5

484082c497c7a52fa225cb0983c709a9  /owl_web/nightingales/P_generosa/Undetermined_S0_L005_R1_001.fastq.gz
9718d259172f2c05ef97eb0d439c31da  /owl_web/nightingales/P_generosa/Undetermined_S0_L005_R2_001.fastq.gz
484082c497c7a52fa225cb0983c709a9  /owl_web/nightingales/Porites_spp/Undetermined_S0_L005_R1_001.fastq.gz
9718d259172f2c05ef97eb0d439c31da  /owl_web/nightingales/Porites_spp/Undetermined_S0_L005_R2_001.fastq.gz
484082c497c7a52fa225cb0983c709a9  /owl_web/nightingales/A_elegantissima/Undetermined_S0_L005_R1_001.fastq.gz
9718d259172f2c05ef97eb0d439c31da  /owl_web/nightingales/A_elegantissima/Undetermined_S0_L005_R2_001.fastq.gz


In [25]:
%%bash
cat Undetermined_checksums

484082c497c7a52fa225cb0983c709a9  Undetermined_S0_L005_R1_001.fastq.gz
9718d259172f2c05ef97eb0d439c31da  Undetermined_S0_L005_R2_001.fastq.gz


Looks like everything matches.

### Append facility checksums to checksum files on Owl

In [26]:
%%bash
cat Undetermined_checksums >> /owl_web/nightingales/P_generosa/checksums.md5
cat Undetermined_checksums >> /owl_web/nightingales/Porites_spp/checksums.md5
cat Undetermined_checksums >> /owl_web/nightingales/A_elegantissima/checksums.md5

### Download barcode html file

I'm not entirely sure what this is, but it might be useful to have.

In [27]:
%%bash
time WGETRC=/data/wgetrc_berk_seq wget -r -np -nc -q --accept "laneBarcode.html"  ftp://gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/


real	0m2.784s
user	0m0.000s
sys	0m0.070s


In [28]:
ls -lh

total 5.2G
drwxr-xr-x 1 srlab staff   68 Mar  1 22:07 [0m[01;34mAlfaro[0m/
drwxr-xr-x 1 srlab staff   68 Mar  1 22:07 [01;34mChang[0m/
drwxr-xr-x 1 srlab staff   68 Mar  1 22:07 [01;34mCoates[0m/
drwxr-xr-x 1 srlab staff   68 Mar  1 22:07 [01;34mDoudna[0m/
drwxr-xr-x 1 srlab staff   68 Mar  1 22:07 [01;34mJohnson[0m/
drwxr-xr-x 1 srlab staff   68 Mar  1 22:07 [01;34mPachter[0m/
drwxr-xr-x 1 srlab staff   68 Mar  1 22:07 [01;34mRoberts[0m/
-rw-r--r-- 1 srlab staff 2.2G Feb 21 23:13 Undetermined_S0_L005_R1_001.fastq.gz
-rw-r--r-- 1 srlab staff 3.1G Feb 21 23:13 Undetermined_S0_L005_R2_001.fastq.gz
-rw-r--r-- 1 srlab staff  142 Feb 28 17:18 Undetermined_checksums
drwxr-xr-x 1 srlab staff   68 Mar  1 22:07 [01;34mWayne[0m/
-rw-r--r-- 1 srlab staff  142 Mar  1 22:18 checksums.md5
drwxr-xr-x 1 srlab staff  102 Mar  1 23:13 [01;34mgslserver.qb3.berkeley.edu[0m/
-rw-r--r-- 1 srlab staff  636 Mar  1 22:59 temp_checksums.md5


In [29]:
%%bash
head gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/laneBarcode.html

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html xmlns:bcl2fastq="http://www.illumina.com/bcl2fastq">
<link rel="stylesheet" href="../../../../Report.css" type="text/css">
<body>
<table width="100%"><tr>
<td><p><p>HG3WNBBXX /
        [all projects] /
        [all samples] /
        [all barcodes]</p></p></td>
<td><p align="right"><a href="../../../../HG3WNBBXX/all/all/all/lane.html">hide barcodes</a></p></td>


In [30]:
%%bash
tail gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/laneBarcode.html

<td>GTCCCG</td>
<td>187,142</td>
<td>CGGACGAG</td>
<td>290,611</td>
<td>CGAGGCTG+CACAAAAA</td>
</tr>
</table>
<p></p>
</body>
</html>


I'm going to rename the file so that it has better association with these files and then copy to each of the directories on Owl.

In [31]:
%%bash
mv gslserver.qb3.berkeley.edu/170217_100PE_HS4KA/laneBarcode.html JD_L005_laneBarcode.html

In [32]:
ls -l

total 5426292
drwxr-xr-x 1 srlab staff         68 Mar  1 22:07 [0m[01;34mAlfaro[0m/
drwxr-xr-x 1 srlab staff         68 Mar  1 22:07 [01;34mChang[0m/
drwxr-xr-x 1 srlab staff         68 Mar  1 22:07 [01;34mCoates[0m/
drwxr-xr-x 1 srlab staff         68 Mar  1 22:07 [01;34mDoudna[0m/
-rw-r--r-- 1 srlab staff      43031 Feb 22 00:29 JD_L005_laneBarcode.html
drwxr-xr-x 1 srlab staff         68 Mar  1 22:07 [01;34mJohnson[0m/
drwxr-xr-x 1 srlab staff         68 Mar  1 22:07 [01;34mPachter[0m/
drwxr-xr-x 1 srlab staff         68 Mar  1 22:07 [01;34mRoberts[0m/
-rw-r--r-- 1 srlab staff 2281224946 Feb 21 23:13 Undetermined_S0_L005_R1_001.fastq.gz
-rw-r--r-- 1 srlab staff 3275238267 Feb 21 23:13 Undetermined_S0_L005_R2_001.fastq.gz
-rw-r--r-- 1 srlab staff        142 Feb 28 17:18 Undetermined_checksums
drwxr-xr-x 1 srlab staff         68 Mar  1 22:07 [01;34mWayne[0m/
-rw-r--r-- 1 srlab staff        142 Mar  1 22:18 checksums.md5
drwxr-xr-x 1 srlab staff        10

In [33]:
%%bash
cp JD_L005_laneBarcode.html /owl_web/nightingales/P_generosa/JD_L005_laneBarcode.html
cp JD_L005_laneBarcode.html /owl_web/nightingales/Porites_spp/JD_L005_laneBarcode.html
cp JD_L005_laneBarcode.html /owl_web/nightingales/A_elegantissima/JD_L005_laneBarcode.html

In [34]:
%%bash
ls /owl_web/nightingales/P_generosa/JD_L005_laneBarcode.html
ls /owl_web/nightingales/Porites_spp/JD_L005_laneBarcode.html
ls /owl_web/nightingales/A_elegantissima/JD_L005_laneBarcode.html

/owl_web/nightingales/P_generosa/JD_L005_laneBarcode.html
/owl_web/nightingales/Porites_spp/JD_L005_laneBarcode.html
/owl_web/nightingales/A_elegantissima/JD_L005_laneBarcode.html
