Aidan Coyle, afcoyle@uw.edu
Roberts Lab, UW-SAFS
2021-02-02

After an initial analysis, realized that my libraries were incorrect. I had pooled day 0 and day 2 libraries together by temperature treatment, but day 0 hemolymph was extracted prior to any temperature treatments. Here are the changes between this analysis and the previous one:

1. Both individual and pooled libraries are examined, rather than solely individual libraries
2. A balanced sample design is utilized, where an equal number of libraries from each treatment will be examined
3. Day 0 and Day 2 libraries will not be pooled together by temperature treament, as discussed above
4. Day 2 and Day 17 libraries will be pooled by temperature treatment
5. The day comparison (Day 0 vs 17) will be dropped (for now at least)
6. When downloading transcripts, we will check IDs with checksums (failed to last time, meaning we must rebuild kallisto indices
7. As much as possible will be done remotely on the lab's Roadrunner computer, rather than on a local machine. This means that commands will largely be copied and pasted from the command line, rather than ran directly in this Jupyter notebook.




Library IDs are as follows. Asterisks label Day 0 crabs that were part of either the elevated or lowered treatment groups - since at Day 0, they had not yet been exposed to changes away from ambient temperature, they are included as part of the ambient treatment group:

| Crab ID    | Library ID | Day| Temperature |
|-------------|----------------|-------------|----------|
| G        | 272             |   2          |   Elevated       |
| H        | 294             |   2          |   Elevated       |
| I        | 280             |   2          |   Elevated       |
|pooled    | 380825          |   2          |   Elevated       |
| G*       | 173*            |   0*         |   Ambient*       |
| H*       | 72*             |   0*         |   Ambient*       |
| I*       | 127*            |   0*         |   Ambient*       |
| A        | 178             |   0          |   Ambient        |
| A        | 359             |   2          |   Ambient        |
| A        | 463             |   17         |   Ambient        |
| B        | 118             |   0          |   Ambient        |
| B        | 349             |   2          |   Ambient        |
| B        | 481             |   17         |   Ambient        |
| C        | 132             |   0          |   Ambient        |
| C        | 334             |   2          |   Ambient        |
| C        | 485             |   17         |   Ambient        |
| E*       | 151*            |   0*         |   Ambient*       |
| pooled   | 380820          |   2          |   Ambient        |
| E        | 254             |   2          |   Lowered        |
| E        | 445             |   17         |   Lowered        |
| pooled   | 380823          |   2          |   Lowered        |


Trimmed individual libraries were downloaded from Gannet, available [here](https://gannet.fish.washington.edu/Atumefaciens/20200318_cbai_RNAseq_fastp_trimming/), at 22:00 PST on 2021-02-02 

Trimmed pooled libraries were downloaded from Gannet, available [here](https://gannet.fish.washington.edu/Atumefaciens/20200414_cbai_RNAseq_fastp_trimming/), at 24:00 PST on 2021-02-02


Transcriptomes used are cbai_transcriptome_v3.0.fasta and cbai_transcriptome_v2.0.fasta, available [here](https://owl.fish.washington.edu/halfshell/genomic-databank/). Both transcriptomes have not been filtered to exclude hematodinium sequences. Transcriptome checksums are available [here](https://github.com/RobertsLab/resources/wiki/Genomic-Resources)

Transcriptomes were downloaded at 01:00 PST on 2021-02-03

Plan to create indices using both transcriptome v3.0 and v2.0

## Download individual libraries

In [None]:
# Download all files in directory
!wget --no-check-certificate --no-parent --recursive --reject "index.html" https://gannet.fish.washington.edu/Atumefaciens/20200318_cbai_RNAseq_f
astp_trimming/

In [None]:
# Remove all files that aren't .fq.gz or .md5
!rm *.html
!rm *.zip
!rm index.html*
!rm *.json
!rm *.sh
!rm *.log
!rm *.out
!rm *.txt
!rm -r multiqc*

In [None]:
# Move files from data/gannet.fish.washington.edu/Atumefaciens/20200318_cbai_RNAseq_fastp_trimming into data/libraries
!cd ..
!mv 20200318_cbai_RNAseq_fastp_trimming/* ../../libraries
# Delete old directory
!cd ../..
!rm -r gannet.fish.washington.edu

In [None]:
# remove all uninfected libraries, as they won't be part of analysis
!rm 113_R*
!rm 221_R*
!rm 222_R*
!rm 425_R*
!rm 427_R*
!rm 73_R*

In [None]:
# Rename checksum file to clarify it is specific to individual libraries
!mv trimmed_fastq_checksums.md5 trimmed_indivfastq_checksums.md5


In [None]:
# Check that files downloaded properly with checksums
!md5sum -c trimmed_indivfastq_checksums.md5

## Download pooled libraries


In [None]:
# Move up a directory to keep download simpler
!cd ..

In [None]:
# Download all files in directory
!wget --no-check-certificate --no-parent --recursive --reject "index.html" https://gannet.fish.washington.edu/Atumefaciens/20200414_cbai_RNAseq_fastp_trimming/

In [None]:
# Move into our new file structure
cd gannet.fish.washington.edu/Atumefaciens/20200414_cbai_RNAseq_fastp_trimming

In [None]:
# Remove all files that aren't .fq.gz or .md5
!rm *.html
!rm *.zip
!rm index.html*
!rm *.json
!rm *.log
!rm *.out
!rm *.txt
!rm -r multiqc*

Interestingly, this library has 2 checksum files - 20200413_cbai_checkums.md5 (not typo - it is checkums) and trimmed_fastq_checksums.md5
Ran diff, and it appears 20200... is a checksum file for the untrimmed fastq files, and can thus be safely removed.
We will also rename the trimmed_fastq_checksums.md5 file to clarify it is specific to pooled libraries

In [None]:
!rm 20200413_cbai_checkums.md5
!mv trimmed_fastq_checksums.md5 trimmed_pooledfastq_checksums.md5

In [None]:
# Remove all uninfected libraries, as they won't be part of analysis
!rm 380820_*
!rm 380822_*
!rm 380824_*

In [None]:
# Check that files downloaded properly with checksums
!md5sum -c trimmed_pooledfastq_checksums.md5

In [None]:
# Move files from data/gannet.fish.washington.edu/Atumefaciens/20200414_cbai_RNAseq_fastp_trimming into data/libraries
!cd ..
!mv 20200414_cbai_RNAseq_fastp_trimming/* ../../libraries
# Delete old directory
!cd ../..
!rm -r gannet.fish.washington.edu
cd libraries

In [None]:
# Merge libraries by lanes, removing un-merged files
!cat 380821_S2_L001_R1_001.fastp-trim.202004143925.fq.gz 380821_S2_L002_R1_001.fastp-trim.202004144145.fq.gz > 380821_S2_R1_001.fastp-trim.fq.gz
!cat 380821_S2_L001_R2_001.fastp-trim.202004143925.fq.gz 380821_S2_L002_R2_001.fastp-trim.202004144145.fq.gz > 380821_S2_R2_001.fastp-trim.fq.gz
!rm 380821_S2_L00*
!cat 380823_S4_L001_R1_001.fastp-trim.202004144852.fq.gz 380823_S4_L002_R1_001.fastp-trim.202004145106.fq.gz > 380823_S4_R1_001.fastp-trim.fq.gz
!cat 380823_S4_L001_R2_001.fastp-trim.202004144852.fq.gz 380823_S4_L002_R2_001.fastp-trim.202004145106.fq.gz > 380823_S4_R2_001.fastp-trim.fq.gz
!rm 380823_S4_L00*
!cat 380825_S6_L001_R1_001.fastp-trim.202004145835.fq.gz 380825_S6_L002_R1_001.fastp-trim.202004140109.fq.gz > 380825_S6_R1_001.fastp-trim.fq.gz
!cat 380825_S6_L001_R2_001.fastp-trim.202004145835.fq.gz 380825_S6_L002_R2_001.fastp-trim.202004140109.fq.gz > 380825_S6_R2_001.fastp-trim.fq.gz
!rm 380825_S6_L00*

## Download transcriptomes
Again, downloading transcriptome v2.0 and v3.0. Both are unfiltered by taxonomic group and include genes from both C. bairdi and Hematodinium.

In [None]:
!cd transcriptomes
# Download transcriptome 2.0
!curl -O -k https://owl.fish.washington.edu/halfshell/genomic-databank/cbai_transcriptome_v2.0.fasta
# Download transcriptome 3.0
!curl -O -k https://owl.fish.washington.edu/halfshell/genomic-databank/cbai_transcriptome_v3.0.fasta

## Create an index for kallisto

In [None]:
!cd ../../output/kallisto_indices
# Index for transcriptome 2.0
!kallisto index -i kallisto_bairdihemat_index_v2.0.idx ../../data/transcriptomes/cbai_transcriptome_v2.0.fasta

## Run kallisto quantification and build matrix for each comparison

### First, comparing Day 0 ambient vs. Day 17 ambient

#### Jupyter has problems running kallisto
#### Returns "Error: Could not create directory"
#### As a result, ran the following directly in command line

In [2]:
# Quantify ID 118
!kallisto quant \
-i /mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_index_v3.0.idx \
-o mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_indivlibs_transcriptome_v3.0/id118 \
/mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/data/indiv_libraries/118_R1_D0_amb.fastp-trim.fq \
/mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/data/indiv_libraries/118_R2_D0_amb.fastp-trim.fq


[quant] fragment length distribution will be estimated from the data
Error: could not create directory mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_indivlibs_transcriptome_v3.0/id118

Usage: kallisto quant [arguments] FASTQ-files

Required arguments:
-i, --index=STRING            Filename for the kallisto index to be used for
                              quantification
-o, --output-dir=STRING       Directory to write output to

Optional arguments:
    --bias                    Perform sequence based bias correction
-b, --bootstrap-samples=INT   Number of bootstrap samples (default: 0)
    --seed=INT                Seed for the bootstrap sampling (default: 42)
    --plaintext               Output plaintext instead of HDF5
    --fusion                  Search for fusions for Pizzly
    --single                  Quantify single-end reads
    --single-overhang         Include reads where unobserved rest of fragment is
                              predic

In [1]:
# Quantify ID 132
!kallisto quant \
-i /mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_index_v3.0.idx \
-o mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_indivlibs_transcriptome_v3.0/id132 \
/mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/data/indiv_libraries/132_R1_D0_amb.fastp-trim.fq \
/mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/data/indiv_libraries/132_R2_D0_amb.fastp-trim.fq


[quant] fragment length distribution will be estimated from the data
Error: could not create directory mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_indivlibs_transcriptome_v3.0/id132

Usage: kallisto quant [arguments] FASTQ-files

Required arguments:
-i, --index=STRING            Filename for the kallisto index to be used for
                              quantification
-o, --output-dir=STRING       Directory to write output to

Optional arguments:
    --bias                    Perform sequence based bias correction
-b, --bootstrap-samples=INT   Number of bootstrap samples (default: 0)
    --seed=INT                Seed for the bootstrap sampling (default: 42)
    --plaintext               Output plaintext instead of HDF5
    --fusion                  Search for fusions for Pizzly
    --single                  Quantify single-end reads
    --single-overhang         Include reads where unobserved rest of fragment is
                              predic

In [None]:
# Quantify ID 178
!kallisto quant \
-i /mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_index_v3.0.idx \
-o mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_indivlibs_transcriptome_v3.0/id178 \
/mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/data/indiv_libraries/178_R1_D0_amb.fastp-trim.fq \
/mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/data/indiv_libraries/178_R2_D0_amb.fastp-trim.fq

In [None]:
# Quantify ID 463
!kallisto quant \
-i /mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_index_v3.0.idx \
-o mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_indivlibs_transcriptome_v3.0/id463 \
/mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/data/indiv_libraries/463_R1_D17_amb.fastp-trim.fq \
/mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/data/indiv_libraries/463_R2_D17_amb.fastp-trim.fq

In [None]:
# Quantify ID 481
!kallisto quant \
-i /mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_index_v3.0.idx \
-o mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_indivlibs_transcriptome_v3.0/id481 \
/mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/data/indiv_libraries/481_R1_D17_amb.fastp-trim.fq \
/mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/data/indiv_libraries/481_R2_D17_amb.fastp-trim.fq

In [None]:
# Quantify ID 485
!kallisto quant \
-i /mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_index_v3.0.idx \
-o mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_indivlibs_transcriptome_v3.0/id485 \
/mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/data/indiv_libraries/485_R1_D17_amb.fastp-trim.fq \
/mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/data/indiv_libraries/485_R2_D17_amb.fastp-trim.fq

#### End of Kallisto quantification
#### Begin building transcript expression matrix

In [11]:
!cd /mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_pooledlibs_transcriptome_v3.0

In [12]:
!/mnt/c/Users/acoyl/Documents/GradSchool/RobertsLab/Tools/Trinity/trinityrnaseq-v2.11.0/util/abundance_estimates_to_matrix.pl \
--est_method kallisto \
--gene_trans_map 'none' \
--out_prefix kallisto \
--name_sample_by_basedir \
id118/abundance.tsv \
id132/abundance.tsv \
id178/abundance.tsv \
id463/abundance.tsv \
id481/abundance.tsv \
id485/abundance.tsv 

-reading file: library02/abundance.tsv
-reading file: library04/abundance.tsv
-reading file: library06/abundance.tsv
-reading file: library08/abundance.tsv
-reading file: library10/abundance.tsv


* Outputting combined matrix.

/mnt/c/Users/acoyl/Documents/GradSchool/RobertsLab/Tools/Trinity/trinityrnaseq-v2.11.0/util/support_scripts/run_TMM_scale_matrix.pl --matrix kallisto.isoform.TPM.not_cross_norm > kallisto.isoform.TMM.EXPR.matrixCMD: R --no-save --no-restore --no-site-file --no-init-file -q < kallisto.isoform.TPM.not_cross_norm.runTMM.R 1>&2 
/mnt/c/Users/acoyl/Downloads/anaconda3/lib/R/bin/exec/R: error while loading shared libraries: libreadline.so.6: cannot open shared object file: No such file or directory
Error, cmd: R --no-save --no-restore --no-site-file --no-init-file -q < kallisto.isoform.TPM.not_cross_norm.runTMM.R 1>&2  died with ret (32512)  at /mnt/c/Users/acoyl/Documents/GradSchool/RobertsLab/Tools/Trinity/trinityrnaseq-v2.11.0/util/support_scripts/run_TMM_scale_mat

CalledProcessError: Command 'b"/mnt/c/Users/acoyl/Documents/GradSchool/RobertsLab/Tools/Trinity/trinityrnaseq-v2.11.0/util/abundance_estimates_to_matrix.pl \\\n--est_method kallisto \\\n--gene_trans_map 'none' \\\n--out_prefix kallisto \\\n--name_sample_by_basedir \\\nlibrary02/abundance.tsv \\\nlibrary04/abundance.tsv \\\nlibrary06/abundance.tsv \\\nlibrary08/abundance.tsv \\\nlibrary10/abundance.tsv\n"' returned non-zero exit status 25.

## Repeat from "run kallisto quantification and build matrix for each comparison".

### This time, examining ambient vs. low libraries from Day 0-2 (118, 132, 178, 334, 349, 359 vs. 181, 254). Since we already built a Kallisto index for 118/132/178, these will not be rebuilt

In [None]:
# Quantify ID 334
!kallisto quant \
-i /mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_index_v3.0.idx \
-o mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_indivlibs_transcriptome_v3.0/id334 \
/mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/data/indiv_libraries/334_R1_D0_amb.fastp-trim.fq \
/mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/data/indiv_libraries/334_R2_D0_amb.fastp-trim.fq

In [None]:
# Quantify ID 349
!kallisto quant \
-i /mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_index_v3.0.idx \
-o mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_indivlibs_transcriptome_v3.0/id349 \
/mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/data/indiv_libraries/349_R1_D17_amb.fastp-trim.fq \
/mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/data/indiv_libraries/349_R2_D17_amb.fastp-trim.fq

In [None]:
# Quantify ID 359
!kallisto quant \
-i /mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_index_v3.0.idx \
-o mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_indivlibs_transcriptome_v3.0/id359 \
/mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/data/indiv_libraries/359_R1_D17_amb.fastp-trim.fq \
/mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/data/indiv_libraries/359_R2_D17_amb.fastp-trim.fq

In [None]:
# Quantify ID 181
!kallisto quant \
-i /mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_index_v3.0.idx \
-o mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_indivlibs_transcriptome_v3.0/id181 \
/mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/data/indiv_libraries/181_R1_D17_amb.fastp-trim.fq \
/mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/data/indiv_libraries/181_R2_D17_amb.fastp-trim.fq

In [None]:
# Quantify ID 254
!kallisto quant \
-i /mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_index_v3.0.idx \
-o mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_indivlibs_transcriptome_v3.0/id254 \
/mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/data/indiv_libraries/254_R1_D17_amb.fastp-trim.fq \
/mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/data/indiv_libraries/254_R2_D17_amb.fastp-trim.fq

#### End of Kallisto quantification
#### Begin building transcript expression matrix

In [11]:
!cd /mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_pooledlibs_transcriptome_v3.0

In [12]:
!/mnt/c/Users/acoyl/Documents/GradSchool/RobertsLab/Tools/Trinity/trinityrnaseq-v2.11.0/util/abundance_estimates_to_matrix.pl \
--est_method kallisto \
--gene_trans_map 'none' \
--out_prefix kallisto \
--name_sample_by_basedir \
id118/abundance.tsv \
id132/abundance.tsv \
id178/abundance.tsv \
id334/abundance.tsv \
id349/abundance.tsv \
id359/abundance.tsv \
id181/abundance.tsv \
id254/abundance.tsv 

-reading file: library02/abundance.tsv
-reading file: library04/abundance.tsv
-reading file: library06/abundance.tsv
-reading file: library08/abundance.tsv
-reading file: library10/abundance.tsv


* Outputting combined matrix.

/mnt/c/Users/acoyl/Documents/GradSchool/RobertsLab/Tools/Trinity/trinityrnaseq-v2.11.0/util/support_scripts/run_TMM_scale_matrix.pl --matrix kallisto.isoform.TPM.not_cross_norm > kallisto.isoform.TMM.EXPR.matrixCMD: R --no-save --no-restore --no-site-file --no-init-file -q < kallisto.isoform.TPM.not_cross_norm.runTMM.R 1>&2 
/mnt/c/Users/acoyl/Downloads/anaconda3/lib/R/bin/exec/R: error while loading shared libraries: libreadline.so.6: cannot open shared object file: No such file or directory
Error, cmd: R --no-save --no-restore --no-site-file --no-init-file -q < kallisto.isoform.TPM.not_cross_norm.runTMM.R 1>&2  died with ret (32512)  at /mnt/c/Users/acoyl/Documents/GradSchool/RobertsLab/Tools/Trinity/trinityrnaseq-v2.11.0/util/support_scripts/run_TMM_scale_mat

CalledProcessError: Command 'b"/mnt/c/Users/acoyl/Documents/GradSchool/RobertsLab/Tools/Trinity/trinityrnaseq-v2.11.0/util/abundance_estimates_to_matrix.pl \\\n--est_method kallisto \\\n--gene_trans_map 'none' \\\n--out_prefix kallisto \\\n--name_sample_by_basedir \\\nlibrary02/abundance.tsv \\\nlibrary04/abundance.tsv \\\nlibrary06/abundance.tsv \\\nlibrary08/abundance.tsv \\\nlibrary10/abundance.tsv\n"' returned non-zero exit status 25.

## Repeat from "run kallisto quantification and build matrix for each comparison".

### This time, examining elevated vs ambient libraries from Day 0-2 (127, 173, 072, 272, 280, 294 vs. 118, 132, 178, 334, 349, 359). Since we already built a Kallisto index for all ambient libraries, these will not be rebuilt

In [None]:
# Quantify ID 127
!kallisto quant \
-i /mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_index_v3.0.idx \
-o mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_indivlibs_transcriptome_v3.0/id127 \
/mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/data/indiv_libraries/127_R1_D0_amb.fastp-trim.fq \
/mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/data/indiv_libraries/127_R2_D0_amb.fastp-trim.fq

In [None]:
# Quantify ID 173
!kallisto quant \
-i /mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_index_v3.0.idx \
-o mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_indivlibs_transcriptome_v3.0/id173 \
/mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/data/indiv_libraries/173_R1_D17_amb.fastp-trim.fq \
/mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/data/indiv_libraries/173_R2_D17_amb.fastp-trim.fq

In [None]:
# Quantify ID 072
!kallisto quant \
-i /mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_index_v3.0.idx \
-o mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_indivlibs_transcriptome_v3.0/id072 \
/mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/data/indiv_libraries/072_R1_D17_amb.fastp-trim.fq \
/mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/data/indiv_libraries/072_R2_D17_amb.fastp-trim.fq

In [None]:
# Quantify ID 272
!kallisto quant \
-i /mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_index_v3.0.idx \
-o mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_indivlibs_transcriptome_v3.0/id272 \
/mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/data/indiv_libraries/272_R1_D17_amb.fastp-trim.fq \
/mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/data/indiv_libraries/272_R2_D17_amb.fastp-trim.fq

In [None]:
# Quantify ID 280
!kallisto quant \
-i /mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_index_v3.0.idx \
-o mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_indivlibs_transcriptome_v3.0/id280 \
/mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/data/indiv_libraries/280_R1_D17_amb.fastp-trim.fq \
/mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/data/indiv_libraries/280_R2_D17_amb.fastp-trim.fq

In [None]:
# Quantify ID 294
!kallisto quant \
-i /mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_index_v3.0.idx \
-o mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_indivlibs_transcriptome_v3.0/id294 \
/mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/data/indiv_libraries/294_R1_D17_amb.fastp-trim.fq \
/mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/data/indiv_libraries/294_R2_D17_amb.fastp-trim.fq

#### End of Kallisto quantification
#### Begin building transcript expression matrix

In [11]:
!cd /mnt/c/Users/acoyl/Documents/GitHub/hemat_bairdii_transcriptome/output/kallisto_pooledlibs_transcriptome_v3.0

In [12]:
!/mnt/c/Users/acoyl/Documents/GradSchool/RobertsLab/Tools/Trinity/trinityrnaseq-v2.11.0/util/abundance_estimates_to_matrix.pl \
--est_method kallisto \
--gene_trans_map 'none' \
--out_prefix kallisto \
--name_sample_by_basedir \
id127/abundance.tsv \
id173/abundance.tsv \
id072/abundance.tsv \
id272/abundance.tsv \
id280/abundance.tsv \
id294/abundance.tsv \
id118/abundance.tsv \
id132/abundance.tsv \
id178/abundance.tsv \
id334/abundance.tsv \
id349/abundance.tsv \
id359/abundance.tsv 

-reading file: library02/abundance.tsv
-reading file: library04/abundance.tsv
-reading file: library06/abundance.tsv
-reading file: library08/abundance.tsv
-reading file: library10/abundance.tsv


* Outputting combined matrix.

/mnt/c/Users/acoyl/Documents/GradSchool/RobertsLab/Tools/Trinity/trinityrnaseq-v2.11.0/util/support_scripts/run_TMM_scale_matrix.pl --matrix kallisto.isoform.TPM.not_cross_norm > kallisto.isoform.TMM.EXPR.matrixCMD: R --no-save --no-restore --no-site-file --no-init-file -q < kallisto.isoform.TPM.not_cross_norm.runTMM.R 1>&2 
/mnt/c/Users/acoyl/Downloads/anaconda3/lib/R/bin/exec/R: error while loading shared libraries: libreadline.so.6: cannot open shared object file: No such file or directory
Error, cmd: R --no-save --no-restore --no-site-file --no-init-file -q < kallisto.isoform.TPM.not_cross_norm.runTMM.R 1>&2  died with ret (32512)  at /mnt/c/Users/acoyl/Documents/GradSchool/RobertsLab/Tools/Trinity/trinityrnaseq-v2.11.0/util/support_scripts/run_TMM_scale_mat

CalledProcessError: Command 'b"/mnt/c/Users/acoyl/Documents/GradSchool/RobertsLab/Tools/Trinity/trinityrnaseq-v2.11.0/util/abundance_estimates_to_matrix.pl \\\n--est_method kallisto \\\n--gene_trans_map 'none' \\\n--out_prefix kallisto \\\n--name_sample_by_basedir \\\nlibrary02/abundance.tsv \\\nlibrary04/abundance.tsv \\\nlibrary06/abundance.tsv \\\nlibrary08/abundance.tsv \\\nlibrary10/abundance.tsv\n"' returned non-zero exit status 25.

## Repeat from "run kallisto quantification and build matrix for each comparison".

### This time, examining elevated vs low libraries from Day 0-2 (127, 173, 072, 272, 280, 294 vs. 181, 254). Since we already built a Kallisto index for all libraries, we can skip directly to building a matrix of counts

## Repeat from "run kallisto quantification and build matrix for each comparison".

### This time, examining elevated vs ambient libraries from Day 0-2 (127, 173, 072, 272, 280, 294 vs. 118, 132, 178, 334, 349, 359). Since we already built a Kallisto index for all ambient libraries, these will not be rebuilt

In [12]:
!/mnt/c/Users/acoyl/Documents/GradSchool/RobertsLab/Tools/Trinity/trinityrnaseq-v2.11.0/util/abundance_estimates_to_matrix.pl \
--est_method kallisto \
--gene_trans_map 'none' \
--out_prefix kallisto \
--name_sample_by_basedir \
id127/abundance.tsv \
id173/abundance.tsv \
id072/abundance.tsv \
id272/abundance.tsv \
id280/abundance.tsv \
id294/abundance.tsv \
id181/abundance.tsv \
id254/abundance.tsv 

-reading file: library02/abundance.tsv
-reading file: library04/abundance.tsv
-reading file: library06/abundance.tsv
-reading file: library08/abundance.tsv
-reading file: library10/abundance.tsv


* Outputting combined matrix.

/mnt/c/Users/acoyl/Documents/GradSchool/RobertsLab/Tools/Trinity/trinityrnaseq-v2.11.0/util/support_scripts/run_TMM_scale_matrix.pl --matrix kallisto.isoform.TPM.not_cross_norm > kallisto.isoform.TMM.EXPR.matrixCMD: R --no-save --no-restore --no-site-file --no-init-file -q < kallisto.isoform.TPM.not_cross_norm.runTMM.R 1>&2 
/mnt/c/Users/acoyl/Downloads/anaconda3/lib/R/bin/exec/R: error while loading shared libraries: libreadline.so.6: cannot open shared object file: No such file or directory
Error, cmd: R --no-save --no-restore --no-site-file --no-init-file -q < kallisto.isoform.TPM.not_cross_norm.runTMM.R 1>&2  died with ret (32512)  at /mnt/c/Users/acoyl/Documents/GradSchool/RobertsLab/Tools/Trinity/trinityrnaseq-v2.11.0/util/support_scripts/run_TMM_scale_mat

CalledProcessError: Command 'b"/mnt/c/Users/acoyl/Documents/GradSchool/RobertsLab/Tools/Trinity/trinityrnaseq-v2.11.0/util/abundance_estimates_to_matrix.pl \\\n--est_method kallisto \\\n--gene_trans_map 'none' \\\n--out_prefix kallisto \\\n--name_sample_by_basedir \\\nlibrary02/abundance.tsv \\\nlibrary04/abundance.tsv \\\nlibrary06/abundance.tsv \\\nlibrary08/abundance.tsv \\\nlibrary10/abundance.tsv\n"' returned non-zero exit status 25.

File complete. Move to the R file to begin differential gene expression analysis using DESeq2