Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trinity: Error; cannot find path to bowtie2 #452

Closed
grace-ac opened this issue Oct 24, 2018 · 32 comments
Closed

Trinity: Error; cannot find path to bowtie2 #452

grace-ac opened this issue Oct 24, 2018 · 32 comments

Comments

@grace-ac
Copy link
Contributor

@grace-ac grace-ac commented Oct 24, 2018

Here's what I have, basing off of what @kubu4 has done in his notebook

Questions

  1. Does everything look ok?
  2. What do I do about bowtie2 error?
[graceac9@mox1 srlab]$ pwd
/gscratch/srlab
[graceac9@mox1 srlab]$ #!/bin/bash
[graceac9@mox1 srlab]$ ## Job Name
[graceac9@mox1 srlab]$ #SBATCH --job-name=20181024_Cbairdi_trinity_01
[graceac9@mox1 srlab]$ ## Allocation Definition 
[graceac9@mox1 srlab]$ #SBATCH --account=srlab
[graceac9@mox1 srlab]$ #SBATCH --partition=srlab
[graceac9@mox1 srlab]$ ## Resources
[graceac9@mox1 srlab]$ ## Nodes 
[graceac9@mox1 srlab]$ #SBATCH --nodes=1   
[graceac9@mox1 srlab]$ ## Walltime (days-hours:minutes:seconds format)
[graceac9@mox1 srlab]$ #SBATCH --time=90:30:00
[graceac9@mox1 srlab]$ ## Memory per node
[graceac9@mox1 srlab]$ #SBATCH --mem=500G
[graceac9@mox1 srlab]$ ## Specify the working directory for this job
[graceac9@mox1 srlab]$ #SBATCH --workdir=/gscratch/srlab/graceac9/analyses/1024
[graceac9@mox1 srlab]$ 
[graceac9@mox1 srlab]$ /gscratch/srlab/programs/trinity/Trinity \
> --seqType fq \
> --max_memory 100G \
> --left /gscratch/srlab/graceac9/data/304428_S1_L001_R1_001.fastq,\
> /gscratch/srlab/graceac9/data/304428_S1_L002_R1_001.fastq \
> --right /gscratch/srlab/graceac9/data/304428_S1_L001_R2_001.fastq,\
> /gscratch/srlab/graceac9/data/304428_S1_L002_R2_001.fastq \
> --trimmomatic \
> --CPU 28
Left read files: $VAR1 = [
          '/gscratch/srlab/graceac9/data/304428_S1_L001_R1_001.fastq',
          '/gscratch/srlab/graceac9/data/304428_S1_L002_R1_001.fastq'
        ];
Right read files: $VAR1 = [
          '/gscratch/srlab/graceac9/data/304428_S1_L001_R2_001.fastq',
          '/gscratch/srlab/graceac9/data/304428_S1_L002_R2_001.fastq'
        ];
Trinity version: Trinity-v2.4.0
** NOTE: Latest version of Trinity is Trinity-v2.8.4, and can be obtained at:
	https://github.com/trinityrnaseq/trinityrnaseq/releases

which: no bowtie2 in (/gscratch/srlab/programs/trinity/trinity-plugins/BIN:/sw/local/bin:/usr/lib64/qt-3.3/bin:/sw/modules-1.775/bin:/usr/lpp/mmfs/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/usr/lusers/graceac9/bin)
which: no bowtie2-build in (/gscratch/srlab/programs/trinity/trinity-plugins/BIN:/sw/local/bin:/usr/lib64/qt-3.3/bin:/sw/modules-1.775/bin:/usr/lpp/mmfs/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/usr/lusers/graceac9/bin)
Error, cannot find path to bowtie2 () or bowtie2-build (), which is now needed as part of Chrysalis' read scaffolding step.  If you should choose to not run bowtie, include the --no_bowtie in your Trinity command.
@kubu4

This comment has been minimized.

Copy link
Contributor

@kubu4 kubu4 commented Oct 24, 2018

I'm not exactly sure what you've pasted here.

You shouldn't have all of these:

[graceac9@mox1 srlab]$

Or, all of these:

>

@kubu4

This comment has been minimized.

Copy link
Contributor

@kubu4 kubu4 commented Oct 24, 2018

Where is your script located? I can't find it.

Also, I needed this line when I last ran Trinity (doesn't resolve bowtie error, but will prevent a problem with Trinity later on)

# Load Python Mox module for Python module availability

module load intel-python3_2017
@kubu4

This comment has been minimized.

Copy link
Contributor

@kubu4 kubu4 commented Oct 24, 2018

bowtie2 error is because bowtie2 isn't currently in your $PATH. That means you'll probably also have to add jellyfish and salmon to your $PATH.

To do so, add the following text to your ~/.bashrc file:

# Custom PATH

export PATH="$PATH:\
/gscratch/srlab/programs/bowtie2-2.3.4.1-linux-x86_64:\
/gscratch/srlab/programs/anaconda3/bin/cutadapt:\
/gscratch/srlab/programs/FastQC:\
/gscratch/srlab/programs/jellyfish-2.2.10/bin:\
/gscratch/srlab/programs/salmon-0.11.2-linux_x86_64/bin:\
/gscratch/srlab/programs/samtools-1.9"
@grace-ac

This comment has been minimized.

Copy link
Contributor Author

@grace-ac grace-ac commented Oct 24, 2018

Yeah, I'm not entirely sure what I'm doing!

I literally just copy and pasted the script from TextWrangler into my terminal while logged in to Mox.

...Which I now understand is not the way to go. Now I'm thinking that I send the job via the script file rather than copy and pasting it ...

Here's where my script lives: https://github.com/fish546-2018/grace-Cbairdi-transcriptome/blob/master/scripts/20181024_Cbairdi_trinity_01.sh

@kubu4

This comment has been minimized.

Copy link
Contributor

@kubu4 kubu4 commented Oct 24, 2018

Correct, you cannot copy and paste script to run it. Script has to be on Mox and then needs a special command to run it; see the Mox wiki.

@kubu4

This comment has been minimized.

Copy link
Contributor

@kubu4 kubu4 commented Oct 24, 2018

I looked at script. You need to put the Custom PATH stuff in your ~/.bashrc file.

@kubu4

This comment has been minimized.

Copy link
Contributor

@kubu4 kubu4 commented Oct 24, 2018

Also, after adding that to your ~./bashrc file, you'll need to source the file so the computer finds the new info:

source ~/.bashrc

@sr320

This comment has been minimized.

Copy link
Member

@sr320 sr320 commented Oct 24, 2018

@grace-ac

This comment has been minimized.

Copy link
Contributor Author

@grace-ac grace-ac commented Oct 24, 2018

Is ~/.bashrc file the same thing as the script file?

@sr320

This comment has been minimized.

Copy link
Member

@sr320 sr320 commented Oct 24, 2018

@sr320

This comment has been minimized.

Copy link
Member

@sr320 sr320 commented Oct 25, 2018

@kubu4

This comment has been minimized.

Copy link
Contributor

@kubu4 kubu4 commented Oct 25, 2018

Those programs need to be available in the system $PATH for Trinity to run. Absolute or relative paths have no bearing, as long as the programs are available in the system $PATH.

@kubu4

This comment has been minimized.

Copy link
Contributor

@kubu4 kubu4 commented Oct 25, 2018

Or, maybe that export PATH command will work in your SBATCH script the way you already have it... Hmmm...

@grace-ac

This comment has been minimized.

Copy link
Contributor Author

@grace-ac grace-ac commented Oct 25, 2018

well I guess we'll find out! just started running job on Mox

@kubu4

This comment has been minimized.

Copy link
Contributor

@kubu4 kubu4 commented Oct 25, 2018

Exciting!

However, I noticed you'll be using this version of Trinity:

/gscratch/srlab/programs/trinity/Trinity

That's a pretty old version (2.40) and is nearly two years old at this point. I'd recommend using the newer version:

/gscratch/srlab/programs/Trinity-v2.8.3/Trinity

However, since you'll be using the old version, I don't think you'll actually need those programs in your PATH; I think they came bundled with Trinity back then.

@grace-ac

This comment has been minimized.

Copy link
Contributor Author

@grace-ac grace-ac commented Oct 28, 2018

Yep, got an error:

[graceac9@mox2 1024]$ head slurm-401714.out 
Left read files: $VAR1 = [
          '/gscratch/srlab/graceac9/data/304428_S1_L001_R1_001.fastq',
          '/gscratch/srlab/graceac9/data/304428_S1_L002_R1_001.fastq'
        ];
Right read files: $VAR1 = [
          '/gscratch/srlab/graceac9/data/304428_S1_L001_R2_001.fastq',
          '/gscratch/srlab/graceac9/data/304428_S1_L002_R2_001.fastq'
        ];
Trinity version: Trinity-v2.4.0
-ERROR: couldn't run the network check to confirm latest Trinity software version.

Will fix Trinity version in script and re do.

Any other things I should fix before I send the job again?

@sr320

This comment has been minimized.

Copy link
Member

@sr320 sr320 commented Oct 28, 2018

@sr320

This comment has been minimized.

Copy link
Member

@sr320 sr320 commented Oct 28, 2018

@sr320

This comment has been minimized.

Copy link
Member

@sr320 sr320 commented Oct 28, 2018

@grace-ac

This comment has been minimized.

Copy link
Contributor Author

@grace-ac grace-ac commented Oct 28, 2018

Oh, I didn't fully read the error correctly... thought it said it couldn't run trinity because it wasn't the updated version.

I'll check the fq files and compare to source!

@grace-ac

This comment has been minimized.

Copy link
Contributor Author

@grace-ac grace-ac commented Oct 30, 2018

These are the files I had in the script:

304428_S1_L001_R1_001.fastq
304428_S1_L002_R1_001.fastq
304428_S1_L001_R2_001.fastq
304428_S1_L002_R2_001.fastq

These are the files that are in the trinity_out_dir:

304428_S1_L001_R1_001.fastq.PwU.qtrim.fq 
304428_S1_L001_R2_001.fastq.PwU.qtrim.fq
304428_S1_L002_R1_001.fastq.PwU.qtrim.fq
304428_S1_L002_R2_001.fastq.PwU.qtrim.fq

The numbers are correct in the .fastq file names, but I'm not sure if the added extensions of ".PwU.qtrim.fq" means that there was a problem?

@sr320

This comment has been minimized.

Copy link
Member

@sr320 sr320 commented Oct 30, 2018

@kubu4

This comment has been minimized.

Copy link
Contributor

@kubu4 kubu4 commented Oct 30, 2018

but there is likely problem with your fq files

Why do you think this? I'm not following.

@grace-ac

This comment has been minimized.

Copy link
Contributor Author

@grace-ac grace-ac commented Oct 30, 2018

the files are from /nightingales/C_bairdi/. I downloaded them, and then uploaded them to my data directory on mox.

I didn't do md5, but Sam did

@kubu4

This comment has been minimized.

Copy link
Contributor

@kubu4 kubu4 commented Oct 30, 2018

MD5 is a program that generates a unique code (checksum) for a file. Transferring data from one place to another can corrupt a file (larger files are more prone to corruption during transfer). You can use the MD5 checksum originally generated for the file to compare the MD5 checksum generated after a file is transferred. If the checksums match it means the transferred file is exactly the same as the original. If the checksums do not match, something got corrupted during transfer.

So, any time you copy/move any FastQ files, you should compare the checksums.

Pro tip: Using rsync to copy files actually has this functionality built in and will do it automatically.

@grace-ac

This comment has been minimized.

Copy link
Contributor Author

@grace-ac grace-ac commented Oct 30, 2018

Right! We learned about that in class a few weeks ago.

Cool! Since I used rsync, the files are fine then?

@sr320

This comment has been minimized.

Copy link
Member

@sr320 sr320 commented Oct 30, 2018

@grace-ac

This comment has been minimized.

Copy link
Contributor Author

@grace-ac grace-ac commented Oct 30, 2018

Never trust your data or your tools!

@kubu4

This comment has been minimized.

Copy link
Contributor

@kubu4 kubu4 commented Oct 30, 2018

I downloaded them, and then uploaded them to my data

You used rsync for both steps?

Also, as @sr320 mentioned, it never hurts to check! The process is quick.

@grace-ac

This comment has been minimized.

Copy link
Contributor Author

@grace-ac grace-ac commented Oct 30, 2018

Nope, I downloaded them without rsync. I uploaded using rsync.

Yes! I will do md5! I'll look back at the textbook and the class notes.

@kubu4

This comment has been minimized.

Copy link
Contributor

@kubu4 kubu4 commented Oct 30, 2018

Heads up. Macs use a program called md5, while Linux (i.e. Mox) uses a program called md5sum. They will generate the same checksum for a given file, but the output will be formatted differently. This can make it a pain to compare checksums computationally.

However, you can use the -r flag on Macs to get the output to match that of Linux md5sum. E.g.

md5 -r fastq.fq.gz

You can also have md5/md5sum work with wildcards to generate checksums for a set of files:

md5 -r *.fq.gz

@grace-ac

This comment has been minimized.

Copy link
Contributor Author

@grace-ac grace-ac commented Oct 30, 2018

@kubu4 helped me and we figured out that I was using .fastq files, not the .fast.gz.

For some reason my laptop, automatically unzips the files...? And they are only 64K, so they aren't the true unzipped files.

screen shot 2018-10-30 at 10 51 48 am

Anyway, @kubu4 showed me how to rsync the files from /nightingales/C_bairdi/ into my Mox data folder.

screen shot 2018-10-30 at 10 58 08 am

I ran md5 and compared results to the md5 in nightingales, and the files look good.
screen shot 2018-10-30 at 11 28 57 am

I have now changed the file names in my .sh and will re-run Trinity

@sr320 sr320 closed this Oct 31, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.