New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tophat, Tophat2, Bowtie2 ending with metadata warning #62

Closed
jennaj opened this Issue Nov 14, 2017 · 9 comments

Comments

Projects
None yet
2 participants
@jennaj
Member

jennaj commented Nov 14, 2017

Jobs are ending with a metadata warning. The "autodetect" button does not resolve the problem. Datasets cannot be used or in some cases downloaded (.bam.bai files) until resolved.

How can I tell if my job has a metadata error? A yellow warning box will appear in the dataset.

screen shot 2017-11-15 at 12 51 25 pm

Temporary workaround: (updated 11-14-17)

  • Under the History Menu (gear icon), select "Copy Dataset". On the copy form, select the datasets to copy and place them into the same history or another new history or another existing history. Alternatively, the entire history can be copied into a new history with "Copy History".
  • For the new dataset copy with the prior metadata problem, click on the pencil icon to reach the Edit attributes forms and click on "autodetect metadata". Allow that to process.
  • Once copy results confirmed the original history can be permanently deleted to recover space. BUT triple check! first before deleting data. A test download the .bam and .bam.bai to see if the process will launch without a 500 error means the metadata issue is resolved.

ping @natefoo

@jennaj jennaj added the bug label Nov 14, 2017

@jennaj

This comment has been minimized.

Member

jennaj commented Nov 14, 2017

Bowtie2 is now ending with this error:

Remote job server indicated a problem running or monitoring this job.

Error on stampede and jetstream. Success on roundup.

Test history https://usegalaxy.org/u/jen/h/variants-coursera-project---used-in-testing

Similiar to #61

Could these be memory failures? A few reports of multi-core jobs ending this way, or with metadata problems, have come in starting around Thurs 11/9, also linked to jetstream/stampede.

@natefoo

This comment has been minimized.

Member

natefoo commented Nov 14, 2017

The second error is due to the SSL certs on our staging Galaxy servers expiring. I'd renewed all of our certs last month and deployed them, but apparently missed these two servers. Jetstream jobs should run now. Still looking at the metadata issue, which is unrelated.

@jennaj

This comment has been minimized.

Member

jennaj commented Nov 14, 2017

Bowtie2 job on

  • roundup (local cluster) == success
  • jetstream == still queued - will update when completes
  • stampede == error, with metadata yellow box, but result is empty. Details:

accepted hits

WARNING:galaxy.model:Datatype class not found for extension 'fastqsanger.gz'

report stats

Could not locate a Bowtie index corresponding to basename "/cvmfs/data.galaxyproject.org/managed/bowtie2_index/Schizosaccharomyces_pombe_1.1/Schizosaccharomyces_pombe_1.1"
Error: Encountered internal Bowtie 2 exception (#1)
Command: /work/galaxy/main/deps/_conda/envs/mulled-v1-cf272fa72b0572012c68ee2cbf0c8f909a02f29be46918c2a23283da1d3d76b5/bin/bowtie2-align-s --wrapper basic-0 -p 16 -x /cvmfs/data.galaxyproject.org/managed/bowtie2_index/Schizosaccharomyces_pombe_1.1/Schizosaccharomyces_pombe_1.1 --passthrough -U input_f.fastq.gz 
(ERR): bowtie2-align exited with value 1

BWA job on

  • jetstream = success, no metadata error
@jennaj

This comment has been minimized.

Member

jennaj commented Nov 15, 2017

Update: A fix has been made and we are testing again.

@natefoo

This comment has been minimized.

Member

natefoo commented Nov 15, 2017

Jobs using locally cached data probably aren't going to work on Stampede. When we move to Stampede2 I'll automate updating the cached data. You can still use jobs with data from the history.

@jennaj

This comment has been minimized.

Member

jennaj commented Nov 16, 2017

Agree - the default cluster Roundup (Job resource left at default or this cluster specifically chosen) or Jetstream should be used when using a built-in index (aka locally cached data) when running Tophat, Tophat2, or HISAT2. Let's leave this ticket open until the transition to Stampede2 .. unless you'd prefer a distinct ticket, and reference this one?

For those running mapping jobs at https://usegalaxy.org: Avoid sending these jobs to the cluster Stampede unless you are using a custom genome from the history. Instead, specify the Job Resource parameter to use Jetstream instead. The option is the last one on the tool form, immediately above the "submit" button:

screen shot 2017-11-15 at 7 17 12 pm

For more details about job resource selections, please see this FAQ: https://galaxyproject.org/main/#resouces-available-to-main-site


Note: When inputting fastqsanger.gz formatted data, you may get this warning in the comment fields of the output datasets when using Jetstream (but not the default cluster Roundup):

WARNING:galaxy.model:Datatype class not found for extension 'fastqsanger.gz'

This warning does not impact the mapping job's results. They are valid and can be used directly as inputs to downstream tools.

We are doing more testing to see if this warning impacts other tools/use cases and will update this ticket with a link to the new issue if a problem is discovered.

@natefoo

This comment has been minimized.

Member

natefoo commented Nov 16, 2017

Hopefully the datatype warning should be eliminated as well. This doesn't need to be left open for Stampede2 as long as everything is working.

@natefoo

This comment has been minimized.

Member

natefoo commented Nov 16, 2017

Okay, the fastqsanger.gz datatype warning was still there as of this morning, but it's been fixed in a71ffea.

@jennaj

This comment has been minimized.

Member

jennaj commented Nov 30, 2017

I haven't seen this problem in the last two weeks, so I think it is ok to close now too. Thanks!

@jennaj jennaj closed this Nov 30, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment