Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bugs fixed for oncofuse #407

Merged
merged 7 commits into from
Apr 30, 2014
Merged

bugs fixed for oncofuse #407

merged 7 commits into from
Apr 30, 2014

Conversation

tanglingfung
Copy link
Contributor

it's for initial bug fixes for the oncofuse pipeline.

I disabled tophat fusion workflow for now, and star junction output should work fine for GRCh37/hg19 genome. oncofuse output follows hg19 chromosome name though. sorry for the long delays in the fix (for #237 and #383 )

disable tophat input for now
only works for GRCh37/hg19 genome
add checks to make sure the junction file exists before running oncofuse
@tanglingfung
Copy link
Contributor Author

I added checks for an empty fusion file from STAR or Tophat and resumed support for tophat

@tanglingfung
Copy link
Contributor Author

upload to final is handled as well; please let me know if I missed anything

@roryk
Copy link
Collaborator

roryk commented Apr 25, 2014

Thanks Paul, sorry for the delay I haven't gotten around to testing this yet.

@tanglingfung
Copy link
Contributor Author

thanks Rory for taking a look, I am just closing overlapping issues. I kept #383 for this case

@roryk
Copy link
Collaborator

roryk commented Apr 25, 2014

Hi Paul,

Thanks! I tested this today-- it looks like it fails if there are no fusions found:

[2014-04-25 18:30] Uncaught exception occurred
Traceback (most recent call last):
  File "/home/vagrant/anaconda/envs/bcbio/lib/python2.7/site-packages/bcbio_nextgen-0.7.9a-py2.7.egg/bcbio/provenance/do.py", line 23, in run
    _do_run(cmd, checks, log_stdout)
  File "/home/vagrant/anaconda/envs/bcbio/lib/python2.7/site-packages/bcbio_nextgen-0.7.9a-py2.7.egg/bcbio/provenance/do.py", line 120, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
CalledProcessError: Command 'set -o pipefail; java -Xms750m -Xmx2000m -jar /usr/local/share/java/oncofuse/Oncofuse.jar /v-data/bcbio-nextgen/tests/test_automated_output/align/Test1/Test1Chimeric.out.junction rnastar AVG /v-data/bcbio-nextgen/tests/test_automated_output/align/Test1/oncofuse_out.txt
[Fri Apr 25 18:30:55 UTC 2014] Mapping fusion breakpoints
[Fri Apr 25 18:30:57 UTC 2014] Reading input file
[Fri Apr 25 18:30:58 UTC 2014] [Post-processing RNASTAR input] Taken 776 reads, of them mapped to canonical RefSeq: 0 as encompassing, 0 as spanning. Total 0 exon pairs, of them 0 passed junction coverage filter.
No fusions loaded
' returned non-zero exit status 255
Traceback (most recent call last):
  File "/home/vagrant/anaconda/envs/bcbio/bin/bcbio_nextgen.py", line 5, in <module>
    pkg_resources.run_script('bcbio-nextgen==0.7.9a', 'bcbio_nextgen.py')
  File "/home/vagrant/anaconda/envs/bcbio/lib/python2.7/site-packages/distribute-0.6.49-py2.7.egg/pkg_resources.py", line 507, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/home/vagrant/anaconda/envs/bcbio/lib/python2.7/site-packages/distribute-0.6.49-py2.7.egg/pkg_resources.py", line 1272, in run_script
    execfile(script_filename, namespace, namespace)
  File "/home/vagrant/anaconda/envs/bcbio/lib/python2.7/site-packages/bcbio_nextgen-0.7.9a-py2.7.egg/EGG-INFO/scripts/bcbio_nextgen.py", line 62, in <module>
    main(**kwargs)
  File "/home/vagrant/anaconda/envs/bcbio/lib/python2.7/site-packages/bcbio_nextgen-0.7.9a-py2.7.egg/EGG-INFO/scripts/bcbio_nextgen.py", line 40, in main
    run_main(**kwargs)
  File "/home/vagrant/anaconda/envs/bcbio/lib/python2.7/site-packages/bcbio_nextgen-0.7.9a-py2.7.egg/bcbio/pipeline/main.py", line 39, in run_main
    fc_dir, run_info_yaml)
  File "/home/vagrant/anaconda/envs/bcbio/lib/python2.7/site-packages/bcbio_nextgen-0.7.9a-py2.7.egg/bcbio/pipeline/main.py", line 82, in _run_toplevel
    for xs in pipeline.run(config, config_file, parallel, dirs, pipeline_items):
  File "/home/vagrant/anaconda/envs/bcbio/lib/python2.7/site-packages/bcbio_nextgen-0.7.9a-py2.7.egg/bcbio/pipeline/main.py", line 432, in run
    samples = rnaseq.estimate_expression(samples, run_parallel)
  File "/home/vagrant/anaconda/envs/bcbio/lib/python2.7/site-packages/bcbio_nextgen-0.7.9a-py2.7.egg/bcbio/pipeline/rnaseq.py", line 9, in estimate_expression
    samples = run_parallel("generate_transcript_counts", samples)
  File "/home/vagrant/anaconda/envs/bcbio/lib/python2.7/site-packages/bcbio_nextgen-0.7.9a-py2.7.egg/bcbio/distributed/multi.py", line 28, in run_parallel
    return run_multicore(fn, items, config, parallel=parallel)
  File "/home/vagrant/anaconda/envs/bcbio/lib/python2.7/site-packages/bcbio_nextgen-0.7.9a-py2.7.egg/bcbio/distributed/multi.py", line 84, in run_multicore
    for data in joblib.Parallel(parallel["num_jobs"])(joblib.delayed(fn)(x) for x in items):
  File "/home/vagrant/anaconda/envs/bcbio/lib/python2.7/site-packages/joblib-0.8.0a3-py2.7.egg/joblib/parallel.py", line 644, in __call__
    self.dispatch(function, args, kwargs)
  File "/home/vagrant/anaconda/envs/bcbio/lib/python2.7/site-packages/joblib-0.8.0a3-py2.7.egg/joblib/parallel.py", line 391, in dispatch
    job = ImmediateApply(func, args, kwargs)
  File "/home/vagrant/anaconda/envs/bcbio/lib/python2.7/site-packages/joblib-0.8.0a3-py2.7.egg/joblib/parallel.py", line 129, in __init__
    self.results = func(*args, **kwargs)
  File "/home/vagrant/anaconda/envs/bcbio/lib/python2.7/site-packages/bcbio_nextgen-0.7.9a-py2.7.egg/bcbio/utils.py", line 47, in wrapper
    return apply(f, *args, **kwargs)
  File "/home/vagrant/anaconda/envs/bcbio/lib/python2.7/site-packages/bcbio_nextgen-0.7.9a-py2.7.egg/bcbio/distributed/multitasks.py", line 61, in generate_transcript_counts
    return rnaseq.generate_transcript_counts(*args)
  File "/home/vagrant/anaconda/envs/bcbio/lib/python2.7/site-packages/bcbio_nextgen-0.7.9a-py2.7.egg/bcbio/pipeline/rnaseq.py", line 17, in generate_transcript_counts
    oncofuse_file = oncofuse.run(data)
  File "/home/vagrant/anaconda/envs/bcbio/lib/python2.7/site-packages/bcbio_nextgen-0.7.9a-py2.7.egg/bcbio/rnaseq/oncofuse.py", line 51, in run
    do.run(cmd, "oncofuse fusion detection", data)
  File "/home/vagrant/anaconda/envs/bcbio/lib/python2.7/site-packages/bcbio_nextgen-0.7.9a-py2.7.egg/bcbio/provenance/do.py", line 23, in run
    _do_run(cmd, checks, log_stdout)
  File "/home/vagrant/anaconda/envs/bcbio/lib/python2.7/site-packages/bcbio_nextgen-0.7.9a-py2.7.egg/bcbio/provenance/do.py", line 120, in _do_run
    raise subprocess.CalledProcessError(exitcode, error_msg)
subprocess.CalledProcessError: Command 'set -o pipefail; java -Xms750m -Xmx2000m -jar /usr/local/share/java/oncofuse/Oncofuse.jar /v-data/bcbio-nextgen/tests/test_automated_output/align/Test1/Test1Chimeric.out.junction rnastar AVG /v-data/bcbio-nextgen/tests/test_automated_output/align/Test1/oncofuse_out.txt

This is on the fusion test in the unit tests. You can run it with nosetests -v -s -a fusion=True. If you edit data/automated/run_info-fusion.yaml and change tophat2 to star it will use star as the aligner.

@tanglingfung
Copy link
Contributor Author

sorry @roryk, I forgot to handle to case where no fusions loaded, and my last commit is still not doing the right thing

roryk added a commit that referenced this pull request Apr 30, 2014
Oncofuse bug fixes.

This should be considered pre-alpha, we haven't been able to validate the calls but this should get us started.

Thanks to @tanglingfung for all of his awesome work.
@roryk roryk merged commit 8652a74 into bcbio:master Apr 30, 2014
@roryk
Copy link
Collaborator

roryk commented Apr 30, 2014

Hi Paul,

Awesome. Thanks for all of your work. It runs through the fusion test and spits out the oncofuse file, but it is empty because our test data doesn't have any fusion events. Do you have anything like a small test data set we could add that should have a fusion event in it?

@tanglingfung
Copy link
Contributor Author

@roryk , thanks for merging. I am still asking author of oncofuse to see how I can best handle this case.

for a small dataset for fusion test, @mjafin suggests a test data in the FusionMap bundle in #237 :
http://www.omicsoft.com/fusionmap/Software/FusionMap_2014-01-01.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants