-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GTF_2_GFF3 fails with latest Gencode GTF for human #72
Comments
Hi @amizeranschi. |
For the record, the referenced commits by @dkoppstein have solved my issue. Would be great to see this fix merged into the base pipeline. Unfortunately, my test run still ended up failing, due to a different reason. This is the updated command that I ran:
And below is the new issue. @valentinoruggieri any help with this new error would be much appreciated.
|
@amizeranschi thanks for the confirmation. I don't have access to a linux box with Docker to run the formal tests, so am waiting on opening a PR, if you or one of the devs can test run However, I am also encountering another issue downstream at the index_gff step with a Gencode GTF... will try tinkering around a bit, then may need to submit another bug report. For your STAR issue in the short term, it seems like the AWS iGenomes are out of date -- I would suggest using the latest ENSEMBL or GENCODE fasta and gtf files, and specifying them on the command line. |
OK, I will give this a try. Either way, I guess the STAR index will have to be rebuilt, correct? I will try using the latest FASTA and GTF from Gencode, I just saw that they recently made a new release, v. 44. |
Yes, the STAR index will have to be rebuilt.
|
OK, then I'll add --save_reference to my command so that I can later reuse the new STAR index. Thanks a lot for your help. |
@amizeranschi @dkoppstein Thank you both for reviewing and testing the code. Regarding the STAR issue, as you pointed out it depends on the STAR version (2.5.1b 2016) used to generate the index in the AWS iGenome and the STAR version used for the mapping (version 2.7.9a). We are looking for a way to take into consideration this discrepancy. |
OK, changing the command to the one below got things moving forward for me:
@dkoppstein I am now getting some GFF-related warnings and errors, but I'm not sure if this is what you were referring to earlier. This is what I get:
|
adding the option "--keep-gene" in the gffread script of the GTF_2_GFF3 module ( |
@dkoppstein I also ran a built-in test earlier, as you suggested:
and the outcome was similar to my test above. Would you be able to implement the fix suggested by @valentinoruggieri?
|
@amizeranschi I just pushed the suggested fix to my fork. |
Thanks, it seems like that did the trick. The built-in test ran fine now. I'll also run my own test above and see how that turns out. |
Unfortunately, my test still ran into an error. Reminder, this is the command I'm running:
And this was the outcome, with the latest version of @dkoppstein's fork. Any idea about this error, @valentinoruggieri?
|
@amizeranschi it seems the error arises because the "miso_genes" IDs in nextflow.config are slightly different from those of the genome version you used (e.g. ENSG00000004961 vs ENSG00000004961.15) . You should change them accordingly. |
I'm afraid I don't fully follow you. Could you please be a bit more specific? Where and what should I change in regards to those "miso_genes" IDs? |
rnasplice relies on "miso_genes" params to specify the gene IDs you want to plot via miso/sashimi_plot function. By default, it includes 3 genes ('ENSG00000004961, ENSG00000005302, ENSG00000147403') (check the nextflow.config file).
|
I see, thanks a lot for the added details. I have added the following to my command line:
and everything worked perfectly. It might also be helpful to update the documentation to reflect the actual default value for |
I would also suggest changing these values automatically if |
Good point @dkoppstein. I opened a separate issue report here: #78 |
@dkoppstein could you please create a PR with your fixes? |
Description of the bug
The pipeline fails when running with
--rmats
and the latest GTF file for human for Gencode: https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_43/gencode.v43.annotation.gtf.gz, even though--gencode
is specified in the command.GTF file source: https://www.gencodegenes.org/human/
I'm including a command below that reproduces the error. The sample sheet and contrast sheet are based on the pipeline's
test
profile, but I've customized the GTF and reference genome.Command used and terminal output
Relevant files
No response
System information
No response
The text was updated successfully, but these errors were encountered: