Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A few issues #33

Closed
crj32 opened this issue Apr 5, 2019 · 8 comments
Closed

A few issues #33

crj32 opened this issue Apr 5, 2019 · 8 comments

Comments

@crj32
Copy link

crj32 commented Apr 5, 2019

Ruolin

That is useful with the adding the gene_id from the ensembl annotation, much faster than my python script to do it afterwards. I'll let you know how this pans out with the downstream tools I use. There are a few other minor issues though that would help make the tool better IMO:

  1. What would likely be a further improvement is adding the gene_name as well, this is an example from Stringtie of the field we are discussing:

gene_id "ERR188044.1"; transcript_id "ERR188044.1.1"; reference_id "NM_018390"; ref_gene_id "NM_018390"; ref_gene_name "PLCXD1"

So we have your tool's gene_id and transcript_id, plus the ref_gene_name and ref_gene_id from ensembl/ reference annotation.

  1. If I am running the --no-quant flag, would it be possible to remove the TPM and FPKM parts from the annotation output file, they should not really be there, minor issue. We get these currently:

;FPKM "NA";Frac "NA";TPM "NA";

  1. Usually I run stringtie in 1 directory and it outputs its .gtf files into the same directory, each named according to the original file ID. This is easier than having them all in different directories, then having to rename and move them all afterwards, before I run cuffmerge. I actually use taco (https://www.nature.com/articles/nmeth.4078) instead of cuffmerge, it is supposed to be a lot better, and my results made more sense when using this.

  2. I don't really want any log files to be outputted, I only need the .gtf file. I have to include extra code to clean all this up. Is it possible to have a parameter to get a single .gtf and nothing else?

Thanks for your time. These are just some ideas for you to review, I am keen on using strawberry in our work.

Chris

@ruolin
Copy link
Owner

ruolin commented Apr 6, 2019

@crj32 Thank you for your suggestions. I really like those people giving feedbacks. Please don't hesitate to do so. And for 3 and 4, I think what you need is an option for output_gtf but not an output folder, right? I will start to cracking these issues. It will probably take a few days since I am also busy with other stuff.

@crj32
Copy link
Author

crj32 commented Apr 6, 2019

Yeah, just a custom named output .gtf would be nice. Right you are.

@ruolin
Copy link
Owner

ruolin commented Apr 6, 2019

What if when no annotation, do you prefer something like ; ref_gene_id ""; ref_gene_name ""?

@crj32
Copy link
Author

crj32 commented Apr 7, 2019

I don't mind because we are always using a reference, but I'd suggest making the names for these IDs the same as Stringtie uses incase any other tools require specific names for compatibility.

I think without a reference, it is probably just going to be 'gene_id' and 'transcript_id'? I think you had that already.

@crj32
Copy link
Author

crj32 commented Apr 8, 2019

I've just noticed something, gene_id "ERR188044.1"; transcript_id "ERR188044.1.1", from Stringtie include the sample name in the syntax. You may have recognised this already.

This is important as when it comes to merging the gtf's, as I have an error with strawberry using Taco to merge the .gtf files, because the .gtf format is not the same as Stringtie (https://tacorna.github.io/). This is an issue for me, and I expect other people, potentially.

@ruolin
Copy link
Owner

ruolin commented Apr 10, 2019

@crj32 I think I have completed all your suggested items. Can you have a try of the current master, commit id 09aee96 to see if that has met your needs?

@crj32
Copy link
Author

crj32 commented Apr 10, 2019

My VM I do my work on is down today unfortunately so we will have to wait and is it OK to get a precompiled binary instead of the source code?

@crj32
Copy link
Author

crj32 commented Apr 11, 2019

This is good, my computer is working again, I have compiled from source and will test the new version for you.

@ruolin ruolin closed this as completed May 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants