Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Results _ File/ Just one CDS #16

Closed
StephanieRodrigues opened this issue Apr 19, 2019 · 6 comments
Closed

Results _ File/ Just one CDS #16

StephanieRodrigues opened this issue Apr 19, 2019 · 6 comments

Comments

@StephanieRodrigues
Copy link

StephanieRodrigues commented Apr 19, 2019

[Hi](url
SNPGenie_ISSUES.zip
) Chase,
Its me again! I thought that my problems are solved, in fact, im not having problems to run the program, but when I started to look to my results and the output files, something seems weird.
Seems that the program just recognized my first CDS, and just print the results for this.
I`m attaching the results folder, the gtf, vcf and fasta file.
The command line that i used was:
./snpgenie.pl --vcfformat=3 --snpreport=CL9800.vcf —fastafile=Ef_Aus0004.fasta --gtffile=Aus004.gtf

Regards,

@singing-scientist
Copy link
Contributor

It is likely an issue with the GTF file (e.g., non-UNIX line endings). I you can provide it, I'll take a look.

@StephanieRodrigues
Copy link
Author

Aus004_gtf.zip
Here!
I don`t know whats going on, I solve the last issue changing the CDS as you told me, then I remove the double quotes. But still wrong!

@singing-scientist
Copy link
Contributor

I see; as discussed in the Troubleshooting of the SNPGenie documentation, the line endings in all your files must be Unix. However, the line endings here are Mac. You'll have to convert them to Unix in a program like TextWrangler (Mac) or Notepad++ (Windows). Let me know.

@StephanieRodrigues
Copy link
Author

Hi Chase, Sorry for the delay!
So I restart my process, I ran gffcompare again, to convert my gff file in gtf file! And now, the gtf file seems to be ok. In my last issue, the program was not recognizing + strands because the word (CDS is missing). I solved this opening my file in Excel and replacing the names, but what happening is the line ending problem. (Your last reply in this issue here).
So now, after convert my gff file any of this problems happened, but know, this is the message:

The CDS coordinates for gene gene45 in the gtf file do not yield a set of complete codons,

or are absent from the file. The number of nucleotides must be a multiple of 3.

SNPGenie terminated.

I tried to delete all CDS lines with this problems, but I still having error. I saw other issue here that the guy had the same problem, but in his case, his GTF file had multiple transcripts for the same gene. I checked my GTF file but this not seems to be the problem.
I`m attaching you the GTF file.
Really sorry for a bunch of this issues, I really need to run your program.
GTF_Aus0004.zip

Regards,
Stephanie

@singing-scientist
Copy link
Contributor

Hi @StephanieRodrigues ! No worries. I checked gene45 and indeed, there are (50502-49545+1)/3 = 319.3333 codons, so this gene length is not a multiple of 3. There may be multiple "CDS" lines for the same gene (e.g., protein-coding exons), but alll "CDS" for each unique gene name must sum to a multiple of 3.

First, make sure CDS records for the same gene have the same gene name. If there are still problems with some genes, then the problematic (non-multiple of 3) genes should be removed. Let me know.

@singing-scientist
Copy link
Contributor

As this issue has been silent 26 days, I am closing it now. Please feel free to re-open if you have further issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants