Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify coordinates for chess2.2.genes for RefSeq entries #2

Open
nh13 opened this issue Jul 23, 2020 · 2 comments
Open

Clarify coordinates for chess2.2.genes for RefSeq entries #2

nh13 opened this issue Jul 23, 2020 · 2 comments

Comments

@nh13
Copy link

nh13 commented Jul 23, 2020

I was looking at the chess2.2.genes file and this entry:

protein_coding  LMAN1   CHS.24766   chr18:59324939-59359276:-   RefSeq  3998    lectin%2C mannose binding 1

I figured that the coordinates would be the refseq coordinates from chess2.2_and_refseq.gtf.gz:

chr18  BestRefSeq  transcript  59327823  59359276  .  -  .  transcript_id  "CHS.24766.4";  gene_id  "CHS.24766";  gene_name  "LMAN1";

But instead it looks like the chess2.2.genes takes the smallest start coordinate and largest end coordinate from chess2.2_and_refseq.gtf.gz. Is that correct? If so, should the README have some more details?

@alevar
Copy link
Contributor

alevar commented Jul 23, 2020

Hi Nils,

Thank you for pointing this out! We will include a description of the columns for each file to clarify their meaning.

The .genes file describes coordinates as they are specified in the corresponding gene entries in the GFF3 annotation files. In the gff files, gene coordinates are indeed specified to be inclusive of all child transcripts/exons/CDS entries. The 5th column in the .genes file indicates the annotation where the gene has previously been annotated, but the coordinates might have been extended based on evidence from GTEx data.

Best,

Ales

@nh13
Copy link
Author

nh13 commented Jul 24, 2020

but the coordinates might have been extended based on evidence from GTEx data

This explains it, thank-you! I'll leave this issue open for when the descriptions in the README are updated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants