Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mane Plus Clinical tags missing from gff.json.gz files #71

Closed
holtgrewe opened this issue Feb 29, 2024 · 2 comments
Closed

Mane Plus Clinical tags missing from gff.json.gz files #71

holtgrewe opened this issue Feb 29, 2024 · 2 comments

Comments

@holtgrewe
Copy link
Contributor

holtgrewe commented Feb 29, 2024

For cdot v0.2.22 and v0.2.23, it holds true that cdot-$VERSION.GCF_000001405.40_GRCh38.p14_genomic.110.gff.json.gz contain MANE Select but not MANE Plus Clinical tags.

The MANE Plus Clinical tags are in the cdot-0.2.22.refseq.grch38.json.gz and cdot-0.2.22.ensembl.Homo_sapiens.GRCh38.110.gff3.json.gz files, however.

@holtgrewe
Copy link
Contributor Author

I am not 100% certain that I understand how the transcripts are built. However, it appears that multiple GFF files are merged for historical transcripts. In the case of a transcripts with the same accession in version being in two different GFF files from NCBI and one has a tag while the other does not - which one wins? The first one or the last one?

@davmlaw
Copy link
Contributor

davmlaw commented Mar 6, 2024

cdot-$VERSION.GCF_000001405.40_GRCh38.p14_genomic.110.gff.json.gz contain MANE Select but not MANE Plus Clinical tags.

There are no "MANE Plus" tags in the original GFF file for the .json.gz produced above

$ zgrep -l "MANE Plus" *.gff.gz

GCF_000001405.40_GRCh38.p14_genomic.RS_2023_03.gff.gz
GCF_000001405.40_GRCh38.p14_genomic.RS_2023_10.gff.gz
GCF_000001405.40-RS_2023_03_combined_annotation_alignments.gff.gz
GCF_000001405.40-RS_2023_03_genomic.gff.gz

So the MANE Plus clinical tags appear in json.gz files where the GFF had the tag, or in merged historical ones

which one wins? The first one or the last one?

The last one. The idea is that the latest will have corrected the earlier one

@davmlaw davmlaw closed this as completed Mar 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants