Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide utility to generate transcript variants from HGVS-ish expressions with gene names #517

Open
reece opened this issue Aug 27, 2018 · 6 comments
Assignees
Labels
keep alive exempt issue from staleness checks

Comments

@reece
Copy link
Member

reece commented Aug 27, 2018

Unfortunately, some authors generate bogus HGVS expressions that use gene names rather than reference sequences. This issue should provide functionality to generate plausible expressions.

For example for TNFSF1A has 4 transcripts at the site of rs104895271. So, for a gene-based variant like TNFRSF1A:c.123T>C return NM_001065.3:c.123T>C, and for TNFRSF1A:n.426T>C return NR_144351.1:n.426T>C. In general, there might be zero or more plausible variants for a given input.

See code in misc/experimental/hgvs-guess-plausible-transcripts.

@reece reece self-assigned this Aug 27, 2018
@reece reece added this to the 1.3.0 milestone Sep 4, 2018
@reece
Copy link
Member Author

reece commented Dec 10, 2018

See #267, and particularly this comment:


hgvs-guess-plausible-transcripts works like this:

(3.6) snafu$ ./misc/experimental/hgvs-guess-plausible-transcripts 'HFE2:c.187_188insGAG' 'TNFRSF1A:c.123T>C' 'TNFRSF1A:n.426T>C' FRSF1A:n.426T>C' 
HFE2:c.187_188insGAG	5	NM_213653.3:c.187_188insGAG	NM_202004.3:c.187_188insGAG	NM_145277.4:c.187_188insGAG	NM_001316767.1:c.187_188insGAG	NM_213652.3:c.187_188insGAG
TNFRSF1A:c.123T>C	1	NM_001065.3:c.123T>C
TNFRSF1A:n.426T>C	1	NR_144351.1:n.426T>C

For each quasi-variant on the command line, the script constructs the variant on all of the transcripts for the named gene. If the variant is considered valid (in the hgvs validator sense), then it's displayed. Columns above are input variant, # of results, list of results (all tab sep).

@davmlaw
Copy link
Contributor

davmlaw commented Jun 13, 2023

Hi, I've had to implement this, and after putting it in front of medical scientists, I found the "generate everything" was slow and not that helpful to them. What they appear to want is the canonical transcript, which nowadays is the MANE transcript.

The RefSeq and Ensembl GTFs have tags on them, eg MANE_select for GRCh38 or Refseq select for 37

Copy link

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the stale Issue is stale and subject to automatic closing label Dec 24, 2023
Copy link

This issue was closed because it has been stalled for 7 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Dec 31, 2023
@reece reece reopened this Feb 19, 2024
@reece reece added resurrected and removed stale Issue is stale and subject to automatic closing labels Feb 19, 2024
@reece
Copy link
Member Author

reece commented Feb 19, 2024

This issue was closed by stalebot. It has been reopened to give more time for community review. See biocommons coding guidelines for stale issue and pull request policies. This resurrection is expected to be a one-time event.

Copy link

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the stale Issue is stale and subject to automatic closing label May 23, 2024
@jsstevenson jsstevenson added keep alive exempt issue from staleness checks and removed stale Issue is stale and subject to automatic closing resurrected labels May 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
keep alive exempt issue from staleness checks
Projects
None yet
Development

No branches or pull requests

3 participants