Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Paralogues plugin #681

Merged

Conversation

nuno-agostinho
Copy link
Contributor

@nuno-agostinho nuno-agostinho commented Jan 8, 2024

ENSVAR-2596

Requires Ensembl/ensembl-variation#1069

Changelog

  1. Fetch paralogue variants from VEP cache; the source for paralogue variants is prioritised in this order:
    • From custom VCF file (if vcf option is defined)
    • From VEP cache (in cache/offline mode)
    • From Ensembl API (in database mode)
  2. Simplify remote mode
    • Enable remote mode using paralogues=remote instead of a separate argument mode=remote
    • This change makes it clearer that remote mode is relative to paralogue annotation only
  3. Support a VCF file of variants mapped to their paralogous regions
    • This can be created during the release process
    • New regions option allows to get paralogue regions (that are used to look up for paralogue variants)
  4. Improve performance when using Tabix-indexed data by using get_data
  5. Fix using plugin with registry to access custom databases

Matches file

The file with variants mapped to paralogue regions can be created by running VEP with VCF output, a registry pointing to the next release databases (e112 in this case) and Paralogues plugin with the recommended options.

More information: refer to the Paralogues plugin page in Confluence.

Testing

Example of variants to test with this plugin:

1	939112	.	G	A
1	44827194-44827195	.	GT	AA
1	8871910	.	C	T
2	166073606	.	G	T
17	4833586	.	G	T
  1. Test fetching variants in VEP cache by using the plugin in --offline or --cache mode (without using a custom VCF)
  2. Check if paralogues=remote works as expected
  3. Check if matches option works with ClinVar file (location to file in ENSVAR-2596)
    • Test regions=0 and regions=1

@nuno-agostinho nuno-agostinho changed the title Paralogues plugin: Fetch variants from VEP cache Improve Paralogues plugin Jan 8, 2024
Copy link
Contributor

@jamie-m-a jamie-m-a left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Nuno - those few fixes seem to be working.

Just a quick comment before merging - it's good practise to not leave big chunks of commented code in stuff we're merging to the repos. Can you remove (and keep locally if you think it might be code you want to use later), thanks!

@nuno-agostinho
Copy link
Contributor Author

Thanks for the review, @jamie-m-a! I removed the commented code now.

@jamie-m-a jamie-m-a merged commit c236450 into Ensembl:postreleasefix/112 May 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants