scripts to automatically update ANNOVAR db

Update ClinVar

If you wish to update clinvar data for ANNOVAR more frequently than the ANNOVAR releases, you can use this script (works on clinvar VCF format as of May 2021).


  • a decently recent version of ANNOVAR (tested on 2020Jun08).


Using conda

Clone the repo then run:

conda env create -f environment.yml

to create the environment and install the requirements. Then you should activate the environment:

conda activate update_annovar_db


To test the avinput to annovar db format conversion run:


this will try to convert the clinvar/GRCh37/clinvar_test.avinput to a new file linvar/GRCh37/clinvar_test.txt compatible wth ANNOVAR db format. If you are happy with it, then you can try the entire script.


Required arguments

  • -d, --database-type: the database type (e.g. 'clinvar')
  • -a, --annovar-path: path to ANNOVAR perl scripts folder
  • -hp, --humandb-path to ANNOVAR humandb/ folder

Optional arguments

  • -g, --genome-version: genome version [GRCh37|GRCh38], default GRCh37
  • -r, --rename: rename the date part of the file, e.g. clinvar_20210501.txt to clinvar_latest.txt, e.g. 'latest'


python -d clinvar -hp /Path/to/annovar/humandb -a /Path/to/annovar/2020Jun08 -g GRCh38

The script checks the clinvar/{genome_version}/ folder to detect a previous version of Clinvar. Then it downloads the most recent md5 file of clinvar and compares it to the version present in the clinvar/{genome_version} folder (or to nothing of there's no previous version). If the 2 md5 do not match, the script downloads the latest clinvar VCF then converts it in several steps into ANNOVAR db format.


The resulting version of clinvar is not decomposed not normalised before conversion to ANNOVAR format. However some internal tests have shown that the current Clinvar VCFs are already processed, then these steps are no longer mandatory. But use it at your own risks!