This repo aims to collate publicly available releases of the IMGT/GENE-DB resource of immunoglobulin and T cell receptor germline genes. I am not affiliated with IMGT in any way, I've just gathered these data to make it easier to compare between different releases of this resource, as to the best of my knowledge IMGT only hosts current releases. However given that people publish papers and tools using and referencing this database, it seems that a public historical record is necessary for ensuring reproducibility.
The releases themselves are stored in the releases/
directory, named by date of access, source of access, and IMGT provided release number. The majority of the pre-2022 historical releases were recovered using the Internet Archive's Wayback Machine ('WBM' label), using the scripts/harvest-wayback.py
script. Some entries (absent most files) were incidentally banked by me using my tool IMGTgeneDL (producing reference sets for stitchr) ('JH-GeneDL'). Newer entries have been directly downloaded from GENE-DB itself ('GENEDB').
Note that this is likely an incomplete record of published releases. However the scripts/harvest-genedb.py
script can be used to download the complete current release; I hope to run this script on a regular basis and similarly bank those releases here, to provide an ongoing record.
As stated on their website and in their publications, IMGT's policy regarding sharing of their data is that:
"... IMGT® software and data are provided to the academic users and NPO's (Not for Profit Organization(s)) under the CC BY-NC-ND 4.0 license. Any other use of IMGT® material, from the private sector, needs a financial arrangement with CNRS."