# Scheduled Integration of ClinGen Gene-Disease Validity Data into WikiData

ClinGen (Clinical Genome Resource) develops curated data of genetic associations <br>
CC0 https://clinicalgenome.org/docs/terms-of-use/

This scheduled bot operates through WDI to integrate ClinGen Gene-Disease Validity Data <br>
http://jenkins.sulab.org/ <br>
https://github.com/SuLab/GeneWikiCentral/issues/116 <br>
https://search.clinicalgenome.org/kb/gene-validity/ <br>

Python script contributions, in order: Sabah Ul-Hasan, Andra Waagmeester, Andrew Su, Ginger Tsueng

## Checks

- Login should automatically align with given environment 
- For loop checks for both HGNC Qid and MONDO Qid per each row (ie if HGNC absent or multiple, then checks MONDO) 
- For loop works on multiple Qid option, tested for A2ML1 and corrected afterwards
- For loop puts correct Qid for either HGNC or MONDO, if available 

## Issues

- Why does output say a row is a complete, when this is not true in WikiData using scheduled bot?
- create_reference() and update_retrieved_if_new_multiple_refs functions adds and/or updates ref to existing HGNC or MONDO value in genetic association statement within 180 days (doesn't overwrite URLs from non-ClinGen sources)
- Updated, not updated, skipped - not definitive or mapping error..
- Maybe get of Definitive column, but keep Gene and Disease QID


### Relevant modules and libraries

In [4]:
# Installations by shell 
!pip install --upgrade pip # Installs pip, ensures it's up-to-date
!pip3 install tqdm # Visualizes installation progress (progress bar)
!pip3 install termcolor # For color-coding printed output
!pip3 install wikidataintegrator # For wikidata

Requirement already up-to-date: pip in /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages (19.3.1)


In [2]:
# Installations by python
from wikidataintegrator import wdi_core, wdi_login 
from wikidataintegrator.ref_handlers import update_retrieved_if_new_multiple_refs
from datetime import datetime
#from termcolor import colored 

import pandas as pd
import numpy as np

#import ssl
#ssl._create_default_https_context = ssl._create_unverified_context

import os
import copy 
import time 

### Login for running WDI

### ClinGen gene-disease validity data

In [3]:
print("Logging in...") 

# Enter your own username and password ** to be updated to ProteinBoxBot
os.environ["WDUSER"] = "username" # Uses os package to call and set the environment for wikidata username
os.environ["WDPASS"] = "password"

# Conditional that outputs error command if not in the local python environment
if "WDUSER" in os.environ and "WDPASS" in os.environ: 
    WDUSER = os.environ['WDUSER']
    WDPASS = os.environ['WDPASS']
else: 
    raise ValueError("WDUSER and WDPASS must be specified in local.py or as environment variables")      

# Sets attributed username and password as 'login'
login = wdi_login.WDLogin(WDUSER, WDPASS) 

Logging in...
https://www.wikidata.org/w/api.php
Successfully logged in as Sulhasan


In [4]:
# Read as csv
df = pd.read_csv('https://search.clinicalgenome.org/kb/gene-validity.csv', skiprows=6, header=None)  

# Label column headings
df.columns = ['Gene', 'HGNC Gene ID', 'Disease', 'MONDO Disease ID','SOP','Classification','Report Reference URL','Report Date']

# Create time stamp of when downloaded (error if isoformat() used)
timeStringNow = datetime.now().strftime("+%Y-%m-%dT00:00:00Z")

df.head(6) # View first 6 rows

Unnamed: 0,Gene,HGNC Gene ID,Disease,MONDO Disease ID,SOP,Classification,Report Reference URL,Report Date
0,A2ML1,HGNC:23336,Noonan syndrome with multiple lentigines,MONDO_0007893,SOP5,No Reported Evidence,https://search.clinicalgenome.org/kb/gene-vali...,2018-06-07T14:37:47.175Z
1,A2ML1,HGNC:23336,cardiofaciocutaneous syndrome,MONDO_0015280,SOP5,No Reported Evidence,https://search.clinicalgenome.org/kb/gene-vali...,2018-06-07T14:31:03.696Z
2,A2ML1,HGNC:23336,Costello syndrome,MONDO_0009026,SOP5,No Reported Evidence,https://search.clinicalgenome.org/kb/gene-vali...,2018-06-07T14:34:05.324Z
3,A2ML1,HGNC:23336,Noonan syndrome,MONDO_0018997,SOP5,Disputed,https://search.clinicalgenome.org/kb/gene-vali...,2018-06-07T14:23:53.157Z
4,A2ML1,HGNC:23336,Noonan syndrome-like disorder with loose anage...,MONDO_0011899,SOP5,No Reported Evidence,https://search.clinicalgenome.org/kb/gene-vali...,2018-06-07T14:40:11.599Z
5,AARS,HGNC:20,undetermined early-onset epileptic encephalopathy,MONDO_0018614,SOP6,Limited,https://search.clinicalgenome.org/kb/gene-vali...,2018-11-20T17:00:00.000Z


In [5]:
### Create empty columns for output file (ignore warnings)

df['Status'] = "pending" # "Status" column with 'pending' for all cells: 'error', 'complete', 'skipped' (meaning previously logged within 180 days)
df['Definitive'] = "" # Empty cell to be replaced with 'yes' or 'no' string
df['Gene QID'] = "" # To be replaced with 'absent' or 'multiple'
df['Disease QID'] = "" # To be replaced with 'absent' or 'multiple'

df.head(6)

Unnamed: 0,Gene,HGNC Gene ID,Disease,MONDO Disease ID,SOP,Classification,Report Reference URL,Report Date,Status,Definitive,Gene QID,Disease QID
0,A2ML1,HGNC:23336,Noonan syndrome with multiple lentigines,MONDO_0007893,SOP5,No Reported Evidence,https://search.clinicalgenome.org/kb/gene-vali...,2018-06-07T14:37:47.175Z,pending,,,
1,A2ML1,HGNC:23336,cardiofaciocutaneous syndrome,MONDO_0015280,SOP5,No Reported Evidence,https://search.clinicalgenome.org/kb/gene-vali...,2018-06-07T14:31:03.696Z,pending,,,
2,A2ML1,HGNC:23336,Costello syndrome,MONDO_0009026,SOP5,No Reported Evidence,https://search.clinicalgenome.org/kb/gene-vali...,2018-06-07T14:34:05.324Z,pending,,,
3,A2ML1,HGNC:23336,Noonan syndrome,MONDO_0018997,SOP5,Disputed,https://search.clinicalgenome.org/kb/gene-vali...,2018-06-07T14:23:53.157Z,pending,,,
4,A2ML1,HGNC:23336,Noonan syndrome-like disorder with loose anage...,MONDO_0011899,SOP5,No Reported Evidence,https://search.clinicalgenome.org/kb/gene-vali...,2018-06-07T14:40:11.599Z,pending,,,
5,AARS,HGNC:20,undetermined early-onset epileptic encephalopathy,MONDO_0018614,SOP6,Limited,https://search.clinicalgenome.org/kb/gene-vali...,2018-11-20T17:00:00.000Z,pending,,,


In [6]:
### Create a function for adding references to then be iterated in the loop "create_reference()"

def create_reference(): # Indicates a parameter included before running rest of function (otherwise may not recognize)
        refStatedIn = wdi_core.WDItemID(value="Q64403342", prop_nr="P248", is_reference=True) # ClinGen Qid = Q64403342, 'stated in' Pid = P248 
        timeStringNow = datetime.now().strftime("+%Y-%m-%dT00:00:00Z") # Create time stamp of when downloaded (error if isoformat() used)
        refRetrieved = wdi_core.WDTime(timeStringNow, prop_nr="P813", is_reference=True) # Calls on previous 'timeStringNow' string, 'retrieved' Pid = P813
        refURL = wdi_core.WDUrl((df.loc[index, 'Report Reference URL']), prop_nr="P854", is_reference=True) # 'reference URL' Pid = P854
        return [refStatedIn, refRetrieved, refURL]

### For loop that executes the following through each row of the dataframe 

In [7]:
start_time = time.time() # Keep track of how long it takes loop to run

for index, row in df.iterrows(): # Index is a row number, row is all variables and values for that row
        
    # Identify the string in the Gene or Disease column for a given row
    HGNC = df.loc[index, 'HGNC Gene ID'].replace("HGNC:", "") # .replace() changes HGNC: to space for SparQL query
    MONDO = df.loc[index, 'MONDO Disease ID'].replace("_", ":")
    
    # SparQL query to search for Gene or Disease in Wikidata based on HGNC ID (P354) or MonDO ID (P5270)
    sparqlQuery_HGNC = "SELECT * WHERE {?gene wdt:P354 \""+HGNC+"\"}" 
    result_HGNC = wdi_core.WDItemEngine.execute_sparql_query(sparqlQuery_HGNC) # Resultant query
    sparqlQuery_MONDO = "SELECT * WHERE {?disease wdt:P5270 \""+MONDO+"\"}" 
    result_MONDO = wdi_core.WDItemEngine.execute_sparql_query(sparqlQuery_MONDO)
    
    # Assign resultant length of dictionary for either Gene or Disease (number of Qid)
    HGNC_qlength = len(result_HGNC["results"]["bindings"]) 
    MONDO_qlength = len(result_MONDO["results"]["bindings"])
    
    # Conditional utilizing length value for output table, accounts for absent/present combos
    if HGNC_qlength == 1:
        HGNC_qid = result_HGNC["results"]["bindings"][0]["gene"]["value"].replace("http://www.wikidata.org/entity/", "")
        df.at[index, 'Gene QID'] = HGNC_qid # Input HGNC Qid in 'Gene QID' cell  
    if HGNC_qlength < 1: # If no Qid
        df.at[index, 'Status'] = "error" 
        df.at[index, 'Gene QID'] = "absent"  
    if HGNC_qlength > 1: # If multiple Qid
        df.at[index, 'Status'] = "error" 
        df.at[index, 'Gene QID'] = "multiple"
        
    if MONDO_qlength == 1:
        MONDO_qid = result_MONDO["results"]["bindings"][0]["disease"]["value"].replace("http://www.wikidata.org/entity/", "") 
        df.at[index, 'Disease QID'] = MONDO_qid  
    if MONDO_qlength < 1: 
        df.at[index, 'Status'] = "error" 
        df.at[index, 'Disease QID'] = "absent" 
    if MONDO_qlength > 1:
        df.at[index, 'Status'] = "error" 
        df.at[index, 'Disease QID'] = "multiple" 
        
    # Conditional inputs error such that only rows are written for where Classification = 'Definitive'
    if row['Classification']!='Definitive': # If the string is NOT 'Definitive' for the Classification column
        df.at[index, 'Status'] = "error" # Then input "error" in the Status column
        df.at[index, 'Definitive'] = "no" # And'no' for Definitive column
        continue # Skips rest and goes to next row
    else: # Otherwise
        df.at[index, 'Definitive'] = "yes" # Input 'yes' for Definitive column, go to next step
  
    # Conditional continues to write into WikiData only if 1 Qid for each + Definitive classification 
    if HGNC_qlength == 1 & MONDO_qlength == 1:
        
        # Call upon create_reference() function created   
        reference = create_reference() 
        
        # Add disease value to gene item page, and gene value to disease item page (symmetry)
        
        # Creates 'gene assocation' statement (P2293) whether or not it's already there, and includes the references
        statement_HGNC = [wdi_core.WDItemID(value=MONDO_qid, prop_nr="P2293", references=[copy.deepcopy(reference)])] 
        wikidata_HGNCitem = wdi_core.WDItemEngine(wd_item_id=HGNC_qid, 
                                                  data=statement_HGNC, 
                                                  global_ref_mode='CUSTOM', # parameter that looks within 180 days
                                                  ref_handler=update_retrieved_if_new_multiple_refs, 
                                                  append_value=["P2293"])
        wikidata_HGNCitem.get_wd_json_representation() # Gives json structure that submitted to API, helpful for debugging 
        
        statement_MONDO = [wdi_core.WDItemID(value=HGNC_qid, prop_nr="P2293", references=[copy.deepcopy(reference)])] 
        wikidata_MONDOitem = wdi_core.WDItemEngine(wd_item_id=MONDO_qid, 
                                                   data=statement_MONDO, 
                                                   global_ref_mode='CUSTOM',
                                                   ref_handler=update_retrieved_if_new_multiple_refs, 
                                                   append_value=["P2293"])
        wikidata_MONDOitem.get_wd_json_representation()
        
        # Write message for combination successfully logged, and enter 'complete' in Status column
        HGNC_name = df.loc[index, 'Gene'] # To output gene name > HGNC ID
        MONDO_name = df.loc[index, 'Disease']
        df.at[index, 'Status'] = "complete" 
        
        #print(colored(HGNC_name,"blue"), "Gene with HGNC ID", 
        #      colored(HGNC,"blue"), "logged as Qid",
        #      colored(wikidata_HGNCitem.write(login),"blue"),
        #      "and")
        #print(colored(MONDO_name,"green"), "Disease with MONDO ID", 
        #      colored(MONDO,"green"), "logged as Qid",
        #      colored(wikidata_MONDOitem.write(login),"green"))
        
        
end_time = time.time() # Captures when loop run ends
print("The total time of this loop is:", end_time - start_time, "seconds, or", (end_time - start_time)/60, "minutes")

# Write output to a .csv file
now = datetime.now() # Retrieves current time and saves it as 'now'
# Includes hour:minute:second_dd-mm-yyyy time stamp (https://en.wikipedia.org/wiki/ISO_8601)
df.to_csv("ClinGenBot_Status-Output_" + now.isoformat() + ".csv")  # isoformat
df

[34mABCC9[0m Gene with HGNC ID [34m60[0m logged as Qid [34mQ18034993[0m and
[32mhypertrichotic osteochondrodysplasia Cantu type[0m Disease with MONDO ID [32mMONDO:0009406[0m logged as Qid [32mQ5034093[0m
[34mABCD1[0m Gene with HGNC ID [34m61[0m logged as Qid [34mQ14912808[0m and
[32mX-linked cerebral adrenoleukodystrophy[0m Disease with MONDO ID [32mMONDO:0010247[0m logged as Qid [32mQ55345732[0m
[34mABHD12[0m Gene with HGNC ID [34m15868[0m logged as Qid [34mQ18038087[0m and
[32mPHARC syndrome[0m Disease with MONDO ID [32mMONDO:0012984[0m logged as Qid [32mQ32137273[0m
[34mACAD8[0m Gene with HGNC ID [34m87[0m logged as Qid [34mQ18038564[0m and
[32misobutyryl-CoA dehydrogenase deficiency[0m Disease with MONDO ID [32mMONDO:0012648[0m logged as Qid [32mQ6085391[0m
[34mACAD9[0m Gene with HGNC ID [34m21497[0m logged as Qid [34mQ18039534[0m and
[32macyl-CoA dehydrogenase 9 deficiency[0m Disease with MONDO ID [32mMONDO:0012624[0m logge

[34mBRIP1[0m Gene with HGNC ID [34m20473[0m logged as Qid [34mQ18047355[0m and
[32mFanconi anemia complementation group j[0m Disease with MONDO ID [32mMONDO:0012187[0m logged as Qid [32mQ32147098[0m
[34mBRWD3[0m Gene with HGNC ID [34m17342[0m logged as Qid [34mQ18053726[0m and
[32mX-linked syndromic intellectual disability[0m Disease with MONDO ID [32mMONDO:0020119[0m logged as Qid [32mQ8041560[0m
[34mBSND[0m Gene with HGNC ID [34m16512[0m logged as Qid [34mQ18032616[0m and
[32mBartter disease type 4a[0m Disease with MONDO ID [32mMONDO:0011242[0m logged as Qid [32mQ27674850[0m
[34mBTD[0m Gene with HGNC ID [34m1122[0m logged as Qid [34mQ18021004[0m and
[32mbiotinidase deficiency[0m Disease with MONDO ID [32mMONDO:0009665[0m logged as Qid [32mQ776026[0m
[34mBUB1B[0m Gene with HGNC ID [34m1149[0m logged as Qid [34mQ17853927[0m and
[32mmosaic variegated aneuploidy syndrome 1[0m Disease with MONDO ID [32mMONDO:0009759[0m logged as Qi

[32mexostoses, multiple, type 1[0m Disease with MONDO ID [32mMONDO:0007585[0m logged as Qid [32mQ55950215[0m
[34mEXT2[0m Gene with HGNC ID [34m3513[0m logged as Qid [34mQ17917699[0m and
[32mexostoses, multiple, type 2[0m Disease with MONDO ID [32mMONDO:0007586[0m logged as Qid [32mQ55950216[0m
[34mEYA1[0m Gene with HGNC ID [34m3519[0m logged as Qid [34mQ17917742[0m and
[32mbranchio-oto-renal syndrome[0m Disease with MONDO ID [32mMONDO:0007029[0m logged as Qid [32mQ2280106[0m
[34mEYA4[0m Gene with HGNC ID [34m3522[0m logged as Qid [34mQ17917387[0m and
[32mnonsyndromic genetic deafness[0m Disease with MONDO ID [32mMONDO:0019497[0m logged as Qid [32mQ9079046[0m
[34mF5[0m Gene with HGNC ID [34m3542[0m logged as Qid [34mQ14865116[0m and
[32mthrombophilia due to activated protein C resistance[0m Disease with MONDO ID [32mMONDO:0008560[0m logged as Qid [32mQ296104[0m
[34mFANCC[0m Gene with HGNC ID [34m3584[0m logged as Qid [34mQ182505

[34mIVD[0m Gene with HGNC ID [34m6186[0m logged as Qid [34mQ3195538[0m and
[32misovaleric acidemia[0m Disease with MONDO ID [32mMONDO:0009475[0m logged as Qid [32mQ3278042[0m
[34mKCNQ2[0m Gene with HGNC ID [34m6296[0m logged as Qid [34mQ14914307[0m and
[32mundetermined early-onset epileptic encephalopathy[0m Disease with MONDO ID [32mMONDO:0018614[0m logged as Qid [32mQ56014174[0m
[34mKCNQ4[0m Gene with HGNC ID [34m6298[0m logged as Qid [34mQ18033856[0m and
[32mnonsyndromic genetic deafness[0m Disease with MONDO ID [32mMONDO:0019497[0m logged as Qid [32mQ9079046[0m
[34mKDM5C[0m Gene with HGNC ID [34m11114[0m logged as Qid [34mQ18032784[0m and
[32mX-linked syndromic intellectual disability[0m Disease with MONDO ID [32mMONDO:0020119[0m logged as Qid [32mQ8041560[0m
[34mKLHL40[0m Gene with HGNC ID [34m30372[0m logged as Qid [34mQ18050101[0m and
[32mnemaline myopathy 8[0m Disease with MONDO ID [32mMONDO:0014138[0m logged as Qid [32

[32mautosomal recessive nonsyndromic deafness 9[0m Disease with MONDO ID [32mMONDO:0010986[0m logged as Qid [32mQ28024662[0m
[34mOTOG[0m Gene with HGNC ID [34m8516[0m logged as Qid [34mQ18055453[0m and
[32mnonsyndromic genetic deafness[0m Disease with MONDO ID [32mMONDO:0019497[0m logged as Qid [32mQ9079046[0m
[34mOTOGL[0m Gene with HGNC ID [34m26901[0m logged as Qid [34mQ18054103[0m and
[32mnonsyndromic genetic deafness[0m Disease with MONDO ID [32mMONDO:0019497[0m logged as Qid [32mQ9079046[0m
[34mPAK3[0m Gene with HGNC ID [34m8592[0m logged as Qid [34mQ18030342[0m and
[32mX-linked syndromic intellectual disability[0m Disease with MONDO ID [32mMONDO:0020119[0m logged as Qid [32mQ8041560[0m
[34mPALB2[0m Gene with HGNC ID [34m26144[0m logged as Qid [34mQ18046302[0m and
[32mFanconi anemia complementation group N[0m Disease with MONDO ID [32mMONDO:0012565[0m logged as Qid [32mQ32147054[0m
[34mPCDH19[0m Gene with HGNC ID [34m14270[

[32mhereditary pheochromocytoma-paraganglioma[0m Disease with MONDO ID [32mMONDO:0017366[0m logged as Qid [32mQ56013982[0m
[34mSHANK3[0m Gene with HGNC ID [34m14294[0m logged as Qid [34mQ18048003[0m and
[32mPhelan McDermid syndrome[0m Disease with MONDO ID [32mMONDO:0011652[0m logged as Qid [32mQ1926345[0m
[34mSHOC2[0m Gene with HGNC ID [34m15454[0m logged as Qid [34mQ18032712[0m and
[32mNoonan syndrome-like disorder with loose anagen hair[0m Disease with MONDO ID [32mMONDO:0011899[0m logged as Qid [32mQ55783530[0m
[34mSIX1[0m Gene with HGNC ID [34m10887[0m logged as Qid [34mQ18031509[0m and
[32mbranchio-oto-renal syndrome[0m Disease with MONDO ID [32mMONDO:0007029[0m logged as Qid [32mQ2280106[0m
[34mSLC16A2[0m Gene with HGNC ID [34m10923[0m logged as Qid [34mQ18031570[0m and
[32mAllan-Herndon-Dudley syndrome[0m Disease with MONDO ID [32mMONDO:0010354[0m logged as Qid [32mQ4731121[0m
[34mSLC22A5[0m Gene with HGNC ID [34m10969[0

[34mUBE2A[0m Gene with HGNC ID [34m12472[0m logged as Qid [34mQ18032252[0m and
[32msyndromic X-linked intellectual disability Nascimento type[0m Disease with MONDO ID [32mMONDO:0010461[0m logged as Qid [32mQ28065624[0m
[34mUBE3A[0m Gene with HGNC ID [34m12496[0m logged as Qid [34mQ14878336[0m and
[32mAngelman syndrome[0m Disease with MONDO ID [32mMONDO:0007113[0m logged as Qid [32mQ535364[0m
[34mVHL[0m Gene with HGNC ID [34m12687[0m logged as Qid [34mQ14905473[0m and
[32mhereditary pheochromocytoma-paraganglioma[0m Disease with MONDO ID [32mMONDO:0017366[0m logged as Qid [32mQ56013982[0m
[34mVPS13B[0m Gene with HGNC ID [34m2183[0m logged as Qid [34mQ7907439[0m and
[32mCohen syndrome[0m Disease with MONDO ID [32mMONDO:0008999[0m logged as Qid [32mQ1107087[0m
[34mWAS[0m Gene with HGNC ID [34m12731[0m logged as Qid [34mQ14863971[0m and
[32mWiskott-Aldrich syndrome[0m Disease with MONDO ID [32mMONDO:0010518[0m logged as Qid [32mQ95

Unnamed: 0,Gene,HGNC Gene ID,Disease,MONDO Disease ID,SOP,Classification,Report Reference URL,Report Date,Status,Definitive,Gene QID,Disease QID
0,A2ML1,HGNC:23336,Noonan syndrome with multiple lentigines,MONDO_0007893,SOP5,No Reported Evidence,https://search.clinicalgenome.org/kb/gene-vali...,2018-06-07T14:37:47.175Z,error,no,Q18051234,absent
1,A2ML1,HGNC:23336,cardiofaciocutaneous syndrome,MONDO_0015280,SOP5,No Reported Evidence,https://search.clinicalgenome.org/kb/gene-vali...,2018-06-07T14:31:03.696Z,error,no,Q18051234,absent
2,A2ML1,HGNC:23336,Costello syndrome,MONDO_0009026,SOP5,No Reported Evidence,https://search.clinicalgenome.org/kb/gene-vali...,2018-06-07T14:34:05.324Z,error,no,Q18051234,Q1136492
3,A2ML1,HGNC:23336,Noonan syndrome,MONDO_0018997,SOP5,Disputed,https://search.clinicalgenome.org/kb/gene-vali...,2018-06-07T14:23:53.157Z,error,no,Q18051234,absent
4,A2ML1,HGNC:23336,Noonan syndrome-like disorder with loose anage...,MONDO_0011899,SOP5,No Reported Evidence,https://search.clinicalgenome.org/kb/gene-vali...,2018-06-07T14:40:11.599Z,error,no,Q18051234,Q55783530
5,AARS,HGNC:20,undetermined early-onset epileptic encephalopathy,MONDO_0018614,SOP6,Limited,https://search.clinicalgenome.org/kb/gene-vali...,2018-11-20T17:00:00.000Z,error,no,Q17707208,Q56014174
6,AASS,HGNC:17366,hyperlysinemia (disease),MONDO_0009388,SOP6,Moderate,https://search.clinicalgenome.org/kb/gene-vali...,2019-11-08T17:00:00.000Z,error,no,Q18035079,absent
7,ABCC9,HGNC:60,hypertrichotic osteochondrodysplasia Cantu type,MONDO_0009406,SOP4,Definitive,https://search.clinicalgenome.org/kb/gene-vali...,2017-09-27T00:00:00,complete,yes,Q18034993,Q5034093
8,ABCD1,HGNC:61,X-linked cerebral adrenoleukodystrophy,MONDO_0010247,SOP4,Definitive,https://search.clinicalgenome.org/kb/gene-vali...,2018-02-07T14:00:00,complete,yes,Q14912808,Q55345732
9,ABHD12,HGNC:15868,PHARC syndrome,MONDO_0012984,SOP5,Definitive,https://search.clinicalgenome.org/kb/gene-vali...,2018-06-28T16:45:15.791Z,complete,yes,Q18038087,Q32137273
