# Gene–Disease Network Enrichment with UniProt & BioPortal


## Table of Contents
1. [Environment Setup](#setup)
2. [Imports](#imports)
3. [Configuration & Parameters](#config)
4. [Load Graph](#load-graph)
5. [Enrich Genes with UniProt](#enrich-genes)
6. [Enrich Diseases with MeSH/OMIM](#enrich-diseases)
7. [Visualization](#visualization)
8. [Save & Export](#save-export)
9. [Next Steps](#next-steps)


## 1. Environment Setup <a id='setup'></a>
Install required libraries

In [20]:
!pip install psycopg2-binary pandas


Collecting psycopg2-binary
  Downloading psycopg2_binary-2.9.10-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.9 kB)
Downloading psycopg2_binary-2.9.10-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.0 MB)
[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/3.0 MB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━[0m[91m╸[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.4/3.0 MB[0m [31m10.5 MB/s[0m eta [36m0:00:01[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m3.0/3.0 MB[0m [31m51.2 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.0/3.0 MB[0m [31m39.2 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: psycopg2-binary
Successfully installed psycopg2-binary-2.9.10


In [28]:
!pip install gql[requests]


Collecting gql[requests]
  Downloading gql-3.5.3-py2.py3-none-any.whl.metadata (9.4 kB)
Collecting graphql-core<3.2.7,>=3.2 (from gql[requests])
  Downloading graphql_core-3.2.6-py3-none-any.whl.metadata (11 kB)
Collecting backoff<3.0,>=1.11.1 (from gql[requests])
  Downloading backoff-2.2.1-py3-none-any.whl.metadata (14 kB)
Downloading backoff-2.2.1-py3-none-any.whl (15 kB)
Downloading graphql_core-3.2.6-py3-none-any.whl (203 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m203.4/203.4 kB[0m [31m9.9 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading gql-3.5.3-py2.py3-none-any.whl (74 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m74.3/74.3 kB[0m [31m6.8 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: graphql-core, backoff, gql
Successfully installed backoff-2.2.1 gql-3.5.3 graphql-core-3.2.6


In [32]:
!pip install mygene


Collecting mygene
  Downloading mygene-3.2.2-py2.py3-none-any.whl.metadata (10 kB)
Collecting biothings-client>=0.2.6 (from mygene)
  Downloading biothings_client-0.4.1-py3-none-any.whl.metadata (10 kB)
Downloading mygene-3.2.2-py2.py3-none-any.whl (5.4 kB)
Downloading biothings_client-0.4.1-py3-none-any.whl (46 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m46.7/46.7 kB[0m [31m3.4 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: biothings-client, mygene
Successfully installed biothings-client-0.4.1 mygene-3.2.2


In [37]:
# --- Setup ---
!pip install -q mygene gql[requests] tqdm

import mygene
from tqdm import tqdm
from gql import gql, Client
from gql.transport.requests import RequestsHTTPTransport

# --- Connect to Open Targets GraphQL ---
transport = RequestsHTTPTransport(
    url="https://api.platform.opentargets.org/api/v4/graphql",
    verify=True,
    retries=3,
)
client = Client(transport=transport, fetch_schema_from_transport=True)

# --- Step 1: Map Entrez Gene IDs to Ensembl IDs and HGNC Symbols ---
print("🔎 Mapping Gene::#### nodes to Ensembl IDs...")

mg = mygene.MyGeneInfo()
gene_ids = [n.split("::")[1] for n, d in G.nodes(data=True)
            if d.get("label") == "Gene" and "::" in n]

entrez_to_ensembl = {}
for i in tqdm(range(0, len(gene_ids), 1000)):
    chunk = gene_ids[i:i+1000]
    results = mg.querymany(chunk, scopes='entrezgene', fields='symbol,ensembl.gene', species='human')
    for r in results:
        if "ensembl" in r and "symbol" in r:
            ens = r["ensembl"]
            if isinstance(ens, dict):
                entrez_to_ensembl[r["query"]] = {
                    "symbol": r["symbol"],
                    "ensembl_id": ens["gene"]
                }

# Apply mappings to your graph
for node in G.nodes:
    if node.startswith("Gene::"):
        entrez = node.split("::")[1]
        info = entrez_to_ensembl.get(entrez)
        if info:
            G.nodes[node]["symbol"] = info["symbol"]
            G.nodes[node]["ensembl_id"] = info["ensembl_id"]

print(f"✅ Mapped {len(entrez_to_ensembl)} genes with Ensembl IDs.")

# --- Step 2: Enrich with Open Targets Associations ---
print("🔗 Enriching gene–disease links from Open Targets...")

added_diseases = 0
added_edges = 0
skipped = 0

# Snapshot to avoid iteration issues
gene_nodes = [(n, d.get("ensembl_id")) for n, d in G.nodes(data=True)
              if d.get("label") == "Gene" and d.get("ensembl_id")]

for node, ensembl_id in tqdm(gene_nodes[:500]):  # You can raise the limit as needed
    query = gql(f"""
    query {{
      target(ensemblId: "{ensembl_id}") {{
        associatedDiseases(page: {{ index: 0, size: 10 }}) {{
          rows {{
            disease {{ id name }}
            score
          }}
        }}
      }}
    }}
    """)

    try:
        result = client.execute(query)
        target_block = result.get("target")
        if not target_block:
            skipped += 1
            continue

        rows = target_block.get("associatedDiseases", {}).get("rows", [])
        for row in rows:
            raw_id = row["disease"]["id"]
            disease_id = f"Disease::{raw_id}"
            disease_name = row["disease"]["name"]
            score = row.get("score", 0)

            # Always assign disease label and name
            if not G.has_node(disease_id):
                G.add_node(disease_id, label="Disease", name=disease_name)
                added_diseases += 1
            else:
                G.nodes[disease_id]["label"] = "Disease"
                G.nodes[disease_id]["name"] = disease_name

            if not G.has_edge(node, disease_id):
                G.add_edge(node, disease_id, relation="associated_with", score=score)
                added_edges += 1

    except Exception as e:
        print(f"⚠️ Error for {ensembl_id}: {e}")
        skipped += 1

print(f"\n✅ Enrichment complete.")
print(f"📦 Added {added_diseases} disease nodes and 🔗 {added_edges} edges.")
print(f"⏭️ Skipped {skipped} queries.")


🔎 Mapping Gene::#### nodes to Ensembl IDs...


INFO:biothings.client:querying 1-1000 ...
INFO:biothings.client:Finished.
INFO:biothings.client:Pass "returnall=True" to return complete lists of duplicate or missing query terms.
INFO:biothings.client:querying 1-1000 ...
INFO:biothings.client:Finished.
INFO:biothings.client:Pass "returnall=True" to return complete lists of duplicate or missing query terms.
INFO:biothings.client:querying 1-1000 ...
INFO:biothings.client:Finished.
INFO:biothings.client:Pass "returnall=True" to return complete lists of duplicate or missing query terms.
INFO:biothings.client:querying 1-1000 ...
INFO:biothings.client:Finished.
INFO:biothings.client:Pass "returnall=True" to return complete lists of duplicate or missing query terms.
INFO:biothings.client:querying 1-1000 ...
INFO:biothings.client:Finished.
INFO:biothings.client:Pass "returnall=True" to return complete lists of duplicate or missing query terms.
INFO:biothings.client:querying 1-1000 ...
INFO:biothings.client:Finished.
INFO:biothings.client:Pass

✅ Mapped 17910 genes with Ensembl IDs.
🔗 Enriching gene–disease links from Open Targets...


100%|██████████| 500/500 [01:53<00:00,  4.41it/s]


✅ Enrichment complete.
📦 Added 1197 disease nodes and 🔗 3807 edges.
⏭️ Skipped 3 queries.





In [38]:
!pip install -q fuzzywuzzy python-Levenshtein


[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/161.7 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m161.7/161.7 kB[0m [31m8.3 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/3.1 MB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m[90m━━━━━━━━━[0m [32m2.4/3.1 MB[0m [31m62.3 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.1/3.1 MB[0m [31m40.4 MB/s[0m eta [36m0:00:00[0m
[?25h

## 2. Imports <a id='imports'></a>

In [3]:
import pandas as pd
import networkx as nx

# Load your downloaded files
nodes_df = pd.read_csv("hetionet-nodes.tsv", sep='\t')
edges_df = pd.read_csv("hetionet-edges.sif.gz", sep='\t')

print(f"✅ Nodes loaded: {len(nodes_df)}")
print(f"✅ Edges loaded: {len(edges_df)}")

# Build the graph
G = nx.MultiDiGraph()

# Add nodes
for _, row in nodes_df.iterrows():
    G.add_node(row["id"], label=row["kind"], name=row["name"])

# Add edges
for _, row in edges_df.iterrows():
    G.add_edge(row["source"], row["target"], relation=row["metaedge"])

print(f"✅ Final graph: {G.number_of_nodes()} nodes, {G.number_of_edges()} edges")


✅ Nodes loaded: 47031
✅ Edges loaded: 2250197
✅ Final graph: 47031 nodes, 2250197 edges


In [4]:
import requests
import time

def enrich_genes_with_uniprot(G, max_genes=200):
    print("🧬 Enriching genes from UniProt...")

    # Filter gene nodes
    gene_nodes = [n for n, d in G.nodes(data=True) if d.get("label") == "Gene"]
    gene_nodes = gene_nodes[:max_genes]  # limit for speed

    for node_id in gene_nodes:
        gene_name = G.nodes[node_id].get("name")
        if not gene_name:
            continue

        url = f"https://rest.uniprot.org/uniprotkb/search?query=gene_exact:{gene_name}&format=json&size=1"
        try:
            response = requests.get(url)
            data = response.json()

            if data.get("results"):
                record = data["results"][0]
                G.nodes[node_id]["uniprot_id"] = record.get("primaryAccession", "")
                G.nodes[node_id]["uniprot_description"] = record.get("proteinDescription", {}).get("recommendedName", {}).get("fullName", {}).get("value", "")
                synonyms = record.get("gene", [{}])[0].get("synonyms", [])
                G.nodes[node_id]["gene_synonyms"] = [s.get("value") for s in synonyms]

        except Exception as e:
            print(f"⚠️ {gene_name}: {e}")

        time.sleep(0.1)  # avoid rate limit

    print("✅ Gene enrichment complete.")


In [15]:
import requests
import time
from tqdm import tqdm

# MONDO Normalizer endpoint
normalizer_url = "https://name-resolution-normalizer.prod.monarchinitiative.org/resolve"

# Get first 200 disease nodes
disease_nodes = [n for n, d in G.nodes(data=True) if d.get("label") == "Disease"]
disease_nodes = disease_nodes[:200]

added_nodes = 0
added_edges = 0
linked_pairs = []

headers = {
    "Accept": "application/json",
    "Content-Type": "application/json"
}

# Loop through each disease node and resolve to MONDO
for node_id in tqdm(disease_nodes, desc="🌐 MONDO Normalizer Matching"):
    disease_name = G.nodes[node_id].get("name", "")
    if not disease_name:
        continue

    try:
        payload = {"curie": "", "input": disease_name}
        response = requests.post(normalizer_url, json=payload, headers=headers, timeout=6)
        if response.status_code == 200:
            results = response.json().get("normalized_info", [])
            if results:
                best = results[0]
                mondo_curie = best["id"]["identifier"]  # e.g. MONDO:0005294
                mondo_id = f"MONDO::{mondo_curie.split(':')[-1]}"
                mondo_label = best.get("label", "")

                if not G.has_node(mondo_id):
                    G.add_node(mondo_id,
                               label="Ontology",
                               name=mondo_label,
                               source="MONDO",
                               original_uri=mondo_curie)
                    added_nodes += 1

                if not G.has_edge(node_id, mondo_id):
                    G.add_edge(node_id, mondo_id, relation="has_ontology_term")
                    added_edges += 1
                    linked_pairs.append((G.nodes[node_id]["name"], mondo_label, mondo_id))

    except Exception as e:
        print(f"⚠️ {disease_name}: {e}")
    time.sleep(0.25)

# Summary
print(f"\n✅ MONDO Normalizer enrichment complete.")
print(f"📦 Added {added_nodes} MONDO nodes and 🔗 {added_edges} edges.\n")

for disease, mondo_label, mondo_node in linked_pairs[:10]:
    print(f"🔗 {disease} → {mondo_label} → {mondo_node}")


🌐 MONDO Normalizer Matching:   0%|          | 0/137 [00:00<?, ?it/s]

⚠️ idiopathic pulmonary fibrosis: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd669a950>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:   1%|          | 1/137 [00:00<00:40,  3.37it/s]

⚠️ restless legs syndrome: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66a8bd0>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:   1%|▏         | 2/137 [00:00<00:39,  3.43it/s]

⚠️ alcohol dependence: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66aaa50>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:   2%|▏         | 3/137 [00:00<00:37,  3.59it/s]

⚠️ nicotine dependence: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd6698a90>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:   3%|▎         | 4/137 [00:01<00:36,  3.68it/s]

⚠️ lymphatic system cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66abdd0>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:   4%|▎         | 5/137 [00:01<00:35,  3.73it/s]

⚠️ pharynx cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b1810>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:   4%|▍         | 6/137 [00:01<00:34,  3.76it/s]

⚠️ duodenum cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b3810>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:   5%|▌         | 7/137 [00:01<00:34,  3.79it/s]

⚠️ ileum cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b9610>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:   6%|▌         | 8/137 [00:02<00:33,  3.80it/s]

⚠️ leprosy: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66bb3d0>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:   7%|▋         | 9/137 [00:02<00:33,  3.81it/s]

⚠️ prostate cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b3550>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:   7%|▋         | 10/137 [00:02<00:33,  3.81it/s]

⚠️ stomach cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b1150>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:   8%|▊         | 11/137 [00:02<00:32,  3.82it/s]

⚠️ celiac disease: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66ab950>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:   9%|▉         | 12/137 [00:03<00:32,  3.82it/s]

⚠️ Alzheimer's disease: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66aa990>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:   9%|▉         | 13/137 [00:03<00:32,  3.82it/s]

⚠️ hypertension: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd669b210>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  10%|█         | 14/137 [00:03<00:32,  3.82it/s]

⚠️ nasal cavity cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66c0310>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  11%|█         | 15/137 [00:03<00:31,  3.82it/s]

⚠️ age related macular degeneration: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd669a290>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  12%|█▏        | 16/137 [00:04<00:31,  3.83it/s]

⚠️ attention deficit hyperactivity disorder: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66a9210>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  12%|█▏        | 17/137 [00:04<00:31,  3.83it/s]

⚠️ intracranial aneurysm: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66a8610>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  13%|█▎        | 18/137 [00:04<00:31,  3.83it/s]

⚠️ membranous glomerulonephritis: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b2090>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  14%|█▍        | 19/137 [00:05<00:30,  3.83it/s]

⚠️ urinary bladder cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b3590>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  15%|█▍        | 20/137 [00:05<00:30,  3.81it/s]

⚠️ Gilles de la Tourette syndrome: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66c1c10>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  15%|█▌        | 21/137 [00:05<00:30,  3.82it/s]

⚠️ sarcoma: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b2e10>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  16%|█▌        | 22/137 [00:05<00:30,  3.80it/s]

⚠️ appendix cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b12d0>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  17%|█▋        | 23/137 [00:06<00:29,  3.81it/s]

⚠️ osteoporosis: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66aac90>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  18%|█▊        | 24/137 [00:06<00:29,  3.82it/s]

⚠️ Fuchs' endothelial dystrophy: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66a9190>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  18%|█▊        | 25/137 [00:06<00:29,  3.83it/s]

⚠️ polycystic ovary syndrome: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66999d0>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  19%|█▉        | 26/137 [00:06<00:28,  3.83it/s]

⚠️ penile cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66c0850>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  20%|█▉        | 27/137 [00:07<00:28,  3.84it/s]

⚠️ gestational diabetes: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66a8b50>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  20%|██        | 28/137 [00:07<00:28,  3.83it/s]

⚠️ ureter cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66a8550>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  21%|██        | 29/137 [00:07<00:28,  3.81it/s]

⚠️ vaginal cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd669a990>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  22%|██▏       | 30/137 [00:07<00:28,  3.82it/s]

⚠️ peripheral nervous system neoplasm: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b2d10>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  23%|██▎       | 31/137 [00:08<00:27,  3.82it/s]

⚠️ tracheal cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b3150>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  23%|██▎       | 32/137 [00:08<00:27,  3.83it/s]

⚠️ head and neck cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66c2050>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  24%|██▍       | 33/137 [00:08<00:27,  3.84it/s]

⚠️ Creutzfeldt-Jakob disease: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd69f0290>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  25%|██▍       | 34/137 [00:08<00:26,  3.84it/s]

⚠️ otosclerosis: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd669b490>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  26%|██▌       | 35/137 [00:09<00:26,  3.84it/s]

⚠️ primary biliary cirrhosis: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b3790>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  26%|██▋       | 36/137 [00:09<00:26,  3.84it/s]

⚠️ vitiligo: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfe1ef61f50>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  27%|██▋       | 37/137 [00:09<00:26,  3.85it/s]

⚠️ Graves' disease: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66aa5d0>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  28%|██▊       | 38/137 [00:09<00:25,  3.84it/s]

⚠️ malaria: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66c1190>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  28%|██▊       | 39/137 [00:10<00:25,  3.84it/s]

⚠️ vulva cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b9310>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  29%|██▉       | 40/137 [00:10<00:25,  3.82it/s]

⚠️ autistic disorder: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66a99d0>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  30%|██▉       | 41/137 [00:10<00:25,  3.82it/s]

⚠️ dilated cardiomyopathy: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfe12c4a8d0>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  31%|███       | 42/137 [00:11<00:24,  3.82it/s]

⚠️ conduct disorder: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b1410>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  31%|███▏      | 43/137 [00:11<00:24,  3.82it/s]

⚠️ focal segmental glomerulosclerosis: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd6699750>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  32%|███▏      | 44/137 [00:11<00:24,  3.80it/s]

⚠️ gout: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66c0950>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  33%|███▎      | 45/137 [00:11<00:24,  3.81it/s]

⚠️ brain cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd669a990>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  34%|███▎      | 46/137 [00:12<00:23,  3.81it/s]

⚠️ uterine fibroid: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66abc10>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  34%|███▍      | 47/137 [00:12<00:23,  3.81it/s]

⚠️ lung cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66a8050>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  35%|███▌      | 48/137 [00:12<00:23,  3.81it/s]

⚠️ Behcet's disease: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66c3450>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  36%|███▌      | 49/137 [00:12<00:23,  3.81it/s]

⚠️ Kawasaki disease: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b3750>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  36%|███▋      | 50/137 [00:13<00:22,  3.81it/s]

⚠️ jejunal cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b2410>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  37%|███▋      | 51/137 [00:13<00:22,  3.82it/s]

⚠️ thoracic aortic aneurysm: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd6698f50>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  38%|███▊      | 52/137 [00:13<00:22,  3.82it/s]

⚠️ metabolic syndrome X: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b3c10>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  39%|███▊      | 53/137 [00:13<00:21,  3.82it/s]

⚠️ azoospermia: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b3c90>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  39%|███▉      | 54/137 [00:14<00:21,  3.82it/s]

⚠️ sclerosing cholangitis: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66c3510>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  40%|████      | 55/137 [00:14<00:21,  3.80it/s]

⚠️ Parkinson's disease: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66c2e50>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  41%|████      | 56/137 [00:14<00:21,  3.81it/s]

⚠️ hypothyroidism: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66a8b90>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  42%|████▏     | 57/137 [00:14<00:20,  3.82it/s]

⚠️ endogenous depression: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66bb5d0>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  42%|████▏     | 58/137 [00:15<00:20,  3.83it/s]

⚠️ breast cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b8210>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  43%|████▎     | 59/137 [00:15<00:20,  3.83it/s]

⚠️ glaucoma: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66a9750>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  44%|████▍     | 60/137 [00:15<00:20,  3.83it/s]

⚠️ peritoneum cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66c2410>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  45%|████▍     | 61/137 [00:16<00:19,  3.81it/s]

⚠️ vascular cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66c2550>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  45%|████▌     | 62/137 [00:16<00:19,  3.79it/s]

⚠️ thyroid cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd69c1d90>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  46%|████▌     | 63/137 [00:16<00:19,  3.81it/s]

⚠️ malignant mesothelioma: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b0590>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  47%|████▋     | 64/137 [00:16<00:19,  3.80it/s]

⚠️ pancreatic cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b8bd0>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  47%|████▋     | 65/137 [00:17<00:18,  3.81it/s]

⚠️ epilepsy syndrome: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd69b3b50>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  48%|████▊     | 66/137 [00:17<00:18,  3.82it/s]

⚠️ bone cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b34d0>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  49%|████▉     | 67/137 [00:17<00:18,  3.83it/s]

⚠️ melanoma: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66a9b90>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  50%|████▉     | 68/137 [00:17<00:18,  3.83it/s]

⚠️ atherosclerosis: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66c2f10>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  50%|█████     | 69/137 [00:18<00:17,  3.82it/s]

⚠️ fallopian tube cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66c1d50>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  51%|█████     | 70/137 [00:18<00:17,  3.82it/s]

⚠️ rectum cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66bae90>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  52%|█████▏    | 71/137 [00:18<00:17,  3.82it/s]

⚠️ hepatitis B: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd69eca10>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  53%|█████▎    | 72/137 [00:18<00:16,  3.82it/s]

⚠️ dental caries: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd69c6590>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  53%|█████▎    | 73/137 [00:19<00:16,  3.82it/s]

⚠️ ocular cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66c1850>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  54%|█████▍    | 74/137 [00:19<00:16,  3.82it/s]

⚠️ colon cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66c0ed0>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  55%|█████▍    | 75/137 [00:19<00:16,  3.83it/s]

⚠️ anemia: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd669a550>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  55%|█████▌    | 76/137 [00:19<00:15,  3.83it/s]

⚠️ multiple sclerosis: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b1510>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  56%|█████▌    | 77/137 [00:20<00:15,  3.84it/s]

⚠️ ovarian cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66bb4d0>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  57%|█████▋    | 78/137 [00:20<00:15,  3.83it/s]

⚠️ hematologic cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66c2950>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  58%|█████▊    | 79/137 [00:20<00:15,  3.83it/s]

⚠️ larynx cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66c3e10>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  58%|█████▊    | 80/137 [00:20<00:14,  3.83it/s]

⚠️ kidney cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b1f50>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  59%|█████▉    | 81/137 [00:21<00:14,  3.84it/s]

⚠️ asthma: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd669a410>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  60%|█████▉    | 82/137 [00:21<00:14,  3.83it/s]

⚠️ IgA glomerulonephritis: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66a9110>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  61%|██████    | 83/137 [00:21<00:14,  3.82it/s]

⚠️ germ cell cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b9310>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  61%|██████▏   | 84/137 [00:22<00:13,  3.82it/s]

⚠️ testicular cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd69c75d0>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  62%|██████▏   | 85/137 [00:22<00:13,  3.82it/s]

⚠️ malignant glioma: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b1090>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  63%|██████▎   | 86/137 [00:22<00:13,  3.81it/s]

⚠️ chronic obstructive pulmonary disease: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66c3390>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  64%|██████▎   | 87/137 [00:22<00:13,  3.82it/s]

⚠️ gallbladder cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66c3b50>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  64%|██████▍   | 88/137 [00:23<00:12,  3.81it/s]

⚠️ thymus cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd669bed0>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  65%|██████▍   | 89/137 [00:23<00:12,  3.81it/s]

⚠️ atopic dermatitis: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd6d14d10>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  66%|██████▌   | 90/137 [00:23<00:12,  3.82it/s]

⚠️ bipolar disorder: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd69d6110>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  66%|██████▋   | 91/137 [00:23<00:12,  3.81it/s]

⚠️ amyotrophic lateral sclerosis: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd69be210>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  67%|██████▋   | 92/137 [00:24<00:11,  3.82it/s]

⚠️ coronary artery disease: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66c37d0>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  68%|██████▊   | 93/137 [00:24<00:11,  3.71it/s]

⚠️ meningioma: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b1f50>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  69%|██████▊   | 94/137 [00:24<00:11,  3.76it/s]

⚠️ liver cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66a8450>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  69%|██████▉   | 95/137 [00:24<00:11,  3.78it/s]

⚠️ uterine cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b22d0>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  70%|███████   | 96/137 [00:25<00:10,  3.79it/s]

⚠️ adrenal gland cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b8fd0>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  71%|███████   | 97/137 [00:25<00:10,  3.80it/s]

⚠️ muscle cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66c2290>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  72%|███████▏  | 98/137 [00:25<00:10,  3.79it/s]

⚠️ skin cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66c1b50>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  72%|███████▏  | 99/137 [00:25<00:09,  3.80it/s]

⚠️ systemic scleroderma: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66c1010>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  73%|███████▎  | 100/137 [00:26<00:09,  3.81it/s]

⚠️ cervical cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd69fcad0>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  74%|███████▎  | 101/137 [00:26<00:09,  3.83it/s]

⚠️ allergic rhinitis: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66bbbd0>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  74%|███████▍  | 102/137 [00:26<00:09,  3.82it/s]

⚠️ bile duct cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd6699a10>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  75%|███████▌  | 103/137 [00:27<00:08,  3.83it/s]

⚠️ pancreatitis: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66a9990>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  76%|███████▌  | 104/137 [00:27<00:08,  3.83it/s]

⚠️ esophageal cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b3990>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  77%|███████▋  | 105/137 [00:27<00:08,  3.83it/s]

⚠️ middle ear cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66c2350>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  77%|███████▋  | 106/137 [00:27<00:08,  3.81it/s]

⚠️ Paget's disease of bone: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b99d0>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  78%|███████▊  | 107/137 [00:28<00:07,  3.82it/s]

⚠️ schizophrenia: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd669b750>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  79%|███████▉  | 108/137 [00:28<00:07,  3.82it/s]

⚠️ mediastinal cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66c4ed0>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  80%|███████▉  | 109/137 [00:28<00:07,  3.83it/s]

⚠️ spinal cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd69f1890>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  80%|████████  | 110/137 [00:28<00:07,  3.83it/s]

⚠️ nephrolithiasis: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b9c50>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  81%|████████  | 111/137 [00:29<00:06,  3.83it/s]

⚠️ retroperitoneal cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66c1490>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  82%|████████▏ | 112/137 [00:29<00:06,  3.83it/s]

⚠️ panic disorder: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b2890>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  82%|████████▏ | 113/137 [00:29<00:06,  3.83it/s]

⚠️ acquired immunodeficiency syndrome: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66abe90>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  83%|████████▎ | 114/137 [00:29<00:06,  3.53it/s]

⚠️ migraine: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66c4bd0>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  84%|████████▍ | 115/137 [00:30<00:06,  3.61it/s]

⚠️ ankylosing spondylitis: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66aa850>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  85%|████████▍ | 116/137 [00:30<00:05,  3.67it/s]

⚠️ rheumatoid arthritis: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b2e50>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  85%|████████▌ | 117/137 [00:30<00:05,  3.72it/s]

⚠️ abdominal aortic aneurysm: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66c0110>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  86%|████████▌ | 118/137 [00:31<00:05,  3.75it/s]

⚠️ chronic kidney failure: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b1f50>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  87%|████████▋ | 119/137 [00:31<00:04,  3.77it/s]

⚠️ periodontitis: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd69f2990>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  88%|████████▊ | 120/137 [00:31<00:04,  3.79it/s]

⚠️ osteoarthritis: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66c4bd0>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  88%|████████▊ | 121/137 [00:31<00:04,  3.81it/s]

⚠️ ulcerative colitis: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfe3fd529d0>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  89%|████████▉ | 122/137 [00:32<00:03,  3.82it/s]

⚠️ Crohn's disease: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66a8850>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  90%|████████▉ | 123/137 [00:32<00:03,  3.83it/s]

⚠️ salivary gland cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66c0b10>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  91%|█████████ | 124/137 [00:32<00:03,  3.84it/s]

⚠️ psoriasis: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b81d0>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  91%|█████████ | 125/137 [00:32<00:03,  3.85it/s]

⚠️ narcolepsy: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66c3790>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  92%|█████████▏| 126/137 [00:33<00:02,  3.84it/s]

⚠️ degenerative disc disease: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd6698a50>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  93%|█████████▎| 127/137 [00:33<00:02,  3.84it/s]

⚠️ psoriatic arthritis: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b3790>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  93%|█████████▎| 128/137 [00:33<00:02,  3.84it/s]

⚠️ systemic lupus erythematosus: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66ba7d0>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  94%|█████████▍| 129/137 [00:33<00:02,  3.83it/s]

⚠️ Barrett's esophagus: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66b8710>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  95%|█████████▍| 130/137 [00:34<00:01,  3.82it/s]

⚠️ cleft lip: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66c02d0>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  96%|█████████▌| 131/137 [00:34<00:01,  3.83it/s]

⚠️ type 2 diabetes mellitus: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66abe50>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  96%|█████████▋| 132/137 [00:34<00:01,  3.83it/s]

⚠️ type 1 diabetes mellitus: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd669b590>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  97%|█████████▋| 133/137 [00:34<00:01,  3.83it/s]

⚠️ refractive error: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66a9cd0>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  98%|█████████▊| 134/137 [00:35<00:00,  3.83it/s]

⚠️ alopecia areata: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfe12c4bb50>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  99%|█████████▊| 135/137 [00:35<00:00,  3.82it/s]

⚠️ pleural cancer: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66aba90>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching:  99%|█████████▉| 136/137 [00:35<00:00,  3.82it/s]

⚠️ obesity: HTTPSConnectionPool(host='name-resolution-normalizer.prod.monarchinitiative.org', port=443): Max retries exceeded with url: /resolve (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x7dfdd66c0990>: Failed to resolve 'name-resolution-normalizer.prod.monarchinitiative.org' ([Errno -2] Name or service not known)"))


🌐 MONDO Normalizer Matching: 100%|██████████| 137/137 [00:35<00:00,  3.81it/s]


✅ MONDO Normalizer enrichment complete.
📦 Added 0 MONDO nodes and 🔗 0 edges.






In [19]:
import requests
import time
from tqdm import tqdm

# Limit to 100 diseases for enrichment
disease_nodes = [n for n, d in G.nodes(data=True) if d.get("label") == "Disease"]
disease_nodes = disease_nodes[:100]

added_trials = 0
added_trial_edges = 0
linked_trials = []

import urllib.parse

def search_clinical_trials(disease_name):
    """Robust ClinicalTrials.gov search using raw expr string."""
    url = "https://clinicaltrials.gov/api/query/study_fields"

    params = {
        "expr": f'"{disease_name}"',  # wrap in quotes to allow multi-word phrases
        "fields": "NCTId,BriefTitle",
        "min_rnk": 1,
        "max_rnk": 3,
        "fmt": "json"
    }

    try:
        response = requests.get(url, params=params, timeout=10)
        if response.status_code != 200:
            print(f"❌ HTTP error {response.status_code} for {disease_name}")
            return []

        data = response.json()
        studies = data.get("StudyFieldsResponse", {}).get("StudyFields", [])
        return [(s["NCTId"][0], s["BriefTitle"][0]) for s in studies if s.get("NCTId") and s.get("BriefTitle")]

    except Exception as e:
        print(f"⚠️ Exception for {disease_name}: {e}")
        return []


# Enrich graph
for node_id in tqdm(disease_nodes, desc="🔬 Linking clinical trials"):
    disease_name = G.nodes[node_id].get("name")
    if not disease_name:
        continue

    trials = search_clinical_trials(disease_name)
    for nct_id, title in trials:
        trial_node = f"ClinicalTrial::{nct_id}"

        if not G.has_node(trial_node):
            G.add_node(trial_node, label="ClinicalTrial", name=title)
            added_trials += 1

        if not G.has_edge(node_id, trial_node):
            G.add_edge(node_id, trial_node, relation="studied_in")
            added_trial_edges += 1
            linked_trials.append((disease_name, nct_id, title))

    time.sleep(0.3)

print(f"\n✅ Clinical trial linking complete.")
print(f"📦 Added {added_trials} trial nodes and 🔗 {added_trial_edges} edges.")
for dname, nct, title in linked_trials[:10]:
    print(f"🔗 {dname} → {nct} → {title}")



🔬 Linking clinical trials:   0%|          | 0/100 [00:00<?, ?it/s]

❌ HTTP error 404 for idiopathic pulmonary fibrosis


🔬 Linking clinical trials:   1%|          | 1/100 [00:00<00:37,  2.63it/s]

❌ HTTP error 404 for restless legs syndrome


🔬 Linking clinical trials:   2%|▏         | 2/100 [00:00<00:37,  2.64it/s]

❌ HTTP error 404 for alcohol dependence


🔬 Linking clinical trials:   3%|▎         | 3/100 [00:01<00:36,  2.63it/s]

❌ HTTP error 404 for nicotine dependence


🔬 Linking clinical trials:   4%|▍         | 4/100 [00:01<00:36,  2.64it/s]

❌ HTTP error 404 for lymphatic system cancer


🔬 Linking clinical trials:   5%|▌         | 5/100 [00:01<00:35,  2.64it/s]

❌ HTTP error 404 for pharynx cancer


🔬 Linking clinical trials:   6%|▌         | 6/100 [00:02<00:35,  2.64it/s]

❌ HTTP error 404 for duodenum cancer


🔬 Linking clinical trials:   7%|▋         | 7/100 [00:02<00:35,  2.64it/s]

❌ HTTP error 404 for ileum cancer


🔬 Linking clinical trials:   8%|▊         | 8/100 [00:03<00:35,  2.62it/s]

❌ HTTP error 404 for leprosy


🔬 Linking clinical trials:   9%|▉         | 9/100 [00:03<00:35,  2.60it/s]

❌ HTTP error 404 for prostate cancer


🔬 Linking clinical trials:  10%|█         | 10/100 [00:03<00:34,  2.61it/s]

❌ HTTP error 404 for stomach cancer


🔬 Linking clinical trials:  11%|█         | 11/100 [00:04<00:34,  2.59it/s]

❌ HTTP error 404 for celiac disease


🔬 Linking clinical trials:  12%|█▏        | 12/100 [00:04<00:33,  2.60it/s]

❌ HTTP error 404 for Alzheimer's disease


🔬 Linking clinical trials:  13%|█▎        | 13/100 [00:04<00:33,  2.61it/s]

❌ HTTP error 404 for hypertension


🔬 Linking clinical trials:  14%|█▍        | 14/100 [00:05<00:33,  2.60it/s]

❌ HTTP error 404 for nasal cavity cancer


🔬 Linking clinical trials:  15%|█▌        | 15/100 [00:05<00:32,  2.61it/s]

❌ HTTP error 404 for age related macular degeneration


🔬 Linking clinical trials:  16%|█▌        | 16/100 [00:06<00:32,  2.60it/s]

❌ HTTP error 404 for attention deficit hyperactivity disorder


🔬 Linking clinical trials:  17%|█▋        | 17/100 [00:06<00:31,  2.62it/s]

❌ HTTP error 404 for intracranial aneurysm


🔬 Linking clinical trials:  18%|█▊        | 18/100 [00:06<00:31,  2.61it/s]

❌ HTTP error 404 for membranous glomerulonephritis


🔬 Linking clinical trials:  19%|█▉        | 19/100 [00:07<00:31,  2.60it/s]

❌ HTTP error 404 for urinary bladder cancer


🔬 Linking clinical trials:  20%|██        | 20/100 [00:07<00:30,  2.60it/s]

❌ HTTP error 404 for Gilles de la Tourette syndrome


🔬 Linking clinical trials:  21%|██        | 21/100 [00:08<00:30,  2.60it/s]

❌ HTTP error 404 for sarcoma


🔬 Linking clinical trials:  22%|██▏       | 22/100 [00:08<00:29,  2.61it/s]

❌ HTTP error 404 for appendix cancer


🔬 Linking clinical trials:  23%|██▎       | 23/100 [00:08<00:29,  2.60it/s]

❌ HTTP error 404 for osteoporosis


🔬 Linking clinical trials:  24%|██▍       | 24/100 [00:09<00:29,  2.60it/s]

❌ HTTP error 404 for Fuchs' endothelial dystrophy


🔬 Linking clinical trials:  25%|██▌       | 25/100 [00:09<00:28,  2.61it/s]

❌ HTTP error 404 for polycystic ovary syndrome


🔬 Linking clinical trials:  26%|██▌       | 26/100 [00:09<00:28,  2.62it/s]

❌ HTTP error 404 for penile cancer


🔬 Linking clinical trials:  27%|██▋       | 27/100 [00:10<00:27,  2.63it/s]

❌ HTTP error 404 for gestational diabetes


🔬 Linking clinical trials:  28%|██▊       | 28/100 [00:10<00:27,  2.64it/s]

❌ HTTP error 404 for ureter cancer


🔬 Linking clinical trials:  29%|██▉       | 29/100 [00:11<00:27,  2.63it/s]

❌ HTTP error 404 for vaginal cancer


🔬 Linking clinical trials:  30%|███       | 30/100 [00:11<00:26,  2.64it/s]

❌ HTTP error 404 for peripheral nervous system neoplasm


🔬 Linking clinical trials:  31%|███       | 31/100 [00:11<00:26,  2.64it/s]

❌ HTTP error 404 for tracheal cancer


🔬 Linking clinical trials:  32%|███▏      | 32/100 [00:12<00:25,  2.65it/s]

❌ HTTP error 404 for head and neck cancer


🔬 Linking clinical trials:  33%|███▎      | 33/100 [00:12<00:25,  2.65it/s]

❌ HTTP error 404 for Creutzfeldt-Jakob disease


🔬 Linking clinical trials:  34%|███▍      | 34/100 [00:12<00:24,  2.66it/s]

❌ HTTP error 404 for otosclerosis


🔬 Linking clinical trials:  35%|███▌      | 35/100 [00:13<00:24,  2.65it/s]

❌ HTTP error 404 for primary biliary cirrhosis


🔬 Linking clinical trials:  36%|███▌      | 36/100 [00:13<00:24,  2.64it/s]

❌ HTTP error 404 for vitiligo


🔬 Linking clinical trials:  37%|███▋      | 37/100 [00:14<00:23,  2.64it/s]

❌ HTTP error 404 for Graves' disease


🔬 Linking clinical trials:  38%|███▊      | 38/100 [00:14<00:23,  2.63it/s]

❌ HTTP error 404 for malaria


🔬 Linking clinical trials:  39%|███▉      | 39/100 [00:14<00:23,  2.59it/s]

❌ HTTP error 404 for vulva cancer


🔬 Linking clinical trials:  40%|████      | 40/100 [00:15<00:23,  2.59it/s]

❌ HTTP error 404 for autistic disorder


🔬 Linking clinical trials:  41%|████      | 41/100 [00:15<00:22,  2.60it/s]

❌ HTTP error 404 for dilated cardiomyopathy


🔬 Linking clinical trials:  42%|████▏     | 42/100 [00:16<00:22,  2.61it/s]

❌ HTTP error 404 for conduct disorder


🔬 Linking clinical trials:  43%|████▎     | 43/100 [00:16<00:22,  2.59it/s]

❌ HTTP error 404 for focal segmental glomerulosclerosis


🔬 Linking clinical trials:  44%|████▍     | 44/100 [00:16<00:21,  2.59it/s]

❌ HTTP error 404 for gout


🔬 Linking clinical trials:  45%|████▌     | 45/100 [00:17<00:21,  2.61it/s]

❌ HTTP error 404 for brain cancer


🔬 Linking clinical trials:  46%|████▌     | 46/100 [00:17<00:20,  2.62it/s]

❌ HTTP error 404 for uterine fibroid


🔬 Linking clinical trials:  47%|████▋     | 47/100 [00:17<00:20,  2.63it/s]

❌ HTTP error 404 for lung cancer


🔬 Linking clinical trials:  48%|████▊     | 48/100 [00:18<00:19,  2.64it/s]

❌ HTTP error 404 for Behcet's disease


🔬 Linking clinical trials:  49%|████▉     | 49/100 [00:18<00:19,  2.64it/s]

❌ HTTP error 404 for Kawasaki disease


🔬 Linking clinical trials:  50%|█████     | 50/100 [00:19<00:18,  2.65it/s]

❌ HTTP error 404 for jejunal cancer


🔬 Linking clinical trials:  51%|█████     | 51/100 [00:19<00:18,  2.63it/s]

❌ HTTP error 404 for thoracic aortic aneurysm


🔬 Linking clinical trials:  52%|█████▏    | 52/100 [00:19<00:18,  2.64it/s]

❌ HTTP error 404 for metabolic syndrome X


🔬 Linking clinical trials:  53%|█████▎    | 53/100 [00:20<00:17,  2.62it/s]

❌ HTTP error 404 for azoospermia


🔬 Linking clinical trials:  54%|█████▍    | 54/100 [00:20<00:17,  2.62it/s]

❌ HTTP error 404 for sclerosing cholangitis


🔬 Linking clinical trials:  55%|█████▌    | 55/100 [00:20<00:17,  2.62it/s]

❌ HTTP error 404 for Parkinson's disease


🔬 Linking clinical trials:  56%|█████▌    | 56/100 [00:21<00:16,  2.63it/s]

❌ HTTP error 404 for hypothyroidism


🔬 Linking clinical trials:  57%|█████▋    | 57/100 [00:21<00:16,  2.63it/s]

❌ HTTP error 404 for endogenous depression


🔬 Linking clinical trials:  58%|█████▊    | 58/100 [00:22<00:15,  2.64it/s]

❌ HTTP error 404 for breast cancer


🔬 Linking clinical trials:  59%|█████▉    | 59/100 [00:22<00:15,  2.64it/s]

❌ HTTP error 404 for glaucoma


🔬 Linking clinical trials:  60%|██████    | 60/100 [00:22<00:15,  2.62it/s]

❌ HTTP error 404 for peritoneum cancer


🔬 Linking clinical trials:  61%|██████    | 61/100 [00:23<00:14,  2.62it/s]

❌ HTTP error 404 for vascular cancer


🔬 Linking clinical trials:  62%|██████▏   | 62/100 [00:23<00:14,  2.62it/s]

❌ HTTP error 404 for thyroid cancer


🔬 Linking clinical trials:  63%|██████▎   | 63/100 [00:24<00:14,  2.61it/s]

❌ HTTP error 404 for malignant mesothelioma


🔬 Linking clinical trials:  64%|██████▍   | 64/100 [00:24<00:13,  2.62it/s]

❌ HTTP error 404 for pancreatic cancer


🔬 Linking clinical trials:  65%|██████▌   | 65/100 [00:24<00:13,  2.61it/s]

❌ HTTP error 404 for epilepsy syndrome


🔬 Linking clinical trials:  66%|██████▌   | 66/100 [00:25<00:13,  2.60it/s]

❌ HTTP error 404 for bone cancer


🔬 Linking clinical trials:  67%|██████▋   | 67/100 [00:25<00:12,  2.60it/s]

❌ HTTP error 404 for melanoma


🔬 Linking clinical trials:  68%|██████▊   | 68/100 [00:25<00:12,  2.59it/s]

❌ HTTP error 404 for atherosclerosis


🔬 Linking clinical trials:  69%|██████▉   | 69/100 [00:26<00:11,  2.62it/s]

❌ HTTP error 404 for fallopian tube cancer


🔬 Linking clinical trials:  70%|███████   | 70/100 [00:26<00:11,  2.61it/s]

❌ HTTP error 404 for rectum cancer


🔬 Linking clinical trials:  71%|███████   | 71/100 [00:27<00:11,  2.62it/s]

❌ HTTP error 404 for hepatitis B


🔬 Linking clinical trials:  72%|███████▏  | 72/100 [00:27<00:10,  2.62it/s]

❌ HTTP error 404 for dental caries


🔬 Linking clinical trials:  73%|███████▎  | 73/100 [00:27<00:10,  2.60it/s]

❌ HTTP error 404 for ocular cancer


🔬 Linking clinical trials:  74%|███████▍  | 74/100 [00:28<00:10,  2.59it/s]

❌ HTTP error 404 for colon cancer


🔬 Linking clinical trials:  75%|███████▌  | 75/100 [00:28<00:09,  2.62it/s]

❌ HTTP error 404 for anemia


🔬 Linking clinical trials:  76%|███████▌  | 76/100 [00:29<00:09,  2.61it/s]

❌ HTTP error 404 for multiple sclerosis


🔬 Linking clinical trials:  77%|███████▋  | 77/100 [00:29<00:08,  2.59it/s]

❌ HTTP error 404 for ovarian cancer


🔬 Linking clinical trials:  78%|███████▊  | 78/100 [00:29<00:08,  2.59it/s]

❌ HTTP error 404 for hematologic cancer


🔬 Linking clinical trials:  79%|███████▉  | 79/100 [00:30<00:08,  2.61it/s]

❌ HTTP error 404 for larynx cancer


🔬 Linking clinical trials:  80%|████████  | 80/100 [00:30<00:07,  2.62it/s]

❌ HTTP error 404 for kidney cancer


🔬 Linking clinical trials:  81%|████████  | 81/100 [00:30<00:07,  2.62it/s]

❌ HTTP error 404 for asthma


🔬 Linking clinical trials:  82%|████████▏ | 82/100 [00:31<00:06,  2.61it/s]

❌ HTTP error 404 for IgA glomerulonephritis


🔬 Linking clinical trials:  83%|████████▎ | 83/100 [00:31<00:06,  2.61it/s]

❌ HTTP error 404 for germ cell cancer


🔬 Linking clinical trials:  84%|████████▍ | 84/100 [00:32<00:06,  2.62it/s]

❌ HTTP error 404 for testicular cancer


🔬 Linking clinical trials:  85%|████████▌ | 85/100 [00:32<00:05,  2.63it/s]

❌ HTTP error 404 for malignant glioma


🔬 Linking clinical trials:  86%|████████▌ | 86/100 [00:32<00:05,  2.64it/s]

❌ HTTP error 404 for chronic obstructive pulmonary disease


🔬 Linking clinical trials:  87%|████████▋ | 87/100 [00:33<00:04,  2.65it/s]

❌ HTTP error 404 for gallbladder cancer


🔬 Linking clinical trials:  88%|████████▊ | 88/100 [00:33<00:04,  2.65it/s]

❌ HTTP error 404 for thymus cancer


🔬 Linking clinical trials:  89%|████████▉ | 89/100 [00:33<00:04,  2.63it/s]

❌ HTTP error 404 for atopic dermatitis


🔬 Linking clinical trials:  90%|█████████ | 90/100 [00:34<00:03,  2.64it/s]

❌ HTTP error 404 for bipolar disorder


🔬 Linking clinical trials:  91%|█████████ | 91/100 [00:34<00:03,  2.64it/s]

❌ HTTP error 404 for amyotrophic lateral sclerosis


🔬 Linking clinical trials:  92%|█████████▏| 92/100 [00:35<00:03,  2.64it/s]

❌ HTTP error 404 for coronary artery disease


🔬 Linking clinical trials:  93%|█████████▎| 93/100 [00:35<00:02,  2.64it/s]

❌ HTTP error 404 for meningioma


🔬 Linking clinical trials:  94%|█████████▍| 94/100 [00:35<00:02,  2.64it/s]

❌ HTTP error 404 for liver cancer


🔬 Linking clinical trials:  95%|█████████▌| 95/100 [00:36<00:01,  2.64it/s]

❌ HTTP error 404 for uterine cancer


🔬 Linking clinical trials:  96%|█████████▌| 96/100 [00:36<00:01,  2.63it/s]

❌ HTTP error 404 for adrenal gland cancer


🔬 Linking clinical trials:  97%|█████████▋| 97/100 [00:37<00:01,  2.63it/s]

❌ HTTP error 404 for muscle cancer


🔬 Linking clinical trials:  98%|█████████▊| 98/100 [00:37<00:00,  2.62it/s]

❌ HTTP error 404 for skin cancer


🔬 Linking clinical trials:  99%|█████████▉| 99/100 [00:37<00:00,  2.64it/s]

❌ HTTP error 404 for systemic scleroderma


🔬 Linking clinical trials: 100%|██████████| 100/100 [00:38<00:00,  2.62it/s]


✅ Clinical trial linking complete.
📦 Added 0 trial nodes and 🔗 0 edges.





In [21]:
import psycopg2
import pandas as pd

conn = psycopg2.connect(
    host="aact-db.ctti-clinicaltrials.org",
    dbname="aact",
    user="theiman",
    password="Pth_082704",
    port=5432
)


In [29]:
from gql import gql, Client
from gql.transport.requests import RequestsHTTPTransport

transport = RequestsHTTPTransport(
    url="https://api.platform.opentargets.org/api/v4/graphql",
    verify=True,
    retries=3,
)

client = Client(transport=transport, fetch_schema_from_transport=True)

query = gql("""
query GetTargetAssociations {
  target(ensemblId: "ENSG00000155657") {
    approvedSymbol
    associatedDiseases(page: { index: 0, size: 5 }) {
      count
      rows {
        disease {
          id
          name
        }
        score
      }
    }
  }
}
""")

result = client.execute(query)
print(result)


{'target': {'approvedSymbol': 'TTN', 'associatedDiseases': {'count': 828, 'rows': [{'disease': {'id': 'EFO_0000407', 'name': 'dilated cardiomyopathy'}, 'score': 0.8463293516985615}, {'disease': {'id': 'MONDO_0011400', 'name': 'dilated cardiomyopathy 1G'}, 'score': 0.806061786200876}, {'disease': {'id': 'EFO_0000318', 'name': 'cardiomyopathy'}, 'score': 0.8040243189083583}, {'disease': {'id': 'MONDO_0010870', 'name': 'tibial muscular dystrophy'}, 'score': 0.7954923412028356}, {'disease': {'id': 'MONDO_0012714', 'name': 'early-onset myopathy with fatal cardiomyopathy'}, 'score': 0.7838356772448081}]}}}


In [33]:
import mygene

mg = mygene.MyGeneInfo()
gene_ids = [n.split("::")[1] for n, _ in G.nodes(data=True) if n.startswith("Gene::")]

# Query MyGene.info for mappings
results = mg.querymany(gene_ids[:100], scopes='entrezgene', fields='symbol,ensembl.gene', species='human')

# Build mapping
entrez_to_ensembl = {}
for r in results:
    if "ensembl" in r and "symbol" in r:
        ens = r["ensembl"]
        if isinstance(ens, dict):
            entrez_to_ensembl[r["query"]] = {
                "symbol": r["symbol"],
                "ensembl_id": ens["gene"]
            }

print(list(entrez_to_ensembl.items())[:5])


INFO:biothings.client:querying 1-100 ...
INFO:biothings.client:Finished.
INFO:biothings.client:Pass "returnall=True" to return complete lists of duplicate or missing query terms.


[('1', {'symbol': 'A1BG', 'ensembl_id': 'ENSG00000121410'}), ('10', {'symbol': 'NAT2', 'ensembl_id': 'ENSG00000156006'}), ('100', {'symbol': 'ADA', 'ensembl_id': 'ENSG00000196839'}), ('1000', {'symbol': 'CDH2', 'ensembl_id': 'ENSG00000170558'}), ('100008586', {'symbol': 'GAGE12F', 'ensembl_id': 'ENSG00000236362'})]


In [34]:
from gql import gql
from tqdm import tqdm

added_diseases = 0
added_edges = 0

# Update graph with mapping attributes
for node in G.nodes:
    if node.startswith("Gene::"):
        entrez = node.split("::")[1]
        if entrez in entrez_to_ensembl:
            G.nodes[node]["symbol"] = entrez_to_ensembl[entrez]["symbol"]
            G.nodes[node]["ensembl_id"] = entrez_to_ensembl[entrez]["ensembl_id"]

# Enrich via GraphQL
# Take a snapshot to avoid mutation error
gene_nodes = [(node, data) for node, data in G.nodes(data=True) if data.get("label") == "Gene"]

for node, data in tqdm(gene_nodes):
    ensembl_id = data.get("ensembl_id")
    if not ensembl_id:
        continue

    query = gql(f"""
    query {{
      target(ensemblId: "{ensembl_id}") {{
        associatedDiseases(page: {{ index: 0, size: 10 }}) {{
          rows {{
            disease {{ id name }}
            score
          }}
        }}
      }}
    }}
    """)

    try:
        result = client.execute(query)
        rows = result["target"]["associatedDiseases"]["rows"]

        for row in rows:
            disease_id = f"EFO::{row['disease']['id']}"
            disease_name = row["disease"]["name"]
            score = row["score"]

            if not G.has_node(disease_id):
                G.add_node(disease_id, label="Disease", name=disease_name)

            if not G.has_edge(node, disease_id):
                G.add_edge(node, disease_id, relation="associated_with", score=score)

    except Exception as e:
        print(f"⚠️ Error for {ensembl_id}: {e}")


print(f"\n✅ Finished Open Targets enrichment.")
print(f"📦 Added {added_diseases} new disease nodes and 🔗 {added_edges} edges.")


100%|██████████| 20945/20945 [00:08<00:00, 2461.82it/s]


✅ Finished Open Targets enrichment.
📦 Added 0 new disease nodes and 🔗 0 edges.





In [35]:
from tqdm import tqdm
import mygene

mg = mygene.MyGeneInfo()

gene_ids = [n.split("::")[1] for n, d in G.nodes(data=True)
            if d.get("label") == "Gene" and "::" in n]

# Query MyGene.info in chunks
entrez_to_ensembl = {}
for i in tqdm(range(0, len(gene_ids), 1000)):
    chunk = gene_ids[i:i+1000]
    results = mg.querymany(chunk, scopes='entrezgene', fields='symbol,ensembl.gene', species='human')
    for r in results:
        if "ensembl" in r and "symbol" in r:
            ens = r["ensembl"]
            if isinstance(ens, dict):
                entrez_to_ensembl[r["query"]] = {
                    "symbol": r["symbol"],
                    "ensembl_id": ens["gene"]
                }


INFO:biothings.client:querying 1-1000 ...
INFO:biothings.client:Finished.
INFO:biothings.client:Pass "returnall=True" to return complete lists of duplicate or missing query terms.
INFO:biothings.client:querying 1-1000 ...
INFO:biothings.client:Finished.
INFO:biothings.client:Pass "returnall=True" to return complete lists of duplicate or missing query terms.
INFO:biothings.client:querying 1-1000 ...
INFO:biothings.client:Finished.
INFO:biothings.client:Pass "returnall=True" to return complete lists of duplicate or missing query terms.
INFO:biothings.client:querying 1-1000 ...
INFO:biothings.client:Finished.
INFO:biothings.client:Pass "returnall=True" to return complete lists of duplicate or missing query terms.
INFO:biothings.client:querying 1-1000 ...
INFO:biothings.client:Finished.
INFO:biothings.client:Pass "returnall=True" to return complete lists of duplicate or missing query terms.
INFO:biothings.client:querying 1-1000 ...
INFO:biothings.client:Finished.
INFO:biothings.client:Pass

In [None]:
from difflib import get_close_matches
from tqdm import tqdm

added_trials = 0
added_links = 0
unmatched_conditions = []

# Snapshot of existing disease names for matching
disease_lookup = {
    d.lower(): n for n, d in nx.get_node_attributes(G, "name").items()
    if G.nodes[n].get("label") == "Disease"
}

print("🔬 Linking clinical trials...")

for _, row in tqdm(trials_df.iterrows(), total=len(trials_df)):
    cond = str(row["condition_name"]).lower().strip()
    nct_id = row["nct_id"]
    title = row["brief_title"]

    match = disease_lookup.get(cond)
    if not match:
        # Try fuzzy match (optional)
        guess = get_close_matches(cond, disease_lookup.keys(), n=1, cutoff=0.9)
        if guess:
            match = disease_lookup[guess[0]]
        else:
            unmatched_conditions.append(cond)
            continue

    trial_node = f"ClinicalTrial::{nct_id}"
    if not G.has_node(trial_node):
        G.add_node(trial_node, label="ClinicalTrial", name=title)
        added_trials += 1

    if not G.has_edge(match, trial_node):
        G.add_edge(match, trial_node, relation="studied_in")
        added_links += 1

print(f"\n✅ Trial linking complete.")
print(f"📦 Added {added_trials} trial nodes and 🔗 {added_links} edges.")
print(f"⏭️ Unmatched conditions: {len(unmatched_conditions)}")


🔬 Linking clinical trials...


100%|██████████| 5000/5000 [01:23<00:00, 60.23it/s]


✅ Trial linking complete.
📦 Added 1260 trial nodes and 🔗 1332 edges.
⏭️ Unmatched conditions: 3661





In [39]:
from fuzzywuzzy import fuzz
from tqdm import tqdm

added_trials = 0
added_links = 0
unmatched_conditions = []

# Snapshot of lowercased disease name → node_id
disease_lookup = {
    d.lower(): n for n, d in nx.get_node_attributes(G, "name").items()
    if G.nodes[n].get("label") == "Disease"
}

print("🔬 Linking clinical trials using fuzzy matching...")

for _, row in tqdm(trials_df.iterrows(), total=len(trials_df)):
    cond = str(row["condition_name"]).lower().strip()
    nct_id = row["nct_id"]
    title = row["brief_title"]

    # Try exact match first
    match_node = disease_lookup.get(cond)

    if not match_node:
        # Fuzzy match if exact fails
        best_match = None
        best_score = 0
        for name in disease_lookup:
            score = fuzz.partial_ratio(cond, name)
            if score > best_score and score >= 85:
                best_score = score
                best_match = name

        if best_match:
            match_node = disease_lookup[best_match]
        else:
            unmatched_conditions.append(cond)
            continue

    trial_node = f"ClinicalTrial::{nct_id}"
    if not G.has_node(trial_node):
        G.add_node(trial_node, label="ClinicalTrial", name=title)
        added_trials += 1

    if not G.has_edge(match_node, trial_node):
        G.add_edge(match_node, trial_node, relation="studied_in")
        added_links += 1

print(f"\n✅ Fuzzy trial linking complete.")
print(f"📦 Added {added_trials} trial nodes and 🔗 {added_links} edges.")
print(f"⏭️ Unmatched conditions: {len(unmatched_conditions)}")


🔬 Linking clinical trials using fuzzy matching...


100%|██████████| 5000/5000 [01:55<00:00, 43.35it/s]


✅ Fuzzy trial linking complete.
📦 Added 2348 trial nodes and 🔗 2662 edges.
⏭️ Unmatched conditions: 2194





In [41]:
import requests
import pandas as pd
from tqdm import tqdm
import time

def fetch_mesh_synonyms(apikey, pages=50):
    base_url = "https://data.bioontology.org/ontologies/MESH/classes"
    params = {
        "apikey": apikey,
        "pagesize": 100
    }

    all_data = []

    for page in tqdm(range(1, pages + 1), desc="🔍 Fetching MeSH terms"):
        params["page"] = page
        try:
            r = requests.get(base_url, params=params)
            r.raise_for_status()
            for entry in r.json().get("collection", []):
                name = entry.get("prefLabel")
                synonyms = entry.get("synonym", [])
                if name:
                    all_data.append({
                        "Preferred_Name": name,
                        "Synonyms": synonyms
                    })
            time.sleep(0.3)
        except Exception as e:
            print(f"⚠️ Page {page} failed: {e}")
            continue

    return pd.DataFrame(all_data)

# Fetch 5000+ terms (~50 pages)
mesh_df = fetch_mesh_synonyms(BIOPORTAL_APIKEY, pages=50)


🔍 Fetching MeSH terms: 100%|██████████| 50/50 [00:45<00:00,  1.11it/s]


## 3. Configuration & Parameters <a id='config'></a>'

In [5]:
enrich_genes_with_uniprot(G, max_genes=200)

🧬 Enriching genes from UniProt...
✅ Gene enrichment complete.


In [7]:
enrich_diseases_with_mesh_omim(G, max_diseases=200)


📚 Enriching diseases with MeSH/OMIM using BioPortal (fuzzy match)...
✅ Fuzzy disease enrichment complete.


In [8]:
# Display enriched diseases
count = 0
for node, data in G.nodes(data=True):
    if data.get("label") == "Disease" and data.get("bioportal_id"):
        print(f"🦠 {data['name']}")
        print(f"   BioPortal ID: {data.get('bioportal_id')}")
        print(f"   Label: {data.get('bioportal_label')}")
        print(f"   Synonyms: {data.get('bioportal_synonyms')}")
        print(f"   Link: {data.get('bioportal_links')}\n")
        count += 1
        if count >= 5:
            break



In [9]:
def enrich_diseases_with_mesh_omim(G, max_diseases=200, apikey="YOUR_API_KEY_HERE"):
    print("📚 Enriching diseases with MeSH/OMIM using BioPortal (with API key)...")

    search_url = "https://data.bioontology.org/search"
    disease_nodes = [n for n, d in G.nodes(data=True) if d.get("label") == "Disease"]
    disease_nodes = disease_nodes[:max_diseases]

    for node_id in disease_nodes:
        disease_name = G.nodes[node_id].get("name")
        if not disease_name:
            continue

        try:
            print(f"🔍 Searching: {disease_name}")
            params = {
                "q": disease_name,
                "ontologies": "MESH,OMIM",
                "format": "json",
                "pagesize": 1,
                "apikey": apikey
            }
            response = requests.get(search_url, params=params)
            data = response.json()

            if data.get("collection"):
                match = data["collection"][0]
                G.nodes[node_id]["bioportal_id"] = match.get("@id")
                G.nodes[node_id]["bioportal_label"] = match.get("prefLabel")
                G.nodes[node_id]["bioportal_synonyms"] = match.get("synonym")
                G.nodes[node_id]["bioportal_links"] = match.get("links", {}).get("self")
                print(f"✅ Match: {match.get('prefLabel')} ({match.get('@id')})")
            else:
                print("❌ No match found")

        except Exception as e:
            print(f"⚠️ Error for {disease_name}: {e}")

        time.sleep(0.3)

    print("✅ Enrichment complete.")


In [10]:
enrich_diseases_with_mesh_omim(G, max_diseases=200, apikey="41a17e7d-a975-4fc3-a0c7-f6a1915ea89a")


📚 Enriching diseases with MeSH/OMIM using BioPortal (with API key)...
🔍 Searching: idiopathic pulmonary fibrosis
✅ Match: Idiopathic Pulmonary Fibrosis (http://purl.bioontology.org/ontology/MESH/D054990)
🔍 Searching: restless legs syndrome
✅ Match: Restless Legs Syndrome (http://purl.bioontology.org/ontology/MESH/D012148)
🔍 Searching: alcohol dependence
✅ Match: ALCOHOL DEPENDENCE (http://purl.bioontology.org/ontology/OMIM/103780)
🔍 Searching: nicotine dependence
✅ Match: Nicotine (http://purl.bioontology.org/ontology/MESH/D009538)
🔍 Searching: lymphatic system cancer
✅ Match: Lymphatic System (http://purl.bioontology.org/ontology/MESH/D008208)
🔍 Searching: pharynx cancer
✅ Match: Pharyngeal Neoplasms (http://purl.bioontology.org/ontology/MESH/D010610)
🔍 Searching: duodenum cancer
✅ Match: Duodenal Neoplasms (http://purl.bioontology.org/ontology/MESH/D004379)
🔍 Searching: ileum cancer
✅ Match: Ileum (http://purl.bioontology.org/ontology/MESH/D007082)
🔍 Searching: leprosy
✅ Match: Lepro

In [11]:
# Show sample enriched diseases
count = 0
for node, data in G.nodes(data=True):
    if data.get("label") == "Disease" and data.get("bioportal_id"):
        print(f"🦠 {data['name']}")
        print(f"   MeSH/OMIM: {data.get('bioportal_label')}")
        print(f"   Link: {data.get('bioportal_links')}\n")
        count += 1
        if count >= 5:
            break


🦠 idiopathic pulmonary fibrosis
   MeSH/OMIM: Idiopathic Pulmonary Fibrosis
   Link: https://data.bioontology.org/ontologies/MESH/classes/http%3A%2F%2Fpurl.bioontology.org%2Fontology%2FMESH%2FD054990

🦠 restless legs syndrome
   MeSH/OMIM: Restless Legs Syndrome
   Link: https://data.bioontology.org/ontologies/MESH/classes/http%3A%2F%2Fpurl.bioontology.org%2Fontology%2FMESH%2FD012148

🦠 alcohol dependence
   MeSH/OMIM: ALCOHOL DEPENDENCE
   Link: https://data.bioontology.org/ontologies/OMIM/classes/http%3A%2F%2Fpurl.bioontology.org%2Fontology%2FOMIM%2F103780

🦠 nicotine dependence
   MeSH/OMIM: Nicotine
   Link: https://data.bioontology.org/ontologies/MESH/classes/http%3A%2F%2Fpurl.bioontology.org%2Fontology%2FMESH%2FD009538

🦠 lymphatic system cancer
   MeSH/OMIM: Lymphatic System
   Link: https://data.bioontology.org/ontologies/MESH/classes/http%3A%2F%2Fpurl.bioontology.org%2Fontology%2FMESH%2FD008208



In [12]:
added_nodes = 0
added_edges = 0

for node, data in list(G.nodes(data=True)):
    if data.get("label") == "Disease" and data.get("bioportal_id"):
        mesh_omim_id = data["bioportal_id"]
        label = data.get("bioportal_label", "")
        source = data.get("bioportal_links", "")

        if not G.has_node(mesh_omim_id):
            G.add_node(mesh_omim_id,
                       label="Ontology",
                       name=label,
                       source="BioPortal",
                       original_source=mesh_omim_id)
            added_nodes += 1

        if not G.has_edge(node, mesh_omim_id):
            G.add_edge(node, mesh_omim_id, relation="has_ontology_term")
            added_edges += 1

added_nodes, added_edges


(137, 137)

In [13]:
def enrich_diseases_with_mesh_omim(G, max_diseases=200, apikey="YOUR_API_KEY_HERE"):
    """
    Enrich Disease nodes with MeSH/OMIM terms from BioPortal and structurally link them.
    - Adds new Ontology nodes with simplified IDs (e.g., MeSH::D001943)
    - Adds 'has_ontology_term' edges from Disease → Ontology node
    """
    import requests
    import time

    print("📚 Enriching diseases with MeSH/OMIM using BioPortal...")
    search_url = "https://data.bioontology.org/search"

    disease_nodes = [n for n, d in G.nodes(data=True) if d.get("label") == "Disease"]
    disease_nodes = disease_nodes[:max_diseases]

    added_nodes = 0
    added_edges = 0

    for node_id in disease_nodes:
        disease_name = G.nodes[node_id].get("name")
        if not disease_name:
            continue

        try:
            print(f"🔍 Searching: {disease_name}")
            params = {
                "q": disease_name,
                "ontologies": "MESH,OMIM",
                "format": "json",
                "pagesize": 1,
                "apikey": apikey
            }
            response = requests.get(search_url, params=params)
            data = response.json()

            if data.get("collection"):
                match = data["collection"][0]
                full_uri = match.get("@id")
                label = match.get("prefLabel", "")
                synonyms = match.get("synonym", [])
                source_link = match.get("links", {}).get("self", "")

                # Store on disease node
                G.nodes[node_id]["bioportal_id"] = full_uri
                G.nodes[node_id]["bioportal_label"] = label
                G.nodes[node_id]["bioportal_synonyms"] = synonyms
                G.nodes[node_id]["bioportal_links"] = source_link

                # Create simplified node ID
                if "MESH" in full_uri:
                    readable_id = "MeSH::" + full_uri.split("/")[-1]
                elif "OMIM" in full_uri:
                    readable_id = "OMIM::" + full_uri.split("/")[-1]
                else:
                    readable_id = "Ontology::" + full_uri.split("/")[-1]

                # Add ontology node if missing
                if not G.has_node(readable_id):
                    G.add_node(readable_id,
                               label="Ontology",
                               name=label,
                               source="BioPortal",
                               original_uri=full_uri)
                    added_nodes += 1

                # Add edge
                if not G.has_edge(node_id, readable_id):
                    G.add_edge(node_id, readable_id, relation="has_ontology_term")
                    added_edges += 1

                print(f"✅ Linked: {disease_name} → {readable_id}")

            else:
                print("❌ No match found")

        except Exception as e:
            print(f"⚠️ Error for {disease_name}: {e}")

        time.sleep(0.3)

    print(f"✅ Enrichment complete. Added {added_nodes} nodes and {added_edges} edges.")


In [14]:
enrich_diseases_with_mesh_omim(G, max_diseases=200, apikey="41a17e7d-a975-4fc3-a0c7-f6a1915ea89a")

📚 Enriching diseases with MeSH/OMIM using BioPortal...
🔍 Searching: idiopathic pulmonary fibrosis
✅ Linked: idiopathic pulmonary fibrosis → MeSH::D054990
🔍 Searching: restless legs syndrome
✅ Linked: restless legs syndrome → MeSH::D012148
🔍 Searching: alcohol dependence
✅ Linked: alcohol dependence → OMIM::103780
🔍 Searching: nicotine dependence
✅ Linked: nicotine dependence → MeSH::D009538
🔍 Searching: lymphatic system cancer
✅ Linked: lymphatic system cancer → MeSH::D008208
🔍 Searching: pharynx cancer
✅ Linked: pharynx cancer → MeSH::D010610
🔍 Searching: duodenum cancer
✅ Linked: duodenum cancer → MeSH::D004379
🔍 Searching: ileum cancer
✅ Linked: ileum cancer → MeSH::D007082
🔍 Searching: leprosy
✅ Linked: leprosy → MeSH::D007918
🔍 Searching: prostate cancer
✅ Linked: prostate cancer → OMIM::176807
🔍 Searching: stomach cancer
✅ Linked: stomach cancer → OMIM::MTHU072800
🔍 Searching: celiac disease
✅ Linked: celiac disease → MeSH::D002446
🔍 Searching: Alzheimer's disease
✅ Linked: Alzhe

In [16]:
def enrich_diseases_with_mesh_omim(G, max_diseases=200, apikey="41a17e7d-a975-4fc3-a0c7-f6a1915ea89a"):
    import requests, time

    print("📚 Enriching diseases with MeSH/OMIM using BioPortal (with API key)...")

    search_url = "https://data.bioontology.org/search"
    disease_nodes = [n for n, d in G.nodes(data=True) if d.get("label") == "Disease"]
    disease_nodes = disease_nodes[:max_diseases]

    added = 0
    for node_id in disease_nodes:
        disease_name = G.nodes[node_id].get("name")
        if not disease_name:
            continue

        try:
            print(f"🔍 Searching: {disease_name}")
            params = {
                "q": disease_name,
                "ontologies": "MESH,OMIM",
                "format": "json",
                "pagesize": 1,
                "apikey": apikey
            }
            response = requests.get(search_url, params=params)
            data = response.json()

            if data.get("collection"):
                match = data["collection"][0]
                source = match.get("links", {}).get("ontology", "").split("/")[-1]  # MESH or OMIM
                concept_id = match.get("@id", "").split("/")[-1]
                label = match.get("prefLabel")
                synonyms = match.get("synonym", [])
                uri = match.get("links", {}).get("self")

                if source and concept_id:
                    node_tag = f"{source}::{concept_id}"
                    if not G.has_node(node_tag):
                        G.add_node(node_tag,
                                   label="Ontology",
                                   name=label,
                                   source=source,
                                   synonyms=synonyms,
                                   uri=uri)
                    if not G.has_edge(node_id, node_tag):
                        G.add_edge(node_id, node_tag, relation="has_ontology_term")
                        added += 1

                print(f"✅ Linked: {disease_name} → {label} → {node_tag}")
            else:
                print("❌ No match found")

        except Exception as e:
            print(f"⚠️ Error for {disease_name}: {e}")

        time.sleep(0.3)

    print(f"\n✅ Enrichment complete. Added {added} new ontology links.")


In [17]:
def enrich_diseases_with_mesh_omim(G, max_diseases=200, apikey="41a17e7d-a975-4fc3-a0c7-f6a1915ea89a"):
    import requests, time

    print("📚 Enriching diseases with MeSH/OMIM using BioPortal (with API key)...")

    search_url = "https://data.bioontology.org/search"
    disease_nodes = [n for n, d in G.nodes(data=True) if d.get("label") == "Disease"]
    disease_nodes = disease_nodes[:max_diseases]

    added = 0
    for node_id in disease_nodes:
        disease_name = G.nodes[node_id].get("name")
        if not disease_name:
            continue

        try:
            print(f"🔍 Searching: {disease_name}")
            params = {
                "q": disease_name,
                "ontologies": "MESH,OMIM",
                "format": "json",
                "pagesize": 1,
                "apikey": apikey
            }
            response = requests.get(search_url, params=params)
            data = response.json()

            if data.get("collection"):
                match = data["collection"][0]
                source = match.get("links", {}).get("ontology", "").split("/")[-1]  # MESH or OMIM
                concept_id = match.get("@id", "").split("/")[-1]
                label = match.get("prefLabel")
                synonyms = match.get("synonym", [])
                uri = match.get("links", {}).get("self")

                if source and concept_id:
                    node_tag = f"{source}::{concept_id}"
                    if not G.has_node(node_tag):
                        G.add_node(node_tag,
                                   label="Ontology",
                                   name=label,
                                   source=source,
                                   synonyms=synonyms,
                                   uri=uri)
                    if not G.has_edge(node_id, node_tag):
                        G.add_edge(node_id, node_tag, relation="has_ontology_term")
                        added += 1

                print(f"✅ Linked: {disease_name} → {label} → {node_tag}")
            else:
                print("❌ No match found")

        except Exception as e:
            print(f"⚠️ Error for {disease_name}: {e}")

        time.sleep(0.3)

    print(f"\n✅ Enrichment complete. Added {added} new ontology links.")


In [18]:
enrich_diseases_with_mesh_omim(G, max_diseases=200, apikey="41a17e7d-a975-4fc3-a0c7-f6a1915ea89a")

📚 Enriching diseases with MeSH/OMIM using BioPortal (with API key)...
🔍 Searching: idiopathic pulmonary fibrosis
✅ Linked: idiopathic pulmonary fibrosis → Idiopathic Pulmonary Fibrosis → MESH::D054990
🔍 Searching: restless legs syndrome
✅ Linked: restless legs syndrome → Restless Legs Syndrome → MESH::D012148
🔍 Searching: alcohol dependence
✅ Linked: alcohol dependence → ALCOHOL DEPENDENCE → OMIM::103780
🔍 Searching: nicotine dependence
✅ Linked: nicotine dependence → Nicotine → MESH::D009538
🔍 Searching: lymphatic system cancer
✅ Linked: lymphatic system cancer → Lymphatic System → MESH::D008208
🔍 Searching: pharynx cancer
✅ Linked: pharynx cancer → Pharyngeal Neoplasms → MESH::D010610
🔍 Searching: duodenum cancer
✅ Linked: duodenum cancer → Duodenal Neoplasms → MESH::D004379
🔍 Searching: ileum cancer
✅ Linked: ileum cancer → Ileum → MESH::D007082
🔍 Searching: leprosy
✅ Linked: leprosy → Leprosy → MESH::D007918
🔍 Searching: prostate cancer
✅ Linked: prostate cancer → PROSTATE CANCER →

In [22]:
query = """
SELECT
    c.name AS condition_name,
    s.nct_id,
    s.brief_title
FROM
    conditions c
JOIN
    studies s ON c.nct_id = s.nct_id
WHERE
    s.study_type = 'Interventional'
    AND s.overall_status = 'Recruiting'
    AND c.name IN (
        'skin cancer',
        'obesity',
        'liver cancer',
        'uterine cancer'
    )
LIMIT 50;
"""

trials_df = pd.read_sql(query, conn)
trials_df.head()


  trials_df = pd.read_sql(query, conn)


Unnamed: 0,condition_name,nct_id,brief_title


In [23]:
query = """
SELECT
    c.name AS condition_name,
    s.nct_id,
    s.brief_title
FROM
    conditions c
JOIN
    studies s ON c.nct_id = s.nct_id
LIMIT 20;
"""

trials_df = pd.read_sql(query, conn)
trials_df.head()


  trials_df = pd.read_sql(query, conn)


Unnamed: 0,condition_name,nct_id,brief_title
0,Congenital Adrenal Hyperplasia,NCT00000102,Congenital Adrenal Hyperplasia: Calcium Channe...
1,Lead Poisoning,NCT00000104,Does Lead Burden Alter Neuropsychological Deve...
2,Cancer,NCT00000105,Vaccination With Tetanus and KLH to Assess Imm...
3,Rheumatic Diseases,NCT00000106,41.8 Degree Centigrade Whole Body Hyperthermia...
4,"Heart Defects, Congenital",NCT00000107,Body Water Content in Cyanotic Congenital Hear...


In [24]:
query = """
SELECT
    c.name AS condition_name,
    s.nct_id,
    s.brief_title
FROM
    conditions c
JOIN
    studies s ON c.nct_id = s.nct_id
WHERE
    LOWER(c.name) LIKE '%cancer%'
LIMIT 50;
"""

trials_df = pd.read_sql(query, conn)
trials_df.head()

  trials_df = pd.read_sql(query, conn)


Unnamed: 0,condition_name,nct_id,brief_title
0,Bladder Cancer,NCT06026189,Safely Reduce Cystoscopic Evaluations for Hema...
1,Breast Cancer,NCT04406779,The Frequency of Thyroid Diseases in Women Wit...
2,Early Breast Cancer,NCT05213403,"Innovative ""Scoring System"" in Breast Cancer P..."
3,"Non-colorectal Cancer (Esophagus, Stomach, Liv...",NCT04478175,Prospective Evaluation of a Program for Early ...
4,Head and Neck Cancer,NCT00017277,Radiation Therapy With or Without Epoetin Alfa...


In [25]:
query = """
SELECT
    c.name AS condition_name,
    s.nct_id,
    s.brief_title
FROM
    conditions c
JOIN
    studies s ON c.nct_id = s.nct_id
LIMIT 500;
"""
trials_df = pd.read_sql(query, conn)
trials_df.head()

  trials_df = pd.read_sql(query, conn)


Unnamed: 0,condition_name,nct_id,brief_title
0,Congenital Adrenal Hyperplasia,NCT00000102,Congenital Adrenal Hyperplasia: Calcium Channe...
1,Lead Poisoning,NCT00000104,Does Lead Burden Alter Neuropsychological Deve...
2,Cancer,NCT00000105,Vaccination With Tetanus and KLH to Assess Imm...
3,Rheumatic Diseases,NCT00000106,41.8 Degree Centigrade Whole Body Hyperthermia...
4,"Heart Defects, Congenital",NCT00000107,Body Water Content in Cyanotic Congenital Hear...


In [26]:
query = """
SELECT DISTINCT
    LOWER(TRIM(c.name)) AS condition_name,
    s.nct_id,
    s.brief_title,
    s.overall_status,
    s.study_type
FROM
    conditions c
JOIN
    studies s ON c.nct_id = s.nct_id
WHERE
    LOWER(s.study_type) = 'interventional'
    AND LOWER(s.overall_status) IN ('recruiting', 'not yet recruiting', 'active, not recruiting')
    AND c.name IS NOT NULL
LIMIT 5000;
"""
trials_df = pd.read_sql(query, conn)
trials_df.head()

  trials_df = pd.read_sql(query, conn)


Unnamed: 0,condition_name,nct_id,brief_title,overall_status,study_type
0,03 - bacteria,NCT06988514,Stool Collection for Model,RECRUITING,INTERVENTIONAL
1,177lu-eb-fapi,NCT05400967,Diagnosis of Metastatic Tumors on 68Ga-FAPI PE...,RECRUITING,INTERVENTIONAL
2,18f-fdg pet/ct,NCT07048405,Intermittent Cold Exposure and Brown Adipose T...,RECRUITING,INTERVENTIONAL
3,18 years and over,NCT06300827,The Effect of Structured Digital-Based Educati...,RECRUITING,INTERVENTIONAL
4,19 and 20+ b cell hematologic tumors,NCT05388695,To Observe the Dual-target Chimeric Antigen Re...,RECRUITING,INTERVENTIONAL


In [27]:
added_nodes = 0
added_edges = 0
linked = []

for _, row in trials_df.iterrows():
    disease_name = row["condition_name"].strip().lower()
    nct_id = row["nct_id"]
    title = row["brief_title"]

    # Match disease node by name (case-insensitive)
    matched_nodes = [
        n for n, d in G.nodes(data=True)
        if d.get("label") == "Disease" and d.get("name", "").strip().lower() == disease_name
    ]

    for disease_node in matched_nodes:
        trial_node = f"ClinicalTrial::{nct_id}"

        if not G.has_node(trial_node):
            G.add_node(trial_node, label="ClinicalTrial", name=title)
            added_nodes += 1

        if not G.has_edge(disease_node, trial_node):
            G.add_edge(disease_node, trial_node, relation="studied_in")
            added_edges += 1
            linked.append((disease_node, trial_node))

print(f"\n✅ Trial enrichment complete.")
print(f"📦 Added {added_nodes} trial nodes and 🔗 {added_edges} edges.")
print("🧪 Sample links:")
for u, v in linked[:5]:
    print(f"🔗 {u} → studied_in → {v}")



✅ Trial enrichment complete.
📦 Added 63 trial nodes and 🔗 63 edges.
🧪 Sample links:
🔗 Disease::DOID:7693 → studied_in → ClinicalTrial::NCT02548546
🔗 Disease::DOID:7693 → studied_in → ClinicalTrial::NCT03918460
🔗 Disease::DOID:7693 → studied_in → ClinicalTrial::NCT04307992
🔗 Disease::DOID:7693 → studied_in → ClinicalTrial::NCT04500756
🔗 Disease::DOID:7693 → studied_in → ClinicalTrial::NCT04746677


In [30]:
gene_nodes = [
    (n, d.get("name")) for n, d in G.nodes(data=True)
    if d.get("label") == "Gene" and d.get("name")
]


In [36]:
added_diseases = 0
added_edges = 0
skipped = 0

for node, ensembl_id in tqdm(gene_nodes[:100]):  # test range if needed
    query = gql(f"""
    query {{
      target(ensemblId: "{ensembl_id}") {{
        associatedDiseases(page: {{ index: 0, size: 10 }}) {{
          rows {{
            disease {{ id name }}
            score
          }}
        }}
      }}
    }}
    """)
    try:
        result = client.execute(query)
        target_block = result.get("target")
        if not target_block:
            skipped += 1
            continue

        rows = target_block.get("associatedDiseases", {}).get("rows", [])
        for row in rows:
            raw_id = row["disease"]["id"]
            disease_id = f"Disease::{raw_id}"
            disease_name = row["disease"]["name"]
            score = row.get("score", 0)

            # Always set or update name
            if not G.has_node(disease_id):
                G.add_node(disease_id, label="Disease", name=disease_name)
                added_diseases += 1
            else:
                G.nodes[disease_id]["label"] = "Disease"
                G.nodes[disease_id]["name"] = disease_name

            if not G.has_edge(node, disease_id):
                G.add_edge(node, disease_id, relation="associated_with", score=score)
                added_edges += 1

    except Exception as e:
        print(f"⚠️ Error for {ensembl_id}: {e}")
        skipped += 1

print(f"\n✅ Enrichment complete.")
print(f"📦 Added {added_diseases} disease nodes and 🔗 {added_edges} edges.")
print(f"⏭️ Skipped {skipped} queries.")


100%|██████████| 100/100 [00:16<00:00,  5.96it/s]


✅ Enrichment complete.
📦 Added 0 disease nodes and 🔗 0 edges.
⏭️ Skipped 100 queries.





In [40]:
BIOPORTAL_APIKEY = "41a17e7d-a975-4fc3-a0c7-f6a1915ea89a"  # my real API key


## 4. Load or Build Your Graph <a id='load-graph'></a>

In [1]:
pip install pandas networkx matplotlib requests tqdm




## 5. Enrich Genes with UniProt <a id='enrich-genes'></a>

## 6. Enrich Diseases with MeSH/OMIM <a id='enrich-diseases'></a>

In [6]:
def enrich_diseases_with_mesh_omim(G, max_diseases=200):
    print("📚 Enriching diseases with MeSH/OMIM using BioPortal (fuzzy match)...")

    search_url = "https://data.bioontology.org/search"
    disease_nodes = [n for n, d in G.nodes(data=True) if d.get("label") == "Disease"]
    disease_nodes = disease_nodes[:max_diseases]

    for node_id in disease_nodes:
        disease_name = G.nodes[node_id].get("name")
        if not disease_name:
            continue

        try:
            params = {
                "q": disease_name,
                "ontologies": "MESH,OMIM",
                "format": "json",
                "pagesize": 1  # only need top hit
            }
            response = requests.get(search_url, params=params)
            data = response.json()

            if data.get("collection"):
                match = data["collection"][0]
                G.nodes[node_id]["bioportal_id"] = match.get("@id")
                G.nodes[node_id]["bioportal_label"] = match.get("prefLabel")
                G.nodes[node_id]["bioportal_synonyms"] = match.get("synonym")
                G.nodes[node_id]["bioportal_links"] = match.get("links", {}).get("self")

        except Exception as e:
            print(f"⚠️ {disease_name}: {e}")

        time.sleep(0.3)

    print("✅ Fuzzy disease enrichment complete.")


## 7. Results & Visualization <a id='visualization'></a>

## 8. Save & Export <a id='save-export'></a>

## 9. Next Steps & References <a id='next-steps'></a>
- Consider adding Ensembl integration.
- Extend with additional data sources or visualizations.

In [2]:
# Remove corrupted/existing
!rm -f hetionet-nodes.tsv hetionet-edges.sif.gz

# Download both from GitHub directly
!wget https://github.com/hetio/hetionet/raw/master/hetnet/tsv/hetionet-v1.0-nodes.tsv -O hetionet-nodes.tsv
!wget https://github.com/hetio/hetionet/raw/master/hetnet/tsv/hetionet-v1.0-edges.sif.gz -O hetionet-edges.sif.gz


--2025-07-12 20:56:20--  https://github.com/hetio/hetionet/raw/master/hetnet/tsv/hetionet-v1.0-nodes.tsv
Resolving github.com (github.com)... 140.82.116.4
Connecting to github.com (github.com)|140.82.116.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/hetio/hetionet/master/hetnet/tsv/hetionet-v1.0-nodes.tsv [following]
--2025-07-12 20:56:20--  https://raw.githubusercontent.com/hetio/hetionet/master/hetnet/tsv/hetionet-v1.0-nodes.tsv
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.109.133, 185.199.108.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.109.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2509984 (2.4M) [text/plain]
Saving to: ‘hetionet-nodes.tsv’


2025-07-12 20:56:20 (45.2 MB/s) - ‘hetionet-nodes.tsv’ saved [2509984/2509984]

--2025-07-12 20:56:20--  https://github.com/hetio/hetionet/raw/master/hetn

Enriching genes from UniProt

Enriching diseases with MeSH/OMIM using BioPortal

In [None]:
api key: "41a17e7d-a975-4fc3-a0c7-f6a1915ea89a"

In [None]:
api key: "41a17e7d-a975-4fc3-a0c7-f6a1915ea89a"


In [31]:
def make_ensembl_query(ensembl_id):
    return gql(f"""
    query {{
      target(ensemblId: "{ensembl_id}") {{
        approvedSymbol
        associatedDiseases(page: {{ index: 0, size: 10 }}) {{
          rows {{
            disease {{ id name }}
            score
          }}
        }}
      }}
    }}
    """)

