# Dependencies
Execute the following in the terminal before running any notebooks:
`pip install -r requirements.txt`

# Exercise 1: 1000 Alzheimer's disease and 1000 cancer papers from PubMed 

In [2]:
# Query Entrez API for PubMed IDs of search term and year and return as a list
import requests

def get_pmids(term, year, retmax):
    # Define efetch URL
    url = f"https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&term={term}+AND+{year}[pdat]&retmode=json&retmax={retmax}"
    
    # Query the Entrez API
    r = requests.get(url)

    # Grab the list of PMIDs
    pmids = r.json()["esearchresult"]["idlist"]
    
    return pmids

get_pmids("lupus", "2004", 10)

['21473028',
 '21210563',
 '18958642',
 '18202459',
 '17822285',
 '17642789',
 '17642773',
 '17642626',
 '17642623',
 '17491665']

In [17]:
# Query Entrez for metadata of a PubMed paper given its PMID
import lxml

def get_metadata(pmids):
    # Convert list of PMIDs to a string for POST payload
    pmids_string = ",".join(pmids)

    url = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi"

    # Define parameters for POST payload
    params = {
        'db': 'pubmed',
        'id': pmids_string,
        'retmode': 'xml'
    }

    # Query the Entrez API
    r = requests.post(url, params)

    # Parse the XML response if the response was successful
    if r.status_code == 200:
        doc = lxml.etree.fromstring(r.text)
        titles = doc.xpath("//ArticleTitle")
        abstracts = doc.xpath("//AbstractText")
    else:
        print(f"Error: {r.status_code}")
        return None

pmids = ["32008517", "32008517", "32008517"]
metadata = get_metadata(pmids)

Deep Learning for Alzheimer's Disease Classification using Texture Features.
We propose a classification method for Alzheimer's disease (AD) based on the texture of the hippocampus, which is the organ that is most affected by the onset of AD.


In [8]:
# Process all papers and save metadata to JSON
pmids = get_pmids("alzheimers", "2023", 1000) + get_pmids("cancer", "2023", 1000)
pmids_string = ",".join(pmids)
url = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi"

# Define parameters for POST payload
params = {
    'db': 'pubmed',
    'id': pmids_string,
    'retmode': 'xml'
}

# Query the Entrez API
r = requests.post(url, params)

In [13]:
from lxml import etree
import json

# Parse XML response
doc = etree.fromstring(r.text)
titles = doc.xpath("//ArticleTitle")
abstracts = doc.xpath("//AbstractText")
queries = ["alzheimers"] * 1000 + ["cancer"] * 1000

def get_full_abstract(abstract_elements):
    # Concatenate all abstract text elements to form the full abstract
    full_abstract = ' '.join([abstract_elem.text for abstract_elem in abstract_elements if abstract_elem.text])
    return full_abstract

papers_dict = {}
for i, pmid in enumerate(pmids):
    #full_abstract = get_full_abstract(abstracts[i])

    papers_dict[pmid] = {
        "ArticleTitle": titles[i].text,
        "AbstractText": abstracts[i].txt,
        "query": queries[i]
    } 

AttributeError: 'lxml.etree._Element' object has no attribute 'txt'

Episodic memory decline is an early symptom of Alzheimer's disease (AD) - a neurodegenerative disease that has a higher prevalence rate in older females compared to older males. However, little is known about why these sex differences in prevalence rate exist. In the current longitudinal task fMRI study, we explored whether there were sex differences in the patterns of memory decline and brain activity during object-location (spatial context) encoding and retrieval in a large sample of cognitively unimpaired older adults from the Pre-symptomatic Evaluation of Novel or Experimental Treatments for Alzheimer's Disease (PREVENT-AD) program who are at heightened risk of developing AD due to having a family history (+FH) of the disease. The goal of the study was to gain insight into whether there are sex differences in the neural correlates of episodic memory decline, which may advance knowledge about sex-specific patterns in the natural progression to AD. Our results indicate that +FH femal