Prompt: i want to keep up to date on the newest research in trauma therapy, ptsd and cptsd. I believe that searching pubmed is one way to do this, but I would like to write a script to automate it so I don't have to rewrite the queries every time. I would like to search once a month, and then get a list of the 5 most relevant/popular articles about that topic. is it possible to write a python program to do this? do they have an api? or do you recommend some other way

Yes, it’s absolutely possible to automate this with Python using the **NCBI Entrez API**, which PubMed provides via their [E-utilities](https://www.ncbi.nlm.nih.gov/books/NBK25501/). You can use the `Bio.Entrez` module from the Biopython library to query PubMed and retrieve metadata like article titles, authors, and links.

Here's a basic example script to:

* Query PubMed for trauma therapy, PTSD, and CPTSD.
* Retrieve metadata for the top 5 most relevant or recent articles.
* Output titles and links.

### Step 1: Install Biopython

```bash
pip install biopython
```

### Step 2: Example Script




In [None]:
from Bio import Entrez
from datetime import datetime

# Set your email (required by NCBI)
Entrez.email = "your_email@example.com"

def search_pubmed(query, max_results=5):
    handle = Entrez.esearch(
        db="pubmed",
        term=query,
        retmax=max_results,
        sort="relevance",
        datetype="pdat",  # use publication date
        reldate=30        # last 30 days
    )
    record = Entrez.read(handle)
    handle.close()
    return record['IdList']

def fetch_details(id_list):
    ids = ",".join(id_list)
    handle = Entrez.efetch(db="pubmed", id=ids, retmode="xml")
    records = Entrez.read(handle)
    handle.close()
    return records

def format_results(records):
    for i, article in enumerate(records['PubmedArticle']):
        title = article['MedlineCitation']['Article']['ArticleTitle']
        authors = article['MedlineCitation']['Article'].get('AuthorList', [])
        pub_date = article['MedlineCitation']['Article']['Journal']['JournalIssue']['PubDate']
        article_id = article['MedlineCitation']['PMID']
        link = f"https://pubmed.ncbi.nlm.nih.gov/{article_id}/"

        author_names = ', '.join(
            [f"{a['ForeName']} {a['LastName']}" for a in authors if 'ForeName' in a and 'LastName' in a]
        )
        print(f"{i+1}. {title}")
        print(f"   Authors: {author_names}")
        print(f"   Published: {pub_date}")
        print(f"   Link: {link}\n")

if __name__ == "__main__":
    query = "trauma therapy OR PTSD OR CPTSD"
    ids = search_pubmed(query)
    if ids:
        articles = fetch_details(ids)
        format_results(articles)
    else:
        print("No new articles found.")


### Features to Extend

* Save results to a CSV or HTML report.
* Schedule it monthly using `cron` (Linux/macOS) or Task Scheduler (Windows).
* Add email delivery via `smtplib`.

Would you like help turning this into a monthly automation or outputting results into a file or email?

# Updated script to exclude results already seen

In [None]:
import smtplib
import os
import json
from datetime import datetime
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from Bio import Entrez

# === Configuration ===
Entrez.email = "email@place.com" # Required by NCBI

SMTP_SERVER = "smtp.gmail.com"
SMTP_PORT = 587
EMAIL_ADDRESS = "email@place.com"      # Sender
EMAIL_PASSWORD = " "         # Gmail App Password
TO_EMAIL = "email@place.com"           # Recipient

QUERY = "PTSD OR CPTSD"
MAX_RESULTS = 5
SEEN_FILE = "seen_pmids.json"


# === PubMed Query ===
def search_pubmed(query, max_results=10):
    handle = Entrez.esearch(
        db="pubmed",
        term=query,
        retmax=max_results,
        sort="relevance",
        datetype="pdat",
        reldate=30
    )
    record = Entrez.read(handle)
    handle.close()
    return record['IdList']


def fetch_details(id_list):
    ids = ",".join(id_list)
    handle = Entrez.efetch(db="pubmed", id=ids, retmode="xml")
    records = Entrez.read(handle)
    handle.close()
    return records


# === HTML Formatting ===
def format_results(records):
    html = f"""
    <html>
    <body>
        <h2>PubMed Research Digest: Trauma Therapy, PTSD, CPTSD</h2>
        <p><i>Latest articles from the past 30 days</i></p>
        <ol>
    """
    for article in records['PubmedArticle']:
        title = article['MedlineCitation']['Article']['ArticleTitle']
        article_id = article['MedlineCitation']['PMID']
        link = f"https://pubmed.ncbi.nlm.nih.gov/{article_id}/"
        authors = article['MedlineCitation']['Article'].get('AuthorList', [])
        author_str = ', '.join(
            [f"{a.get('ForeName', '')} {a.get('LastName', '')}".strip() for a in authors if 'LastName' in a]
        )
        html += f"<li><b>{title}</b><br>"
        if author_str:
            html += f"<small>{author_str}</small><br>"
        html += f"<a href='{link}'>{link}</a></li><br>"

    html += f"""
        </ol>
        <p style="font-size: small;">Generated on {datetime.now().strftime('%Y-%m-%d %H:%M')}</p>
    </body>
    </html>
    """
    return html


# === Email Sender ===
def send_email(subject, body_html):
    msg = MIMEMultipart("alternative")
    msg['From'] = EMAIL_ADDRESS
    msg['To'] = TO_EMAIL
    msg['Subject'] = subject

    msg.attach(MIMEText(body_html, "html"))

    with smtplib.SMTP(SMTP_SERVER, SMTP_PORT) as server:
        server.starttls()
        server.login(EMAIL_ADDRESS, EMAIL_PASSWORD)
        server.send_message(msg)


# === Seen PMIDs Tracker ===
def load_seen_pmids():
    if os.path.exists(SEEN_FILE):
        with open(SEEN_FILE, "r") as f:
            return set(json.load(f))
    return set()


def save_seen_pmids(pmids):
    with open(SEEN_FILE, "w") as f:
        json.dump(list(pmids), f)


# === Main Runner ===
if __name__ == "__main__":
    try:
        seen_pmids = load_seen_pmids()
        ids = search_pubmed(QUERY, MAX_RESULTS)
        new_ids = [pmid for pmid in ids if pmid not in seen_pmids]

        if new_ids:
            records = fetch_details(new_ids)
            html = format_results(records)
            send_email("Monthly PubMed Digest: Trauma/PTSD Research", html)

            seen_pmids.update(new_ids)
            save_seen_pmids(seen_pmids)

            print(f"Sent {len(new_ids)} new article(s).")
        else:
            print("No new articles since last run.")
    except Exception as e:
        print(f"Error occurred: {e}")


# Final Result (hopefully)

Perfect! Below is the **updated Python script** that:

---

### ✅ Features

* Pulls **PubMed results** with `"free full text[filter]"`.
* Pulls from two major RSS feeds:

  * *Journal of Traumatic Stress* (Wiley)
  * *European Journal of Psychotraumatology* (fully open access)
* Prioritizes **freely available PDFs**
* Merges results and sends a **monthly digest limited to 10 items**
* Marks PDF availability with ✅
* Avoids duplicate articles using stored PMIDs/URLs

---

### ✅ Full Script: `trauma_research_digest.py`




In [None]:
import smtplib
import json
import os
from datetime import datetime
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText

from Bio import Entrez
import feedparser

# === Config ===
Entrez.email = "your_email@example.com"

SMTP_SERVER = "smtp.gmail.com"
SMTP_PORT = 587
EMAIL_ADDRESS = "your_email@gmail.com"
EMAIL_PASSWORD = "your_app_password"
TO_EMAIL = "recipient@example.com"

MAX_RESULTS = 10
SEEN_FILE = "seen_articles.json"

# RSS feeds
RSS_FEEDS = {
    "Journal of Traumatic Stress": "https://onlinelibrary.wiley.com/feed/10.1002/(ISSN)1573-6598",
    "European Journal of Psychotraumatology": "https://www.tandfonline.com/feed/rss/zept20"
}


# === Seen article tracking ===
def load_seen():
    if os.path.exists(SEEN_FILE):
        with open(SEEN_FILE, "r") as f:
            return set(json.load(f))
    return set()


def save_seen(seen_set):
    with open(SEEN_FILE, "w") as f:
        json.dump(list(seen_set), f)


# === PubMed search ===
def search_pubmed(term, max_results=10):
    handle = Entrez.esearch(
        db="pubmed",
        term=term,
        retmax=max_results,
        sort="relevance",
        datetype="pdat",
        reldate=30
    )
    record = Entrez.read(handle)
    handle.close()
    return record["IdList"]


def fetch_pubmed_details(id_list):
    ids = ",".join(id_list)
    handle = Entrez.efetch(db="pubmed", id=ids, retmode="xml")
    records = Entrez.read(handle)
    handle.close()

    results = []
    for article in records['PubmedArticle']:
        pmid = article['MedlineCitation']['PMID']
        title = article['MedlineCitation']['Article']['ArticleTitle']
        article_link = f"https://pubmed.ncbi.nlm.nih.gov/{pmid}/"
        authors = article['MedlineCitation']['Article'].get('AuthorList', [])
        author_str = ', '.join(
            [f"{a.get('ForeName', '')} {a.get('LastName', '')}".strip() for a in authors if 'LastName' in a]
        )
        results.append({
            "source": "PubMed",
            "title": title,
            "link": article_link,
            "authors": author_str,
            "id": f"pmid:{pmid}",
            "pdf": "✅" if "LinkOut" in article else ""
        })
    return results


# === RSS Parsing ===
def fetch_rss_articles():
    articles = []
    for source, url in RSS_FEEDS.items():
        feed = feedparser.parse(url)
        for entry in feed.entries:
            title = entry.get('title', 'No Title')
            link = entry.get('link', '')
            authors = entry.get('author', 'Unknown Author')
            pdf = '✅' if 'pdf' in link.lower() or 'open' in title.lower() else ''
            articles.append({
                "source": source,
                "title": title,
                "link": link,
                "authors": authors,
                "id": f"rss:{link}",
                "pdf": pdf
            })
    return articles


# === Email Formatting ===
def format_html_digest(articles):
    html = f"""
    <html><body>
    <h2>Trauma Research Digest</h2>
    <p><i>Latest research from PubMed and leading journals (past 30 days)</i></p>
    <ol>
    """
    for article in articles:
        html += f"<li><b>{article['title']}</b> {article['pdf']}<br>"
        if article['authors']:
            html += f"<small>{article['authors']}</small><br>"
        html += f"<a href='{article['link']}'>{article['link']}</a> <i>({article['source']})</i><br><br></li>"
    html += f"""
    </ol>
    <p style="font-size:small;">Generated {datetime.now().strftime('%Y-%m-%d %H:%M')}</p>
    </body></html>
    """
    return html


# === Send Email ===
def send_email(subject, html_body):
    msg = MIMEMultipart("alternative")
    msg['From'] = EMAIL_ADDRESS
    msg['To'] = TO_EMAIL
    msg['Subject'] = subject
    msg.attach(MIMEText(html_body, "html"))

    with smtplib.SMTP(SMTP_SERVER, SMTP_PORT) as server:
        server.starttls()
        server.login(EMAIL_ADDRESS, EMAIL_PASSWORD)
        server.send_message(msg)


# === Main ===
if __name__ == "__main__":
    try:
        seen = load_seen()

        # Step 1: PubMed (free full text only)
        pubmed_ids = search_pubmed(
    '("psychological trauma"[MeSH Terms] OR "post-traumatic stress disorder"[MeSH Terms] OR PTSD OR CPTSD OR "complex PTSD") AND (therapy OR intervention OR treatment OR counseling OR psychotherapy) AND "free full text"[Filter]',
    max_results=15
)

        pubmed_articles = fetch_pubmed_details(pubmed_ids)

        # Step 2: RSS feeds
        rss_articles = fetch_rss_articles()

        # Step 3: Combine & filter out seen
        combined = pubmed_articles + rss_articles
        new_articles = [a for a in combined if a["id"] not in seen]

        # Step 4: Limit total
        digest = new_articles[:MAX_RESULTS]

        # Step 5: Send and update seen list
        if digest:
            html = format_html_digest(digest)
            send_email("Monthly Trauma Research Digest", html)
            seen.update([a["id"] for a in digest])
            save_seen(seen)
            print(f"Email sent with {len(digest)} new articles.")
        else:
            print("No new articles since last run.")

    except Exception as e:
        print(f"Error: {e}")


---

### 🔁 What to Expect Each Run

* Searches recent **PubMed** articles (free full text only)
* Pulls latest items from **two major trauma journals**
* Filters out previously emailed articles using `seen_articles.json`
* Sends an email digest with **up to 10 items**, tagged with ✅ if PDF/full text is likely available

---

Let me know if you'd like:

* A local `.html` output as a backup
* To add **more journal feeds**
* Abstract previews included (if available)

You're now set for a comprehensive, Open Access–oriented monthly trauma research tracker!
