# Finde Expert:innen, um Preprint einschätzen zu lassen

In [8]:
url = "https://www.medrxiv.org/content/10.1101/2024.12.30.24319800v1"
doi = "https://doi.org/10.1101/2024.12.30.24319800"
authors_str = "Friederike Hoheisel, Kathrin Maria Fleischer, Kerstin Rubarth, Nuno Sepúlveda, Sandra Bauer, Frank Konietschke, Claudia Kedor, Annika Elisa Stein, Kirsten Wittke, Martina Seifert, Judith Bellmann-Strobl, Josef Mautner, Uta Behrends, Carmen Scheibenbogen, Franziska Sotzny"
title = "Autoantibodies to Arginine-rich Sequences Mimicking Epstein-Barr Virus in Post-COVID and Myalgic Encephalomyelitis/Chronic Fatigue Syndrome"
abstract = """Background Epstein-Barr virus (EBV) infection is a known trigger and risk factor for myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) and post-COVID syndrome (PCS). In previous studies, we found enhanced IgG reactivity to EBV EBNA4 and EBNA6 arginine-rich sequences in postinfectious ME/CFS (piME/CFS).

Objective This study aims to investigate IgG responses to arginine-rich (poly-R) EBNA4 and EBNA6 sequences and homologous human sequences in PCS and ME/CFS.

Methods The IgG responses against poly-R EBNA4 and EBNA6 and corresponding homologous human 15-mer peptides and respective full-length proteins were analyzed using a cytometric bead array (CBA) and a multiplex dot-blot assay. Sera of 45 PCS patients diagnosed according to WHO criteria, with 26 patients fulfilling the Canadian Consensus criteria for ME/CFS (pcME/CFS), 36 patients with non-COVID post-infectious ME/CFS (piME/CFS), and 34 healthy controls (HC) were investigated.

Results Autoantibodies to poly-R peptide sequences of the neuronal antigen SRRM3, the ion channel SLC24A3, TGF-β signaling regulator TSPLY2, angiogenic regulator TSPYL5, as well as to full-length α-adrenergic receptor (ADRA) proteins were more frequent in patients. Several autoantibodies were positively associated with key symptoms of autonomic dysfunction, fatigue, cognition, and pain.

Conclusion Collectively, we identified autoantibodies with new antigen specificities with a potential role in PCS and ME/CFS.

Clinical Implication These finding should prompt further studies on the function of these autoantibodies, their exploitation for diagnostic use, and of drugs targeting autoantibodies.

Capsule summary Our study reveals elevated autoantibodies to EBV-related poly-R sequences and their human homologues in PCS and ME/CFS patients associated with symptom severity, suggesting a potential role in disease pathogenesis."""

**Wege zum Ziel**

- Semi-automatischer Worfklow: Automatisierung großer Teile der Suche + Interventionsmöglichkeiten durch User
    - Workflow braucht Zeit - keine _search experience_ wie im EE2, eher Rechercheassistent, der durch massive Parallelisierung und Automatisierung Zeit spart
- Studien => Kandidaten
    - Bei Preprints sind nach X Tagen die Daten (über Crossref) in OpenAlex über die DOI abfragbar ([Beispiel](https://api.openalex.org/works/https://doi.org/10.1101/2024.12.30.24319800)), related Papers, Autor:innen aus Deutschland finden; oder über geteilte Keywords, Topics, etc.
    - Suche nach Preprint in Semantic Scholar => Paper [Recommendations API](https://api.semanticscholar.org/api-docs/recommendations), über empfohlene Paper Autor:innen aus Deutschland finden
    - Aus Titel + Abstract über LLM Queries erzeugen, damit in Datenbanken wie PubMed, OpenAlex, etc. nach möglichst aktuellen ähnlichen Studien suchen; von LLM einschätzen lassen, wie ähnlich die Inhalte der gefundenen Studien zu den Inhalten der zu beurteilenden Studie sind; aus ähnlichen Studien Autor:innen aus Deutschland extrahieren
- Kandidaten => Mehr Kandidaten
    - Ko-Autoren-/Zitations-Netzwerke erweitern, Autor:inenn aus Deutschland extrahieren
- Kandidaten => Vorschläge
    - Matching auf von Autor:innen auf OpenAlex / Semantic Scholar / etc. Entitäten (Informationen aus vorhergehenden Schritten mitnehmen, falls vorhanden)
        - Abgleich von Themen
        - Abgleich von Themen der letzten N Publikationen
        - Einbeziehung von Metriken
        - Einbeziehung von Aktualität der letzten Veröffentlichungen
- Für Suchen interessant: https://jina.ai/reader

In [36]:
import requests
from urllib.parse import urlparse, quote
import json
from funcy import lfilter, lpluck
from typing import Any

In [33]:
query_endpoint = "https://api.semanticscholar.org/graph/v1/paper/search?query="
data_endpoint = "https://api.semanticscholar.org/graph/v1/paper/"
data_batch_endpoint = "https://api.semanticscholar.org/graph/v1/paper/batch"
recommendation_endpoint = "https://api.semanticscholar.org/recommendations/v1/papers/forpaper/"
# "?fields=title,url,year,externalIds"

In [72]:
def get_semantic_id(query: str) -> str:
    """
    This function takes in a query string and returns results from the Semantic Scholar API.
    """

    # Parse the query string to make it URL-friendly
    parsed = quote(query.lower())
    # Make a GET request to the Semantic Scholar API with the parsed query string
    r = requests.get(query_endpoint + parsed)

     # Load the response content as JSON
    content = json.loads(r.content)
    data = content["data"]

    return data

# Semantic ID kann auch direkt über DOI bezogen werden: https://api.semanticscholar.org/api-docs/#tag/Paper-Data/operation/get_graph_get_paper
def get_semantic_id_by_title(title: str) -> str:
    """
    This function queries the Semantic Scholar API for a given title and returns PaperID
    """

    data = get_semantic_id(title)

    paper_ids = lfilter(lambda x: x["title"] == title, data)

    if paper_ids:
        return paper_ids[0]["paperId"]
    else:
        return "not found"
    
def get_paper_details(paper_id: str, return_fields="title,url,year,externalIds") -> dict[str, Any]:
    """
    This function retrieves details for paper given paper_id and return_fields from Semantic Scholar API
    """

     # Make a GET request to the Semantic Scholar API with the specified ID and fields
    r = requests.get(f"{data_endpoint}{paper_id}?fields={return_fields}")
    # Load the response content as JSON
    content = json.loads(r.content)

    return content

def get_paper_details_batch(paper_ids: list[str], return_fields="title,url,year,externalIds") -> dict[str, Any]:
    """
    This function retrieves details for a batch of papers given paper_ids and return_fields from Semantic Scholar API
    """
    r = requests.post(data_batch_endpoint, params={"fields":return_fields}, json={"ids":paper_ids})
    content = json.loads(r.content)

    return content
    

def get_recommended_paper_ids(paper_id: str) -> dict[str, Any]:
    """
    This functions retrieves recommendations for given paper_id from Semantic Scholar API
    """

    r = requests.get(f"{recommendation_endpoint}{paper_id}")
    content = json.loads(r.content)

    return content


In [25]:
paper_id = get_semantic_id_by_title(title)
paper_id

'7a4115db8f0643989eb91f09da3801b65686f9dd'

In [28]:
get_paper_details(paper_id)

{'paperId': '7a4115db8f0643989eb91f09da3801b65686f9dd',
 'externalIds': {'DOI': '10.1101/2024.12.30.24319800', 'CorpusId': 275141355},
 'url': 'https://www.semanticscholar.org/paper/7a4115db8f0643989eb91f09da3801b65686f9dd',
 'title': 'Autoantibodies to Arginine-rich Sequences Mimicking Epstein-Barr Virus in Post-COVID and Myalgic Encephalomyelitis/Chronic Fatigue Syndrome',
 'year': 2024}

In [43]:
recs = get_recommended_paper_ids(paper_id)
rec_paper_ids = lpluck("paperId", recs["recommendedPapers"])

In [79]:
paper_authors = get_paper_details_batch(rec_paper_ids, "title,authors.authorId,authors.name,authors.externalIds,authors.paperCount,authors.citationCount,authors.hIndex")

In [83]:
print(json.dumps(paper_authors, indent=2))

[
  {
    "paperId": "c72b5077da352cba455ad3026e294732d139bb4a",
    "title": "Efficacy of repeated immunoadsorption in patients with post-COVID myalgic encephalomyelitis/chronic fatigue syndrome and elevated \u03b22-adrenergic receptor autoantibodies: a prospective cohort study",
    "authors": [
      {
        "authorId": "2244709870",
        "externalIds": {},
        "name": "E. Stein",
        "paperCount": 6,
        "citationCount": 19,
        "hIndex": 2
      },
      {
        "authorId": "2220110766",
        "externalIds": {},
        "name": "C. Heindrich",
        "paperCount": 5,
        "citationCount": 40,
        "hIndex": 2
      },
      {
        "authorId": "7177149",
        "externalIds": {},
        "name": "K. Wittke",
        "paperCount": 34,
        "citationCount": 1241,
        "hIndex": 18
      },
      {
        "authorId": "2242986655",
        "externalIds": {},
        "name": "C. Kedor",
        "paperCount": 29,
        "citationCount": 695,
  