# Why Checking DOIs via API is the “Silver Bullet” for AI Detection
Checking References, specifically through their Digital Object Identifiers (DOIs), is arguably the most definitive method to catch AI hallucinations. Large Language Models (LLMs) like ChatGPT often generate plausible-sounding citations that do not actually exist.

Here is why the Python + doi.org Content Negotiation method is superior:

-  Deterministic Accuracy (Binary Result)
Unlike analyzing writing style or “perplexity” scores—which are probabilistic and prone to false positives—a DOI check is binary. A DOI either exists in the global registry, or it doesn’t.

Result: 404 Not Found = 100% Fake Reference.

 - Detecting “Stolen” DOIs
AI sometimes hallucinates by taking a real DOI from an unrelated paper and attaching it to a fake citation.

- The Fix: By retrieving the metadata (JSON) directly from the source, you can compare the actual title in the database against the title listed in the suspicious paper. If the paper claims to be about “Economics” but the DOI resolves to “Marine Biology,” it is undeniable proof of AI generation.
-  Global Coverage (Not Just One Publisher)
By querying the central doi.org resolver rather than specific publisher APIs (like Elsevier or Wiley), this method covers all academic content.

Efficiency: It handles redirects automatically, finding the metadata whether the paper is hosted on Crossref, DataCite, or mEDRA.
- . Scalability and Automation
Manually clicking 50 links is tedious. This Python script allows for batch processing. You can feed it a list of 100 references and receive a full audit report in seconds, making it perfect for editors, professors, or automated quality control systems.

In [1]:
import requests

def verify_doi_validity(doi_input):
    """
    Checks if a DOI exists by querying the doi.org resolver directly.
    Returns detailed metadata if valid, or an error status if invalid.
    """
    # Clean the input to ensure we only have the DOI string
    clean_doi = doi_input.replace("https://doi.org/", "").replace("http://doi.org/", "")
    
    url = f"https://doi.org/{clean_doi}"
    
    headers = {
        "Accept": "application/vnd.citationstyles.csl+json"
    }

    try:
        response = requests.get(url, headers=headers, allow_redirects=True, timeout=10)
        
        if response.status_code == 200:
            try:
                data = response.json()
            except ValueError:
                return {"status": "Error", "details": "Response was not valid JSON."}
            
            # 1. Extracting Title
            title = data.get('title', 'N/A')
            if isinstance(title, list) and len(title) > 0:
                title = title[0]
            
            # 2. Extracting Journal Name (Container Title)
            journal = data.get('container-title', 'N/A')
            if isinstance(journal, list) and len(journal) > 0:
                journal = journal[0]

            # 3. Extracting First Author's Last Name
            author_lastname = "N/A"
            if 'author' in data and len(data['author']) > 0:
                # We take the first author in the list
                author_lastname = data['author'][0].get('family', 'N/A')

            return {
                "status": "Valid",
                "real_title": title,
                "journal": journal,
                "first_author": author_lastname
            }
            
        elif response.status_code == 404:
            return {"status": "Invalid", "details": "DOI not found"}
        else:
            return {"status": "Error", "details": f"HTTP Code: {response.status_code}"}

    except Exception as e:
        return {"status": "Connection Error", "details": str(e)}

# --- Usage Example ---

doi_list_to_check = [
    "10.1038/nature123",            # Fake
    "10.1007/s10701-005-9016-x",    # Valid (Physics paper)
    "10.1016/j.jbi.2008.04.002",    # Valid (Bioinformatics paper)
    "10.1126/science.fake.999"      # Fake
]

# Header format for the table
print(f"{'DOI':<27} | {'Status':<8} | {'Author':<15} | {'Journal':<20} | {'Real Title'}")
print("-" * 110)

for doi in doi_list_to_check:
    result = verify_doi_validity(doi)
    
    if result['status'] == "Valid":
        # Clean and shorten strings for table display
        author = str(result['first_author'])[:15]
        journal = str(result['journal'])[:20]
        title = str(result['real_title'])[:35] + "..."
        
        print(f"{doi:<27} | {result['status']:<8} | {author:<15} | {journal:<20} | {title}")
    else:
        # For errors, we just print the details in the last column
        print(f"{doi:<27} | {result['status']:<8} | {'-':<15} | {'-':<20} | {result.get('details', '-')}")


DOI                         | Status   | Author          | Journal              | Real Title
--------------------------------------------------------------------------------------------------------------
10.1038/nature123           | Invalid  | -               | -                    | DOI not found
10.1007/s10701-005-9016-x   | Valid    | Ellis           | Foundations of Physi | Physics and the Real World...
10.1016/j.jbi.2008.04.002   | Valid    | Sward           | Journal of Biomedica | Reasons for declining computerized ...
10.1126/science.fake.999    | Invalid  | -               | -                    | DOI not found


In this section, we proved that this is an efficient way to find if a paper is valid or not. 