Skip to content

cypgg/dlsci

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

dlsci

Download scientific paper PDFs by DOI.

Tries OA sources first (Unpaywall, Semantic Scholar, arXiv, PMC, bioRxiv, Publisher Direct), then auto-fallback to Sci-Hub.

Features

  • 6 OA sources priority: Unpaywall → Semantic Scholar → arXiv → PMC → bioRxiv → Publisher Direct
  • Auto Sci-Hub fallback when OA sources fail
  • Batch download support
  • Structured JSON output for agents

Installation

git clone https://github.com/cypgg/dlsci.git
cd dlsci
pip install requests beautifulsoup4

Usage

# Download a paper (OA + Sci-Hub auto fallback)
python scripts/dlsci.py "10.1038/nature12385"

# Download to specific directory
python scripts/dlsci.py "DOI" --out ./papers

# Disable Sci-Hub (OA sources only)
python scripts/dlsci.py "DOI" --no-scihub

# Batch download
python scripts/dlsci.py --batch dois.txt

# JSON output (for agents)
python scripts/dlsci.py "DOI" --json

Environment Variables

Variable Description
UNPAYWALL_EMAIL Email for Unpaywall API (higher OA hit rate)
SEMANTIC_SCHOLAR_API_KEY Semantic Scholar API key
NCBI_API_KEY NCBI/PMC API key

Sources

  1. Unpaywall - Legal OA via DOI lookup
  2. Semantic Scholar - openAccessPdf field
  3. arXiv - Preprint server
  4. PMC - PubMed Central
  5. bioRxiv/medRxiv - Preprints
  6. Publisher Direct - IOP, MDPI, Frontiers, PLOS
  7. Sci-Hub - Fallback (auto-enabled)

License

MIT

About

Download scientific paper PDFs by DOI with OA sources and Sci-Hub fallback

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages