Warhead Hunter Python: Batch Target Mining and Programmatic Job Submission for Warhead Hunter
A lightweight Python companion repo for users who want to create Warhead Hunter jobs in batches, reproduce example submissions, and mine additional target structures from UniProt and RCSB.
Submit more targets. Mine more structures. Reproduce more results.
From a simple CSV of targets to reproducible Warhead Hunter batch submissions.
Warhead Hunter Python is a companion utility repository for users who want to interact with Warhead Hunter from Python instead of manually submitting one target at a time through the web interface.
It was built from the original working scripts used to generate a small batch of additional Warhead Hunter results, then cleaned into a more reusable and user-friendly command-line workflow.
Use this repo when you want to:
- submit several Warhead Hunter jobs from a target CSV,
- build API payloads from FASTA files,
- monitor submitted jobs,
- download completed job bundles,
- mine UniProt sequences and RCSB PDB/ligand metadata,
- keep reproducible examples of previous submissions, and
- give other researchers a clean starting point for their own batch workflows.
The main Warhead Hunter web platform is designed for structure-aware ligand exposure analysis and exit-vector discovery.
This Python repo is designed for the practical question that comes next:
How do I create several more Warhead Hunter jobs without doing everything manually?
Instead of copying and pasting target information into a browser repeatedly, you can define targets in a CSV file, point each target to a FASTA file, and submit the full batch from the command line.
- Overview
- Why this repo?
- Core capabilities
- Companion tool ecosystem
- Repository layout
- Quick start
- Endpoint fallback
- Batch submission
- Target mining
- Examples
- Configuration reference
- Data and output policy
- Scientific interpretation
- Citation
- Contact
| Capability | Purpose |
|---|---|
| Batch job submission | Submit multiple Warhead Hunter jobs from a CSV file. |
| Single-target submission | Submit one target directly from command-line arguments. |
| Payload generation | Build JSON API payloads from target names, search queries, and FASTA files. |
| Job monitoring | Refresh job statuses from a saved submission manifest. |
| Bundle download | Download a completed Warhead Hunter job bundle by job ID. |
| UniProt sequence mining | Fetch target FASTA sequences by UniProt accession. |
| RCSB structure mining | Find PDB entries associated with target UniProt accessions. |
| Ligand metadata extraction | Collect non-polymer ligand IDs, names, formula weights, and structure metadata. |
| Reproducible examples | Keep previous AD-related submissions as examples for new users. |
Warhead Hunter Python is part of a broader structure-guided molecular design ecosystem.
Together, these tools support a connected workflow:
Protein / target selection
β
Batch target mining and FASTA preparation
β
Warhead Hunter job submission
β
Solvent exposure and exit-vector analysis
β
Warhead, linker, or PROTAC design hypothesis
β
E3 analysis, ternary modeling, and downstream evaluation
warheadhunter-python/
βββ warheadhunter_python/
β βββ __init__.py
β βββ batch.py # CSV loading and batch submission helpers
β βββ client.py # Warhead Hunter API client
β βββ cli.py # Command-line interface
β βββ fasta.py # FASTA parsing helpers
β βββ target_miner.py # UniProt / RCSB target mining utilities
βββ examples/
β βββ ad_batch_targets.csv # Example batch-submission CSV
β βββ ad_mining_targets.csv # Example target-mining CSV
β βββ fastas/ # Example FASTA files
β βββ mined_targets/ # Example mined FASTA output
β βββ previous_submissions/ # Previous working payload/response examples
β βββ oga_payload.json # Single payload example
βββ scripts/
β βββ warheadhunter-python # Convenience wrapper
β βββ legacy_ad_warhead_target_miner.py
β βββ legacy_submit_ad_warhead_targets.py
βββ requirements.txt
βββ pyproject.toml
βββ .gitignore
βββ LICENSE
βββ README.md
Clone the repository:
git clone https://github.com/Joey305/warheadhunter-python.git
cd warheadhunter-pythonCreate and activate a Python environment:
python -m venv .venv
source .venv/bin/activateInstall the package locally:
pip install -e .Check that the command-line tool is available:
warheadhunter --helpCheck API health:
warheadhunter healthBy default, the CLI tries both active endpoints in order:
1. https://warheadhunter.com
2. http://cartman.rove-vernier.ts.net
This means the public deployment is attempted first. If the public API is not available yet, the CLI automatically falls back to the Cartman/Tailscale endpoint so development and batch processing can continue while final deployment is still cooking.
You can also provide your own fallback order by repeating --base-url before the command:
warheadhunter \
--base-url https://warheadhunter.com \
--base-url http://cartman.rove-vernier.ts.net \
healthwarheadhunter-python is designed for the transition period between the private/dev API and the public Warhead Hunter deployment.
By default, every API command tries:
1. https://warheadhunter.com
2. http://cartman.rove-vernier.ts.net
If the first endpoint is unavailable, returns a server error, or does not expose the expected API route, the client automatically retries the same request against the second endpoint. This preserves the original working Cartman workflow while allowing the public deployment to become the primary endpoint once it is fully live.
The submission manifest includes an api_base_url column so each job records which endpoint actually accepted the request.
To override the default order, repeat --base-url before the subcommand:
warheadhunter \
--base-url https://warheadhunter.com \
--base-url http://cartman.rove-vernier.ts.net \
batch-submit \
--targets examples/ad_batch_targets.csv \
--output-dir outputs/ad_submissionsThe easiest workflow is to define targets in a CSV file.
Example:
priority,target_name,uniprot,search_query,fasta_file,why
1,OGA,O60502,"O-GlcNAcase 9BA9 6PM9 5UN9 5M7T",fastas/OGA_O60502.fasta,"Tau-relevant enzyme."
2,DYRK1A,Q13627,"DYRK1A 3ANR 4MQ2 6EIF 7AKL",fastas/DYRK1A_Q13627.fasta,"Tau phosphorylation kinase."Run the included AD example as a dry run first:
warheadhunter batch-submit \
--targets examples/ad_batch_targets.csv \
--output-dir outputs/ad_submissions \
--dry-runThis writes payload JSON files and a submission manifest without contacting the API.
When ready, submit the batch:
warheadhunter batch-submit \
--targets examples/ad_batch_targets.csv \
--output-dir outputs/ad_submissions \
--delay 3The output folder will contain:
outputs/ad_submissions/
βββ 1_OGA_payload.json
βββ 1_OGA_response.json
βββ 2_DYRK1A_payload.json
βββ 2_DYRK1A_response.json
βββ submitted_warhead_jobs.csv
The manifest file records job IDs, status URLs, results URLs, files URLs, bundle URLs, and the local payload/response files.
Submit one target directly:
warheadhunter submit \
--target-name OGA \
--search-query "O-GlcNAcase 9BA9 6PM9 5UN9 5M7T" \
--fasta-file examples/fastas/OGA_O60502.fasta \
--output-dir outputs/oga_submissionGenerate a payload without submitting:
warheadhunter payload \
--target-name OGA \
--search-query "O-GlcNAcase 9BA9 6PM9 5UN9 5M7T" \
--fasta-file examples/fastas/OGA_O60502.fasta \
--output outputs/oga_payload.jsonCheck one job:
warheadhunter status 2d9b72a8Refresh all jobs in a manifest:
warheadhunter monitor \
--manifest outputs/ad_submissions/submitted_warhead_jobs.csv \
--output outputs/ad_submissions/refreshed_status.csvDownload a completed bundle:
warheadhunter download-bundle \
2d9b72a8 \
--output outputs/bundles/2d9b72a8.zipThis repo also includes a target miner that can:
- fetch FASTA sequences from UniProt,
- search RCSB for structures linked to each UniProt accession,
- extract basic PDB metadata,
- collect non-polymer ligand metadata, and
- optionally download mmCIF files.
Run the included AD target mining example:
warheadhunter mine-targets \
--targets examples/ad_mining_targets.csv \
--output-dir outputs/ad_warhead_targets \
--rows-per-target 250Optionally download mmCIF files:
warheadhunter mine-targets \
--targets examples/ad_mining_targets.csv \
--output-dir outputs/ad_warhead_targets \
--rows-per-target 250 \
--download-structuresExpected outputs:
outputs/ad_warhead_targets/
βββ target_sequences.fasta
βββ target_pdbs_and_ligands.csv
βββ pdb_cif_files/ # only if --download-structures is used
This repository keeps the original working AD-related examples so users can see the exact shape of a successful workflow.
Included example targets:
| Priority | Target | UniProt | Why it was included |
|---|---|---|---|
| 1 | OGA | O60502 | Tau O-GlcNAc / tau pathology target with inhibitor-rich structural biology. |
| 2 | DYRK1A | Q13627 | Tau phosphorylation, A-beta biology, and Down-syndrome/AD overlap. |
| 3 | GSK3B | P49841 | Classic tau kinase target with many ligand-bound structures. |
| 4 | MAPK14 | Q16539 | Neuroinflammation/tau-related kinase axis. |
| 5 | RIPK1 | Q13546 | Neuroinflammation and necroptosis target with inhibitor-bound structures. |
The folder examples/previous_submissions/ contains the previous payloads, API responses, and manifest from the working batch run.
Required columns:
| Column | Required? | Description |
|---|---|---|
target_name |
Yes | Human-readable target/job name. |
search_query |
Yes | Search string sent to Warhead Hunter. Include target name and useful PDB IDs when known. |
fasta_file |
Either this or fasta_seq |
Path to a FASTA file. Relative paths are resolved from the CSV location. |
fasta_seq |
Either this or fasta_file |
Inline FASTA text. Useful for generated CSVs. |
priority |
No | Optional ordering label used in output filenames. |
uniprot |
No | Optional accession for tracking. |
why |
No | Human-readable rationale. |
Required columns:
| Column | Required? | Description |
|---|---|---|
target_name |
Yes | Short target label. |
uniprot |
Yes | UniProt accession used for FASTA and RCSB lookup. |
gene |
No | Gene symbol. |
protein |
No | Protein name. |
notes |
No | Rationale or project notes. |
This repository is intended to stay lightweight and GitHub-friendly.
Recommended to commit:
- source code,
- CLI helpers,
- small example FASTA files,
- small example payloads/responses,
- documentation,
- CSV templates, and
- reproducible configuration files.
Recommended to exclude:
- full Warhead Hunter job output bundles,
- downloaded PDB/mmCIF folders,
- large generated molecular files,
- private datasets,
- credentials, tokens, and secrets,
- temporary logs and local outputs.
Before committing, check repository size:
du -sh .
du -h --max-depth=1 . | sort -hrCheck for large files:
find . -type f -size +50M \
-not -path "./.git/*" \
-not -path "./outputs/*" \
-exec ls -lh {} \;Warhead Hunter Python helps automate job creation and structure/ligand metadata collection. It does not automatically decide whether a target, ligand, warhead, or exit vector is experimentally valid.
Use the generated outputs as decision-support material alongside:
- binding pose quality,
- structure resolution and model reliability,
- ligand identity and chemical plausibility,
- known SAR,
- local protein-ligand interaction networks,
- solvent exposure,
- synthetic feasibility,
- warhead reactivity,
- linker geometry, and
- biological context.
The recommended mindset is:
Batch automation creates candidates.
Warhead Hunter helps interpret candidates.
Medicinal chemistry expertise decides what to test.
Install the package in editable mode:
pip install -e .Or run the module directly:
python -m warheadhunter_python.cli --helpRelative fasta_file paths are resolved relative to the CSV file, not necessarily your current shell directory. For example, this works from examples/ad_batch_targets.csv:
fasta_file
fastas/OGA_O60502.fastaMake sure at least one API endpoint is reachable:
warheadhunter healthThe default fallback order is:
https://warheadhunter.com
http://cartman.rove-vernier.ts.net
For a local/private deployment, pass one or more URLs before the command:
warheadhunter \
--base-url http://127.0.0.1:5000 \
--base-url http://cartman.rove-vernier.ts.net \
healthIf you use Warhead Hunter Python, please cite the repository and the Warhead Hunter web platform:
Schulz, J.-M. Warhead Hunter Python: Batch Target Mining and Programmatic Job Submission for Warhead Hunter. GitHub repository: https://github.com/Joey305/warheadhunter-python.
Schulz, J.-M. Warhead Hunter: Structure-Aware Solvent Exposure Analysis for Warhead, Linker, and PROTAC Attachment Vector Discovery. Web platform: https://warheadhunter.com.
A manuscript for Warhead Hunter is currently in preparation. Once a preprint or publication is available, update this section with the formal citation, DOI, and manuscript link.
For questions, bug reports, workflow support, or collaboration inquiries:
Python utilities for mining targets, preparing FASTA/payload files, and batch-submitting jobs to Warhead Hunter.
Use Warhead Hunter Python when you want to turn this:
A list of targets + FASTA sequences + search queries
into this:
A reproducible folder of Warhead Hunter payloads, API responses, job IDs, and downloadable result bundles.