🐍 Warhead Hunter Python

Warhead Hunter Python: Batch Target Mining and Programmatic Job Submission for Warhead Hunter

A lightweight Python companion repo for users who want to create Warhead Hunter jobs in batches, reproduce example submissions, and mine additional target structures from UniProt and RCSB.

Submit more targets. Mine more structures. Reproduce more results.

From a simple CSV of targets to reproducible Warhead Hunter batch submissions.

🚀 Overview

Warhead Hunter Python is a companion utility repository for users who want to interact with Warhead Hunter from Python instead of manually submitting one target at a time through the web interface.

It was built from the original working scripts used to generate a small batch of additional Warhead Hunter results, then cleaned into a more reusable and user-friendly command-line workflow.

Use this repo when you want to:

submit several Warhead Hunter jobs from a target CSV,
build API payloads from FASTA files,
monitor submitted jobs,
download completed job bundles,
mine UniProt sequences and RCSB PDB/ligand metadata,
keep reproducible examples of previous submissions, and
give other researchers a clean starting point for their own batch workflows.

🎯 Why This Repo?

The main Warhead Hunter web platform is designed for structure-aware ligand exposure analysis and exit-vector discovery.

This Python repo is designed for the practical question that comes next:

How do I create several more Warhead Hunter jobs without doing everything manually?

Instead of copying and pasting target information into a browser repeatedly, you can define targets in a CSV file, point each target to a FASTA file, and submit the full batch from the command line.

🧭 Repository Navigation

Overview
Why this repo?
Core capabilities
Companion tool ecosystem
Repository layout
Quick start
Endpoint fallback
Batch submission
Target mining
Examples
Configuration reference
Data and output policy
Scientific interpretation
Citation
Contact

✨ Core Capabilities

Capability	Purpose
Batch job submission	Submit multiple Warhead Hunter jobs from a CSV file.
Single-target submission	Submit one target directly from command-line arguments.
Payload generation	Build JSON API payloads from target names, search queries, and FASTA files.
Job monitoring	Refresh job statuses from a saved submission manifest.
Bundle download	Download a completed Warhead Hunter job bundle by job ID.
UniProt sequence mining	Fetch target FASTA sequences by UniProt accession.
RCSB structure mining	Find PDB entries associated with target UniProt accessions.
Ligand metadata extraction	Collect non-polymer ligand IDs, names, formula weights, and structure metadata.
Reproducible examples	Keep previous AD-related submissions as examples for new users.

🧬 Companion Tool Ecosystem

Warhead Hunter Python is part of a broader structure-guided molecular design ecosystem.

Together, these tools support a connected workflow:

Protein / target selection
        ↓
Batch target mining and FASTA preparation
        ↓
Warhead Hunter job submission
        ↓
Solvent exposure and exit-vector analysis
        ↓
Warhead, linker, or PROTAC design hypothesis
        ↓
E3 analysis, ternary modeling, and downstream evaluation

📦 Repository Layout

warheadhunter-python/
├── warheadhunter_python/
│   ├── __init__.py
│   ├── batch.py              # CSV loading and batch submission helpers
│   ├── client.py             # Warhead Hunter API client
│   ├── cli.py                # Command-line interface
│   ├── fasta.py              # FASTA parsing helpers
│   └── target_miner.py       # UniProt / RCSB target mining utilities
├── examples/
│   ├── ad_batch_targets.csv  # Example batch-submission CSV
│   ├── ad_mining_targets.csv # Example target-mining CSV
│   ├── fastas/               # Example FASTA files
│   ├── mined_targets/        # Example mined FASTA output
│   ├── previous_submissions/ # Previous working payload/response examples
│   └── oga_payload.json      # Single payload example
├── scripts/
│   ├── warheadhunter-python  # Convenience wrapper
│   ├── legacy_ad_warhead_target_miner.py
│   └── legacy_submit_ad_warhead_targets.py
├── requirements.txt
├── pyproject.toml
├── .gitignore
├── LICENSE
└── README.md

⚡ Quick Start

Clone the repository:

git clone https://github.com/Joey305/warheadhunter-python.git
cd warheadhunter-python

Create and activate a Python environment:

python -m venv .venv
source .venv/bin/activate

Install the package locally:

pip install -e .

Check that the command-line tool is available:

warheadhunter --help

Check API health:

warheadhunter health

By default, the CLI tries both active endpoints in order:

1. https://warheadhunter.com
2. http://cartman.rove-vernier.ts.net

This means the public deployment is attempted first. If the public API is not available yet, the CLI automatically falls back to the Cartman/Tailscale endpoint so development and batch processing can continue while final deployment is still cooking.

You can also provide your own fallback order by repeating --base-url before the command:

warheadhunter \
  --base-url https://warheadhunter.com \
  --base-url http://cartman.rove-vernier.ts.net \
  health

🌐 Endpoint Fallback

warheadhunter-python is designed for the transition period between the private/dev API and the public Warhead Hunter deployment.

By default, every API command tries:

1. https://warheadhunter.com
2. http://cartman.rove-vernier.ts.net

If the first endpoint is unavailable, returns a server error, or does not expose the expected API route, the client automatically retries the same request against the second endpoint. This preserves the original working Cartman workflow while allowing the public deployment to become the primary endpoint once it is fully live.

The submission manifest includes an api_base_url column so each job records which endpoint actually accepted the request.

To override the default order, repeat --base-url before the subcommand:

warheadhunter \
  --base-url https://warheadhunter.com \
  --base-url http://cartman.rove-vernier.ts.net \
  batch-submit \
  --targets examples/ad_batch_targets.csv \
  --output-dir outputs/ad_submissions

🧪 Batch Submission

The easiest workflow is to define targets in a CSV file.

Example:

priority,target_name,uniprot,search_query,fasta_file,why
1,OGA,O60502,"O-GlcNAcase 9BA9 6PM9 5UN9 5M7T",fastas/OGA_O60502.fasta,"Tau-relevant enzyme."
2,DYRK1A,Q13627,"DYRK1A 3ANR 4MQ2 6EIF 7AKL",fastas/DYRK1A_Q13627.fasta,"Tau phosphorylation kinase."

Run the included AD example as a dry run first:

warheadhunter batch-submit \
  --targets examples/ad_batch_targets.csv \
  --output-dir outputs/ad_submissions \
  --dry-run

This writes payload JSON files and a submission manifest without contacting the API.

When ready, submit the batch:

warheadhunter batch-submit \
  --targets examples/ad_batch_targets.csv \
  --output-dir outputs/ad_submissions \
  --delay 3

The output folder will contain:

outputs/ad_submissions/
├── 1_OGA_payload.json
├── 1_OGA_response.json
├── 2_DYRK1A_payload.json
├── 2_DYRK1A_response.json
└── submitted_warhead_jobs.csv

The manifest file records job IDs, status URLs, results URLs, files URLs, bundle URLs, and the local payload/response files.

🎯 Single-Target Submission

Submit one target directly:

warheadhunter submit \
  --target-name OGA \
  --search-query "O-GlcNAcase 9BA9 6PM9 5UN9 5M7T" \
  --fasta-file examples/fastas/OGA_O60502.fasta \
  --output-dir outputs/oga_submission

Generate a payload without submitting:

warheadhunter payload \
  --target-name OGA \
  --search-query "O-GlcNAcase 9BA9 6PM9 5UN9 5M7T" \
  --fasta-file examples/fastas/OGA_O60502.fasta \
  --output outputs/oga_payload.json

📡 Monitoring Jobs

Check one job:

warheadhunter status 2d9b72a8

Refresh all jobs in a manifest:

warheadhunter monitor \
  --manifest outputs/ad_submissions/submitted_warhead_jobs.csv \
  --output outputs/ad_submissions/refreshed_status.csv

Download a completed bundle:

warheadhunter download-bundle \
  2d9b72a8 \
  --output outputs/bundles/2d9b72a8.zip

⛏️ Target Mining

This repo also includes a target miner that can:

fetch FASTA sequences from UniProt,
search RCSB for structures linked to each UniProt accession,
extract basic PDB metadata,
collect non-polymer ligand metadata, and
optionally download mmCIF files.

Run the included AD target mining example:

warheadhunter mine-targets \
  --targets examples/ad_mining_targets.csv \
  --output-dir outputs/ad_warhead_targets \
  --rows-per-target 250

Optionally download mmCIF files:

warheadhunter mine-targets \
  --targets examples/ad_mining_targets.csv \
  --output-dir outputs/ad_warhead_targets \
  --rows-per-target 250 \
  --download-structures

Expected outputs:

outputs/ad_warhead_targets/
├── target_sequences.fasta
├── target_pdbs_and_ligands.csv
└── pdb_cif_files/              # only if --download-structures is used

🧬 Included Examples

This repository keeps the original working AD-related examples so users can see the exact shape of a successful workflow.

Included example targets:

Priority	Target	UniProt	Why it was included
1	OGA	O60502	Tau O-GlcNAc / tau pathology target with inhibitor-rich structural biology.
2	DYRK1A	Q13627	Tau phosphorylation, A-beta biology, and Down-syndrome/AD overlap.
3	GSK3B	P49841	Classic tau kinase target with many ligand-bound structures.
4	MAPK14	Q16539	Neuroinflammation/tau-related kinase axis.
5	RIPK1	Q13546	Neuroinflammation and necroptosis target with inhibitor-bound structures.

The folder examples/previous_submissions/ contains the previous payloads, API responses, and manifest from the working batch run.

🧾 Configuration Reference

Batch submission CSV

Required columns:

Column	Required?	Description
`target_name`	Yes	Human-readable target/job name.
`search_query`	Yes	Search string sent to Warhead Hunter. Include target name and useful PDB IDs when known.
`fasta_file`	Either this or `fasta_seq`	Path to a FASTA file. Relative paths are resolved from the CSV location.
`fasta_seq`	Either this or `fasta_file`	Inline FASTA text. Useful for generated CSVs.
`priority`	No	Optional ordering label used in output filenames.
`uniprot`	No	Optional accession for tracking.
`why`	No	Human-readable rationale.

Target mining CSV

Required columns:

Column	Required?	Description
`target_name`	Yes	Short target label.
`uniprot`	Yes	UniProt accession used for FASTA and RCSB lookup.
`gene`	No	Gene symbol.
`protein`	No	Protein name.
`notes`	No	Rationale or project notes.

🧹 Data and Output Policy

This repository is intended to stay lightweight and GitHub-friendly.

Recommended to commit:

source code,
CLI helpers,
small example FASTA files,
small example payloads/responses,
documentation,
CSV templates, and
reproducible configuration files.

Recommended to exclude:

full Warhead Hunter job output bundles,
downloaded PDB/mmCIF folders,
large generated molecular files,
private datasets,
credentials, tokens, and secrets,
temporary logs and local outputs.

Before committing, check repository size:

du -sh .
du -h --max-depth=1 . | sort -hr

Check for large files:

find . -type f -size +50M \
  -not -path "./.git/*" \
  -not -path "./outputs/*" \
  -exec ls -lh {} \;

🧠 Scientific Interpretation

Warhead Hunter Python helps automate job creation and structure/ligand metadata collection. It does not automatically decide whether a target, ligand, warhead, or exit vector is experimentally valid.

Use the generated outputs as decision-support material alongside:

binding pose quality,
structure resolution and model reliability,
ligand identity and chemical plausibility,
known SAR,
local protein-ligand interaction networks,
solvent exposure,
synthetic feasibility,
warhead reactivity,
linker geometry, and
biological context.

The recommended mindset is:

Batch automation creates candidates.
Warhead Hunter helps interpret candidates.
Medicinal chemistry expertise decides what to test.

🛠️ Troubleshooting

`warheadhunter: command not found`

Install the package in editable mode:

pip install -e .

Or run the module directly:

python -m warheadhunter_python.cli --help

FASTA path errors

Relative fasta_file paths are resolved relative to the CSV file, not necessarily your current shell directory. For example, this works from examples/ad_batch_targets.csv:

fasta_file
fastas/OGA_O60502.fasta

API connection errors

Make sure at least one API endpoint is reachable:

warheadhunter health

The default fallback order is:

https://warheadhunter.com
http://cartman.rove-vernier.ts.net

For a local/private deployment, pass one or more URLs before the command:

warheadhunter \
  --base-url http://127.0.0.1:5000 \
  --base-url http://cartman.rove-vernier.ts.net \
  health

🧬 Citation

If you use Warhead Hunter Python, please cite the repository and the Warhead Hunter web platform:

Schulz, J.-M. Warhead Hunter Python: Batch Target Mining and Programmatic Job Submission for Warhead Hunter. GitHub repository: https://github.com/Joey305/warheadhunter-python.

Schulz, J.-M. Warhead Hunter: Structure-Aware Solvent Exposure Analysis for Warhead, Linker, and PROTAC Attachment Vector Discovery. Web platform: https://warheadhunter.com.

A manuscript for Warhead Hunter is currently in preparation. Once a preprint or publication is available, update this section with the formal citation, DOI, and manuscript link.

📬 Contact

For questions, bug reports, workflow support, or collaboration inquiries:

🧾 Repository Description

Python utilities for mining targets, preparing FASTA/payload files, and batch-submitting jobs to Warhead Hunter.

🙌 Practical Takeaway

Use Warhead Hunter Python when you want to turn this:

A list of targets + FASTA sequences + search queries

into this:

A reproducible folder of Warhead Hunter payloads, API responses, job IDs, and downloadable result bundles.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🐍 Warhead Hunter Python

🚀 Overview

🎯 Why This Repo?

🧭 Repository Navigation

✨ Core Capabilities

🧬 Companion Tool Ecosystem

📦 Repository Layout

⚡ Quick Start

🌐 Endpoint Fallback

🧪 Batch Submission

🎯 Single-Target Submission

📡 Monitoring Jobs

⛏️ Target Mining

🧬 Included Examples

🧾 Configuration Reference

Batch submission CSV

Target mining CSV

🧹 Data and Output Policy

🧠 Scientific Interpretation

🛠️ Troubleshooting

`warheadhunter: command not found`

FASTA path errors

API connection errors

🧬 Citation

📬 Contact

🧾 Repository Description

🙌 Practical Takeaway

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
examples		examples
scripts		scripts
warheadhunter_python		warheadhunter_python
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🐍 Warhead Hunter Python

🚀 Overview

🎯 Why This Repo?

🧭 Repository Navigation

✨ Core Capabilities

🧬 Companion Tool Ecosystem

📦 Repository Layout

⚡ Quick Start

🌐 Endpoint Fallback

🧪 Batch Submission

🎯 Single-Target Submission

📡 Monitoring Jobs

⛏️ Target Mining

🧬 Included Examples

🧾 Configuration Reference

Batch submission CSV

Target mining CSV

🧹 Data and Output Policy

🧠 Scientific Interpretation

🛠️ Troubleshooting

warheadhunter: command not found

FASTA path errors

API connection errors

🧬 Citation

📬 Contact

🧾 Repository Description

🙌 Practical Takeaway

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`warheadhunter: command not found`

Packages