Skip to content

triwinds/bytehaul

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

61 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

bytehaul

Tests Crates.io Docs.rs PyPI Python License

A Rust async HTTP download library with Python bindings (also available on PyPI), supporting resume, multi-connection downloads, write-back cache, rate limiting, and checksum verification.

Documentation

Features

  • Single & multi-connection downloads — automatic Range probing and fallback
  • Pause / resume — cooperative pause with persisted control files for later continuation
  • Write-back cache — piece-based aggregation to reduce random I/O
  • Memory budget & backpressure — semaphore-based flow control
  • Retry with exponential backoff — configurable max retries, respects Retry-After
  • Rate limiting — shared token-bucket across all workers
  • SHA-256 checksum verification — post-download integrity check
  • Cancellation — cooperative cancel via stop signal
  • Progress reporting — real-time speed, ETA, downloaded bytes, and state
  • Shared network configuration — proxy, custom DNS servers, and IPv6 toggle on the downloader client

Installation

Rust

Add bytehaul to your project via Cargo:

cargo add bytehaul

Or add it manually to your Cargo.toml:

[dependencies]
bytehaul = "0.1.3"

Python

pip install bytehaul

Requires Python 3.9+. A single wheel per platform covers all supported Python versions (abi3).

Quick Start (Rust)

use bytehaul::{DownloadSpec, Downloader};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let downloader = Downloader::builder().build()?;

    let spec = DownloadSpec::new("https://example.com/largefile.zip")
        .output_path("largefile.zip");

    let handle = downloader.download(spec);
    handle.wait().await?;
    println!("Download complete!");
    Ok(())
}

For configuration, progress monitoring, cancellation, and more, see the Advanced Usage Guide.

If you omit output_path, bytehaul will automatically choose a filename from Content-Disposition, then the URL path, and finally download. You can combine that with .output_dir("downloads") to control the destination directory. Absolute output_path values are still accepted when output_dir is not set.

Quick Start (Python)

import bytehaul

# Simple one-line download
bytehaul.download("https://example.com/file.bin", output_path="output.bin")

# Automatic filename detection into a directory
bytehaul.download("https://example.com/file.bin", output_dir="downloads")

# With options
bytehaul.download(
    "https://example.com/file.bin",
    output_path="output.bin",
    max_connections=8,
    max_download_speed=1_000_000,  # 1 MB/s
)

For the full Python API (object API, progress, cancellation, error handling, etc.), see the Python Bindings Guide.

Architecture

DownloadManager
  └─ DownloadSession
       ├─ Scheduler (piece assignment, segment reclamation)
       ├─ HttpWorker ×N (Range requests, retry)
       │    └─ channel ─→ Writer (WriteBackCache → FileWriter)
       └─ ControlStore (atomic save/load/delete)

License

MIT. See LICENSE.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors