File downloader for archive.org
Made with 💝 by 🤖
Simply pass the URL of an archive.org details page you want to download and ia-get
will automatically get the XML metadata and download all files to the current working directory.
ia-get https://archive.org/details/<identifier>
I wanted to download high-quality scans of ZZap!64 magazine and some read-only memory from archive.org. Archives of this type often include many large files, torrents are not always provided and when they are available they do not index all the available files in the archive.
Archive.org publishes XML documents for every page that indexes every file available.
So I co-authored ia-get
to automate the download process.
- 🔽 Reliably download files from the Internet Archive
- 🌳 Preserves the original directory structure
- 🔄 Automatically resumes partial or failed downloads
- 🔏 Hash checks to confirm file integrity
- 🌱 Can be run multiple times to update existing downloads
- 📊 Gets all the metadata for the archive
- 📦️ Available for Linux 🐧 macOS 🍏 and Windows 🪟
You can use ia-get
to download files from archive.org, including all the metadata and the .torrent
file, if there is one.
You can the start seeding the torrent using a pristine copy of the archive, and a complete file set.
Such as it is.
cargo build
You can run the built-in unit tests with:
cargo test
This will run tests that verify URL pattern validation and other core functionality.
I used these commands to test ia-get
during development.
ia-get https://archive.org/details/deftributetozzap64
ia-get https://archive.org/details/zzapp_64_issue_001_600dpi
This program is an experiment 🧪 In late 2023, it was initially co-authored using Chatty Jeeps. When I started this project, I had no experience 👶 with Rust and was curious to see if I could use AI tools to assist in developing a program in a language I do not know.
As featured on Linux Matters podcast! 🎙️ I am a presenter on Linux Matters and we discussed how the initial version of the program was created using Chatty Jeeps (ChatGPT-4) in Episode 16 - Blogging to the Fediverse.
I discussed that process, its successes, and drawbacks. In a future episode, we will discuss the latest version of the project.
Since that initial MVP, I used Unfold.ai to add features and improve the code 🧑💻. All commits from October 27, 2023, until the end of December 2023 that were AI co-authored have full details of the AI contribution in the commit messages. Linux Matters listner Daniel Dewberry submitted a "peer review" of ia-get in January 2024. The project had little development activity until May 2025, when I incorporated the improvements Daniel had suggested.
I've picked up some Rust along the way, and some of the refactoring and redesign comes directly from my brain 🧠 and some assistance from GitHub CoPilot using Claude Sonnet 3.7 and Gemini Pro 2.5.