A powerful Puppeteer-based bot that automatically downloads and archives web pages using saveweb2zip.com. Perfect for creating offline backups of websites with their complete structure and assets.
- 🚀 Batch download multiple URLs from a JSON file
- 📁 Automatic file organization with smart naming
- ⏱️ Timestamp-based versioning for duplicate pages
- 🔄 Preserves website structure and assets
- 🛡️ Built-in error handling and retry mechanisms
- 📊 Detailed progress logging
- Node.js (v14 or higher)
- npm (Node Package Manager)
- Clone the repository:
git clone https://github.com/terminalDZ/saveweb2zip-bot.git
cd saveweb2zip-bot- Install dependencies:
npm install- Create or edit
urls.jsonwith your target URLs:
[
"https://example.com/page1.html",
"https://example.com/page2.html"
]Run the bot:
node bot.jsThe bot will:
- Process each URL in your
urls.jsonfile - Download the complete website structure
- Save ZIP files in the
zipdirectory - Automatically handle duplicate filenames by adding timestamps
Downloaded files are saved in the zip directory with the following naming convention:
- First download:
pagename.zip - Subsequent downloads:
pagename_TIMESTAMP.zip
Example:
zip/
├── about.zip
├── about_2025-01-11T06-34-41.zip
└── contact.zip
- Smart File Naming: Automatically extracts page names from URLs
- Duplicate Handling: Uses timestamps to prevent overwriting
- Progress Monitoring: Detailed console logging of each step
- Error Recovery: Built-in retry mechanism for failed downloads
- File Verification: Ensures downloaded files are complete and valid
Contributions are welcome! Feel free to:
- Report bugs
- Suggest new features
- Submit pull requests
Idriss Boukmouche
- GitHub: @terminalDZ
- Puppeteer - Headless Chrome Node.js API
- SaveWeb2Zip - Website archiving service
⭐️ If you find this project useful, please consider giving it a star on GitHub!