A standalone GUI application for uploading large datasets to S3-compatible storage (originally developed for VUB Pixiu) with resilient, resume-friendly transfers that can survive network interruptions.
Pixiu Manager provides a user-friendly interface for managing large data uploads to S3 storage using rclone. Originally developed as part of the Veritrace monorepo, it has been extracted as a standalone tool for broader use.
- π Resume-Friendly Uploads - Automatically resumes interrupted transfers, skipping already-uploaded files
- π Progress Tracking - Real-time upload progress with speed and ETA estimates
- π Remote Folder Management - Browse, create, move, and delete remote folders directly from the GUI
- π‘οΈ Network Resilience - Built-in VPN keepalive and automatic retry mechanisms
- π€ Sleep Prevention - Optional macOS caffeinate integration to prevent system sleep during uploads
- β‘ Parallel Transfers - Configurable parallel file transfers for optimal throughput
- π Upload Statistics - Tracks upload history and provides time estimates based on past performance
- πΎ Configuration Persistence - Remembers your last settings and preferences
- macOS (tested), Linux (should work), Windows (untested)
- Python 3.7+
rcloneinstalled and configured
tkinter(usually included with Python)- Standard library modules:
subprocess,threading,json,pathlib,re
- rclone: Must be installed and configured with your S3 remote
brew install rclone # macOS # or follow rclone installation docs for your platform
-
Clone the repository:
git clone https://codeberg.org/research_coder/pixiu-manager.git cd pixiu-manager -
Configure rclone for your S3 storage:
rclone config
Follow the prompts to set up your S3 remote (e.g.,
VUB-Pixiu:). -
Run the application:
python3 pixiu_uploader.py
-
Select Source Folder
- Browse to your local folder containing files to upload
- Only the contents of the folder will be uploaded, not the folder itself
-
Choose Destination
- Use the "Browse Remote" button to navigate your S3 storage
- Select or create a destination folder
-
Configure Options
- Set parallel transfer count (default: 8)
- Enable/disable VPN keepalive (keeps connection alive during long uploads)
- Enable/disable caffeinate (prevents Mac from sleeping)
-
Start Upload
- Click "Start Upload" to begin
- Monitor progress in the output log
- Upload can be safely stopped and resumed later
The remote folder browser provides:
- Navigation: Double-click folders to browse
- Create: Make new folders (created automatically on first upload)
- Info: View folder size and file count
- Move: Rename or move folders
- Delete: Remove folders and their contents (with double confirmation)
If an upload is interrupted (VPN disconnect, system sleep, manual stop):
- Simply restart the application
- Select the same source and destination
- Click "Start Upload" again
- Already-uploaded files will be skipped automatically
- Partial files will be re-uploaded from the beginning
Settings are automatically saved between sessions:
- Last used source folder
- Last used destination
- Parallel transfer count
- VPN keepalive preference
- Caffeinate preference
Configuration is stored in ~/pixiu-logs/config.json.
All uploads are logged to ~/pixiu-logs/ with timestamped filenames:
upload-YYYYMMDD-HHMMSS.log- Detailed rclone logsupload_history.json- Upload statistics and history
Edit pixiu_uploader.py to configure your defaults:
# Line 30-33
self.remote_name = "YOUR-REMOTE:" # Your rclone remote name
self.remote_base = "your-bucket" # Your S3 bucket name
self.local_base = "/path/to/data" # Default local folderPrevents VPN timeout during long uploads by pinging every 5 minutes:
- Automatically stops when upload completes
- Can be toggled in the UI
The application tracks:
- Total jobs completed
- Total data uploaded
- Average upload speed
- Historical performance for time estimates
View statistics by clicking "π View Stats" in the application.
Adjustable parallel transfer count:
- Higher values = faster uploads (if bandwidth allows)
- Lower values = more stable on unreliable connections
- Default: 8 parallel transfers
- Automatically adjusts checkers to 2Γ transfer count
Built-in rclone optimizations:
- 64MB chunk size
- Multipart uploads for files > 200MB
- Aggressive retry settings (30 retries, 50 low-level retries)
- Connection timeouts: 10m general, 5m connection
- Memory-mapped I/O for improved performance
git@codeberg.org: Permission denied (publickey)Solution: Add your SSH key to your S3 provider or Git hosting service.
Error: rclone command not found
Solution: Install rclone and ensure it's in your PATH:
brew install rclone # macOS
# or download from https://rclone.org/downloads/Error: Failed to read: [remote] not found in config
Solution: Configure your rclone remote:
rclone config- Check your VPN connection
- Enable VPN keepalive in the UI
- Reduce parallel transfer count
- Check the detailed logs in
~/pixiu-logs/
pixiu-manager/
βββ pixiu_uploader.py # Main GUI application
βββ remote_browser.py # Remote folder browser dialog
βββ upload_manager.py # Upload execution and statistics
βββ README.md # This file
python3 pixiu_uploader.py- PixiuUploader: Main application class with tkinter GUI
- RemoteFolderBrowser: Modal dialog for browsing S3 folders
- UploadManager: Handles upload execution and progress parsing
- UploadStats: Tracks upload history and generates statistics
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Make your changes with clear commit messages
- Submit a pull request
This project is licensed under the GNU General Public License v3.0 (GPL-3.0).
See LICENSE for details.
If you use this software in your research, please cite it:
@software{pixiu_manager,
title = {Pixiu Upload Manager},
author = {Wolf, Jeffrey C.},
year = {2026},
url = {https://codeberg.org/research_coder/pixiu-manager}
}Or use the CITATION.cff file for automatic citation generation.
- Originally developed for the Veritrace project at VUB
- Built on top of the excellent rclone tool
- Extracted from the Veritrace monorepo for standalone use
For issues, questions, or feature requests:
- Open an issue on Codeberg
- Primary repository: https://codeberg.org/research_coder/pixiu-manager
- GitHub mirror: https://github.com/jeffcwolf/pixiu-manager
Note: This tool was developed for academic research data management and prioritizes reliability and resume-capability over speed for large, long-running uploads.