Skip to content

jeffcwolf/pixiu-manager

Repository files navigation

Pixiu Upload Manager

DOI

A standalone GUI application for uploading large datasets to S3-compatible storage (originally developed for VUB Pixiu) with resilient, resume-friendly transfers that can survive network interruptions.

Overview

Pixiu Manager provides a user-friendly interface for managing large data uploads to S3 storage using rclone. Originally developed as part of the Veritrace monorepo, it has been extracted as a standalone tool for broader use.

Key Features

  • πŸ”„ Resume-Friendly Uploads - Automatically resumes interrupted transfers, skipping already-uploaded files
  • πŸ“Š Progress Tracking - Real-time upload progress with speed and ETA estimates
  • πŸ“ Remote Folder Management - Browse, create, move, and delete remote folders directly from the GUI
  • πŸ›‘οΈ Network Resilience - Built-in VPN keepalive and automatic retry mechanisms
  • πŸ’€ Sleep Prevention - Optional macOS caffeinate integration to prevent system sleep during uploads
  • ⚑ Parallel Transfers - Configurable parallel file transfers for optimal throughput
  • πŸ“ˆ Upload Statistics - Tracks upload history and provides time estimates based on past performance
  • πŸ’Ύ Configuration Persistence - Remembers your last settings and preferences

Requirements

System Requirements

  • macOS (tested), Linux (should work), Windows (untested)
  • Python 3.7+
  • rclone installed and configured

Python Dependencies

  • tkinter (usually included with Python)
  • Standard library modules: subprocess, threading, json, pathlib, re

External Dependencies

  • rclone: Must be installed and configured with your S3 remote
    brew install rclone  # macOS
    # or follow rclone installation docs for your platform

Installation

  1. Clone the repository:

    git clone https://codeberg.org/research_coder/pixiu-manager.git
    cd pixiu-manager
  2. Configure rclone for your S3 storage:

    rclone config

    Follow the prompts to set up your S3 remote (e.g., VUB-Pixiu:).

  3. Run the application:

    python3 pixiu_uploader.py

Usage

Basic Upload Workflow

  1. Select Source Folder

    • Browse to your local folder containing files to upload
    • Only the contents of the folder will be uploaded, not the folder itself
  2. Choose Destination

    • Use the "Browse Remote" button to navigate your S3 storage
    • Select or create a destination folder
  3. Configure Options

    • Set parallel transfer count (default: 8)
    • Enable/disable VPN keepalive (keeps connection alive during long uploads)
    • Enable/disable caffeinate (prevents Mac from sleeping)
  4. Start Upload

    • Click "Start Upload" to begin
    • Monitor progress in the output log
    • Upload can be safely stopped and resumed later

Remote Folder Management

The remote folder browser provides:

  • Navigation: Double-click folders to browse
  • Create: Make new folders (created automatically on first upload)
  • Info: View folder size and file count
  • Move: Rename or move folders
  • Delete: Remove folders and their contents (with double confirmation)

Resume Interrupted Uploads

If an upload is interrupted (VPN disconnect, system sleep, manual stop):

  1. Simply restart the application
  2. Select the same source and destination
  3. Click "Start Upload" again
  4. Already-uploaded files will be skipped automatically
  5. Partial files will be re-uploaded from the beginning

Configuration

Application Settings

Settings are automatically saved between sessions:

  • Last used source folder
  • Last used destination
  • Parallel transfer count
  • VPN keepalive preference
  • Caffeinate preference

Configuration is stored in ~/pixiu-logs/config.json.

Upload Logs

All uploads are logged to ~/pixiu-logs/ with timestamped filenames:

  • upload-YYYYMMDD-HHMMSS.log - Detailed rclone logs
  • upload_history.json - Upload statistics and history

Customizing for Your S3 Storage

Edit pixiu_uploader.py to configure your defaults:

# Line 30-33
self.remote_name = "YOUR-REMOTE:"  # Your rclone remote name
self.remote_base = "your-bucket"   # Your S3 bucket name
self.local_base = "/path/to/data"  # Default local folder

Advanced Features

VPN Keepalive

Prevents VPN timeout during long uploads by pinging every 5 minutes:

  • Automatically stops when upload completes
  • Can be toggled in the UI

Upload Statistics

The application tracks:

  • Total jobs completed
  • Total data uploaded
  • Average upload speed
  • Historical performance for time estimates

View statistics by clicking "πŸ“Š View Stats" in the application.

Parallel Transfers

Adjustable parallel transfer count:

  • Higher values = faster uploads (if bandwidth allows)
  • Lower values = more stable on unreliable connections
  • Default: 8 parallel transfers
  • Automatically adjusts checkers to 2Γ— transfer count

S3 Optimization

Built-in rclone optimizations:

  • 64MB chunk size
  • Multipart uploads for files > 200MB
  • Aggressive retry settings (30 retries, 50 low-level retries)
  • Connection timeouts: 10m general, 5m connection
  • Memory-mapped I/O for improved performance

Troubleshooting

Permission Denied Errors

git@codeberg.org: Permission denied (publickey)

Solution: Add your SSH key to your S3 provider or Git hosting service.

Rclone Not Found

Error: rclone command not found

Solution: Install rclone and ensure it's in your PATH:

brew install rclone  # macOS
# or download from https://rclone.org/downloads/

Remote Configuration Issues

Error: Failed to read: [remote] not found in config

Solution: Configure your rclone remote:

rclone config

Upload Stalls or Fails

  • Check your VPN connection
  • Enable VPN keepalive in the UI
  • Reduce parallel transfer count
  • Check the detailed logs in ~/pixiu-logs/

Development

Project Structure

pixiu-manager/
β”œβ”€β”€ pixiu_uploader.py    # Main GUI application
β”œβ”€β”€ remote_browser.py    # Remote folder browser dialog
β”œβ”€β”€ upload_manager.py    # Upload execution and statistics
└── README.md           # This file

Running from Source

python3 pixiu_uploader.py

Code Overview

  • PixiuUploader: Main application class with tkinter GUI
  • RemoteFolderBrowser: Modal dialog for browsing S3 folders
  • UploadManager: Handles upload execution and progress parsing
  • UploadStats: Tracks upload history and generates statistics

Contributing

Contributions are welcome! Please:

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes with clear commit messages
  4. Submit a pull request

License

This project is licensed under the GNU General Public License v3.0 (GPL-3.0).

See LICENSE for details.

Citation

If you use this software in your research, please cite it:

@software{pixiu_manager,
  title = {Pixiu Upload Manager},
  author = {Wolf, Jeffrey C.},
  year = {2026},
  url = {https://codeberg.org/research_coder/pixiu-manager}
}

Or use the CITATION.cff file for automatic citation generation.

Acknowledgments

  • Originally developed for the Veritrace project at VUB
  • Built on top of the excellent rclone tool
  • Extracted from the Veritrace monorepo for standalone use

Support

For issues, questions, or feature requests:


Note: This tool was developed for academic research data management and prioritizes reliability and resume-capability over speed for large, long-running uploads.

About

A standalone GUI application for uploading large datasets to S3-compatible storage (originally developed for VUB Pixiu) with resilient, resume-friendly transfers that can survive network interruptions.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages