Skip to content

openFactory-ai/cloud-scraper

Repository files navigation

Cloud Scraper

Cloud Scraper

Cloud data export GUI — download your personal data from Google, Microsoft, and Apple.
Built by OpenFactory

Cloud Scraper screenshot

Features

  • Google — Gmail, Contacts, Calendar, Drive, Photos
  • Microsoft — Outlook Mail, Contacts, Calendar, OneDrive
  • Apple (experimental) — Contacts, Calendar, iCloud Drive

Connect to one or more providers, select the data types you want, and export everything to a local folder. No data leaves your machine — exports go straight to disk.

Install

Option A: Debian/Ubuntu package

sudo apt install cloud-scraper

Available from the OpenFactory package repository. Installs to /opt/openfactory/cloud-scraper/ with a desktop entry and /usr/bin/cloud-scraper launcher.

Option B: From source

git clone https://github.com/openFactory-ai/cloud-scraper.git
cd cloud-scraper
./setup.sh

The setup script installs system dependencies (GTK4, libadwaita, PyGObject) and creates a virtual environment with all Python dependencies.

Usage

# If installed via package
cloud-scraper

# If running from source
python -m data_scraper

Provider Setup

Google

OAuth credentials are bundled with the .deb package. When running from source, place your OAuth credentials at one of:

  • ~/.config/data-scraper/google-credentials.json
  • ./credentials/google.json

Create credentials at Google Cloud Console (Desktop app type).

Microsoft

An embedded Azure AD client ID is included. To use your own, create ~/.config/data-scraper/microsoft.json:

{"client_id": "your-azure-app-client-id"}

Register an app at Azure Portal with the "Mobile and desktop applications" platform.

Apple (experimental)

Uses Apple ID + app-specific password (no OAuth). Generate an app-specific password at appleid.apple.com under Sign-In and Security.

Export Formats

Data Type Format
Email .eml files (one per message)
Contacts .vcf (vCard 3.0)
Calendar .ics (iCalendar)
Drive / OneDrive Original files
Photos Original files

Requirements

  • Python 3.11+
  • GTK4 + libadwaita
  • PyGObject

System packages

Debian/Ubuntu:

sudo apt install python3-gi python3-gi-cairo gir1.2-gtk-4.0 gir1.2-adw-1 python3-venv

Fedora:

sudo dnf install gtk4-devel libadwaita-devel python3-gobject

Architecture

Cloud Scraper is part of the OpenFactory data sovereignty stack.

YOUR NETWORK              SOVEREIGNTY LAYER              CORPORATE CLOUDS
(you own this)            (your tools)                   (they control this)

 Home Lab / Servers   <── OpenFactory ──────────────>   AWS (hosting/CDN)
 Personal Devices     <──   Build OS images             Google Cloud
 Managed Network      <──   Deploy to fleet             Azure
 Local Data Store     <──   Compliance engine
                                                         Google (Gmail, Drive,
 Deployed OS Images   <── Cloud Scraper ─ ─ ─ ─ ─ ─>     Photos, Calendar)
   Debian, Ubuntu,        Export personal data           Microsoft (Outlook,
   Fedora, openSUSE,      OAuth2 read-only                OneDrive, Teams)
   Elster OS              No intermediary servers        Apple (iCloud, Photos)
                          Saves straight to disk         Other SaaS

OpenFactory builds custom Linux OS images and deploys them to your hardware. Cloud Scraper pulls your personal data out of corporate clouds onto that hardware. Together they form a complete sovereign computing stack — your machines, your OS, your data, no corporate dependency.

Data Sovereignty Architecture

License

AGPL-3.0

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors