🗄️ Claw Drive

Google Drive stores your files. Claw Drive understands them.

😩 Before — 7 layers deep, 3 "final" versions

✨ After — one sentence, file in hand

Claw Drive is an AI-managed personal drive. It auto-categorizes your files, tags them for cross-cutting search, deduplicates by content, and retrieves them in natural language — all backed by Google Drive for cloud sync and security.

Privacy is not a feature — it's the foundation. Your agent never reads file contents without asking. If you don't respond, it defaults to private. Sensitive categories like identity/ are never read, never synced. Your data stays yours.

Features

📂 Auto-categorize — files sorted into the right folder without you thinking about it
🏷️ Smart tagging — cross-category search (a vet invoice is both medical and invoice)
🔍 Natural language retrieval — "find my cat's vet records" just works
🧬 Content-aware dedup — SHA-256 hash check prevents storing the same file twice
🌐 Global file indexing (v0.3.0+) — index files from any directory without moving them
☁️ Google Drive sync — optional real-time backup via fswatch + rclone
🔒 Privacy-first — local-first by default, sensitive categories excluded from sync, default-safe content handling
🛡️ Sensitive file protection — agent asks before reading contents; defaults to private if no reply
📋 Custom metadata — attach structured data (expiry dates, policy numbers, amounts) to any file
👤 Correspondent tracking — record who sent or issued each file
🔄 Reindex — batch re-enrich old files as your agent gets smarter
📛 Original name tracking — renames are recorded, both names searchable
🤖 AI-native — designed for OpenClaw agents, with a CLI under the hood

Install

Homebrew (recommended)

brew install dissaozw/tap/claw-drive
claw-drive init

As an OpenClaw Skill

Clone into your OpenClaw skills directory — OpenClaw picks it up automatically on the next session:

git clone https://github.com/dissaozw/claw-drive.git ~/.openclaw/skills/claw-drive
cd ~/.openclaw/skills/claw-drive
make install   # symlinks claw-drive to ~/.local/bin (or PREFIX=/usr/local make install)
claw-drive init

That's it. Your agent will see the skill and can start storing files immediately.

Updating: cd ~/.openclaw/skills/claw-drive && git pull

Optional: Google Drive Sync

brew install rclone fswatch   # sync dependencies
claw-drive sync auth          # one-time — opens browser for Google auth
claw-drive sync start         # start background sync daemon

Optional: PDF Extraction

PDF content extraction uses PyMuPDF via uv — no global install needed. It runs automatically when the agent stores a PDF with content reading enabled.

Usage

Claw Drive is designed to be used through your AI agent. You don't organize files — your agent does.

Storing files

Send a file to your agent (Telegram, email, etc.) and it handles everything:

Asks about privacy — "Should I read the contents, or keep it private?"
Extracts content (if permitted) — reads PDFs, images, docs to pull out key details
Categorizes the file into the right folder
Names it with a descriptive, date-stamped filename
Checks for duplicates by content hash
Tags it for cross-category search with specific identifiers
Indexes it in INDEX.jsonl with a rich, searchable description
Reports back what it did

📎 "Here's my auto insurance card"

🔒 "Should I read the contents to index it better, or keep it private?"

👤 "Go ahead"

✅ Stored: insurance/acme-auto-id-cards.pdf Policy ****3441 · 2024 Honda Civic · Effective 1/21/2026–7/21/2026 Tags: insurance, auto, acme, honda-civic, california

If you don't reply or say it's sensitive, the agent classifies by filename only and asks for a brief description if needed. Your data is never read without consent.

Retrieving files

Just ask in natural language:

"Find my cat's medical records" "Show me all invoices from January" "Do I have a copy of my W-2?"

The agent reads INDEX.jsonl directly — its semantic understanding beats any grep. It finds files by meaning, not string matching.

What you never have to do

Pick a folder
Think of a filename
Remember where you put something
Manually organize anything

Global File Indexing (v0.3.0+)

Index files from any directory without moving or copying them:

# Index all files in a directory
claw-drive index-global ~/Photos --tags "photos, personal"

# Index with depth limit (only top 2 levels)
claw-drive index-global ~/Documents --max-depth 2

# Index only PDF files
claw-drive index-global ~/Downloads --pattern "*.pdf"

Use cases:

Large photo libraries (no duplication)
Network drives or NAS storage
Existing project directories
Third-party app data (e.g., Obsidian vaults)

How it works:

Files are indexed with absolute paths (path_type: "absolute")
Original files remain in place (not copied)
No deduplication (performance optimization)
Not synced to cloud (only index entries are synced)

CLI Reference

The CLI handles write operations — store, sync, migrate — where atomicity matters (dedup + index updates). For read operations (search, list, tags), the agent reads INDEX.jsonl directly.

Command	Description
`claw-drive init`	Initialize drive directory and INDEX.jsonl
`claw-drive store <file> [opts]`	Store a file with categorization, tags, dedup, rename (`--name`), metadata (`--metadata`), correspondent (`--correspondent`)
`claw-drive update <path> [opts]`	Update description, tags, metadata, correspondent, and/or source on an existing entry
`claw-drive delete <path> [--force]`	Delete a file, its index entry, and dedup hash
`claw-drive rm <path> [--force]`	Alias for `delete`
`claw-drive status`	Show drive status (files, size, sync)
`claw-drive sync auth`	Authorize Google Drive (one-time, opens browser)
`claw-drive sync setup`	Check sync dependencies and config
`claw-drive sync start`	Start background sync daemon
`claw-drive sync stop`	Stop sync daemon
`claw-drive sync push`	Manual one-shot sync
`claw-drive sync status`	Show sync daemon state
`claw-drive reindex scan [--output plan.json]`	Scan drive for orphans + export existing entries for re-enrichment
`claw-drive reindex apply <plan.json> [--dry-run]`	Apply enriched reindex plan (add orphans, update existing)
`claw-drive migrate scan <dir> [plan.json]`	Scan a directory into a migration plan
`claw-drive migrate summary [plan.json]`	Show migration plan breakdown
`claw-drive migrate apply [plan.json] [--dry-run]`	Execute migration plan
`claw-drive version`	Show version

Sync

Optional real-time sync to Google Drive (or any rclone backend). Files sync within seconds of any change. Sensitive directories stay local-only.

See docs/sync.md for details.

Migration

Got a messy folder full of unsorted files? Claw Drive's migration workflow handles it:

# 1. Scan the source directory
claw-drive migrate scan ~/messy-folder migration-plan.json

# 2. AI agent classifies each file (fills in category, name, tags, description)

# 3. Review the plan
claw-drive migrate summary migration-plan.json

# 4. Dry run first
claw-drive migrate apply migration-plan.json --dry-run

# 5. Execute
claw-drive migrate apply migration-plan.json

The scan outputs a JSON plan with file metadata (path, size, mime type, extension). The agent fills in classification fields, then apply copies files into Claw Drive with full dedup, indexing, and tagging.

Custom Metadata

Store structured data alongside your files — expiry dates, policy numbers, amounts, anything the agent can answer questions about without reading the original file.

# Add metadata when storing
claw-drive store insurance-card.pdf -c insurance -d "Auto insurance" \
  --metadata '{"policy":"****3441","expiry":"2026-08","provider":"Farmers"}'

# Add metadata to existing files
claw-drive update insurance/card.pdf --metadata '{"deductible":"$500"}'

Metadata merges on update — existing fields are preserved, new fields are added. The agent can now answer "when does my insurance expire?" directly from the index, without opening the file.

Correspondent Tracking

Track who sent or issued each file — the person, company, or organization it came from:

# Set on store
claw-drive store invoice.pdf -c finance -d "Q4 invoice" --correspondent "Acme Corp"

# Add to existing file
claw-drive update finance/invoice.pdf --correspondent "Acme Corp"

This lets the agent answer questions like "show me everything from Farmers Insurance" or "what did VEG send me?" by filtering on correspondent.

Reindex

Already have files in Claw Drive but want richer descriptions, better tags, or custom metadata? The reindex workflow lets the agent re-analyze everything:

# 1. Scan — exports a plan with all files + current index entries
claw-drive reindex scan --output reindex-plan.json

# 2. Agent enriches the plan:
#    - Orphan files: fills in desc, tags, source, metadata
#    - Existing entries: adds new_desc, new_tags, new_metadata to update

# 3. Preview changes
claw-drive reindex apply reindex-plan.json --dry-run

# 4. Apply
claw-drive reindex apply reindex-plan.json

As your agent gets smarter, your old files benefit too.

Original Filename Preservation

When you rename a file on store (--name), Claw Drive records the original filename in the index. This means you can search by either name:

claw-drive store messy-scan-001.pdf -c medical --name "blood-work-2026-02.pdf" -d "..."
# Index records: original_name: "messy-scan-001.pdf"
# Searchable by both "messy-scan" and "blood-work"

Architecture

You ← natural language → AI Agent (OpenClaw)
                              │
                        claw-drive CLI
                              │
                        ~/claw-drive/        ← local, source of truth
                              │
                        fswatch + rclone     ← optional real-time sync
                              │
                        Google Drive          ← cloud backup

Privacy & Security

Claw Drive treats your files as personal data by default. This isn't an afterthought — it's a core design decision.

The Problem

AI agents that read your files put those contents into conversation transcripts — which are logged permanently. A "helpful" agent that reads your passport scan, tax return, or medical record has now copied that data into a .jsonl log file. That's a leak, not a feature.

The Solution

Claw Drive's agent always asks before reading. And if you don't answer, it assumes the answer is no.

Scenario	Behavior
User says "go ahead"	Full content extraction → rich description + specific tags
User says "keep it private"	Filename-only classification, asks for brief description
User doesn't reply	Defaults to sensitive — no content reading
File goes to `identity/`	Always sensitive — contents never read, never synced

What "sensitive" means in practice

File contents are never read by the agent
Classification uses filename and user input only
INDEX.jsonl descriptions are kept generic (no SSNs, account numbers, etc.)
identity/ is excluded from cloud sync by default
The file is still stored, hashed (for dedup), tagged, and indexed — just without content extraction

Defense in depth

Layer	Protection
Consent	Agent asks before reading any file
Default-safe	No reply = sensitive
Category rules	`identity/` always sensitive, excluded from sync
Sync exclusion	`.sync-config` exclude list for any category
Index hygiene	No raw sensitive data in descriptions
Local-first	Cloud sync is optional, not default

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
.ccmanager		.ccmanager
.github/workflows		.github/workflows
assets		assets
bin		bin
docs		docs
lib		lib
test		test
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SKILL.md		SKILL.md

Category	Use for
`documents/`	General docs, letters, forms, manuals
`finance/`	Tax returns, bank statements, pay stubs
`insurance/`	Policies, ID cards, claims, coverage docs
`medical/`	Health records, prescriptions, pet health
`travel/`	Boarding passes, itineraries, visas
`identity/`	ID scans, certificates (⚠️ sensitive — excluded from sync)
`receipts/`	Purchase receipts, warranties, invoices
`contracts/`	Leases, employment, legal agreements
`photos/`	Personal photos, document scans
`misc/`	Anything that doesn't fit above

Folders and files

Latest commit

History

Repository files navigation

🗄️ Claw Drive

Features

Install

Homebrew (recommended)

As an OpenClaw Skill

Optional: Google Drive Sync

Optional: PDF Extraction

Usage

Storing files

Retrieving files

What you never have to do

Global File Indexing (v0.3.0+)

CLI Reference

Sync

Migration

Custom Metadata

Correspondent Tracking

Reindex

Original Filename Preservation

Architecture

Categories

Privacy & Security

The Problem

The Solution

What "sensitive" means in practice

Defense in depth

Documentation

Roadmap

License

About

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages