An open-source CLI backup and restore engine for Microsoft 365 mailboxes. Built with per-tenant envelope encryption, content-addressed deduplication, multi-layer integrity validation, and efficient delta synchronization for scalable, secure operations against S3-compatible object storage.
🔐 Per-tenant encryption — Each tenant gets a unique AES-256-GCM key derived via scrypt. Even if storage is breached, data stays encrypted and requires that tenant's passphrase to decrypt.
🧊 Storage-level immutability (WORM) — S3/MinIO Object Lock with time-based retention enforced by storage itself, not app metadata.
🧬 Content-addressed deduplication — Messages and attachments are stored by SHA-256 hash per mailbox. Identical files are stored once.
📎 Attachment backup & restore — Attachments are fetched via Graph API, deduplicated, and encrypted with message data. Large files use chunked upload sessions during restore.
🛡️ Multi-layer integrity checks — SHA-256 in manifests, Content-MD5 on uploads, and AES-GCM auth tags ensure transport and at-rest integrity.
🔄 Delta sync with fallback — Microsoft Graph delta queries for incremental backups with automatic full-scan fallback on interrupted runs.
📨 EML export — Save backed-up emails as standard .eml files in compressed zip archives with Outlook-compatible folder structure and SHA-256 verification.
📊 Live progress dashboard — ANSI dashboard shows all folders in real time with ETA, speed, and per-folder status.
🧩 Typed SDK — Programmatic API for embedding in other Node.js applications via m365-atlas/sdk.
🏗️ Hexagonal architecture — Ports-and-adapters with dependency injection. Swap storage or mail connectors without touching core logic.
graph LR
CLI["CLI (commander)"] --> Services
SDK["SDK (npm API)"] --> Services
Services --> Ports["Ports (interfaces)"]
Adapters --> Ports
Adapters --> GraphAPI["Microsoft Graph"]
Adapters --> S3["S3 / MinIO"]
subgraph core [Core]
Services
Ports
Domain
end
src/
├── adapters/
│ ├── keystore/ # envelope encryption (AES-256-GCM, scrypt KEK)
│ ├── m365/ # Microsoft Graph connector (delta sync, OAuth2, restore)
│ ├── storage-s3/ # S3 object storage, manifest repo, bucket manager
│ └── tenant-context.factory.ts
├── cli/
│ └── commands/ # backup, list, read, verify, restore, save, delete
├── domain/ # Manifest, Snapshot, Tenant, BackupObject (pure data)
├── ports/ # ObjectStorage, MailboxConnector, RestoreConnector, ManifestRepository, KeyService
├── services/ # MailboxSyncService, RestoreService, SaveService, CatalogService, helpers
└── utils/ # config loader, logger
# install
npm install -g m365-atlas
# start MinIO (or use any S3-compatible endpoint)
cd docker && docker compose up -d
# configure
cp .env.example .env
# fill in tenant_id, client_id, client_secret, s3 credentials, encryption passphrase
# first backup
atlas backup --mailbox user@company.com
# list what was backed up
atlas list
# restore a folder from backup
atlas restore -m user@company.com -f Inbox
# save as EML zip archive
atlas save -m user@company.com -o backup.zipFull CLI and SDK reference: See docs/USAGE.md for complete command options, examples, and programmatic API documentation.
Atlas loads configuration from three sources, merged in this order (later wins):
- Config file:
atlas.config.jsonor.atlas/config.json(searched in cwd, then~/.atlas/) .envfile (loaded via dotenv, does not overwrite existing env vars)- Environment variables (always win)
| Variable | Config field | Required | Description |
|---|---|---|---|
ATLAS_TENANT_ID |
tenant_id |
yes | Azure AD tenant ID |
ATLAS_CLIENT_ID |
client_id |
yes | App registration client ID |
ATLAS_CLIENT_SECRET |
client_secret |
yes | App registration client secret |
ATLAS_S3_ENDPOINT |
s3_endpoint |
yes | S3 endpoint URL (e.g. http://localhost:9002) |
ATLAS_S3_ACCESS_KEY |
s3_access_key |
yes | S3 access key |
ATLAS_S3_SECRET_KEY |
s3_secret_key |
yes | S3 secret key |
ATLAS_S3_REGION |
s3_region |
no | S3 region (default: us-east-1) |
ATLAS_ENCRYPTION_PASSPHRASE |
encryption_passphrase |
yes | Master passphrase for envelope encryption |
Register an application in Azure Portal with these Application permissions (not Delegated):
| Permission | Why |
|---|---|
Mail.Read |
Read mailbox contents via Graph API |
Mail.ReadWrite |
Restore messages and create folders in target mailboxes |
User.Read.All |
Enumerate users / resolve mailbox IDs |
MailboxSettings.Read |
Read mailbox metadata and folder structure |
Mail.ReadWriteis only required foratlas restore. Backup, list, read, save, and verify operations work withMail.Readalone.
After adding permissions, click Grant admin consent for [your tenant] in the API Permissions blade. The app authenticates using OAuth2 Client Credentials flow (@azure/identity ClientSecretCredential).
Atlas uses envelope encryption to isolate tenants cryptographically:
Master passphrase (env var)
|
v
scrypt(passphrase, tenant_id, N=16384, r=8, p=1) --> KEK (256-bit, per-tenant)
|
v
KEK wraps/unwraps a random DEK (AES-256-GCM)
|
v
DEK encrypts all data + manifests for that tenant
- KEK (Key Encryption Key) -- derived deterministically from the passphrase and tenant ID. Never stored; re-derived on every run.
- DEK (Data Encryption Key) -- random 256-bit key, generated once per tenant and stored wrapped (encrypted with KEK) at
_meta/dek.encin the tenant's S3 bucket. - Ciphertext format --
[12-byte IV][16-byte GCM auth tag][ciphertext]. Every encrypt operation uses a fresh random IV. - Manifest encryption -- manifests (containing email subjects and message metadata) are encrypted with the same DEK, ensuring subject lines and folder names are never exposed at rest.
Integrity is validated at three layers:
| Layer | Mechanism | When |
|---|---|---|
| Plaintext | SHA-256 checksum stored in manifest | Backup, verify, restore, save |
| Transport | Content-MD5 header on S3 PUT |
Upload (S3 rejects mismatches) |
| At-rest | AES-256-GCM authentication tag | Every decrypt (tamper = exception) |
Atlas supports storage-enforced immutability on AWS S3 and MinIO.
- Enforced by storage backend: Object Lock retention prevents overwrite/delete based on backend rules.
- Recorded by Atlas: manifests include
object_lock.requestedandobject_lock.effectivefor audit and operations. - Not enforcement: manifest policy metadata is bookkeeping, not control-plane enforcement.
- Bucket must exist and be reachable.
- Bucket versioning must be enabled.
- Bucket Object Lock must be enabled/supported.
If immutable backup is requested and these checks fail, Atlas aborts with explicit error categories:
- versioning disabled
- Object Lock unsupported/disabled
- backend rejected requested mode/headers
Atlas uses content-addressed storage. If an object already exists:
- Atlas reuses it,
- does not re-upload,
- does not extend retention on existing versions.
This means newer snapshots can reference older versions whose retention window was anchored at first write. Atlas records requested/effective snapshot policy, but per-object enforcement follows actual stored object versions.
--retention-daysis required for retention-enforced immutability.--lock-mode complianceis stronger but operationally harder to reverse.- Purge in immutable environments means "attempt full deletion and report leftovers", not guaranteed immediate destruction.
Backups use Microsoft Graph delta queries for incremental sync:
- Initial run -- requests
/users/{id}/mailFolders/{id}/messages/deltawith$selectincludingbody. The API returns all messages across paginated responses. The final@odata.deltaLinkis saved in the manifest. - Subsequent runs -- sends the saved
deltaLink. The API returns only messages created, modified, or deleted since the last sync. - Stale-delta safeguard -- if a saved delta link returns 0 items but the previous manifest had 0 entries (indicating an interrupted prior backup), Atlas discards the link and runs a full enumeration automatically.
- Force full --
atlas backup --fullignores all saved delta links. - Graceful interruption -- Ctrl+C during a backup sets an interrupt flag, saves all already-stored objects and completed delta links into a partial manifest, and marks interrupted folders in the dashboard.
Each tenant gets its own S3 bucket named atlas-{tenant_id}:
atlas-{tenant_id}/
├── _meta/
│ └── dek.enc # wrapped DEK (encrypted with KEK)
├── data/
│ └── {mailbox_id}/
│ ├── {sha256_a} # encrypted message (content-addressed)
│ └── ...
├── attachments/
│ └── {mailbox_id}/
│ ├── {sha256_x} # encrypted attachment (content-addressed)
│ └── ...
└── manifests/
└── {mailbox_id}/
├── {snapshot_id_1}.json # encrypted manifest
└── {snapshot_id_2}.json
pnpm install
pnpm run build # bundle with tsdown
pnpm run test # vitest (unit tests)
pnpm run test:coverage # with v8 coverage
pnpm run lint # eslint
pnpm run format # prettier| Rule | Enforced by |
|---|---|
kebab-case file names |
eslint-plugin-check-file |
snake_case variables, parameters, properties |
@typescript-eslint/naming-convention |
PascalCase types, classes, interfaces |
@typescript-eslint/naming-convention |
| Max 300 lines per file (excluding blanks/comments) | max-lines ESLint rule |
| Single quotes, trailing commas, 100-char print width | Prettier |
@/ path aliases (no relative imports) |
tsconfig.json paths |
| JSDoc on all exported functions | Convention |
When a file approaches 300 lines, the logic should be split into smaller helper files rather than compacted. Each function name should describe exactly what it does without hidden side-effects; multi-responsibility functions are split into a parent that calls focused child functions.
Tests use Vitest with @vitest/coverage-v8. Services are tested via Inversify container wiring with mock adapters -- no network calls in unit tests. The same DI tokens used in production are bound to mock implementations.
- Outlook mailbox backup with delta sync, content-addressed deduplication, and attachment handling
- Outlook mailbox restore with original timestamps, folder structure, cross-mailbox support, and chunked attachment upload
- EML export (save) -- save backed-up emails as
.emlfiles in compressed zip archives with Outlook folder structure, SHA-256 verification, and attachment embedding - CLI interface -- backup, restore, save, verify, list, read, delete, and storage-check commands
- S3 Object Lock / retention policies -- GOVERNANCE and COMPLIANCE mode with storage-enforced immutability
- Typed programmatic SDK --
m365-atlas/sdksubpath with camelCase ES6 API including save operations - Repository docs -- contributing guide, issue/PR templates, code conventions
- Full documentation site -- dedicated docs beyond the README with guides, API reference, and deployment examples
- Entra ID backup -- back up Azure AD / Entra ID directory objects (users, groups, roles, policies)
- Entra ID restore -- restore Entra ID objects from backup
- OneDrive backup -- back up OneDrive files and folder structures with delta sync
- OneDrive restore -- restore OneDrive files to the original or a different user
- SDK coverage for new services -- expose Entra ID and OneDrive operations through
m365-atlas/sdk
See CONTRIBUTING.md for setup instructions, code conventions, architecture overview, and pull request guidelines.
Copyright 2026 Miika Oja-Kaukola
This project is licensed under the Apache License, Version 2.0.
See the LICENSE file for details.