Anthony Tseng edited this page Nov 19, 2018 · 46 revisions
Welcome to Sync! pyramids contain stuff and are cool and mysterious

What is Brave Sync?

Brave Sync is a new way to automatically sync browsing data (bookmarks, preferences, history) between devices running Brave browser. It uses client-side encryption such that Brave's servers cannot read your data, since they do not have access to the encryption keys. Sync is not designed for data backup; that is, if you delete Brave browser from all your devices, you will probably not be able to recover old browsing data.

Brave Sync is disabled by default. Once you enable it on one device, you will be able to add more devices to Brave Sync by scanning a QR code or entering a secret phrase on the new device which encodes your Brave Sync encryption keys.

Currently sync is only supported on desktop versions of Brave. We are working on adding it to iOS and Android.

In the future, we plan to add support for syncing Brave Payments data as well as an option to use your own sync server on Amazon S3.

Technical Overview

  • Data to sync: bookmarks, history, site settings, list of devices. User chooses collections to sync.
  • Data encrypted client-side with a single keypair, generated on first run of Sync in a browser client.
  • Server verifies Ed25519 signature over the request in order to authenticate clients.
  • Clients periodically send writes to our web service.
  • Web service async writes to S3.
  • Clients get reads directly from S3 or through SQS if it is bookmarks
  • To add new devices, copy the private encryption key seed to the new device.
  • Clients resolve conflicts.
  • Server does not have access to any unencrypted sync data, nor does it know which devices are making the updates or how many devices there are.

Architecture

Client library

  • JS
  • To be shared among brave-core/Chromium, browser-laptop/Electron, iOS, Android apps.
  • Runs in an extension background page and webviews respectively.
  • Methods to:
    • Generate Ed25519 keypair and secretbox key from a random seed
    • Export random seed to QR code
    • Import QR code to random seed
    • Request temporary AWS credentials to access a user's S3 sync data
    • Write records (accept a record, encrypt/sign payload, format, write to S3), buffered
    • Read records (query S3, decrypt records) (by polling) and call browser callback
    • (?) Resolve conflicts
    • (?) Delete collections
    • (?) Specify email for recovery

Web app

  • JS, Node
  • Heroku (tentative pending a load test. possible alternative is AWS API Gateway)
  • Create temporary AWS credentials to access user's S3 prefix
  • Register user

Crypto

To create a new userID, the client generates a 32 byte random seed. This is expanded with HKDF (SHA-512 HMAC) into:

  1. Ed25519 signing keypair
  2. NaCl secretbox authenticated encryption secret key

The hex or base64 encoding of the Ed25519 public key (32 bytes) is the userID which uniquely identifies each "persona". Users may associate multiple devices with the same userID by copying the 32-byte random seed to the new device.

Each sync update is end-to-end encrypted using secretbox by the client so that the server does not see the device ID or any of the sync data. API requests are signed using the Ed25519 keypair, which the web service verifies for client authentication using the userID / public key.

To add a new device, the user has to copy the 32-byte initial random seed to the new device via a QR code or similar. The new device uses the seed to reconstruct all the necessary keys.

Sync data

All sync records now contain a new 16-byte objectId attribute, set by clients on create. It's sent with each write and can't be changed.

All sync PUTs include a 1-byte deviceId, which increments every time a new device is registered. Users can register up to 256 devices

{
  objectId: bytes // UUID object ID
}

Bookmark (categoryId = 0x00)

{
	location: string,
	title: string, // <title> of a page. Should not be used for folders.
	customTitle: string, // User provided title for bookmark; overrides title
	favicon: string, // URL of the favicon (?) Do we need this
	lastAccessedTime: number, // datetime.getTime(), null if folder
	creationTime: number, //creation time of bookmark
	isFolder: boolean,
	parentFolderObjectId: bytes // UUID object ID of the parent folder
}

HistorySite (categoryId = 0x01)

{
  location: string,
  title: string,
  customTitle: string, // User provided title for bookmark; overrides title
  lastAccessedTime: number, // datetime.getTime(), null if folder
  creationTime: number, //creation time of bookmark
  favicon: string // URL of the favicon
}

Preferences (categoryId = 0x02)

SiteSetting

{
  hostPattern: hostPattern,
  zoomLevel: number,
  shieldsUp: boolean,
  adControl: enum, // (showBraveAds | blockAds | allowAdsAndTracking)
  cookieControl: enum, // (block3rdPartyCookie | allowAllCookies)
  safeBrowsing: boolean,
  noScript: boolean, // true = block scripts, false = allow
  httpsEverywhere: boolean,
  fingerprintingProtection: boolean,
  ledgerPayments: boolean, // False if site should not be paid by the ledger. Defaults to true.
  ledgerPaymentsShown: boolean, // False if site should not be paid by the ledger and should not be shown in the UI. Defaults to true.
}

Device

{
  name: string // optional user-chosen name
}

S3

  • We're not storing any object content; keys contain all the data.
  • Generally clients will request a temporary AWS secret which authorizes them to GET, PUT and LIST S3 data directly.
  • Stored data is serialized with protocol buffers.

Users

  • A user's Sync data is stored in S3 with prefix /{userId}/.
  • userId == user's pubkey matching the client generated private key.
  • Users register themselves implicitly with the first S3 PUT with their userId prefix.
  • Users can delete their Sync account by deleting all S3 keys with prefix /{userId}/.

Devices

  • A user has many devices.
  • Each write includes the client's device ID so clients can see which devices have been writing.
  • New clients read the devices list, determine the next device ID, and write the new device ID to S3. Devices store their own device ID in persistent local storage.

Collections

  • Browser record collections are stored at /{userId}/{categoryId}/ where categoryId maps to a collection that is one of: bookmark, historySite, preferences.
  • A collection is a write journal with writes (create, update, delete) of records.
  • Note the record data is stored in S3 keys which are limited to 1024 UTF-8 characters; so large objects are split into multiple parts.
  • Clients will fetch writes directly with S3 LIST /{userId}/{categoryId}/?start-after={timestamp}
  • Clients will construct Sync state by replaying all journal writes.
  • They'll keep a client-side timestamp of how far they are synced up to, and get updates periodically (15 min?).
  • In case of conflict, clients will preserve the last performed action based on records's S3 LastModified dates.

Format

/{ version }/{ userId == user pubkey }/{ categoryId }/{ client timestamp }/({ CRC32(full payload) }/{ part })/{ encrypt(serialize([ one or more writes ])) }

write: { action, deviceId, objectId, objectData={key1: value1, ...} }

  • version: 1 byte, format version
  • userId: 32 byte, client public key, base64 (hex?) encoded.
  • categoryId: 1 byte, maps to collection types e.g. bookmarks
  • timestamp: 4 byte
  • part: 1 byte
  • delimiters: /'s - 5 bytes
  • multi-part overhead (for requests >933 bytes): checksum + part index + delimiters == 4 + 1 + 2 = 7 bytes ({ CRC32(full payload) }/{ part }/)
  • encryption overhead: ~40 bytes
  • remaining: ~933 bytes

The max size of a multipart write is around 256 * 933 bytes ~= 262 kb.

write

// Writes are encrypted as a collection.
  {
    action: enum, // "create", "update", "delete"
    deviceId: bytes, // device doing the write
    objectId: bytes, // object uuid
    objectData: { // (Optional / in case of delete). Changed object attributes
      // title: string // "Cool bookmark"
      // folderId: number
    }
  },

SQS

  • Used to speed up polling for new records
  • Only support bookmarks for now
  • If records lifetime > 24 hours or sync just initialized, we will still pull records from s3

API

  • All client requests are signed with the user's private key, and the result is specified as signedData in the request query or body.
  • Each endpoint contains the public key (user ID), which we use to verify the signed request and determine the user's S3 prefix.
  • If an endpoint takes no params the client should include a signature over the current client timestamp (epoch time in seconds). The server must reject the request as invalid if the timestamp is too old, to partially protect against signature replay attacks.
  • {version} is the same as the keys' format version

POST /{userId}/credentials

Returns AWS credential valid for 36 hours allowing:

  • S3 ListBucket: /brave-sync/{version}/{userId}/*
  • S3 DeleteObject: /brave-sync/{version}/{userId}
  • S3 DeleteObject: /brave-sync/{version}/{userId}/*
  • For collections bookmarks, historySites, preferences:
    • S3 GetObject, PutObject: /brave-sync/{version}/{userId}/{collection}/*

Response:

{
  aws: {
    accessKeyId: string,
    secretAccessKey: string,
    sessionToken: string,
    expiration: string
  }
  s3Post: {
    // This is POST form data to be included with writes to S3.
    // For details see AWS docs:
    // https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-post-example.html
    postData: {
      AWSAccessKeyId: string,
      policy: string,
      signature: string,
      acl: string
    },
    bucket: string
  }
}
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.