Skip to content
This repository has been archived by the owner on Dec 20, 2022. It is now read-only.
/ bernard Public archive

A synchronisation engine for Google Drive metadata

License

Notifications You must be signed in to change notification settings

m-rots/bernard

Repository files navigation

Introduction

Bernard is an essential character in my Journey of Transfer narrative as he is in charge of mirroring the state of a Shared Drive to a specified datastore. Specifically, Bernard acts as an engine to fetch changes from the Google Drive API to then propagate these changes to a datastore such as SQLite.

Journey of Transfer is a narrative I am writing with projects named after characters of Westworld. The narrative is my exploration process of the Go language, while building a programme utilising service accounts to upload and sync files to Google Drive.

Bernard is the second character of this narrative and was created to provide an alternative to RClone to provide local low-latency access to Google Drive metadata.

Early Access

Bernard is provided as an early-access preview as the API may still change. Furthermore, not all components have associated tests.

This early-access preview comes with a small CLI to visually reflect the changes Bernard picks up. Once Bernard is proven to be stable and correct, this CLI will be removed.

Building the CLI

  1. Install Golang.
  2. Set the environment variable CGO_ENABLED=1 and make sure you have a GCC compiler present.
  3. Clone this repository and cd into it.
  4. Run: go build -o bernard cmd/bernard/main.go

You should now see a binary called bernard in the current working directory.

Using the CLI

Make sure you create a Service Account which has read access to the Shared Drive in question. Additionally, please check whether you have the Drive API enabled in Google Cloud. Save a JSON key of this service account and store it somewhere you can easily access the file.

Bernard will store a SQLite database file called bernard.db in your current working directory. It is advised to store the JSON key of the service account in the same directory.

The CLI requires three arguments:

  1. full or diff
  2. The ID of the Shared Drive you want to synchronise
  3. The path to the JSON key of the service account

The first argument specifies the operation, where full will activate a full synchronisation of the Shared Drive and diff will fetch the latest changes. You must fully synchronise once before fetching the differences.

The second argument takes a string as input which should be the ID of your Shared Drive. Make sure the Service Account has read access to the Shared Drive in question.

The third argument takes a string as input which should point to the JSON key of the service account on your file system.

CLI example

bernard "full" "1234xxxxxxxxxxxxxVA" "./account.json"

In this example, a full synchronisation is activated for the Shared Drive 1234xxxxxxxxxxxxxVA with the Service Account ./account.json.

Using Bernard in your Go project

Bernard is available as a Go module. To add Bernard to your Go project run the following command:

go get github.com/m-rots/bernard

Full & Partial Synchronisation

Bernard allows two ways of synchronising the datastore. The FullSync() takes a considerable amount of time depending on the number of files placed in the Shared Drive. Bernard roughly processes 1000 files every 1-2 seconds in the full synchronisation mode.

Please note that the full synchronisation can be incomplete if you make changes to the Shared Drive in the minutes leading up to the full synchronisation.

Once you have fully synchronised the Shared Drive, you can use the PartialSync() to fetch the differences between the last synchronisation (both full and partial) and the current Shared Drive state.

Hooks

Hooks allow you to run code in-between the fetch of changes and the processing of these changes to the datastore. The reference SQLite datastores comes with a NewDifferencesHook() function to check which of the Google-reported files have actually changed. Furthermore, it also retrieves the last-known values of removed items and reports which items have been added (do not exist yet).

To create a DifferencesHook, you can utilise the following code:

hook, diff := store.NewDifferencesHook()
err = bernard.PartialSync("driveID", hook)

// access diff

diff is a pointer to a Difference struct and is filled with data by the PartialSync function. The Difference struct contains:

  • AddedFiles, a slice of files not currently present in the datastore.
  • AddedFolders, a slice of folders not currently present in the datastore.
  • ChangedFiles, a slice of FileDifferences, providing both the old and new state.
  • ChangedFolders, a slice of FolderDifferences, providing both the old and new state.
  • RemovedFiles, a slice of removed files with their last-known state stored by the datastore.
  • RemovedFolders, a slice of removed folders with their last-known state stored by the datastore.

Datastore

The datastore is a core component of Bernard's operations. Bernard provides a reference implementation of a Datastore in the form of a SQLite database. This reference datastore can be expanded to allow other operations on the underlying database/sql interface.

Please note that the reference SQLite datastore uses the CGO enabled package go-sqlite3. This dependency affects cross-compilation.

If SQLite is not your database of choice, feel free to open a pull request with support for another database such as MongoDB, Fauna or CockroachDB. I highly advise you to have a look at datastore/datastore.go and datastore/sqlite/sqlite.go files to get a feel for the operations the Datastore interface should perform.

Authenticator

Bernard exports an Authenticator interface which hosts an AccessToken function. This function should fetch a valid access token at all times. It should respond with the access token as a string, its UNIX expiry time as an int64 and an error in case the credentials are invalid.

type Authenticator interface {
  AccessToken() (string, int64, error)
}

To get started quickly, you can use Stubbs as it implements the Authenticator interface.

Example code

In this example, Stubbs is used as the Authenticator and the reference SQLite datastore is used.

package main

import (
  "fmt"
  "os"

  "github.com/m-rots/bernard"
  "github.com/m-rots/bernard/datastore/sqlite"
  "github.com/m-rots/stubbs"
)

func getAuthenticator() (bernard.Authenticator, error) {
  clientEmail := "stubbs@westworld.iam.gserviceaccount.com"
  privateKey := "-----BEGIN PRIVATE KEY-----\n..."
  scopes := []string{"https://www.googleapis.com/auth/drive.readonly"}

  priv, err := stubbs.ParseKey(privateKey)
  if err != nil {
    // invalid private key
    return nil, err
  }

  account := stubbs.New(clientEmail, &priv, scopes, 3600)
  return account, nil
}

func main() {
  // Use Stubbs as the authenticator
  authenticator, err := getAuthenticator()
  if err != nil {
    fmt.Println("Invalid private key")
    os.Exit(1)
  }

  driveID := "1234xxxxxxxxxxxxxVA"
  datastorePath := "bernard.db"

  store, err := sqlite.New(datastorePath)
  if err != nil {
    // Either the database could not be created,
    // or the SQL schema is broken somehow...
    fmt.Println("Could not create SQLite datastore")
    os.Exit(1)
  }

  bernie := bernard.New(authenticator, store)

  err = bernie.FullSync(driveID)
  if err != nil {
    fmt.Println("Could not fully synchronise the drive")
    os.Exit(1)
  }

  err = bernie.PartialSync(driveID)
  if err != nil {
    fmt.Println("Could not partially synchronise the drive")
    os.Exit(1)
  }
}

About

A synchronisation engine for Google Drive metadata

Resources

License

Stars

Watchers

Forks

Sponsor this project

Languages