Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incremental backups / partial restore #2963

Merged
merged 111 commits into from Mar 22, 2019
Merged
Show file tree
Hide file tree
Changes from 90 commits
Commits
Show all changes
111 commits
Select commit Hold shift + click to select a range
ab35d12
moved tests one dir up
Jan 31, 2019
99a4f6c
added optional HTTP var 'at' to support incremental backups
Feb 1, 2019
2f03b4a
updated func signature for since value
Feb 1, 2019
03abcef
updated func signature for since value. readTs added to backup request.
Feb 1, 2019
832f82c
added support for partial restore using readTs
Feb 1, 2019
71ce25e
added new command flag -since to restore from a specific readTs.
Feb 1, 2019
f76d829
changed file handler Load signature. added readTs support.
Feb 1, 2019
f997251
changed s3 handler Load signature. added readTs support.
Feb 1, 2019
fd46f60
fixed typo
Feb 1, 2019
430809a
expanded backup handler interface description
Feb 1, 2019
2686eea
more comments
Feb 1, 2019
7562e13
added service test
Feb 1, 2019
6dd8af1
Merge branch 'master' of github.com:/dgraph-io/dgraph into srfrog/iss…
Feb 1, 2019
06c07a8
added incr backup system test
Feb 1, 2019
bc5acb7
split readts and since
Feb 2, 2019
26ce696
Merge branch 'master' of github.com:/dgraph-io/dgraph into srfrog/iss…
Feb 5, 2019
871cd45
fixed bug in backup fan-out code
Feb 5, 2019
a2417b2
removed unused
Feb 5, 2019
ddbe61a
bind compose data dir for restore tests
Feb 5, 2019
b22cf20
renamed FindFilesFunc to WalkPathFunc and changed it to returns dirs …
Feb 5, 2019
cb856f3
changed FilesFindFunc to WalkPathFunc
Feb 5, 2019
039abbc
split off restore as runRestore to make it testable
Feb 5, 2019
03b4e55
ongoing refactoring of backup and restore tests
Feb 5, 2019
ebeb491
removed schema file, added gitignore for test data dirs
Feb 5, 2019
f7efa36
remove unused code
Feb 6, 2019
6e8da19
add time-second granularity for better testing and grouping
Feb 6, 2019
3274fd8
sync writes for new p values. rename readTs to since.
Feb 6, 2019
2b6ebd8
ee/backup/backup_test.go: refactor tests
Feb 6, 2019
c766213
Merge branch 'master' of github.com:/dgraph-io/dgraph into srfrog/iss…
Feb 6, 2019
464b8eb
protos/pb/pb.pb.go: regenerate protos for changes
Feb 6, 2019
2343792
ee/backup/run.go: fixed typo
Feb 6, 2019
9e6805c
contrib/scripts/test-backup-restore.sh: no longer needed
Feb 6, 2019
1eb1fc8
ee/backup/backup.go: decouple Process and Request, add Manifest.
Feb 16, 2019
cee0a06
ee/backup/handler.go: rename newHandler to create and generalize it.
Feb 16, 2019
3142496
ee/backup/file_handler.go: switch to new Create() and update Load()
Feb 16, 2019
669ee38
ee/backup/s3_handler.go: switch to new Create() and update Load()
Feb 16, 2019
ee501b5
worker/backup_ee.go: add manifests and return backup version info.
Feb 16, 2019
e549da1
ee/backup/backup_test.go: only minor updates, not yet done.
Feb 16, 2019
39f7b5e
ee/backup/run.go: removed since arg from Load(), moving to manifests.
Feb 16, 2019
0e06f5e
Merge branch 'master' of github.com:/dgraph-io/dgraph into srfrog/iss…
Feb 20, 2019
392bcd9
x/file.go: added FindFilesFunc alias to WalkPathFunc
Feb 20, 2019
b74f8d2
ee/backup/backup.go: get since version from create() after reading ma…
Feb 20, 2019
806a417
ee/backup/handler.go: added version to object, to return back for Bac…
Feb 20, 2019
e36e77a
ee/backup/run.go: removed log prefix based on PR feedback
Feb 20, 2019
9ebbb1f
worker/backup_ee.go: renamed Since to Version, minor cleanups
Feb 20, 2019
bf9a14c
ee/backup/file_handler.go: added manifest read
Feb 20, 2019
bcbad39
ee/backup/s3_handler.go: added manifest read
Feb 20, 2019
9a53379
removed debug logs
Feb 21, 2019
dcb9373
ee/backup/backup_test.go: run tests in sequence
Feb 21, 2019
535e545
ee/backup/backup_test.go: make data dir ephemeral and remove after tests
Feb 21, 2019
22b6ec2
ee/backup/.gitignore: dont check in the data dir
Feb 21, 2019
5d6e2b1
dgraph/cmd/alpha/admin_backup.go: dont need 'at' value
Feb 21, 2019
1f2fef1
worker/backup_ee.go: dont need since value, we get it from manifests
Feb 21, 2019
25428a5
ee/backup/docker-compose.yml: dont need tracing
Feb 21, 2019
5b9fac3
ee/backup/s3_handler.go: fix bug with pipe
Feb 21, 2019
bb9348e
worker/backup.go: wrong func signature
Feb 21, 2019
3a64b6d
Merge branch 'master' of github.com:/dgraph-io/dgraph into srfrog/iss…
Feb 21, 2019
373a431
changes based on PR feedback
Feb 21, 2019
d0f2b20
removed useless comment
Feb 21, 2019
0244935
removed extra 'at' value
Feb 21, 2019
e975232
ee/backup/backup.go: return nil in Process for clarity
Feb 25, 2019
c15ffb7
backup_test.go: check state for moved movie tablet instead of waiting
Feb 25, 2019
dd312f6
ee/backup/backup_test.go: docker test works better with preexisting dir
Feb 25, 2019
2cbb7f6
re-adding data dir for docker tests
Feb 25, 2019
133f85a
dgraph/cmd/alpha/admin_backup.go: changed text in EE flag check failure
Feb 25, 2019
dccdbb5
ee/backup/docker-compose.yml: data dir not read-only
Feb 26, 2019
e134848
ee/backup/backup.go: print previous and current version v=3
Feb 26, 2019
a167dcd
worker/backup_ee.go: the ctx goes to zero dont cancel it
Feb 26, 2019
92c8d5a
worker/backup_ee.go: fixed issue with versions
Feb 26, 2019
173c7ea
protos/pb.proto: changed RPC response for Backup to return Num
Feb 26, 2019
cfc89a2
proto regenerated
Feb 26, 2019
121f543
worker/backup_ee.go: dont error out when a group's max version is zero
Feb 26, 2019
5c0ac27
ee/backup/backup.go: use snapshot ts for version check
Feb 27, 2019
39f2070
ee/backup/file_handler.go: added version check using snapshot ts
Feb 27, 2019
21c33f3
ee/backup/s3_handler.go: added version check using snapshot ts
Feb 27, 2019
45390f4
protos/pb.proto: added snapshot ts to request
Feb 27, 2019
30fefff
ee/backup/handler.go: added snapshot ts to object for version check
Feb 27, 2019
1567bd7
fixed comments, from obsolete text
Feb 27, 2019
25d895f
worker/backup_ee.go: attach snapshot ts to request
Feb 27, 2019
8446b5a
ee/backup/backup.go: fixed comment
Mar 1, 2019
f8f4e53
Small changes
manishrjain Mar 1, 2019
830b590
ee/backup/backup.go: readded Request object to Process
Mar 1, 2019
4d40889
ee/backup/file_handler.go: removed useless comparison func
Mar 1, 2019
d3e2b83
ee/backup/s3_handler.go: removed useless comparison func
Mar 1, 2019
28976df
ee/backup/backup.go: nicer noop backup response.
Mar 1, 2019
a4d640d
worker/backup_ee.go: readded backup Request, fixed race, nicer noop r…
Mar 1, 2019
011c919
Merge branch 'srfrog/issue-2949_incremental_backups' of github.com:/d…
Mar 1, 2019
21e1f4e
better manifest.json suffix checks
Mar 1, 2019
04464d8
remove debug print, error nicely when no changes, groups report no ch…
Mar 4, 2019
5a6c286
revert some framework changes
Mar 5, 2019
2a7c1fb
Merge branch 'master' of github.com:/dgraph-io/dgraph into srfrog/iss…
Mar 15, 2019
acef487
add support to minio handler using s3 handler code
Mar 15, 2019
fb810f5
remove duplicate check
Mar 15, 2019
0c05799
ee/backup/backup.go: add readTs to manifest for restore checks
Mar 20, 2019
2591b5d
ee/backup/run.go: add optional flag --zero to update startTs after re…
Mar 20, 2019
3a1c0fb
ee/backup/handler.go: change Load() to return the max version Ts
Mar 20, 2019
18560b9
ee/backup/backup_test.go: update Load() signature
Mar 20, 2019
4dff8f7
ee/backup/file_handler.go: use manifest for restore verification
Mar 20, 2019
9bda258
ee/backup/s3_handler.go: use manifest for restore verification
Mar 20, 2019
6064d5d
nicer error
Mar 20, 2019
bd6b441
ee/backup/run.go: remove default zero value
Mar 20, 2019
cb305bf
ee/backup/s3_handler.go: increase worker limit to 100
Mar 20, 2019
e7d3521
minor usability changes
Mar 20, 2019
79d5a32
Merge branch 'master' of github.com:/dgraph-io/dgraph into srfrog/iss…
Mar 20, 2019
33245da
proto regen
Mar 20, 2019
3b47a2e
minor changes
Mar 20, 2019
0f96df0
more usability changes
Mar 21, 2019
530d509
update comments to reflect changes
Mar 21, 2019
44202b7
fix more comments and usability
Mar 21, 2019
362fe59
Merge branch 'master' into srfrog/issue-2949_incremental_backups
manishrjain Mar 22, 2019
20b8463
TODOs to deal with later
manishrjain Mar 22, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
17 changes: 9 additions & 8 deletions dgraph/cmd/alpha/admin_backup.go
Expand Up @@ -31,23 +31,24 @@ func backupHandler(w http.ResponseWriter, r *http.Request) {
return
}
if !Alpha.Conf.GetBool("enterprise_features") {
err := x.Errorf("You must enable Dgraph enterprise features.")
x.SetStatus(w, err.Error(), "Backup failed.")
x.SetStatus(w,
"You must enable Dgraph enterprise features first. "+
"Restart Dgraph Alpha with --enterprise_features",
"Backup failed.")
return
}
dst := r.FormValue("destination")
if dst == "" {
err := x.Errorf("You must specify a 'destination' value")
x.SetStatus(w, err.Error(), "Backup failed.")

destination := r.FormValue("destination")
if destination == "" {
x.SetStatus(w, "You must specify a 'destination' value", "Backup failed.")
return
}
if err := worker.BackupOverNetwork(context.Background(), dst); err != nil {
if err := worker.BackupOverNetwork(context.Background(), destination); err != nil {
x.SetStatus(w, err.Error(), "Backup failed.")
return
}
w.Header().Set("Content-Type", "application/json")
x.Check2(w.Write([]byte(`{"code": "Success", "message": "Backup completed."}`)))

}

func init() {
Expand Down
2 changes: 2 additions & 0 deletions ee/backup/.gitignore
@@ -0,0 +1,2 @@
data/*
!data/.gitkeep
99 changes: 55 additions & 44 deletions ee/backup/backup.go
Expand Up @@ -14,7 +14,8 @@ package backup

import (
"context"
"net/url"
"encoding/json"
"sync"

"github.com/dgraph-io/badger"
"github.com/dgraph-io/dgraph/protos/pb"
Expand All @@ -23,75 +24,85 @@ import (
"github.com/golang/glog"
)

// ErrBackupNoChanges is returned when the manifest version is equal to the snapshot version.
// This means that no data updates happened since the last backup.
var ErrBackupNoChanges = x.Errorf("No changes since last backup, OK.")

// Request has all the information needed to perform a backup.
type Request struct {
DB *badger.DB // Badger pstore managed by this node.
Sizex uint64 // approximate upload size
Backup *pb.BackupRequest
DB *badger.DB // Badger pstore managed by this node.
Backup *pb.BackupRequest
Manifest *Manifest
Version uint64
}

// Process uses the request values to create a stream writer then hand off the data
// retrieval to stream.Orchestrate. The writer will create all the fd's needed to
// collect the data and later move to the target.
// Returns errors on failure, nil on success.
func (r *Request) Process(ctx context.Context) error {
h, err := r.newHandler()
if err := ctx.Err(); err != nil {
return err
}

handler, err := r.newHandler()
if err != nil {
glog.Errorf("Unable to get handler for request: %+v. Error: %v", r, err)
if err != ErrBackupNoChanges {
glog.Errorf("Unable to get handler for request: %+v. Error: %v", r.Backup, err)
}
return err
}
glog.V(3).Infof("Backup manifest version: %d", r.Version)

stream := r.DB.NewStreamAt(r.Backup.ReadTs)
stream.LogPrefix = "Dgraph.Backup"
// Take full backups for now.
if _, err := stream.Backup(h, 0); err != nil {
// Here we return the max version in the original request obejct. We will use this
// to create our manifest to complete the backup.
r.Backup.Since, err = stream.Backup(handler, r.Version)
if err != nil {
glog.Errorf("While taking backup: %v", err)
return err
}
if err := h.Close(); err != nil {
glog.V(2).Infof("Backup group %d version: %d", r.Backup.GroupId, r.Backup.Since)
if err = handler.Close(); err != nil {
glog.Errorf("While closing handler: %v", err)
return err
}
glog.Infof("Backup complete: group %d at %d", r.Backup.GroupId, r.Backup.ReadTs)
return err
return nil
}

// Manifest records backup details, these are values used during restore.
// Version is the maximum version seen.
// Groups are the IDs of the groups involved.
// Request is the original backup request.
type Manifest struct {
sync.Mutex
Version uint64 `json:"version"`
Groups []uint32 `json:"groups"`
}

// newHandler parses the requested target URI, finds a handler and then tries to create a session.
// Target URI formats:
// [scheme]://[host]/[path]?[args]
// [scheme]:///[path]?[args]
// /[path]?[args] (only for local or NFS)
//
// Target URI parts:
// scheme - service handler, one of: "s3", "gs", "az", "http", "file"
// host - remote address. ex: "dgraph.s3.amazonaws.com"
// path - directory, bucket or container at target. ex: "/dgraph/backups/"
// args - specific arguments that are ok to appear in logs.
//
// Global args (might not be support by all handlers):
// secure - true|false turn on/off TLS.
// compress - true|false turn on/off data compression.
//
// Examples:
// s3://dgraph.s3.amazonaws.com/dgraph/backups?secure=true
// gs://dgraph/backups/
// as://dgraph-container/backups/
// http://backups.dgraph.io/upload
// file:///tmp/dgraph/backups or /tmp/dgraph/backups?compress=gzip
func (r *Request) newHandler() (handler, error) {
uri, err := url.Parse(r.Backup.Location)
// Complete will finalize a backup by writing the manifest at the backup destination.
func (r *Request) Complete(ctx context.Context) error {
if err := ctx.Err(); err != nil {
return err
}
// handler, err := create(&object{
// uri: m.Request.Location,
// path: fmt.Sprintf(backupPathFmt, m.Request.UnixTs),
// name: backupManifest,
// version: m.Version,
// })
handler, err := r.newHandler()
if err != nil {
return nil, err
return err
}

// find handler for this URI scheme
h := getHandler(uri.Scheme)
if h == nil {
return nil, x.Errorf("Unable to handle url: %v", uri)
if err = json.NewEncoder(handler).Encode(r.Manifest); err != nil {
return err
}

if err := h.Create(uri, r); err != nil {
return nil, err
if err = handler.Close(); err != nil {
return err
}
return h, nil
glog.Infof("Backup completed OK.")
return nil
}