Skip to content

rfc: backup/restore/import #8191

@danhhz

Description

@danhhz

There's an in-progress one at https://github.com/paperstreet/cockroach/blob/rfc_bulk/docs/RFCS/bulk_operations.md, but there's a few open questions and experiments to run before it's ready for review.

  • Merge slow but correct prototype of backup/restore
  • Bulk Import perf experiments the existing mechanisms are too slow, maybe proposer evaluated kv will be fast enough
    • One node inserting via distsender
    • Many nodes inserting via distsender
    • Leaseholders inserting via distsender
    • Leaseholders WriteBatch into unreplicated range
    • RocksDB's AddFile
  • SSTable perf experiments
    • Iterating large sstable 110byte kvs, 1.4m op/s, 157MB/s
    • Writing large sstable 110byte kvs, 350k op/s, 39.1MB/s
    • Rekeying sstable (ex: change table id) on the fly 110byte kvs, 1.3m op/s, 143MB/s
  • Talk to beta customers about requriements ( @petermattis / @spencerkimball )
  • Open Questions
    • How does the raft part of import work?
    • How to iterate all data (including tombstones) in a snapshot's sstables
      • How to skip sstables that we know don't have relevant timestamps
    • Do we care about time travel over newly restored data?
    • Allow restoring to a different table id? _ yes_
    • Where are TableDescriptors stored in a backup? metadata file
    • How to restore into an empty range
    • How to go from sql/pgcopy to non-overlapping, sorted kv ranges
    • How to handle a very long running transaction (and how to pick the end timestamp)?
    • Do we need any of the local range keys?
    • Can we point rocksdb at s3? the Env abstraction lets us do this, but we have to write it
    • What needs to be verified when importing kvs?
    • How to lock a table during a restore

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions