There's an in-progress one at https://github.com/paperstreet/cockroach/blob/rfc_bulk/docs/RFCS/bulk_operations.md, but there's a few open questions and experiments to run before it's ready for review. - [x] Merge slow but correct prototype of backup/restore - [x] Bulk Import perf experiments _the existing mechanisms are too slow, maybe proposer evaluated kv will be fast enough_ - [ ] One node inserting via distsender - [ ] Many nodes inserting via distsender - [ ] Leaseholders inserting via distsender - [ ] Leaseholders WriteBatch into unreplicated range - [ ] RocksDB's AddFile - [x] SSTable perf experiments - [x] Iterating large sstable _110byte kvs, 1.4m op/s, 157MB/s_ - [x] Writing large sstable _110byte kvs, 350k op/s, 39.1MB/s_ - [x] Rekeying sstable (ex: change table id) on the fly _110byte kvs, 1.3m op/s, 143MB/s_ - [ ] Talk to beta customers about requriements ( @petermattis / @spencerkimball ) - [ ] Open Questions - [ ] How does the raft part of import work? - [ ] How to iterate all data (including tombstones) in a snapshot's sstables - [ ] How to skip sstables that we know don't have relevant timestamps - [ ] Do we care about time travel over newly restored data? - [x] Allow restoring to a different table id? _ yes_ - [x] Where are TableDescriptors stored in a backup? _metadata file_ - [ ] How to restore into an empty range - [ ] How to go from sql/pgcopy to non-overlapping, sorted kv ranges - [x] How to handle a very long running transaction (and how to pick the end timestamp)? - [ ] Do we need any of the local range keys? - [x] Can we point rocksdb at s3? _the Env abstraction lets us do this, but we have to write it_ - [ ] What needs to be verified when importing kvs? - [x] How to lock a table during a restore
There's an in-progress one at https://github.com/paperstreet/cockroach/blob/rfc_bulk/docs/RFCS/bulk_operations.md, but there's a few open questions and experiments to run before it's ready for review.