Skip to content

Commit

Permalink
feat: store blocks under multihash key (#211)
Browse files Browse the repository at this point in the history
This is related to ipfs/js-ipfs#2415

Breaking changes:

- Repo version incremented to `8`, requires a migration
- Blocks are now stored using the multihash, not the full CID
- `repo.blocks.query({})` now returns an async iterator that yields blocks
- `repo.blocks.query({ keysOnly: true })` now returns an async iterator that yields CIDs
  - Those CIDs are v1 with the raw codec

Co-authored-by: achingbrain <alex@achingbrain.net>
  • Loading branch information
AuHau and achingbrain committed Jun 25, 2020
1 parent d4bf852 commit 06a9e27
Show file tree
Hide file tree
Showing 12 changed files with 208 additions and 108 deletions.
57 changes: 41 additions & 16 deletions README.md
Expand Up @@ -40,17 +40,20 @@ This is the implementation of the [IPFS repo spec](https://github.com/ipfs/specs
- [`Promise<Buffer> repo.get(key)`](#promisebuffer-repogetkey)
- [Blocks](#blocks)
- [`Promise<Block> repo.blocks.put(block:Block)`](#promiseblock-repoblocksputblockblock)
- [`AsyncIterator<Block> repo.blocks.putMany(source)`](#asynciteratorblock-repoblocksputmanysource)
- [`Promise<Buffer> repo.blocks.get(cid)`](#promisebuffer-repoblocksgetcid)
- [`AsyncIterable<Buffer> repo.blocks.getMany(source)`](#asynciterablebuffer-repoblocksgetmanysource)
- [`AsyncIterator<Block> repo.blocks.putMany(source:AsyncIterable<Block>)`](#asynciteratorblock-repoblocksputmanysourceasynciterableblock)
- [`Promise<Block> repo.blocks.get(cid:CID)`](#promiseblock-repoblocksgetcidcid)
- [`AsyncIterable<Block> repo.blocks.getMany(source:AsyncIterable<CID>)`](#asynciterableblock-repoblocksgetmanysourceasynciterablecid)
- [`Promise<boolean> repo.blocks.has (cid:CID)`](#promiseboolean-repoblockshas-cidcid)
- [`Promise<boolean> repo.blocks.delete (cid:CID)`](#promiseboolean-repoblocksdelete-cidcid)
- [`AsyncIterator<Block|CID> repo.blocks.query (query)`](#asynciteratorblockcid-repoblocksquery-query)
- [`Promise<CID> repo.blocks.delete(cid:CID)`](#promisecid-repoblocksdeletecidcid)
- [`AsyncIterator<CID> repo.blocks.deleteMany(source)`](#asynciteratorcid-repoblocksdeletemanysource)
- [`AsyncIterator<CID> repo.blocks.deleteMany(source:AsyncIterable<CID>)`](#asynciteratorcid-repoblocksdeletemanysourceasynciterablecid)
- [Datastore](#datastore)
- [`repo.datastore`](#repodatastore)
- [Config](#config)
- [`Promise repo.config.set(key:string, value)`](#promise-repoconfigsetkeystring-value)
- [`Promise repo.config.replace(value)`](#promise-repoconfigreplacevalue)
- [`Promise<?> repo.config.get(key:string)`](#promise-repoconfiggetkeystring)
- [`Promise repo.config.set(key:String, value:Object)`](#promise-repoconfigsetkeystring-valueobject)
- [`Promise repo.config.replace(value:Object)`](#promise-repoconfigreplacevalueobject)
- [`Promise<?> repo.config.get(key:String)`](#promise-repoconfiggetkeystring)
- [`Promise<Object> repo.config.getAll()`](#promiseobject-repoconfiggetall)
- [`Promise<boolean> repo.config.exists()`](#promiseboolean-repoconfigexists)
- [Version](#version)
Expand Down Expand Up @@ -229,31 +232,53 @@ Get a value at the root of the repo

* `block` should be of type [Block][]

#### `AsyncIterator<Block> repo.blocks.putMany(source)`
#### `AsyncIterator<Block> repo.blocks.putMany(source:AsyncIterable<Block>)`

Put many blocks.

* `source` should be an AsyncIterable that yields entries of type [Block][]

#### `Promise<Buffer> repo.blocks.get(cid)`
#### `Promise<Block> repo.blocks.get(cid:CID)`

Get block.

* `cid` is the content id of type [CID][]

#### `AsyncIterable<Buffer> repo.blocks.getMany(source)`
#### `AsyncIterable<Block> repo.blocks.getMany(source:AsyncIterable<CID>)`

Get block.
Get many blocks

* `source` should be an AsyncIterable that yields entries of type [CID][]

#### `Promise<boolean> repo.blocks.has (cid:CID)`

Indicate if a block is present for the passed CID

* `cid` should be of the type [CID][]

#### `Promise<boolean> repo.blocks.delete (cid:CID)`

Deletes a block

* `cid` should be of the type [CID][]

#### `AsyncIterator<Block|CID> repo.blocks.query (query)`

Query what blocks are available in blockstore.

If `query.keysOnly` is true, the returned iterator will yield [CID][]s, otherwise it will yield [Block][]s

* `query` is a object as specified in [interface-datastore](https://github.com/ipfs/interface-datastore#query).

Datastore:

#### `Promise<CID> repo.blocks.delete(cid:CID)`

* `cid` should be of the type [CID][]

Delete a block

#### `AsyncIterator<CID> repo.blocks.deleteMany(source)`
#### `AsyncIterator<CID> repo.blocks.deleteMany(source:AsyncIterable<CID>)`

* `source` should be an Iterable or AsyncIterable that yields entries of the type [CID][]

Expand All @@ -269,7 +294,7 @@ This contains a full implementation of [the `interface-datastore` API](https://g

Instead of using `repo.set('config')` this exposes an API that allows you to set and get a decoded config object, as well as, in a safe manner, change any of the config values individually.

#### `Promise repo.config.set(key:string, value)`
#### `Promise repo.config.set(key:String, value:Object)`

Set a config value. `value` can be any object that is serializable to JSON.

Expand All @@ -281,11 +306,11 @@ const config = await repo.config.get()
assert.equal(config.a.b.c, 'c value')
```

#### `Promise repo.config.replace(value)`
#### `Promise repo.config.replace(value:Object)`

Set the whole config value. `value` can be any object that is serializable to JSON.

#### `Promise<?> repo.config.get(key:string)`
#### `Promise<?> repo.config.get(key:String)`

Get a config value. Returned promise resolves to the same type that was set before.

Expand Down Expand Up @@ -379,7 +404,7 @@ Returned promise resolves to a `boolean` indicating the existence of the lock.

### Migrations

When there is a new repo migration and the version of repo is increased, don't
When there is a new repo migration and the version of the repo is increased, don't
forget to propagate the changes into the test repo (`test/test-repo`).

**For tools that run mainly in the browser environment, be aware that disabling automatic
Expand Down
7 changes: 3 additions & 4 deletions package.json
Expand Up @@ -52,8 +52,7 @@
"it-first": "^1.0.2",
"just-range": "^2.1.0",
"memdown": "^5.1.0",
"multihashes": "^1.0.1",
"multihashing-async": "^0.8.0",
"multihashing-async": "^1.0.0",
"ncp": "^2.0.0",
"rimraf": "^3.0.0",
"sinon": "^9.0.2"
Expand All @@ -69,11 +68,11 @@
"debug": "^4.1.0",
"err-code": "^2.0.0",
"interface-datastore": "^1.0.2",
"ipfs-repo-migrations": "^0.2.1",
"ipfs-repo-migrations": "^1.0.0",
"ipfs-utils": "^2.2.0",
"ipld-block": "^0.9.1",
"it-map": "^1.0.2",
"it-pipe": "^1.1.0",
"it-pushable": "^1.4.0",
"just-safe-get": "^2.0.0",
"just-safe-set": "^2.1.0",
"multibase": "^1.0.1",
Expand Down
7 changes: 5 additions & 2 deletions src/blockstore-utils.js
Expand Up @@ -16,15 +16,18 @@ exports.cidToKey = cid => {
throw errcode(new Error('Not a valid cid'), 'ERR_INVALID_CID')
}

return new Key('/' + multibase.encode('base32', cid.buffer).toString().slice(1).toUpperCase(), false)
return new Key('/' + multibase.encode('base32', cid.multihash).toString().slice(1).toUpperCase(), false)
}

/**
* Transform a datastore Key instance to a CID
* As Key is a multihash of the CID, it is reconstructed using IPLD's RAW codec.
* Hence it is highly probable that stored CID will differ from a CID retrieved from blockstore.
*
* @param {Key} key
* @returns {CID}
*/
exports.keyToCid = key => {
return new CID(multibase.decode('b' + key.toString().slice(1).toLowerCase()))
// Block key is of the form /<base32 encoded string>
return new CID(1, 'raw', multibase.decode('b' + key.toString().slice(1).toLowerCase()))
}
142 changes: 71 additions & 71 deletions src/blockstore.js
Expand Up @@ -5,7 +5,8 @@ const ShardingStore = core.ShardingDatastore
const Block = require('ipld-block')
const { cidToKey, keyToCid } = require('./blockstore-utils')
const map = require('it-map')
const pipe = require('it-pipe')
const drain = require('it-drain')
const pushable = require('it-pushable')

module.exports = async (filestore, options) => {
const store = await maybeWithSharding(filestore, options)
Expand All @@ -23,47 +24,39 @@ function maybeWithSharding (filestore, options) {
function createBaseStore (store) {
return {
/**
* Query the store.
* Query the store
*
* @param {Object} query
* @param {Object} options
* @returns {AsyncIterator<Block>}
* @returns {AsyncIterator<Block|CID>}
*/
async * query (query, options) { // eslint-disable-line require-await
yield * store.query(query, options)
async * query (query, options) {
for await (const { key, value } of store.query(query, options)) {
if (query.keysOnly) {
yield keyToCid(key)
continue
}

yield new Block(value, keyToCid(key))
}
},

/**
* Get a single block by CID.
* Get a single block by CID
*
* @param {CID} cid
* @param {Object} options
* @returns {Promise<Block>}
*/
async get (cid, options) {
const key = cidToKey(cid)
let blockData
try {
blockData = await store.get(key, options)
return new Block(blockData, cid)
} catch (err) {
if (err.code === 'ERR_NOT_FOUND') {
const otherCid = cidToOtherVersion(cid)

if (!otherCid) {
throw err
}

const otherKey = cidToKey(otherCid)
const blockData = await store.get(otherKey, options)
await store.put(key, blockData)
return new Block(blockData, cid)
}
const blockData = await store.get(key, options)

throw err
}
return new Block(blockData, cid)
},

/**
* Like get, but for more.
* Like get, but for more
*
* @param {AsyncIterator<CID>} cids
* @param {Object} options
Expand All @@ -74,8 +67,9 @@ function createBaseStore (store) {
yield this.get(cid, options)
}
},

/**
* Write a single block to the store.
* Write a single block to the store
*
* @param {Block} block
* @param {Object} options
Expand All @@ -86,59 +80,75 @@ function createBaseStore (store) {
throw new Error('invalid block')
}

const exists = await this.has(block.cid)
const key = cidToKey(block.cid)
const exists = await store.has(key, options)

if (exists) {
return this.get(block.cid, options)
if (!exists) {
await store.put(key, block.data, options)
}

await store.put(cidToKey(block.cid), block.data, options)

return block
},

/**
* Like put, but for more.
* Like put, but for more
*
* @param {AsyncIterable<Block>|Iterable<Block>} blocks
* @param {Object} options
* @returns {AsyncIterable<Block>}
*/
async * putMany (blocks, options) { // eslint-disable-line require-await
yield * pipe(
blocks,
(source) => {
// turn them into a key/value pair
return map(source, (block) => {
return { key: cidToKey(block.cid), value: block.data }
})
},
(source) => {
// put them into the datastore
return store.putMany(source, options)
},
(source) => {
// map the returned key/value back into a block
return map(source, ({ key, value }) => {
return new Block(value, keyToCid(key))
})
// we cannot simply chain to `store.putMany` because we convert a CID into
// a key based on the multihash only, so we lose the version & codec and
// cannot give the user back the CID they used to create the block, so yield
// to `store.putMany` but return the actual block the user passed in.
//
// nb. we want to use `store.putMany` here so bitswap can control batching
// up block HAVEs to send to the network - if we use multiple `store.put`s
// it will not be able to guess we are about to `store.put` more blocks
const output = pushable()

// process.nextTick runs on the microtask queue, setImmediate runs on the next
// event loop iteration so is slower. Use process.nextTick if it is available.
const runner = process && process.nextTick ? process.nextTick : setImmediate

runner(async () => {
try {
await drain(store.putMany(async function * () {
for await (const block of blocks) {
const key = cidToKey(block.cid)
const exists = await store.has(key, options)

if (!exists) {
yield { key, value: block.data }
}

// there is an assumption here that after the yield has completed
// the underlying datastore has finished writing the block
output.push(block)
}
}()))

output.end()
} catch (err) {
output.end(err)
}
)
})

yield * output
},

/**
* Does the store contain block with this cid?
* Does the store contain block with this CID?
*
* @param {CID} cid
* @param {Object} options
* @returns {Promise<bool>}
*/
async has (cid, options) {
const exists = await store.has(cidToKey(cid), options)
if (exists) return exists
const otherCid = cidToOtherVersion(cid)
if (!otherCid) return false
return store.has(cidToKey(otherCid), options)
async has (cid, options) { // eslint-disable-line require-await
return store.has(cidToKey(cid), options)
},

/**
* Delete a block from the store
*
Expand All @@ -149,6 +159,7 @@ function createBaseStore (store) {
async delete (cid, options) { // eslint-disable-line require-await
return store.delete(cidToKey(cid), options)
},

/**
* Delete a block from the store
*
Expand All @@ -157,12 +168,9 @@ function createBaseStore (store) {
* @returns {Promise<void>}
*/
async * deleteMany (cids, options) { // eslint-disable-line require-await
yield * store.deleteMany((async function * () {
for await (const cid of cids) {
yield cidToKey(cid)
}
}()), options)
yield * store.deleteMany(map(cids, cid => cidToKey(cid)), options)
},

/**
* Close the store
*
Expand All @@ -173,11 +181,3 @@ function createBaseStore (store) {
}
}
}

function cidToOtherVersion (cid) {
try {
return cid.version === 0 ? cid.toV1() : cid.toV0()
} catch (err) {
return null
}
}
2 changes: 1 addition & 1 deletion src/constants.js
@@ -1,5 +1,5 @@
'use strict'

module.exports = {
repoVersion: 7
repoVersion: 8
}

0 comments on commit 06a9e27

Please sign in to comment.