Skip to content

Reduce number of dependencies needed for importer #186

@Gozala

Description

@Gozala

Context

I have been working on nftstorage/nft.storage#837 and run into complications with utilizing importer, because it requires a blockstore implementation

/**
* @param {AsyncIterable<ImportCandidate> | Iterable<ImportCandidate> | ImportCandidate} source
* @param {Blockstore} blockstore
* @param {UserImporterOptions} options
*/
export async function * importer (source, blockstore, options = {}) {

Which unfortunately isn't a simple API to supply, given that it has large number of methods

export interface Blockstore extends Store<CID, Uint8Array> {}

export interface Store<Key, Value> {
  open: () => Promise<void>
  close: () => Promise<void>
  put: (key: Key, val: Value, options?: Options) => Promise<void>
  get: (key: Key, options?: Options) => Promise<Value>

  has: (key: Key, options?: Options) => Promise<boolean>
  delete: (key: Key, options?: Options) => Promise<void>
  putMany: (
    source: AwaitIterable<Pair<Key, Value>>,
    options?: Options
  ) => AsyncIterable<Pair<Key, Value>>
  getMany: (
    source: AwaitIterable<Key>,
    options?: Options
  ) => AsyncIterable<Value>
  deleteMany: (
    source: AwaitIterable<Key>,
    options?: Options
  ) => AsyncIterable<Key>

  batch: () => Batch<Key, Value>
  query: (query: Query<Key, Value>, options?: Options) => AsyncIterable<Pair<Key, Value>>
  queryKeys: (query: KeyQuery<Key>, options?: Options) => AsyncIterable<Key>
}

https://github.com/ipfs/js-ipfs-interfaces/blob/17a18d9af34a39ea7b066d523893c3254439f50b/packages/interface-blockstore/src/index.ts#L29-L31
https://github.com/ipfs/js-ipfs-interfaces/blob/17a18d9af34a39ea7b066d523893c3254439f50b/packages/interface-store/src/index.ts#L23-L175

I also suspect that importer does not needs all of those methods to do it's job. Given it's name I would expect it probably needs subset of write API.

Proposal

Option 1

I would like to propose to loosen up requirements on the importer, so something like BlockWriter or possibly CarEncoder could be used instead.

Option 2

It seems that importer does two tasks

  1. Importing blocks into the blockstore
  2. Spitting created unixfs entries out

It may be a good idea to untangle dag assembly from importing E.g maybe we could refactor API such that instead of writing blocks and emitting unixfs entries it would emit entries that have own block iterators so they could be written into blockstore as needed.

I realize it's kind of the case already given that dag builder passes things to tree builder which then flushes things into blockstore

async function * treeBuilder (source, block, options) {
/** @type {Dir} */
let tree = new DirFlat({
root: true,
dir: true,
path: '',
dirty: true,
flat: true
}, options)
for await (const entry of source) {
if (!entry) {
continue
}
tree = await addToTree(entry, tree, options)
if (!entry.unixfs || !entry.unixfs.isDirectory()) {
yield entry
}
}
if (options.wrapWithDirectory) {
yield * flushAndYield(tree, block)
} else {
for await (const unwrapped of tree.eachChildSeries()) {
if (!unwrapped) {
continue
}
yield * flushAndYield(unwrapped.child, block)
}
}
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/discussionTopical discussion; usually not changes to codebaseneed/analysisNeeds further analysis before proceeding

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions