Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configuration free implementation #36

Merged
merged 27 commits into from
Sep 25, 2020
Merged

Configuration free implementation #36

merged 27 commits into from
Sep 25, 2020

Conversation

Gozala
Copy link
Contributor

@Gozala Gozala commented Sep 17, 2020

This supersedes #35. It vendors base-x and bundles base32 & base58btc codecs with CID implementation so no dependency injections will be necessary. All the tests pass now, but I need to spend bit more time on code coverage.

@Gozala Gozala requested a review from mikeal September 17, 2020 10:32
@rvagg
Copy link
Member

rvagg commented Sep 18, 2020

Is this going to get a ComposedCodec like you've done for the bases, but for the ipld codecs? I'm still bothered by the "I have a block that I need to decode but can't be sure what codec I'll need until I have the CID" case that a registry nicely solves.

For the CarReader interface, what kind of object is it returning? Can I get it to return a singular Block thing or do I need to go back to returning raw bytes that you then compose into these objects using information you discern from the pieces rather than having the CarReader do it via some other tool (Block)? What if my CAR contains 3 different codecs, do I need to go fishing around in CIDs to select which codecs to use on the binary data or will I be able to hand that concern off? This is the end that I want to see. The example of encoding a block looks nice in here, but decoding when you don't have strict 1:1 correspondence isn't so nice - even with UnixFS you need to select between dag-pb and raw decoders; will I need to do that myself?

@Gozala
Copy link
Contributor Author

Gozala commented Sep 18, 2020

Is this going to get a ComposedCodec like you've done for the bases, but for the ipld codecs?

I was thinking about it and I would love to have something like ComposedDecoder for IPLD codes as well, however base decoder can figure out which decoder to apply from the input, which is not the case for encoded blocks. If we had a notion of "multiblock" (I have asked something along those lines in the past) it would all fit nicely together.

I'm still bothered by the "I have a block that I need to decode but can't be sure what codec I'll need until I have the CID" case that a registry nicely solves.

How does the registry solves this problem ? You still need to know hat codec to use don't you ?


What I have been thinking would be nice to have some representation of block + codec info let's call it a multiblock for now. That would enable composition just like with multibase decoders. Maybe we could have something like this ?

interface MultiBlockEncoder<Code> {
  encode(input:Multiblock<Code>):EncodedMultiblock<Code>

  or(other:MultiblockEncoder<OtherCode>):MultiblockEncoder<Code|OtherCode>
}

interface MultiBlockDecoder<Code> {
  decode(input:EncodedMultiblock<Code>):Multiblock<Code>

  or(other:MultiBlockDecoder<OtherCode>):MultiBlockDecoder<Code>
}

type Multiblock<Code> = {
  code: Code
  data: any
}

type EncodedMultiblock<Code> = {
  code: Code,
  bytes: Uint8Array
}

This would enable composeincg both encoders and decoders e.g.:

const codec = dagRaw.or(dagJSON).or(dagCbor).or(dagPb)
codec.encode({ code: dagCbor.code, data: { ... } })
codec.encode({ code: dagPb.code, data: {...} })

This could also work just fine with js-block and Block class in this pull request. Because they do comply with Multiblock and interface.

@Gozala
Copy link
Contributor Author

Gozala commented Sep 18, 2020

For the CarReader interface, what kind of object is it returning? Can I get it to return a singular Block thing or do I need to go back to returning raw bytes that you then compose into these objects using information you discern from the pieces rather than having the CarReader do it via some other tool (Block)? What if my CAR contains 3 different codecs, do I need to go fishing around in CIDs to select which codecs to use on the binary data or will I be able to hand that concern off? This is the end that I want to see. The example of encoding a block looks nice in here, but decoding when you don't have strict 1:1 correspondence isn't so nice - even with UnixFS you need to select between dag-pb and raw decoders; will I need to do that myself?

So given the CAR encoding:

|--------- Header --------| |---------------------------------- Data -----------------------------------|

[ varint | DAG-CBOR block ] [ varint | CID | block ] [ varint | CID | block ] [ varint | CID | block ] …

If we can formalize this following byte range [ CID | block ] into a general thing, let's call it Multiblock. We'll have a very nice abstraction where Block instances are decoded representations of those byte ranges. That would also enable composable ipld mulitecodecs as illustrated in previous comment. As of CarReader it would just need to be given an IPLD (muliti)codec on which it would be able to do codec.decode(bytes)

I also have to highlight that need for CID+Block abstraction is something I have run and asked for myself in the past. I think back than I was proposing to introduce notion of "inline blocks".

@Gozala
Copy link
Contributor Author

Gozala commented Sep 23, 2020

@mikeal I have got test coverage to here

-----------------------|---------|----------|---------|---------|-------------------
File                   | % Stmts | % Branch | % Funcs | % Lines | Uncovered Line #s 
-----------------------|---------|----------|---------|---------|-------------------
All files              |     100 |    96.21 |     100 |     100 |                   
 src                   |     100 |      100 |     100 |     100 |                   
  basics-import.js     |     100 |      100 |     100 |     100 |                   
  basics.js            |     100 |      100 |     100 |     100 |                   
  bytes.js             |     100 |      100 |     100 |     100 |                   
  cid.js               |     100 |      100 |     100 |     100 |                   
  index.js             |     100 |      100 |     100 |     100 |                   
  legacy.js            |     100 |      100 |     100 |     100 |                   
  varint.js            |     100 |      100 |     100 |     100 |                   
 src/bases             |     100 |      100 |     100 |     100 |                   
  base.js              |     100 |      100 |     100 |     100 |                   
  base16.js            |     100 |      100 |     100 |     100 |                   
  base32.js            |     100 |      100 |     100 |     100 |                   
  base58.js            |     100 |      100 |     100 |     100 |                   
  base64-import.js     |     100 |      100 |     100 |     100 |                   
  base64.js            |     100 |      100 |     100 |     100 |                   
 src/codecs            |     100 |      100 |     100 |     100 |                   
  codec.js             |     100 |      100 |     100 |     100 |                   
  json.js              |     100 |      100 |     100 |     100 |                   
  raw.js               |     100 |      100 |     100 |     100 |                   
 src/hashes            |     100 |      100 |     100 |     100 |                   
  digest.js            |     100 |      100 |     100 |     100 |                   
  hasher.js            |     100 |      100 |     100 |     100 |                   
  sha2.js              |     100 |      100 |     100 |     100 |                   
 test                  |     100 |    91.98 |     100 |     100 |                   
  test-bytes.js        |     100 |      100 |     100 |     100 |                   
  test-cid.js          |     100 |    94.74 |     100 |     100 | 26-30,36-39       
  test-legacy.js       |     100 |       85 |     100 |     100 | 16-20             
  test-multibase.js    |     100 |    91.67 |     100 |     100 | 17-20             
  test-multicodec.js   |     100 |       80 |     100 |     100 | 11-15             
  test-multihash.js    |     100 |     92.5 |     100 |     100 | 30-34             
 test/fixtures         |     100 |      100 |     100 |     100 |                   
  invalid-multihash.js |     100 |      100 |     100 |     100 |                   
  valid-multihash.js   |     100 |      100 |     100 |     100 |                   
-----------------------|---------|----------|---------|---------|-------------------
ERROR: Coverage for branches (96.21%) does not meet global threshold (100%)

Only places it seems to complain about are the throw statements in testThrow functions that I am suspecting is some problem with c8 and I was not able to figure out how to fix. Since you are familiar with its quirks, would you mind taking a peek please ?

I have also moved block.js from here to a separate #39 so it does not need to block this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants