Skip to content
This repository has been archived by the owner on Jul 15, 2018. It is now read-only.

Sdk2 kvpair #102

Merged
merged 2 commits into from
Dec 17, 2017
Merged

Sdk2 kvpair #102

merged 2 commits into from
Dec 17, 2017

Conversation

jaekwon
Copy link
Contributor

@jaekwon jaekwon commented Dec 17, 2017

Instead of defining your own KVPair, always use this one.

@jaekwon jaekwon changed the base branch from sdk2 to sdk2-nilkeys December 17, 2017 04:53
@jaekwon jaekwon changed the base branch from sdk2-nilkeys to sdk2 December 17, 2017 04:54
// {"D4_a__1=", url},
// {"D4/a++1=", std},
// {"D4_a0-9=", url},
// {"Finey-1=", url},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a second I thought this was an ode to Hal Finney but then noticed it was just one n :)

Copy link
Contributor

@odeke-em odeke-em left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @jaekwon for this change, I have posted some thoughts and suggestions. I could definitely be wrong but I think we can simplify Byte by implementing HexByte, Base64Byte etc, instead of HexEncoder, Base64Encoder etc, and I show the problem with an example. Please take a look.

common/bytes.go Outdated
// }
type ByteEncoder interface {
Marshal(bytes []byte) ([]byte, error)
Unmarshal(dst *[]byte, src []byte) error
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pros: Good work on the Unmarshal! Your signature will help gut out allocations and byte slice copies everytime, by enabling byte slice reuse.

var sav []byte
if err := cmn.ByteCodec.Unmarshal(&sav, data); err != nil {
}
// More stuff
if err := cmn.ByteCodec.Unmarshal(&sav, pipedIn); err != nil {
}

as opposed to the traditional style as from the Go stdlib which will always allocate a slice no matter what

var sav []byte
blob, err := cmn.ByteCodec.Unmarshal(data)
if err != nil {
}
copy(sav, blob)
// More stuff
if err := cmn.ByteCodec.Unmarshal(&sav, pipedIn); err != nil {
}

However, the user doesn't need to worry about pre-allocating a slice before Unmarshaling.

Cons:

  • With the commendation above though, I haven't yet seen any codec employing byte slice reuse. HexEncoder does *dst, err = hex.DecodeString which still performs the byte slice allocation, assigning to *dst and completely not reusing the preallocated slice(if that was the intention of the user).
  • The new signature if employing byte slice reuse makes it so that the user has to think about how many bytes that they are expecting. If we aren't going to be resuing pre-allocated slices, perhaps let's just revert to the old style?

Perhaps in the future, let's work on fulfilling the purpose of that signature (*[]byte, []byte). The approach is still commendable though.

common/bytes.go Outdated
Unmarshal(dst *[]byte, src []byte) error
}

// hexEncoder implements ByteEncoder encoding the slice as a hexidecimal
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Encoder/Codec/g as we do both Encoding and Decoding

common/bytes.go Outdated

// String gets a simple string for printing (the json output minus quotes)
func (b Bytes) String() string {
raw, err := encoder.Marshal(b)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that this method is fixated on the json encoder.

common/bytes.go Outdated
return encoder.Marshal(b)
}

func (b *Bytes) UnmarshalJSON(data []byte) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely could be wrong but here are some thoughts:
I am not convinced that we should have a default codec and then use it in here magically on behalf of the user.
Why not let the user call say HexCodec.UnmarshalJSON(data)? What happens when we have two different programs that each import this package but might want to differently encode Byte?

For example if I have a struct with the address stored in hex but the digest, stored in base64

type Data struct {
    Address cmn.Byte `json:"addr"` // <- to be stored in hex
    PassphraseDigest cmn.Byte `json:"pass_addr"` // <-- to be stored in Base64
}

then oops in order to successfully encode/decode that struct, I will have to implement a custom MarshalJSON, UnmarshalJSON that has to set the codec after each field has been traversed or invoke a separate codec for each field. Moreover this most likely would be done concurrently encoding and decoding other structs. It also would require the user to be an advanced Go user to successfully use the data types thus a higher mental load on the user. For example, this is what I think it will take for a user to successfully marshal the struct defined above that has fields that both are of type cmn.Bytes but they want the representations to be different on the wire:

func (d *Data) MarshalJSON() ([]byte, error) {
   var err error
   repr := make(map[string]interface{})
   if repr["addr"], err = cmn.HexCodec.MarshalJSON(repr.Address); err != nil {
      return nil, err
   }
   if repr["pass_digest"], err = cmn.Base64Codec.MarshalJSON(repr.PassphraseDigest); err != nil {
      return nil, err
   }
   // Then for the other fields of the Data struct
   overallMap := make(map[string]interface{})
   overallBlob, err := json.Marshal(d)
   if err != nil {
      return nil, err
   }
   if err := json.Unmarshal(overallBlob, &overallMap); err != nil {
      return nil, err
   }
   // Now overwrite the keys of the custom base64 and hex keys encoded differently together
   for k, v := range repr {
       overallMap[k] = v
   }
   return json.Marshal(overallMap)
}

and notice how that requires advanced knowledge of the encoding/json package but also that the complexity increases with every single custom field added plus it'll do double work in the step of encoding the non-cmn.Bytes fields

Proposed change

I defined separate types with different codecs, I now only need to define the struct as:

type Data struct {
   Address cmn.HexBytes `json:"addr"`
   PassphraseDigest cmn.Base64Bytes `json:"pass_digest"`
}

Instead of implementing *Coder, perhaps we couldimplement different *Byte types, each with its respective Marshal and String methods. This way the user doesn't have to first set a global codec everytime, plus they now have better control in structs.

type JSONBytes []byte
var _ json.Codec = (*JSONBytes)(nil)
var _ fmt.Stringer = (*JSONBytes)(nil)
type HexBytes baseBytes
var _ json.Codec = (*HexBytes)(nil)
var _ fmt.Stringer = (*HexBytes)(nil)
type Base64Bytes baseBytes
var _ json.Codec = (*Base64Bytes)(nil)
var _ fmt.Stringer = (*Base64Bytes)(nil)

// Implement the respective *arshal* and String methods for each type

sm.sorted = true
}

// Returns a copy of sorted KVPairs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this really a copy ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no. good catch...

@jaekwon
Copy link
Contributor Author

jaekwon commented Dec 17, 2017

@odeke-em I agree with you that the original encoding logic was problematic in many ways. For now, all I need in common/bytes is a universal way to refer to Bytes and KVPair types to support hex encoding, so I've simplified the logic.

Much of your original comments apply to the github.com/tendermint/go-wire/data module, so please submit a new Issue on that repo with what you've written here, w/ consideration for this new cmn.Bytes.

@odeke-em
Copy link
Contributor

Cool, thank you @jaekwon for the PR and for considering those comments. For sure, I've opened tendermint/go-amino#65.

@zramsay zramsay deleted the sdk2-kvpair branch December 29, 2017 15:21
ebuchman pushed a commit that referenced this pull request Dec 31, 2017
* Canonical KVPair in common
* Simplify common/Bytes to just hex encode
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants