Skip to content
This repository has been archived by the owner on Mar 9, 2021. It is now read-only.

kv cleanups #28

Merged
merged 6 commits into from
Apr 3, 2020
Merged

kv cleanups #28

merged 6 commits into from
Apr 3, 2020

Conversation

phritz
Copy link
Contributor

@phritz phritz commented Apr 2, 2020

this is the first batch of kv cleanups. note there are separate logical commits.

in the last commit to make construction of maps more efficient (stop copying noms maps to new noms maps) i uncovered several bugs related to newlines in encoded values, and fixed them. eg we were including newlines in the Operation.Value when generating a diff, checksumming over the newline byte, etc.

Progress towards rocicorp/replicache#22

@vercel
Copy link

vercel bot commented Apr 2, 2020

This pull request is being automatically deployed with ZEIT Now (learn more).
To see the status of your deployment, click below or on the icon next to each commit.

🔍 Inspect: https://zeit.co/rocicorp/diff-server/718snyl1c
✅ Preview: https://diff-server-git-kv.rocicorp.now.sh

Copy link
Contributor

@aboodman aboodman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I didn't look at the tests closely, but red flag on the stuff with canonicalization and correctness. Maybe we should just bite the bullet and write the canonicalizer now. Or re-canon in Remove and Diff ?

}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
if tt.wantPanic {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test you got the panic you were expecting? There are lots of reasons things might panic (I've actually gotten bitten by tests like this before).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

kv/map.go Outdated
@@ -16,31 +18,55 @@ import (
type Map struct {
noms types.ValueReadWriter
nm types.Map
c Checksum
Sum Checksum
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oooh a totally public field. Bold. Daring. But this implies caller is allowed to set Sum to some totally different value. That doesn't seem like what we want.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're right. done.

kv/map.go Outdated
func (m Map) Empty() bool {
return m.nm.Empty()
}

func bytesFromNomsValue(value types.Valuable) ([]byte, error) {
// Here we could check value.Kind() if we wanted.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ToJSON does that error handling for us, so this comment can go away.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

kv/map.go Outdated Show resolved Hide resolved
kv/map.go Outdated
return me.Build()
}

// WrapMap wraps a noms Map with additional logic replicache needs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// WrapMap wraps a noms Map with additional logic replicache needs.
// FromNoms creates a map from an existing Noms Map and Checksum.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

kv/map.go Show resolved Hide resolved
kv/map.go Show resolved Hide resolved
kv/map.go Outdated
if err != nil {
return Map{}, fmt.Errorf("failed to convert value for key %s to json", k)
}
if err := me.SetCanonicalized(k, canonicalizedValue); err != nil {
Copy link
Contributor

@aboodman aboodman Apr 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function and SetCanonicalized can go away:

  • Change definition of types.ClientViewResponse.ClientView to map[string]json.RawMessage
  • When processing clientView iterate the decoded map[string]json.RawMessage and call editor.Set repeatedly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no performance benefit that I can see of current method to what I proposed above, and it reuses more code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

kv/map.go Show resolved Hide resolved
kv/patch.go Show resolved Hide resolved
@@ -49,6 +75,7 @@ func (m Map) NomsMap() types.Map {
}

// Get returns the json bytes for the given key, which must exist.
// Note the json bytes gotten include a trailing newline.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry I totally forgot that our ToJSON canonicalized. In that case, I agree with you that that function and kv.Map should be committing to return canonical JSON, including stripping the newline. So sorry.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok will put this in a followup

func (me *MapEditor) Set(key string, value []byte) error {
var v interface{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not rely on the fact that Get() canonicalizes?

The implementation of MapEditor.Set could be:

  1. Call nomsjson.FromJSON to decode to a Noms value
  2. Call nomsjson.ToJSON to get back a canonical bytearray
  3. Compute the checksum
  4. Store the Noms value

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because you have no guarantee that that noms value from step 1 is going to be identical to the noms value you'd get if you parsed the canonical json from step 2. in fact, it will not be identical in many cases eg if the original value has unnormalized strings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i added a comment to this effect. this stuff is trickier than it looks.

kv/map.go Outdated
}

// SetCanonicalized sets the value for a given key. It assumes value is canonicalized.
func (me *MapEditor) SetCanonicalized(key string, value []byte) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can go away. See comment at NewMapFromPile.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

kv/map.go Outdated
Sum Checksum
}

// NewMap returns a new Map. This constructor will panic if there is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is only used by tests, and is inefficient due to the copy from string to []byte. Maybe just move it into a test file to avoid having it accidentally be considered a part of the real interface.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its used by several packages so i renamed it to NewMapForTest and placed it in a separate file called testing.go.

kv/map.go Outdated
if err != nil {
return Map{}, fmt.Errorf("failed to convert value for key %s to json", k)
}
if err := me.SetCanonicalized(k, canonicalizedValue); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no performance benefit that I can see of current method to what I proposed above, and it reuses more code.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants