Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync fork with upstream #1

Merged
merged 125 commits into from
Aug 5, 2022
Merged

Sync fork with upstream #1

merged 125 commits into from
Aug 5, 2022

Conversation

itwaseasy
Copy link

@itwaseasy itwaseasy commented Jul 24, 2022

I synced our fork and upstream, but removed the following fixes that we made:

The reason is that, from my point of view, disabling these features is incorrect behavior that can lead to problems. These fixes prevent go-json from checking if a value implements the json.Marshaler interface, so it doesn't call it even if the method is present. For example, all of our go types in proto repo have this method implemented by easyjson so it can parse the data correctly. But currently, at least for Native where the bidder crashed, go-json tries to deserialize the data itself instead of proxying it to the correct method: https://github.com/remerge/openrtb-proto-go/blob/d979c2a6cdac53ce523b7ea5070dc2b7a93a2e4f/native.go#L30

Actually, the tests that fail in our fork check exactly this behavior:

go-json/decode_test.go

Lines 285 to 304 in 35e99d2

type unmarshalJSON struct {
v int
}
func (u *unmarshalJSON) UnmarshalJSON(b []byte) error {
var v int
if err := json.Unmarshal(b, &v); err != nil {
return err
}
u.v = v
return nil
}
func Test_UnmarshalJSON(t *testing.T) {
t.Run("*struct", func(t *testing.T) {
var v unmarshalJSON
assertErr(t, json.Unmarshal([]byte(`10`), &v))
assertEq(t, "unmarshal", 10, v.v)
})
}

Moreover, if we'll decide to switch other services to this library, they may not work correctly because of this. For example, we have this code in act: https://github.com/remerge/act/blob/946805336494abb77541b8b6afc22930496657ea/adobe.go#L76-L84 With this library, this code will not be called and the developer will not be aware of it at all.

Also, I don't see any performance difference before and after these changes.

- Introduced a two phase compilation to calculate Opcode index accurately
- Fix display number of Opcode
- Improve memory footprint for Opcode
Fix operation conversion for PtrHead to Head in Recursive type
…performance

Improve map encoding performance
Optimize encoding path for escaped string
Supports dynamic filtering of struct fields
@gburanov
Copy link

it would be best to ask @dynamix why he added these 2 commits...

@dynamix is it performance related?

Copy link

@dynamix dynamix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@itwaseasy - @gburanov is correct, this is performance related.

from my point of view, disabling these features is incorrect behavior that can lead to problems.
Incorrect in what sense? There is no reason a JSON decoder has to support the json.Unmarshaler interface. In our case it actually leads to the a behaviour we do not want.

What effectively happens is that the methods generated by easyjson are used for the complete decoding of the OpenRTB object graph and not go-json. Which means we don't get any of the performance gains 😭.

Regarding performance difference - what exactly did you benchmark? There is a major difference if you benchmark this with JSON representing a populated OpenRTB request decoded into our Go structs (Some old results here: https://github.com/remerge/ohoh-json#is-it-fast , unfortunately not yet finished https://github.com/remerge/ohoh-json)

The main reason this fork exists is for improvements targeted at the use case we have in the bidder - it isn't meant to be used generic library 😏.

What we can do: document the behaviour and the reasoning and fix the tests after merging with upstream.

Regarding the crash: it seems the partner sends stuff following an old standard - we haven't seen any crashes in go-json in the last 18 month. I think we rather have to fix how we handle this special case.

@itwaseasy
Copy link
Author

Incorrect in what sense? There is no reason a JSON decoder has to support the json.Unmarshaler interface. In our case it actually leads to the a behaviour we do not want.

In the sense that upstream declares this library as a "drop-in replacement of encoding/json", and in our case it's not. But yes, if this library is supposed to be used only in bidder, then this is understandable.

Regarding performance difference - what exactly did you benchmark? There is a major difference if you benchmark this with JSON representing a populated OpenRTB request decoded into our Go structs (Some old results here: https://github.com/remerge/ohoh-json#is-it-fast , unfortunately not yet finished https://github.com/remerge/ohoh-json)

I used the benchmarks that are in the library.

What we can do: document the behaviour and the reasoning and fix the tests after merging with upstream.

OK, I'll revert the changes and remove these tests because they test logic that is missed in our fork.

@itwaseasy itwaseasy requested a review from dynamix August 4, 2022 12:33
@itwaseasy
Copy link
Author

itwaseasy commented Aug 4, 2022

Well, I disabled a few thousand lines of tests because they are all based on the Marshaler/Unmarshaler interface, please take a look.

@gburanov
Copy link

gburanov commented Aug 4, 2022

I think if it is just tests it is fine

@itwaseasy itwaseasy merged commit 18cfc43 into master Aug 5, 2022
@itwaseasy itwaseasy deleted the 5718-sync-upstream branch August 5, 2022 08:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

7 participants