Refactor FPD Merge Functions #3601

SyntaxNode · 2024-03-28T05:54:34Z

Replaces the FPD merge methods with a smarter implementation.

With the goal of parsing the FPD overlay json only once, the exisitng algorithm cloned the entire app / site / user object before unmarshalling the overlay. A clone is necessary because we share object references between bidder requests to save on allocations, so we need to make a clone to a pointer field before we make a bidder specific change.

Instead of cloning the entire object up front, we can save on allocations and processing time in most cases by tying into the unmarshal process and making a clone as needed. This new approach (now possible with the move to json-iter) also builds ext merging into the unmarshal process and eliminates the need for the extMerger helper.

Benchmarks:

Smallest:
App + ID, ID Overwritten
BenchmarkAppCopy-12    	          976220	        1073 ns/op	     872 B/op	      14 allocs/op
BenchmarkAppMergeClone-12      	 1706986	       700.8 ns/op	     560 B/op	      11 allocs/op

Small:
App - ID + Publisher, ID + Publisher Overwritten
BenchmarkAppCopy-12          	 1883467	       664.5 ns/op	     872 B/op	       8 allocs/op
BenchmarkAppMergeClone-12    	 2491872	       517.9 ns/op	     576 B/op	       6 allocs/op

Medium / Average Case
App - Some Set, ID + Publisher Overwritten
BenchmarkAppCopy-12         	   95799	     12359 ns/op	    8708 B/op	     164 allocs/op
BenchmarkAppMergeClone-12         449222	      2645 ns/op	    1856 B/op	      44 allocs/op

Medium - All Fields Set On Request
App - All Set, ID + Publisher Overwritten
BenchmarkAppCopy-12         	   58051	     21394 ns/op	   14798 B/op	     280 allocs/op
BenchmarkAppMergeClone-12         928129	      1257 ns/op	    1760 B/op	      34 allocs/op

Largest / Worst:
App - All Set, All Overwritten
BenchmarkAppCopy-12         	   34359	     33525 ns/op	   17426 B/op	     405 allocs/op
BenchmarkAppMergeClone-12          33030	     37734 ns/op	   19092 B/op	     435 allocs/op

This new algorithm uses less cpu time and allocates less in almost all cases, except for the worst case. I wouldn't expect the worst case to ever occur, but sharing the benchmark for full transparency. The new algorithm really shines for the medium / average cases.

guscarreon

I didn't find the benchmark in the contributed code. Should we include them?

guscarreon · 2024-04-25T14:49:57Z

util/jsonutil/merge_test.go

+		)
+
+		err := MergeClone(imp, []byte(`{"banner":nul}`))
+		require.EqualError(t, err, "cannot unmarshal openrtb2.Imp.Banner: expect ull")


Should it read expect null instead of expect ull? Does tryExtractErrorMessage have an off-one bug when building the error message string?

This is not a bug in this PR. That is the error message from json-iter. This test is specifically targeting the failure branches of iter.ReadNil. I'll add a comment to clarify.

guscarreon · 2024-04-25T15:32:35Z

util/jsonutil/merge_test.go

+
+		err := MergeClone(imp, []byte(`{"banner":malformed}`))
+		require.EqualError(t, err, "cannot unmarshal openrtb2.Imp.Banner: expect { or n, but found m")
+	})


If we add this test case, it'll pass. Do we want MergeClone to have this behavior with empty non null arrays? In order to save memory, should we modify in order to get a nil imp.IframeBuster instead?

t.Run("nil-existing-empty-incoming", func(t *testing.T) { var ( imp = &openrtb2.Imp{} ) err := MergeClone(imp, []byte(`{"iframeBuster":[]}`)) require.NoError(t, err) assert.Equal(t, []string{}, imp.IframeBuster, "new-val") })

No. It is proper to parse an empty json array to an empty slice. There is sometimes a difference between a null and empty array.

guscarreon · 2024-04-25T16:34:16Z

util/jsonutil/merge.go

+	// token, so must be handled in this decoder.
+	if iter.ReadNil() {
+		*(*unsafe.Pointer)(ptr) = nil
+		d.mapType.UnsafeSet(ptr, d.mapType.UnsafeNew())


Is this call initializing ptr to a new empty map? If so, slices and pointers in lines 80 and 104 are simply set to nil. Should slices in line 104 be initialized too?

I do not fully understand why setting the pointer to nil isn't sufficient for maps. This is copy I copied/pasted from json-iter, see here. Perhaps it's an incorrect assumption, but I assume this is done for a reason.

guscarreon

LGTM

SyntaxNode added 4 commits March 28, 2024 01:30

MergeClone

0c6a8ce

Improved Slice Clone Avoidance Check

c016c45

Merge branch 'master' into json-merge

f3f7823

Remove Old Comments

127a78c

hhhjort requested review from VeronikaSolovei9 and guscarreon March 28, 2024 17:24

hhhjort assigned VeronikaSolovei9 and guscarreon Mar 28, 2024

bsardo assigned bsardo and VeronikaSolovei9 and unassigned VeronikaSolovei9 and bsardo Apr 1, 2024

SyntaxNode added the do not port label Apr 9, 2024

"complex" test for slice reflectutil

56887ff

VeronikaSolovei9 previously approved these changes Apr 22, 2024

View reviewed changes

guscarreon reviewed Apr 25, 2024

View reviewed changes

strange error explanation comments

48f5b8c

SyntaxNode dismissed VeronikaSolovei9’s stale review via 48f5b8c April 25, 2024 18:43

guscarreon approved these changes Apr 25, 2024

View reviewed changes

VeronikaSolovei9 approved these changes Apr 25, 2024

View reviewed changes

bsardo merged commit 36dea2c into prebid:master Apr 26, 2024
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor FPD Merge Functions #3601

Refactor FPD Merge Functions #3601

SyntaxNode commented Mar 28, 2024 •

edited

guscarreon left a comment

guscarreon Apr 25, 2024 •

edited

SyntaxNode Apr 25, 2024 •

edited

guscarreon Apr 25, 2024

SyntaxNode Apr 25, 2024

guscarreon Apr 25, 2024 •

edited

SyntaxNode Apr 25, 2024 •

edited

guscarreon left a comment

Refactor FPD Merge Functions #3601

Refactor FPD Merge Functions #3601

Conversation

SyntaxNode commented Mar 28, 2024 • edited

guscarreon left a comment

Choose a reason for hiding this comment

guscarreon Apr 25, 2024 • edited

Choose a reason for hiding this comment

SyntaxNode Apr 25, 2024 • edited

Choose a reason for hiding this comment

guscarreon Apr 25, 2024

Choose a reason for hiding this comment

SyntaxNode Apr 25, 2024

Choose a reason for hiding this comment

guscarreon Apr 25, 2024 • edited

Choose a reason for hiding this comment

SyntaxNode Apr 25, 2024 • edited

Choose a reason for hiding this comment

guscarreon left a comment

Choose a reason for hiding this comment

SyntaxNode commented Mar 28, 2024 •

edited

guscarreon Apr 25, 2024 •

edited

SyntaxNode Apr 25, 2024 •

edited

guscarreon Apr 25, 2024 •

edited

SyntaxNode Apr 25, 2024 •

edited