chore: gateway stores singular event batches #3256

Sidddddarth · 2023-04-29T10:35:48Z

Description

gateway opens a batch of event, stores each event inside as its own batch.

gateway receives:

{
  "batch": [
    {event-1},
    {event-2}
  ]
}

gateway stores:

jobid - 1:
{
  "batch": [
    {event-1}
  ]
}

jobid - 2:
{
  "batch": [
    {event-2}
  ]
}

Tangential to another PR where gateway simply stores the singular event to gateway table.
The current approach avoids changing things at processor, and does not bear the burden of transitioning from previous behaviour which is evident from the other PR.

Notion Ticket

relevant slack thread

Security

The code changed/added as part of this pull request won't create any security issues with how the software is being used.

codecov · 2023-05-02T14:03:02Z

Codecov Report

Patch coverage: 67.45% and project coverage change: +0.11 🎉

Comparison is base (3f88f50) 68.46% compared to head (619e20e) 68.58%.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #3256      +/-   ##
==========================================
+ Coverage   68.46%   68.58%   +0.11%     
==========================================
  Files         329      329              
  Lines       52742    52778      +36     
==========================================
+ Hits        36109    36196      +87     
+ Misses      14301    14252      -49     
+ Partials     2332     2330       -2

Impacted Files	Coverage Δ
jobsdb/jobsdb.go	`74.61% <44.61%> (+1.03%)`	⬆️
gateway/gateway.go	`75.98% <78.88%> (-0.29%)`	⬇️
gateway/configuration.go	`96.42% <100.00%> (+0.06%)`	⬆️
processor/processor.go	`87.53% <100.00%> (+0.86%)`	⬆️
processor/worker.go	`94.11% <100.00%> (+15.68%)`	⬆️

... and 8 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

gateway/gateway.go

lvrach · 2023-05-03T09:40:58Z

gateway/gateway.go

+			EventCount:   1,
+			WorkspaceId:  workspaceId,


Should we also populate the payload size field?

Suggested change

EventCount: 1,

WorkspaceId: workspaceId,

EventCount: 1,

PayloadSize: len(payload),

WorkspaceId: workspaceId,

Also, we might want to populate the userID of the payload instead of firstUserID. Although it breaks compatibility with our current approach and might introduce performance downsides.
.

True
We consider each individual event's userID for suppression purposes anyway.
Elaborate further on the performance downsides.. I don't see a drastic effect due to this at all.

adopted👍

gateway/gateway.go

atzoum · 2023-05-08T06:28:39Z

Would it be easy to introduce this behaviour behind a configuration flag (enabled by default), i.e. be able to disable it in case we observed any undesired side-effects?

gateway/gateway.go

lvrach · 2023-05-17T07:14:20Z

jobsdb/jobsdb.go

+	var (
+		err error
+		res map[uuid.UUID]string
+	)
+	storeCmd := func() error {
+		command := func() interface{} {
+			dsList := jd.getDSList()
+			res, err = jd.internalStoreEachBatchRetryInTx(ctx, tx.Tx(), dsList[len(dsList)-1], jobBatches)
+			return res
+		}
+		res, _ = jd.executeDbRequest(newWriteDbRequest("store_each_batch_retry", nil, command)).(map[uuid.UUID]string)
+		return err
+	}
+	if tx.storeSafeTxIdentifier() != jd.Identifier() {
+		_ = jd.inStoreSafeCtx(ctx, storeCmd)
+		return res, err
+	}
+	_ = storeCmd()
+	return res, err


How is the error handling work in this part of the code?

As per my understanding, we always return nil and we are ignoring errors

assigning the error returned below to err(line#2999)
res, err = jd.internalStoreEachBatchRetryInTx(ctx, tx.Tx(), dsList[len(dsList)-1], jobBatches)

gateway/gateway.go

jobsdb/jobsdb.go

atzoum · 2023-05-19T06:27:39Z

jobsdb/jobsdb.go

+	if err != nil {
+		return failAll(err), nil
+	}
+	jd.logger.Errorf("Copy In command failed with error %v", err)


do we need this?

gateway/gateway.go

atzoum · 2023-05-19T08:41:18Z

jobsdb/jobsdb.go

+	if err != nil {
+		return failAll(err), nil
+	}
+	err = jd.internalStoreJobsInTx(ctx, tx, ds, lo.Flatten(jobBatches))


one side-effect of using this here is that the store_jobs metric will be published additionally to store_jobs_retry_each

Thankfully because the table_prefix is only gateway where this happens, we can excuse this(at least while inferring job storage times from the stats, but redundant stats would still exist).
We could pull out the stat part inside internalStoreJobsInTx, and have the caller publish said stat(maybe a wrapper?)

just realised, I could simply use jd.doStoreJobsInTx(ctx, tx, ds, jobList) instead of jd.internalStoreJobsInTx.

fracasula

I'm not too comfortable with code that access slice elements returned by other functions without checking for its length, but if you're positive there's always at least one element when err is not nil then go ahead.

i.e. when you do jobIDReqMap[jobData.jobs[0].UUID] = req and jobData is returned by gateway.getJobDataFromRequest(req).

gateway/gateway_test.go

Co-authored-by: Francesco Casula <fracasula@users.noreply.github.com>

Sidddddarth · 2023-05-29T06:28:44Z

I'm not too comfortable with code that access slice elements returned by other functions without checking for its length, but if you're positive there's always at least one element when err is not nil then go ahead.

i.e. when you do jobIDReqMap[jobData.jobs[0].UUID] = req and jobData is returned by gateway.getJobDataFromRequest(req).

True.
Returning Invalid JSON if length is 0.

github-actions bot added server-team with tests labels Apr 29, 2023

Sidddddarth changed the title ~~Chore.gw store singular batch~~ chore: gateway stores singular event batches Apr 29, 2023

Sidddddarth force-pushed the chore.gwStoreSingularBatch branch 3 times, most recently from 5e0c2dc to 2ff4893 Compare May 2, 2023 12:11

Sidddddarth force-pushed the chore.gwStoreSingularBatch branch 2 times, most recently from cd416aa to 59989e2 Compare May 3, 2023 08:16

Sidddddarth marked this pull request as ready for review May 3, 2023 09:04

Sidddddarth requested review from fracasula and atzoum May 3, 2023 09:26

lvrach reviewed May 3, 2023

View reviewed changes

Sidddddarth force-pushed the chore.gwStoreSingularBatch branch from 59989e2 to 1fe9d9f Compare May 4, 2023 10:47

atzoum reviewed May 8, 2023

View reviewed changes

gateway/gateway.go Outdated Show resolved Hide resolved

gateway/gateway.go Outdated Show resolved Hide resolved

Sidddddarth force-pushed the chore.gwStoreSingularBatch branch 2 times, most recently from ecf9b40 to 2d3c18b Compare May 16, 2023 10:55

lvrach reviewed May 17, 2023

View reviewed changes

gateway/gateway.go Outdated Show resolved Hide resolved

Sidddddarth force-pushed the chore.gwStoreSingularBatch branch 4 times, most recently from 1ea52e8 to bdc167a Compare May 19, 2023 06:07

atzoum reviewed May 19, 2023

View reviewed changes

jobsdb/jobsdb.go Outdated Show resolved Hide resolved

atzoum reviewed May 19, 2023

View reviewed changes

gateway/gateway.go Outdated Show resolved Hide resolved

atzoum reviewed May 19, 2023

View reviewed changes

gateway/gateway.go Outdated Show resolved Hide resolved

atzoum reviewed May 19, 2023

View reviewed changes

Sidddddarth force-pushed the chore.gwStoreSingularBatch branch from c033510 to afb08aa Compare May 19, 2023 12:55

atzoum approved these changes May 24, 2023

View reviewed changes

$fracasula$

fracasula approved these changes May 24, 2023

View reviewed changes

gateway/gateway_test.go Outdated Show resolved Hide resolved

Sidddddarth and others added 9 commits May 29, 2023 11:33

chore: gateway stores singular event batches

92d75e9

chore: don't use map

352c677

store each batch retry in tx

db564b4

fixup! store each batch retry in tx

d7e5780

fixup! store each batch retry in tx

ec8ffea

fixup! store each batch retry in tx

c57b915

fixup! store each batch retry in tx

bf66944

$@fracasula$

Update gateway/gateway_test.go

2f3ad5c

Co-authored-by: Francesco Casula <fracasula@users.noreply.github.com>

fixup! Update gateway/gateway_test.go

8421711

Sidddddarth force-pushed the chore.gwStoreSingularBatch branch from 38e4f69 to 8421711 Compare May 29, 2023 06:04

fixup! Update gateway/gateway_test.go

619e20e

Sidddddarth merged commit 1ccec6e into master May 29, 2023
30 checks passed

Sidddddarth deleted the chore.gwStoreSingularBatch branch May 29, 2023 07:27

rudder-server-bot mentioned this pull request Jun 16, 2023

chore: prerelease 1.10.0-rc.1 #3506

Merged

devops-github-rudderstack mentioned this pull request Jun 16, 2023

chore: release 1.10.0 #3513

Merged

This was referenced Jun 21, 2023

chore: prerelease 1.10.0-rc.2 #3533

Merged

chore: prerelease 1.10.0-rc.3 #3544

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: gateway stores singular event batches #3256

chore: gateway stores singular event batches #3256

Sidddddarth commented Apr 29, 2023 •

edited

Loading

codecov bot commented May 2, 2023 •

edited

Loading

lvrach May 3, 2023

Sidddddarth May 3, 2023

Sidddddarth May 4, 2023

atzoum commented May 8, 2023

lvrach May 17, 2023

Sidddddarth May 17, 2023

atzoum May 19, 2023

atzoum May 19, 2023

Sidddddarth May 19, 2023 •

edited

Loading

$@fracasula$ fracasula left a comment

Sidddddarth commented May 29, 2023

chore: gateway stores singular event batches #3256

chore: gateway stores singular event batches #3256

Conversation

Sidddddarth commented Apr 29, 2023 • edited Loading

Description

Notion Ticket

Security

codecov bot commented May 2, 2023 • edited Loading

Codecov Report

lvrach May 3, 2023

Choose a reason for hiding this comment

Sidddddarth May 3, 2023

Choose a reason for hiding this comment

Sidddddarth May 4, 2023

Choose a reason for hiding this comment

atzoum commented May 8, 2023

lvrach May 17, 2023

Choose a reason for hiding this comment

Sidddddarth May 17, 2023

Choose a reason for hiding this comment

atzoum May 19, 2023

Choose a reason for hiding this comment

atzoum May 19, 2023

Choose a reason for hiding this comment

Sidddddarth May 19, 2023 • edited Loading

Choose a reason for hiding this comment

fracasula left a comment

Choose a reason for hiding this comment

Sidddddarth commented May 29, 2023

Sidddddarth commented Apr 29, 2023 •

edited

Loading

codecov bot commented May 2, 2023 •

edited

Loading

Sidddddarth May 19, 2023 •

edited

Loading

$@fracasula$ fracasula left a comment