-
Notifications
You must be signed in to change notification settings - Fork 302
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: gateway stores singular event batches #3256
Conversation
5e0c2dc
to
2ff4893
Compare
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## master #3256 +/- ##
==========================================
+ Coverage 68.46% 68.58% +0.11%
==========================================
Files 329 329
Lines 52742 52778 +36
==========================================
+ Hits 36109 36196 +87
+ Misses 14301 14252 -49
+ Partials 2332 2330 -2
☔ View full report in Codecov by Sentry. |
cd416aa
to
59989e2
Compare
gateway/gateway.go
Outdated
EventCount: 1, | ||
WorkspaceId: workspaceId, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also populate the payload size field?
EventCount: 1, | |
WorkspaceId: workspaceId, | |
EventCount: 1, | |
PayloadSize: len(payload), | |
WorkspaceId: workspaceId, |
Also, we might want to populate the userID of the payload instead of firstUserID
. Although it breaks compatibility with our current approach and might introduce performance downsides.
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True
We consider each individual event's userID for suppression purposes anyway.
Elaborate further on the performance downsides.. I don't see a drastic effect due to this at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adopted👍
59989e2
to
1fe9d9f
Compare
Would it be easy to introduce this behaviour behind a configuration flag (enabled by default), i.e. be able to disable it in case we observed any undesired side-effects? |
ecf9b40
to
2d3c18b
Compare
jobsdb/jobsdb.go
Outdated
var ( | ||
err error | ||
res map[uuid.UUID]string | ||
) | ||
storeCmd := func() error { | ||
command := func() interface{} { | ||
dsList := jd.getDSList() | ||
res, err = jd.internalStoreEachBatchRetryInTx(ctx, tx.Tx(), dsList[len(dsList)-1], jobBatches) | ||
return res | ||
} | ||
res, _ = jd.executeDbRequest(newWriteDbRequest("store_each_batch_retry", nil, command)).(map[uuid.UUID]string) | ||
return err | ||
} | ||
if tx.storeSafeTxIdentifier() != jd.Identifier() { | ||
_ = jd.inStoreSafeCtx(ctx, storeCmd) | ||
return res, err | ||
} | ||
_ = storeCmd() | ||
return res, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is the error handling work in this part of the code?
As per my understanding, we always return nil and we are ignoring errors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assigning the error returned below to err
(line#2999)
res, err = jd.internalStoreEachBatchRetryInTx(ctx, tx.Tx(), dsList[len(dsList)-1], jobBatches)
1ea52e8
to
bdc167a
Compare
jobsdb/jobsdb.go
Outdated
if err != nil { | ||
return failAll(err), nil | ||
} | ||
jd.logger.Errorf("Copy In command failed with error %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need this?
jobsdb/jobsdb.go
Outdated
if err != nil { | ||
return failAll(err), nil | ||
} | ||
err = jd.internalStoreJobsInTx(ctx, tx, ds, lo.Flatten(jobBatches)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one side-effect of using this here is that the store_jobs
metric will be published additionally to store_jobs_retry_each
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thankfully because the table_prefix
is only gateway
where this happens, we can excuse this(at least while inferring job storage times from the stats, but redundant stats would still exist).
We could pull out the stat part inside internalStoreJobsInTx
, and have the caller publish said stat(maybe a wrapper?)
just realised, I could simply use jd.doStoreJobsInTx(ctx, tx, ds, jobList)
instead of jd.internalStoreJobsInTx
.
c033510
to
afb08aa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not too comfortable with code that access slice elements returned by other functions without checking for its length, but if you're positive there's always at least one element when err
is not nil
then go ahead.
i.e. when you do jobIDReqMap[jobData.jobs[0].UUID] = req
and jobData
is returned by gateway.getJobDataFromRequest(req)
.
Co-authored-by: Francesco Casula <fracasula@users.noreply.github.com>
38e4f69
to
8421711
Compare
True. |
Description
gateway opens a batch of event, stores each event inside as its own batch.
gateway receives:
gateway stores:
Tangential to another PR where gateway simply stores the singular event to gateway table.
The current approach avoids changing things at processor, and does not bear the burden of transitioning from previous behaviour which is evident from the other PR.
Notion Ticket
relevant slack thread
Security