Skip to content
This repository has been archived by the owner on Nov 1, 2023. It is now read-only.

Re-think what we include in our events #2937

Open
tevoinea opened this issue Mar 22, 2023 · 0 comments
Open

Re-think what we include in our events #2937

tevoinea opened this issue Mar 22, 2023 · 0 comments
Labels
enhancement New feature or request

Comments

@tevoinea
Copy link
Member

tevoinea commented Mar 22, 2023

For the second time, we've hit an issue where the data we're trying to send through azure queue is too large. The first time (#2742 #2788 ), and that worked ok but we probably don't want to be implementing ITruncatable for everything.

The problem

Many of the events we publish contain TaskConfig/JobConfig/rendered crash reports which can all contain lots of data. We also include the job id/task id.

If we keep only the job id/task id and expect recipients to query the state when they receive the webhook, information may be lost.

For example:

  1. Task starts -> in the db: task id: 123abc, state: started, other info: ...
  2. Send TaskStarted webhook (task id: 123abc)
  3. User receives the webhook
  4. The task crashes and we update the state in the db to Failed
  5. User queries the task id and see status failed

In that example, the db state for the task at time 1. is lost forever. The user can't check any state related to the task when it started.

The solution

I created this issue so we can brainstorm solutions.

The only requirement for a solution is that if we choose to continue using Azure Queue, we need some reasonable expectation that the events don't have unbounded size. For example we know guids/task states will have a limited length when they're serialized.

AB#45326

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants