Skip to content

feat: add configurable async email delivery for workshop invitations#2592

Merged
mroderick merged 4 commits intocodebar:masterfrom
mroderick:feature/delayed-email-async
Apr 25, 2026
Merged

feat: add configurable async email delivery for workshop invitations#2592
mroderick merged 4 commits intocodebar:masterfrom
mroderick:feature/delayed-email-async

Conversation

@mroderick
Copy link
Copy Markdown
Collaborator

@mroderick mroderick commented Apr 25, 2026

Summary

  • Configure ActiveJob to use DelayedJob adapter for background email processing
  • Add ApplicationJob with global error handling for failed jobs
  • Create AsyncEmailConcern for chapter-based async email feature flag
  • Update WorkshopInvitationManager to send emails asynchronously for chapters in ASYNC_EMAIL_CHAPTER_IDS

Hypothesis

DelayedJob jobs timeout after 540s on Heroku when processing bulk workshop invitations. This happens because emails are sent synchronously (deliver_now) inside loops, blocking on SMTP I/O for each email.

At scale:

  • London chapter: 5,454 students + 1,988 coaches = 7,442 potential invitations
  • Recent workshops send 2,000-4,000 invitations
  • 4,000 × ~200ms per email = ~800 seconds > 540s timeout

How the Fix Works

  1. For chapters in ASYNC_EMAIL_CHAPTER_IDS, emails are enqueued with .deliver_later instead of blocking with .deliver_now
  2. Each email becomes an independent DelayedJob job queued in the database
  3. The original "send invitations" job completes in milliseconds, not minutes - no timeout
  4. The worker processes queued email jobs separately

What happens with Heroku Scheduler running every 10 minutes:

Time Event
T+0 Invitation job runs, queues 2,000 emails as separate jobs
T+10min jobs:workoff runs, picks up first batch of email jobs
T+20min Next jobs:workoff, picks up more
... Continues until all emails sent

Expected behavior:

  • Emails arrive in batches over 10-20 minutes instead of all at once
  • No single job exceeds 540s because each email is its own job (~1-2s each)
  • Total send time increases but completes reliably

Trade-off: Emails are sent over ~10-20 minutes instead of ~13 minutes synchronously. This is preferable to timed-out failures.

Feature Flag

ASYNC_EMAIL_CHAPTER_IDS env var:

  • Empty (default): All chapters use sync delivery (safe fallback)
  • Specific IDs: Only those chapters use async delivery

Recommended rollout:

  1. Deploy with empty ASYNC_EMAIL_CHAPTER_IDS (safe - no behavior change)
  2. Set ASYNC_EMAIL_CHAPTER_IDS=1 to enable async for London only
  3. Monitor for 1-2 workshops
  4. Gradually add more chapters or remove the limit entirely

Changes

  1. config/application.rb - Set config.active_job.queue_adapter = :delayed_job and add feature flag config
  2. app/jobs/application_job.rb - Base job class with global error handling
  3. app/models/concerns/async_email_concern.rb - Concern with async_email_for_chapter? helper
  4. app/models/concerns/workshop_invitation_manager_concerns.rb - 7 conditional async email calls

@mroderick mroderick force-pushed the feature/delayed-email-async branch from 9ef1965 to 1e557b8 Compare April 25, 2026 08:31
@mroderick mroderick marked this pull request as draft April 25, 2026 08:33
@mroderick mroderick force-pushed the feature/delayed-email-async branch from b8708c9 to 1569e2d Compare April 25, 2026 08:54
@mroderick mroderick force-pushed the feature/delayed-email-async branch from 1569e2d to 6285351 Compare April 25, 2026 09:04
@mroderick
Copy link
Copy Markdown
Collaborator Author

I am considering a way to use SendGrid's API to send email in batches of up to 1k each, but that's a bigger undertaking. This PR fixes the immediate problem today, and I can carefully design the other solution without rushing or jeopardising the workshop invitations functionality.

@mroderick mroderick requested review from olleolleolle and till and removed request for olleolleolle April 25, 2026 09:21
@mroderick
Copy link
Copy Markdown
Collaborator Author

Once this is merged, I'll speak to some chapter organisers from the larger chapters to get them to help test drive this.

The last thing we need is for all chapter organisers to be disrupted at the same time.

@mroderick mroderick marked this pull request as ready for review April 25, 2026 09:22
Copy link
Copy Markdown
Collaborator

@till till left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not overly familiar with all the Rubyism, but is codebar already using delay jobs? It seems so as there is not additional setup in this PR. Otherwise, LGTM.

Then my only question is: do we have enough database storage for all these new rows?

Comment thread app/jobs/application_job.rb
@mroderick
Copy link
Copy Markdown
Collaborator Author

Then my only question is: do we have enough database storage for all these new rows?

$ heroku pg:info -a codebar-production
=== DATABASE_URL, HEROKU_POSTGRESQL_MAUVE_URL

Plan:                  Standard 0
Status:                Available
Data Size:             593 MB / 64 GB (0.9%)
Tables:                48

I think we'll be ok for a while.

We could add a job that cleans out old (1-2y+) rows.

@mroderick mroderick requested a review from till April 25, 2026 10:04
@mroderick mroderick merged commit fd1a03b into codebar:master Apr 25, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants