Bug Description
When sending a campaign that targets a dynamic segment, the campaign can become permanently stuck in SENDING status if any contact leaves the segment (e.g., unsubscribes) between when totalRecipients is calculated and when their batch is processed.
Root Cause
There are two interacting issues:
1. Dynamic segment is re-evaluated during batch processing
In CampaignService.processBatch(), getRecipientsCursor() rebuilds the recipient WHERE clause on every batch by calling buildRecipientWhereAsync(). For dynamic segments, this re-evaluates the segment condition at query time. If a contact leaves the segment (unsubscribes, data changes, etc.) after totalRecipients was set but before the cursor reaches their position, that contact is silently excluded — no email record is created and no error is logged.
2. Completion check requires exact count match
In email-processor.js, the campaign transitions to SENT only when:
if (sentCount >= campaign.totalRecipients) {
await prisma.campaign.update({
where: { id: email.campaignId },
data: { status: CampaignStatus.SENT, sentCount },
});
}
If even one contact left the segment during processing, sentCount will never reach totalRecipients, and the campaign remains in SENDING forever. There is no other mechanism (timeout, periodic check, etc.) to finalize the campaign.
Steps to Reproduce
- Create a campaign targeting a dynamic segment with a large number of contacts
- Send the campaign
- While the campaign is sending (batches are processing), have a contact that hasn't been reached by the cursor yet unsubscribe or otherwise leave the segment
- The campaign will complete all batches successfully but remain in
SENDING status indefinitely
Real-World Impact
We hit this in production with a campaign of ~39,884 recipients. One contact unsubscribed during the ~90-second batch processing window. All 39,883 remaining emails were sent successfully, but the campaign was stuck in SENDING status with sentCount: 39883 vs totalRecipients: 39884.
Suggested Fixes
A few possible approaches (not mutually exclusive):
-
Add a completion check after the last batch in processBatch(): When hasMore is false, compare sentCount to totalRecipients and transition to SENT if they're close enough or if no more recipients exist.
-
Snapshot recipients at send time: Instead of re-evaluating the dynamic segment during each batch, resolve the recipient list once at send start (or use totalRecipients as the source of truth rather than the live segment query).
-
Add a periodic sweep/finalizer: A background job that checks for campaigns in SENDING status where all batches have completed and no emails are PENDING, and transitions them to SENT.
-
Use >= with the actual email count: After the last batch completes, count actual emails created for the campaign and use that as the basis for completion, rather than requiring it to match the pre-computed totalRecipients.
Workaround
Manually update the database:
UPDATE campaigns
SET status = 'SENT', "totalRecipients" = (
SELECT count(*) FROM emails WHERE "campaignId" = '<campaign-id>' AND "sentAt" IS NOT NULL
)
WHERE id = '<campaign-id>';
Environment
- Plunk image:
ghcr.io/useplunk/plunk:sha-7834b9e
- Self-hosted via Docker Compose
Bug Description
When sending a campaign that targets a dynamic segment, the campaign can become permanently stuck in
SENDINGstatus if any contact leaves the segment (e.g., unsubscribes) between whentotalRecipientsis calculated and when their batch is processed.Root Cause
There are two interacting issues:
1. Dynamic segment is re-evaluated during batch processing
In
CampaignService.processBatch(),getRecipientsCursor()rebuilds the recipient WHERE clause on every batch by callingbuildRecipientWhereAsync(). For dynamic segments, this re-evaluates the segment condition at query time. If a contact leaves the segment (unsubscribes, data changes, etc.) aftertotalRecipientswas set but before the cursor reaches their position, that contact is silently excluded — no email record is created and no error is logged.2. Completion check requires exact count match
In
email-processor.js, the campaign transitions toSENTonly when:If even one contact left the segment during processing,
sentCountwill never reachtotalRecipients, and the campaign remains inSENDINGforever. There is no other mechanism (timeout, periodic check, etc.) to finalize the campaign.Steps to Reproduce
SENDINGstatus indefinitelyReal-World Impact
We hit this in production with a campaign of ~39,884 recipients. One contact unsubscribed during the ~90-second batch processing window. All 39,883 remaining emails were sent successfully, but the campaign was stuck in
SENDINGstatus withsentCount: 39883vstotalRecipients: 39884.Suggested Fixes
A few possible approaches (not mutually exclusive):
Add a completion check after the last batch in
processBatch(): WhenhasMoreis false, comparesentCounttototalRecipientsand transition toSENTif they're close enough or if no more recipients exist.Snapshot recipients at send time: Instead of re-evaluating the dynamic segment during each batch, resolve the recipient list once at send start (or use
totalRecipientsas the source of truth rather than the live segment query).Add a periodic sweep/finalizer: A background job that checks for campaigns in
SENDINGstatus where all batches have completed and no emails arePENDING, and transitions them toSENT.Use
>=with the actual email count: After the last batch completes, count actual emails created for the campaign and use that as the basis for completion, rather than requiring it to match the pre-computedtotalRecipients.Workaround
Manually update the database:
Environment
ghcr.io/useplunk/plunk:sha-7834b9e