- 
                Notifications
    
You must be signed in to change notification settings  - Fork 27
 
[push] support multiple events per token per batch #188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
          
WalkthroughReplaces map-based push APIs with a slice-based Message type (Token + Event), updates Send signatures and callers to use indexed error slices, rewrites service sendAll to correlate errors by index and add a retry helper, and simplifies retry metrics by removing outcome labels in favor of a single counter. Changes
 Sequence Diagram(s)sequenceDiagram
    autonumber
    participant Service as push.service
    participant Upstream as upstream.Client
    participant FCM as fcm.Client
    participant Cache as retry/cache
    Service->>Service: deserialize wrappers -> build []Message
    Service->>Upstream: Send(ctx, []Message)
    Upstream-->>Service: ([]error, err)
    alt transport error (err != nil)
        Service->>Service: log transport error, call retry(wrappers)
    else partial failures (some errs non-nil)
        Service->>Service: correlate errs by index -> failed wrappers
        Service->>Service: increment failure metrics, log per-failure
        Service->>Service: retry(failed wrappers)
    else all sent
        Service-->>Service: record success metrics
    end
    rect rgba(0,128,0,0.06)
    Note over Service,Cache: retry(ctx, wrappers): for each wrapper -> inc retries, maybe blacklist, serialize, cache
    Service->>Cache: serialize & store wrapper
    Cache-->>Service: OK / error
    end
    Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 
 Possibly related PRs
 Pre-merge checks and finishing touches❌ Failed checks (1 warning)
 ✅ Passed checks (2 passed)
 ✨ Finishing touches🧪 Generate unit tests (beta)
 📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
 🧰 Additional context used🧠 Learnings (1)📓 Common learnings⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
 🔇 Additional comments (9)
 Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment   | 
    
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
internal/sms-gateway/modules/push/fcm/client.go(1 hunks)internal/sms-gateway/modules/push/metrics.go(3 hunks)internal/sms-gateway/modules/push/service.go(1 hunks)internal/sms-gateway/modules/push/types.go(1 hunks)internal/sms-gateway/modules/push/types/types.go(1 hunks)internal/sms-gateway/modules/push/upstream/client.go(4 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-07-22T00:31:28.796Z
Learnt from: capcom6
PR: android-sms-gateway/server#154
File: internal/sms-gateway/modules/sse/service.go:67-79
Timestamp: 2025-07-22T00:31:28.796Z
Learning: In the SSE service Send method (internal/sms-gateway/modules/sse/service.go), blocking on full channels is intentional and acceptable. Timeout control is handled at the calling code level through context, not within the Send method itself. The user prefers this approach over adding a default case that would drop messages when channels are full.
Applied to files:
internal/sms-gateway/modules/push/service.gointernal/sms-gateway/modules/push/upstream/client.go
🧬 Code graph analysis (5)
internal/sms-gateway/modules/push/types/types.go (3)
internal/sms-gateway/modules/push/types.go (1)
Event(17-17)internal/sms-gateway/modules/events/types.go (1)
Event(9-12)internal/sms-gateway/modules/sse/types.go (1)
Event(7-10)
internal/sms-gateway/modules/push/types.go (1)
internal/sms-gateway/modules/push/types/types.go (1)
Message(7-10)
internal/sms-gateway/modules/push/service.go (5)
internal/sms-gateway/modules/push/types/types.go (2)
Message(7-10)Event(12-15)internal/sms-gateway/modules/push/types.go (1)
Event(17-17)internal/sms-gateway/modules/events/types.go (1)
Event(9-12)pkg/cache/options.go (1)
WithTTL(22-30)internal/sms-gateway/modules/push/metrics.go (1)
BlacklistOperationAdded(11-11)
internal/sms-gateway/modules/push/fcm/client.go (1)
internal/sms-gateway/modules/push/types/types.go (2)
Message(7-10)Event(12-15)
internal/sms-gateway/modules/push/upstream/client.go (2)
internal/sms-gateway/modules/push/types/types.go (2)
Message(7-10)Event(12-15)internal/sms-gateway/modules/push/types.go (1)
Event(17-17)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Benchmark
 - GitHub Check: E2E
 - GitHub Check: Test
 - GitHub Check: Analyze (go)
 
🔇 Additional comments (1)
internal/sms-gateway/modules/push/fcm/client.go (1)
55-78: Refactoring is correct and well-designed.The transition from map-based to slice-based message processing successfully implements support for multiple events per token. The error handling pattern (per-message errors in slice + single fatal error) is intentional and properly used by callers—the service checks both return values at line 167 (
if len(errs) == 0 && err == nil).Consider validating empty tokens before constructing FCM messages to fail fast rather than relying on the SDK:
for i, message := range messages { + if message.Token == "" { + errs[i] = fmt.Errorf("empty token") + continue + } data, err := eventToMap(message.Event)This is optional—the FCM SDK will also error on invalid tokens, but explicit validation provides clearer error messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (2)
internal/sms-gateway/modules/push/service.go (2)
133-145: Consider the observability trade-off of losing cache keys on deserialization errors.The switch from
MapEntriestoFilterMapoverValuessimplifies the code but removes the cache key from error logs when deserialization fails. While the key can be regenerated viawrapper.key()if deserialization succeeds, a corrupt entry now logs only the raw value without its associated key, making debugging slightly harder.
233-233: Consider adding outcome context to retry metrics.The simplified
IncRetry()call loses granularity about retry outcomes (successful cache, blacklisted, failed serialization, etc.). While simpler, this makes it harder to distinguish between different retry scenarios in monitoring.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
internal/sms-gateway/modules/push/service.go(1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: capcom6
PR: android-sms-gateway/server#154
File: internal/sms-gateway/modules/sse/service.go:67-79
Timestamp: 2025-07-22T00:31:28.796Z
Learning: In the SSE service Send method (internal/sms-gateway/modules/sse/service.go), blocking on full channels is intentional and acceptable. Timeout control is handled at the calling code level through context, not within the Send method itself. The user prefers this approach over adding a default case that would drop messages when channels are full.
📚 Learning: 2025-07-22T00:31:28.796Z
Learnt from: capcom6
PR: android-sms-gateway/server#154
File: internal/sms-gateway/modules/sse/service.go:67-79
Timestamp: 2025-07-22T00:31:28.796Z
Learning: In the SSE service Send method (internal/sms-gateway/modules/sse/service.go), blocking on full channels is intentional and acceptable. Timeout control is handled at the calling code level through context, not within the Send method itself. The user prefers this approach over adding a default case that would drop messages when channels are full.
Applied to files:
internal/sms-gateway/modules/push/service.go
🧬 Code graph analysis (1)
internal/sms-gateway/modules/push/service.go (2)
internal/sms-gateway/modules/push/types/types.go (2)
Message(7-10)Event(12-15)pkg/cache/options.go (1)
WithTTL(22-30)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Benchmark
 - GitHub Check: Test
 - GitHub Check: Analyze (go)
 - GitHub Check: E2E
 
🔇 Additional comments (2)
internal/sms-gateway/modules/push/service.go (2)
147-162: LGTM!The mapping to
types.Messagestructs is clean, thetotalMessagesguard prevents unnecessary operations, and the logging is clear.
172-177: No action required — error handling correctly addresses message loss.The
client.Sendcontract is now clear: it returns([]error, error)where:
- The second
 errorrepresents batch/connection failures (e.g., marshal errors, request failures)- The
 []errorslice contains per-message failures (e.g., individual send failures)When
err != nilat line 172, it indicates a batch-level failure (connection/request error), and retrying all wrappers is correct because no messages were sent. Per-message errors are handled separately at line 182+ (visible in the grep output). The code properly prevents message loss across both error scenarios.
57d49a7    to
    d573318      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
internal/sms-gateway/modules/push/upstream/client.go (1)
81-83: Consider parsing response body for detailed error information.When the upstream API returns a 4xx/5xx status, the current implementation broadcasts the same generic error to all messages without parsing the response body. If the upstream API provides per-message error details in the response, parsing them could improve error diagnostics.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
internal/sms-gateway/modules/push/upstream/client.go(4 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: capcom6
Repo: android-sms-gateway/server PR: 154
File: internal/sms-gateway/modules/sse/service.go:67-79
Timestamp: 2025-07-22T00:31:28.796Z
Learning: In the SSE service Send method (internal/sms-gateway/modules/sse/service.go), blocking on full channels is intentional and acceptable. Timeout control is handled at the calling code level through context, not within the Send method itself. The user prefers this approach over adding a default case that would drop messages when channels are full.
🧬 Code graph analysis (1)
internal/sms-gateway/modules/push/upstream/client.go (1)
internal/sms-gateway/modules/push/types/types.go (2)
Message(7-10)Event(12-15)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Benchmark
 - GitHub Check: Test
 - GitHub Check: E2E
 - GitHub Check: Analyze (go)
 
🔇 Additional comments (5)
internal/sms-gateway/modules/push/upstream/client.go (5)
14-14: LGTM!The addition of the
lopackage supports the functional transformations introduced in the refactored Send method.
46-55: LGTM!The payload construction correctly maps each Message to a PushNotification, preserving Token, Event.Type, and Event.Data fields.
69-69: Verify User-Agent header change is intentional.The User-Agent was updated from
"android-sms-gateway/1.x (server; golang)"to"sms-gate/1.x (server; golang)". Please confirm this branding change aligns with product naming standards.
88-95: LGTM!The
mapErrorshelper correctly implements a broadcast error pattern, returning the same error for each message. This is appropriate when the upstream API provides a batch-level error without per-message granularity.
57-57: Let me check the imports and understand where smsgateway comes from:Perfect! The web search reveals the key information. UpstreamPushRequest is a type alias:type UpstreamPushRequest = []PushNotificationNo actionable issues found. Review comment is incorrect.
The
UpstreamPushRequesttype from the client-go library is defined as a type alias:type UpstreamPushRequest = []PushNotification. Sincepayloadis already of type[]smsgateway.PushNotification, the type conversion at line 57 is simply casting between the alias and its underlying type. This is completely valid and will not cause any runtime marshaling issues. The types are not just compatible—they are the same type.
9da8631    to
    3ee5103      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
internal/sms-gateway/modules/push/service.go (2)
179-189: Add length validation before indexing into error slice.Line 182 accesses
errs[index]without validating thatlen(errs) == len(wrappers). If a client implementation returns a mismatched slice length, this will panic.Apply this diff to add defensive validation:
if err != nil { s.metrics.IncError(totalMessages) s.logger.Error("failed to send messages", zap.Int("total", totalMessages), zap.Error(err)) s.retry(ctx, wrappers) return } + if len(errs) != len(wrappers) { + s.metrics.IncError(totalMessages) + s.logger.Error("error slice length mismatch", zap.Int("expected", len(wrappers)), zap.Int("actual", len(errs))) + s.retry(ctx, wrappers) + return + } + failed := lo.Filter( wrappers, func(item *eventWrapper, index int) bool { if err := errs[index]; err != nil {
201-235: Use a fresh context for retry operations.The
retrymethod receivesctxfromsendAll, which may already be cancelled or timed out when retry operations execute. This can cause blacklist and cache operations (lines 208, 228) to fail, potentially losing messages.Apply this diff to use a fresh background context:
func (s *Service) retry(ctx context.Context, events []*eventWrapper) { + retryCtx, cancel := context.WithTimeout(context.Background(), s.config.Timeout) + defer cancel() + for _, wrapper := range events { token := wrapper.Token wrapper.Retries++ if wrapper.Retries >= maxRetries { - if err := s.blacklist.Set(ctx, token, []byte{}, cacheImpl.WithTTL(blacklistTimeout)); err != nil { + if err := s.blacklist.Set(retryCtx, token, []byte{}, cacheImpl.WithTTL(blacklistTimeout)); err != nil { s.logger.Warn("failed to blacklist", zap.String("token", token), zap.Error(err)) continue } s.metrics.IncBlacklist(BlacklistOperationAdded) s.logger.Warn("retries exceeded, blacklisting token", zap.String("token", token), zap.Duration("ttl", blacklistTimeout), ) continue } wrapperData, err := wrapper.serialize() if err != nil { s.metrics.IncError(1) s.logger.Error("failed to serialize event wrapper", zap.Error(err)) continue } - if setErr := s.events.SetOrFail(ctx, wrapper.key(), wrapperData); setErr != nil { + if setErr := s.events.SetOrFail(retryCtx, wrapper.key(), wrapperData); setErr != nil { s.logger.Warn("failed to set message to cache", zap.String("key", wrapper.key()), zap.Error(setErr)) continue } s.metrics.IncRetry() } }
🧹 Nitpick comments (1)
internal/sms-gateway/modules/push/types.go (1)
19-23: Document the interface contract for error slice correlation.The
Sendmethod signature implies that implementations must return an error slice withlen(errs) == len(messages), preserving order for index-based correlation. This contract should be documented to prevent implementation bugs.Apply this diff to add interface documentation:
+// client defines the push notification transport interface. +// Implementations must maintain message order and return error slices +// that match the input message count for index-based correlation. type client interface { Open(ctx context.Context) error + // Send delivers the provided messages and returns per-message errors. + // The returned error slice must have len(errs) == len(messages), + // with errs[i] corresponding to messages[i]. A non-nil global error + // indicates a transport failure affecting all messages. Send(ctx context.Context, messages []types.Message) ([]error, error) Close(ctx context.Context) error }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
internal/sms-gateway/modules/push/fcm/client.go(1 hunks)internal/sms-gateway/modules/push/metrics.go(3 hunks)internal/sms-gateway/modules/push/service.go(1 hunks)internal/sms-gateway/modules/push/types.go(1 hunks)internal/sms-gateway/modules/push/types/types.go(1 hunks)internal/sms-gateway/modules/push/upstream/client.go(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- internal/sms-gateway/modules/push/metrics.go
 
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: capcom6
Repo: android-sms-gateway/server PR: 154
File: internal/sms-gateway/modules/sse/service.go:67-79
Timestamp: 2025-07-22T00:31:28.796Z
Learning: In the SSE service Send method (internal/sms-gateway/modules/sse/service.go), blocking on full channels is intentional and acceptable. Timeout control is handled at the calling code level through context, not within the Send method itself. The user prefers this approach over adding a default case that would drop messages when channels are full.
📚 Learning: 2025-07-22T00:31:28.796Z
Learnt from: capcom6
Repo: android-sms-gateway/server PR: 154
File: internal/sms-gateway/modules/sse/service.go:67-79
Timestamp: 2025-07-22T00:31:28.796Z
Learning: In the SSE service Send method (internal/sms-gateway/modules/sse/service.go), blocking on full channels is intentional and acceptable. Timeout control is handled at the calling code level through context, not within the Send method itself. The user prefers this approach over adding a default case that would drop messages when channels are full.
Applied to files:
internal/sms-gateway/modules/push/service.go
🧬 Code graph analysis (5)
internal/sms-gateway/modules/push/fcm/client.go (2)
internal/sms-gateway/modules/push/types/types.go (2)
Message(7-10)Event(12-15)internal/sms-gateway/modules/push/types.go (1)
Event(17-17)
internal/sms-gateway/modules/push/types/types.go (3)
internal/sms-gateway/modules/push/types.go (1)
Event(17-17)internal/sms-gateway/modules/events/types.go (1)
Event(9-12)internal/sms-gateway/modules/sse/types.go (1)
Event(7-10)
internal/sms-gateway/modules/push/types.go (1)
internal/sms-gateway/modules/push/types/types.go (1)
Message(7-10)
internal/sms-gateway/modules/push/service.go (3)
internal/sms-gateway/modules/push/types/types.go (2)
Message(7-10)Event(12-15)pkg/cache/options.go (1)
WithTTL(22-30)internal/sms-gateway/modules/push/metrics.go (1)
BlacklistOperationAdded(11-11)
internal/sms-gateway/modules/push/upstream/client.go (1)
internal/sms-gateway/modules/push/types/types.go (2)
Message(7-10)Event(12-15)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: E2E
 - GitHub Check: Test
 - GitHub Check: Benchmark
 - GitHub Check: Analyze (go)
 
🔇 Additional comments (9)
internal/sms-gateway/modules/push/types/types.go (1)
7-10: LGTM! Clean message carrier type.The
Messagestruct is well-defined with appropriate fields for carrying push notification data.internal/sms-gateway/modules/push/upstream/client.go (3)
45-55: LGTM! Clean payload transformation.The use of
lo.Mapto transform messages into the upstream payload is idiomatic and correctly extracts all required fields.
88-95: LGTM! Error mapping maintains slice length invariant.The
mapErrorshelper correctly ensures that the returned error slice length matches the input message count, satisfying the interface contract.
69-69: No issues found with the User-Agent change.The User-Agent header at line 69 has been updated to
"sms-gate/1.x (server; golang)", which is consistent with the branding used elsewhere in the codebase (API endpoints atapi.sms-gate.app, support email atsupport@sms-gate.app). This is the only User-Agent header in the codebase, so there are no inconsistencies to resolve. The change is complete and intentional.internal/sms-gateway/modules/push/service.go (4)
133-145: LGTM! Proper error handling for deserialization failures.The use of
lo.FilterMapcorrectly filters out wrappers that fail to deserialize while logging the errors and updating metrics.
147-160: LGTM! Clean message construction with appropriate early exit.The wrapper-to-Message transformation and zero-message guard are correctly implemented.
164-170: LGTM! Success path handles the happy case correctly.The send operation with timeout context and success logging are properly implemented.
172-177: LGTM! Global error path now preserves messages via retry.The addition of
s.retry(ctx, wrappers)correctly addresses the previous concern about message loss on transport failure.internal/sms-gateway/modules/push/fcm/client.go (1)
55-78: LGTM! FCM client correctly implements the interface contract.The pre-allocated error slice and index-based error storage ensure that
len(errs) == len(messages), satisfying the interface requirement for error correlation.
Summary by CodeRabbit
Infrastructure
Monitoring
Breaking Changes