feat(pubsub): add ability to resume publishing for Publisher #4338

PhongChuong · 2026-01-22T19:04:45Z

codecov · 2026-01-22T19:13:44Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 94.87%. Comparing base (5f5904e) to head (00d8054).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #4338   +/-   ##
=======================================
  Coverage   94.87%   94.87%           
=======================================
  Files         188      188           
  Lines        7237     7241    +4     
=======================================
+ Hits         6866     6870    +4     
  Misses        371      371

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

src/pubsub/src/publisher/publisher.rs

suzmue · 2026-01-22T22:22:39Z

src/pubsub/src/publisher/publisher.rs

+    /// # Ok(())
+    /// # }
+    /// ```
+    pub async fn resume_publish<T: std::convert::Into<std::string::String>>(


This should be a synchronous operation. We don't need for the resume to complete. Any messages that are put into the channel afterward will reach the unpaused worker.

While messages that are put into the channel after will reach the unpaused worker, I think there are cases where the application would want a signal that the worker is now unpaused and publishing is back to the normal behavior.

This is achievable with async or potentially a long blocking sync. My preference is for it to be async and the application choose to .await on it if needed.

I'm not sure I understand the use case. Could you explain it a bit more? If you call a synchronous (fast) resume_publish(), all publishes that occur after it will be batched and sent (until the next error of course). Why do we need to wait for the signal to reach the background worker? What additional benefit is there?

Image if generating a message is expensive with a short TTL. In this case, the application may choose to await until the worker is ready again instead of having the messages potentially expire due to some other on going processes (i.e., flush).

Of course, this use case is completely imaginary but I also think that giving the application the option is beneficial. Is there a reason why sync fast resume is preferred over async version?

In this case, the application may choose to await until the worker is ready again instead of having the messages potentially expire due to some other on going processes (i.e., flush).
This is interesting, but I think maybe they should be awaiting on the flush not the resume_publish in this case.

One reason is to keep resume_publish cheap, its basically flipping a bit and we can immediately know that future publishes will get batched so there is just no need to delay in saying that its done.

Also from discussing this with Alex early in the design process he thought this should be a synchronous operation (if I am remembering correctly).

As discussed offline, the issue with sync is that the worker may be blocked on a long flush operation causing a large delay in the resume_publish being processed due to it awaiting on other ordering keys. The ideal case would be for resume_publish to be sync and for flush not block other operations from being processed.
I'll update resume_publish to be sync and we can update flush in a later PR.

suzmue · 2026-01-22T22:24:36Z

src/pubsub/src/publisher/publisher.rs

+        let got = publisher
+            .publish(
+                PubsubMessage::new()
+                    .set_ordering_key("ordering key with error")


Should this be the same as the line 1318? without error? I'm not sure the name is that much more descriptive than just doing key1 and key2 if we want a second one.

Good catch.
Updated to use a variable instead.
I'll follow up this PR with a cleanup of the tests.

src/pubsub/src/publisher/publisher.rs

suzmue · 2026-01-22T22:28:41Z

src/pubsub/src/publisher/publisher.rs

+        got_err = publisher
+            .publish(
+                PubsubMessage::new()
+                    .set_ordering_key("ordering key with error 0")
+                    .set_data("msg 2"),
+            )
+            .await
+            .unwrap_err();
+        let source = got_err
+            .source()
+            .and_then(|e| e.downcast_ref::<crate::error::PublishError>());
+        assert!(
+            matches!(
+                source,
+                Some(crate::error::PublishError::OrderingKeyPaused(()))
+            ),
+            "{got_err:?}"
+        );


nit: Maybe this could be in a helper, its kind of verbose and these tests become hard to skim.

Acknowledge.
I'll follow up this PR with a cleanup of the tests.

Co-authored-by: Suzy Mueller <suzmue@google.com>

src/pubsub/src/publisher/publisher.rs

src/pubsub/src/publisher/worker.rs

suzmue · 2026-01-23T21:44:37Z

src/pubsub/src/publisher/worker.rs

@@ -35,8 +35,8 @@ pub(crate) enum ToBatchWorker {
    Publish(BundledMessage),
    /// A request to flush all outstanding messages.
    Flush(oneshot::Sender<()>),
-    // TODO(#4015): Add a resume function to allow resume Publishing on a ordering key after a
-    // failure.
+    /// A request to resume publishing.
+    ResumePublish(),


note: I'm only now noticing that ToWorker and ToBatchWorker are basically the same thing. I don't know if we can save anything performance wise from using the same type, but could be a place to consider refactoring in the future.

feat(pubsub): add ability to resume publishing for Publisher

b366ce4

product-auto-label bot added the api: pubsub Issues related to the Pub/Sub API. label Jan 22, 2026

suzmue requested changes Jan 22, 2026

View reviewed changes

PhongChuong and others added 3 commits January 22, 2026 22:06

Apply suggestions from code review

5ec650c

Co-authored-by: Suzy Mueller <suzmue@google.com>

Merge branch 'main' into orderingResume

f645ae3

address comments

a9c2c3d

PhongChuong marked this pull request as ready for review January 23, 2026 03:22

PhongChuong requested a review from a team as a code owner January 23, 2026 03:22

PhongChuong requested a review from dbolduc January 23, 2026 03:24

PhongChuong and others added 4 commits January 23, 2026 14:32

Merge branch 'main' into orderingResume

e6f5192

fix fmt

304f03a

update resume_publish to sync

bd555fe

Merge branch 'main' into orderingResume

c7860b3

suzmue reviewed Jan 23, 2026

View reviewed changes

remove unused channels

00d8054

suzmue approved these changes Jan 23, 2026

View reviewed changes

PhongChuong merged commit 6aaad0e into googleapis:main Jan 23, 2026
30 checks passed

PhongChuong deleted the orderingResume branch January 24, 2026 03:05

feat(pubsub): add ability to resume publishing for Publisher #4338

feat(pubsub): add ability to resume publishing for Publisher #4338

Uh oh!

Conversation

PhongChuong commented Jan 22, 2026

Uh oh!

codecov bot commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Jan 22, 2026 •

edited

Loading