Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
This commit refactors this library, almost from scratch, in order to: 1. Achieve more conventional and easier-to-use APIs; 2. More correct and flexible error handling; 3. Internal segmentation of batches inside of google publisher. The initial motivation was originally 3, but then old way of implicit error handling came up as a problem as well. I attempted to introduce some mechanism to handle errors in a more granular way without adjusting API of the library too much, but it turned out to be not very feasible. The new API is centered around the new `hedwig::Message` trait which the users would implement to associate data types with various information that is necessary for the hewdig library – the topic name, the validators applicable to the data, the encoding mechanisms, etc. Furthermore, it now allows retaining all of auxiliary data alongside the message data as well, giving further flexibility in how this library is used by the users. All in all, the trait was designed in a way that would allow to move the message-encoding specific pieces of code away from the business logic as much as possible. For now the implementation of this trait remains manual, but there is nothing that would prevent it from being `derive`able in some way. --- The error handling was made entirely explicit in the new API. In particular we no longer implicitly re-queue messages in the batch (nor do we reuse the batches, anymore). Instead the users receive a `Stream` of `(result, topic, validated message)`. They can then choose to `push` the validated message back into a new batch they are constructing to retry the message… or just handle the error in whatever other way that makes sense to them. One important improvement as part of this change is simplification of behaviour when the futures/streams are cancelled. Previously dropping the `publish` future would leave one in a fairly difficult to understand state where the batch might have some of the messages re-queued and some of them cancelled, etc. Since there's no longer any state remaining in `hedwig`, dropping streams to cancel starts making significantly more sense as well. --- Google PubSub has various limitations on the API requests that are made to its endpoints. As we are in charge of batching the requests up to implement publishing more efficiently, it is also up to us to ensure that we do not exceed any of the limits upstream has placed on us. To do this we implement a batch segmentation algorithm in `GoogleMessageSegmenter` that the GooglePubSubPublisher uses internally to split up the messages. Fixes #18
- Loading branch information