New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Directly return a DeliveryFuture from FutureProducer::send_copy #53
Conversation
I'll get the tests fixed up. |
Thanks! Yes I agree that using the current interface is a pain. The new one seems to make much more sense. The warning message happens because you changed from the deprecated Also, could you remove the change to |
So, I opted to just eat the unused result from the Tests are hanging, going to investigate. |
Not sure what's up with the tests. Can reproduce locally with |
Iirc the only special configuration required to run the tests externally is
that the default number of partitions for autocreated topics should be 3.
Are you testing this on top of master? There might be issues with the
latest version of librdkafka which is still a RC and hasn't been thoroughly
tested with rust-rdkafka. You could try on top of the 0.11.1 branch of
rust-rdkafka.
I'll give a look as well.
…On Thu, Jun 29, 2017, 23:56 william light ***@***.***> wrote:
Not sure what's up with the tests. Can reproduce locally with
docker-compose, can't reproduce testing outside of a container against a
kafka server.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#53 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABeNzwSyX9u7pRS8SD3gKqUIHGM9onRzks5sJCuvgaJpZM4OJQ5z>
.
|
Yeah, I'm setting the partition number to 3 (otherwise a handful of tests fail immediately). |
I see the tests failing occasionally with the new 0.11.0-RC1 version of librdkafka in master. I'll try to debug and fix the issue. |
Tests should be working now. |
Returning a Result made it very difficult to use `send_copy()` in a chain of futures, and, since a Future is just an asynchronous Result, this commit changes the DeliveryFuture to yield a KafkaError on failure.
Alright, rebased onto |
Thanks for the PR, I'll merge it. Coming back to the example you provided: an I'll give it a try and see if it gets more ergonomic. Feedback is welcome :) |
I had a look at upgrading our project to 0.12 with this change and am a bit confused about pre-delivery errors. We match on |
If one of these errors occurs it should cause an error result in the returned future when it's polled the first time. The only difference will be that there is no clear separation between the errors that can happen immediately and the ones that can happen asynchronously. Would this change prevent you from handling those errors properly? |
I'm not sure if this is an issue for us. I'll try to upgrade again with this knowledge and will let you know. |
I was able to do some further testing and this indeed breaks our use case. We can sometimes have some traffic peaks and in that case we hold of producing a bit. Another piece of error handling we do is clean up a message if it's too big. Before this pull this is possible (sample from our code): Ok(f) => return Ok(f),
Err(KafkaError::MessageProduction(RDKafkaError::QueueFull)) => {
warn!("Producer queue full for {} on attempt {}, waiting one second", topic, attempt);
thread::sleep(Duration::from_secs(1));
},
Err(KafkaError::MessageProduction(RDKafkaError::MessageSizeTooLarge)) => {
return Err(ErrorKind::KafkaMessageTooLarge(size).into())
},
Err(err) => bail!(err) After the pull it's only possible after waiting for the future. But then it's too late, we're producing a bunch of stuff and need the error when it actually occurs and not later. So for us this change does not make sense. |
Thanks for the report! So, in my understanding, the main issue is that there are some errors that you want to handle immediately, before sending other messages, and others that you can deal with later on.
Would this work for you? |
Yes, there are two distinct categories of errors when producing in librdkafka: Things can fail to be accepted onto the producer queue and the actual producing can fail. In my mind it's a bad choice to hide this. Generally I want to get the error when it occurs, not as part of handling the delivery reports that the future provides. In the case of message too large I need it straight away so you have a chance to split up the message into smaller parts for example. I you cannot get this error when producing you'd have to keep every single message around in memory until the delivery report can be checked. So I really do think this is not a good change. We're hiding behaviour of the underlying library just for convenience. This has real downsides too. |
Not an expert on Futures, but maybe |
I agree that we shouldn't hide behavior of the underlying library if not strictly needed, and making the API a bit nicer might not be a reason strong enough. At the same time, I feel there must be a better way than a Result. The |
kafka_producer
.send_copy::<_, ()>("test", None, Some(&*format!("msg on {}", topic)), None, None)
.map_err(|_| io::Error::new(io::ErrorKind::Other, "failed to send kafka message"))
.and_then(|_| future::ok(()))
.boxed() and this is the code using a producer
.send_copy2::<_, ()>("test", None, Some(&*format!("msg on {}", topic)), None, None)
.and_then(|delivery_future| delivery_future.and_then(|_delivery_report| future::ok(())))
.map_err(|_e| io::Error::new(io::ErrorKind::Other, "failed to send kafka message"))
.boxed() I named the variables to show what they mean (also notice that this code won't check the value of the delivery_report, meaning that some errors won't be detected). @thijsc I think this solution should work for you, while remaining ergonomic enough. I pushed a |
@thijsc I'm experimenting with a new API for both the |
Sorry for the slow response, I was working on some other stuff in the mean time. For us not having the It's not only waiting for a full queue to clear up. Another very important use case is knowing whether a message is too big. This is also something we need to know beforehand so we can set aside that message for further processing. If this API stays as is it means we'll have to rewrite our entire code base to move to the base producer with some polling threads. I could also live with a smarter If our use case is a huge outlier I could live with that decision. But I do feel that this is all pretty normal. It's the way librdkafka is structured, so I do not see a good argument for abstracting over that. |
I definitely think your use case is not an outlier, and I'd like to find an API that works for you and that at the same time allows to easily use the send in a chain of futures, which in my understanding can only be done by returning something that implements future. I thought that the new API would be able to cover your case:
Having automatic "queue full" handling (1) should help in most use cases. If that's not the case in yours, I'd like to know why and see if maybe we can come up with something better. Similarly goes for (2): the message is returned when the production fails, so implementing the logic you were mentioning should still be possible. Maybe it's harder because you'll have to handle it asynchronously and you won't have access to the producer anymore? I see other alternatives: for example it should be possible to have additional methods defined on the Could you give me some more details on why the new API doesn't work? Would an approach based on the |
The |
Good. I'll make all of its methods public and add some documentation. |
I tried to port our code to the new We basically need exactly what the Another thought: Could we split the futures stuff out into a feature and make the crate fully work without it? |
Returning a Result made it very difficult to use
send_copy()
in a chain of futures, and, since a Future is just an asynchronous Result, this commit changes the DeliveryFuture to yield a KafkaError on failure. Previously, the DeliveryFuture would only yield Canceled, and, to continue supporting this, a new variant of KafkaError was added,FutureCanceled
.This makes the FutureProducer API considerably more ergonomic. For example, in a function which returns a
BoxFuture<(), ::std::io::Error>
, this is what was necessary before:This becomes even more complicated with the fact that match arms must return the same type and the types of chained Futures can easily get very complex.
Compare that to the code one can write with this PR applied:
Much less verbose, and, more importantly, it flows naturally as part of a chain of futures.
Full disclosure: there is a compilation warning for the unused
Result
returned bytx.send()
in two places, but this warning was present before the change and I'm also not sure what the correct way to handle an error arising in such a situation actually is.