storage: add more log context to continuous_batch_parser::consume_records#25395
storage: add more log context to continuous_batch_parser::consume_records#25395WillemKauf merged 2 commits intoredpanda-data:devfrom
storage: add more log context to continuous_batch_parser::consume_records#25395Conversation
CI test resultstest results on build#63192
test results on build#63196
test results on build#63357
|
5da6e28 to
085f754
Compare
Retry command for Build#63196please wait until all jobs are finished before running the slash command |
| return verify_read_iobuf( | ||
| get_stream(), sz, "parser::consume_records", _recovery) | ||
| .then([this](result<iobuf> record) -> ss::future<result<stop_parser>> { | ||
| if (!record) { | ||
| return ss::make_ready_future<result<stop_parser>>(record.error()); | ||
| } | ||
| _consumer->consume_records(std::move(record.value())); | ||
| return _consumer->consume_batch_end().then([](stop_parser sp) { | ||
| return ss::make_ready_future<result<stop_parser>>(sp); | ||
| }); | ||
| }); | ||
| .then( | ||
| [this](result<iobuf, parser_errc> record) |
There was a problem hiding this comment.
This was being implicitly converted to
std::error_code.
why does only this call site need to change? seems like there are other callers of verify_read_iobuf that didn't change, and then it's not clear why the conversion was a problem in the first place.
There was a problem hiding this comment.
it's not clear why the conversion was a problem in the first place
I'm outputting the error type in the new log line added here, and to_string(parser_errc) is defined here.
why does only this call site need to change?
This is the only call site which has the return type written out explicitly (others use auto).
There was a problem hiding this comment.
This is the only call site which has the return type written out explicitly (others use auto).
right, so in consume_header when auto b = verify... contains an error then we have
if (!b) {
co_return b.error();
}
in a function that returns result<stop_parser> instead of result<stop_parser, parser_errc> but now it's getting a parser_errc instead std::error_code so presumably some new conversion is happening somewhere? to just say it was being implicitly converted doesn't seem complete enough to understand the impact or non impact of the change.
There was a problem hiding this comment.
Good point, I am not correcting all of the sites of implicit conversion with this change- only at this level/call site where I want to log the error, other users still have the implicit conversion to std::error_code.
There's even this awkward conversion back to parser_errc in parser.cc...
I can go back through and correct all the call sites if you'd like to see that in this PR for completion's sake.
There was a problem hiding this comment.
it's not so much about "correcting" as it is about identifying the concern, and addressing it fully in the commit message and/or changing code. i only did the first part.
There was a problem hiding this comment.
Amended the commit message to indict why only this call site was corrected.
There was a problem hiding this comment.
@WillemKauf i still don't feel like i understand the implications of this. the conversion escapes through public methods like consume_header() via b.error(). std::error_code and other error codes have different boolean polarities.
There was a problem hiding this comment.
I see the concern. This was merged (accidentally) because I had auto-merge enabled and another reviewer approved.
If we don't fully grasp the implications of this change we can revert the commit, and add the log line without the error code being printed.
There was a problem hiding this comment.
we don't need to revert it, but we need explain how things are fine. because just saying that we were casting to std::error_code incorrectly feels like there is potential hidden semantic changes.
There was a problem hiding this comment.
With this change, we have simply moved the location of the implicit cast of parser_errc to std::error_code from verify_read_iobuf() to the return statement in its three callers, namely
All of these functions return a result<> type, which is default typedef'ed to an std::error_code for the error_type (template argument S).
Lines 31 to 35 in 419f004
The boolean polarity of the error code return type here doesn't matter- we are using operator bool() from outcome::basic_result to check for errors.
There are no outward functional changes to public users of these functions after the return type change/implicit cast location shift.
This was being implicitly converted to `std::error_code`, and a future commit will attempt to log the error using `to_string(parser_errc)`. For that reason, the return type and a single call site where the log line is to be added is amended with the proper type, `result<iobuf, parser_errc>`, in order to properly log the error.
…records` This `ERROR` log line would be more helpful with added information about the batch consumer type and record batch header. Add a new `ERROR` log line to output these in the cold path.
085f754 to
735e93c
Compare
|
@dotnwat do you still feel there are any blockers for this PR? |
The
ERRORlog line we see appearing in thecontinuous_batch_parserwould be more helpful with added information about the batch consumer type and record batch header.Add a new
ERRORlog line to output this additional context in the cold path.Backports Required
Release Notes