Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ioengines.c:346: td_io_queue: Assertion `res == 0' failed #1521

Conversation

horshack-dpreview
Copy link
Contributor

Assertion in ioengines.c::td_io_queue() fails for pthread_mutex_unlock() on overlap_check mutex when serialize_overlap=1, io_submit_mode=offload, and verify= are used together.

backend.c::fio_io_sync() invokes td_io_queue(), which expects the caller to have ownership of the overlap_check mutex when serialize_overlap and offloading are configured, as part of the overlap-check interlock with IO_U_F_FLIGHT. The mutex is not acquired for this path because it's not an I/O requiring an overlap check.

The fix is to refine the conditional that triggers td_io_queue() to release the overlap_check mutex. Rather than using broad config options, the conditional now uses a new io_u flag named IO_U_F_OVERLAP_LOCK, which is only set for the offload worker thread path that acquires the mutex.

Link: #1520

Signed-off-by: Adam Horshack (horshack@live.com)

Assertion in ioengines.c::td_io_queue() fails for pthread_mutex_unlock()
on overlap_check mutex when serialize_overlap=1, io_submit_mode=offload,
and verify=<any> are used together.

backend.c::fio_io_sync() invokes td_io_queue(), which expects the caller
to have ownership of the overlap_check mutex when serialize_overlap
and offloading are configured, as part of the overlap-check interlock
with IO_U_F_FLIGHT. The mutex is not acquired for this path because it's
not an I/O requiring an overlap check.

The fix is to refine the conditional that triggers td_io_queue() to
release the overlap_check mutex. Rather than using broad config options,
the conditional now uses a new io_u flag named IO_U_F_OVERLAP_LOCK, which
is only set for the offload worker thread path that acquires the mutex.

Link: axboe#1520

Signed-off-by: Adam Horshack (horshack@live.com)
@axboe axboe merged commit d08dbc0 into axboe:master Feb 28, 2023
@axboe
Copy link
Owner

axboe commented Feb 28, 2023

This is causing a crash, and I should've done a better job reviewing it. The rate-submit.c is nonsensical. Did you test this at all? I'm going to revert the change for now.

@horshack-dpreview
Copy link
Contributor Author

horshack-dpreview commented Feb 28, 2023

This is causing a crash, and I should've done a better job reviewing it. The rate-submit.c is nonsensical. Did you test this at all? I'm going to revert the change for now.

I tested it against a matrix of configurations, like I do all my changes. I have my own automated test bench to do this. What options caused it to crash and what about the change to rate-submit.c is nonsensical?

@axboe
Copy link
Owner

axboe commented Feb 28, 2023

Just run:

./fio t/jobs/t0013.fio

and it'll insta-crash.

@horshack-dpreview
Copy link
Contributor Author

Just run:

./fio t/jobs/t0013.fio

and it'll insta-crash.

Reproduced. I don't know how I added that line to rate-submit.c without seeing the obvious terminating value issue for td. I'm sorry.

I was more perplexed how my test runs didn't catch it. Turns out it only faults when numjobs is an even multiple of 8, which none of my test runs are. Digging into it, for_each_td will only yield an ending td value of NULL when the number of threads allocated is an even multiple of JOBS_PER_SEG (8), since that's the only scenario where the thread_segment[n+1] will have a NULL value for threads:

fio/fio.h

Lines 587 to 593 in 9943adf

static inline struct thread_data *tnumber_to_td(unsigned int tnumber)
{
struct thread_segment *seg;
seg = &segments[tnumber / JOBS_PER_SEG];
return &seg->threads[tnumber & (JOBS_PER_SEG - 1)];
}

To prevent others from making my same careless mistake I'd like to propose changing for_each_td to make it impossible to use the terminating value of td. This can be accomplished by 1) moving the scope of td inside for_each_td or 2) explicitly assigning td to NULL at the end of the loop. The first proposal is cleaner and more efficient but touches a lot of files - the latter has only one touch but is less efficient. Please let me know what you think.

@axboe
Copy link
Owner

axboe commented Feb 28, 2023

Option 1 seems nicer. Even if efficiency isn't THAT important so we could tolerate just clearing it, it'd still just crash rather than be safe by design.

@horshack-dpreview
Copy link
Contributor Author

horshack-dpreview commented Feb 28, 2023

@axboe Thanks. I wanted to give both td and i the same macro-scope treatment but in C/C++ if you declare any type within the for() loop initializer all variables must be of that type. For example, the following is invalid:

#define for_each_td(td, i)  \
    for (int (i) = 0, struct thread_data *(td) = &segments[0].threads[0]; (i) < (int) thread_number; (i)++, (td) = tnumber_to_td((i)))

Even if we wanted to only restrict the scope of td the following is invalid too, since i is not the same type as td declared:

#define for_each_td(td, i)  \
    for (struct thread_data *(td) = &segments[0].threads[0], int (i) = 0; (i) < (int) thread_number; (i)++, (td) = tnumber_to_td((i)))

To work around this i must be initialized outside the for() loop initializer:

#define for_each_td(td, i)  \
    (i) = 0;                \
    for (struct thread_data *(td) = &segments[0].threads[0]; (i) < (int) thread_number; (i)++, (td) = tnumber_to_td((i)))

Which is fine but not beautiful. Next issue: Every snippet which uses for_each_td will get a -Wunused-variable warning because the outer-scope version of td is not used in most of fio's code - it's only declared to use inside for_each_td. The cleanest solution would be to remove all existing local declarations of td intended for use in for_each_td (and the loop index variable too while we're at it). But that means we now have to use a hard-coded variable name for the construct, since C/C++ doesn't provide a way to pass a variable name to a macro and have that macro declare a variable with that name...if the name doesn't already exist (which it does in the outer scope for the current implementation). The solution is to always use the name td within the for_each_td block, which is fine except in 4 places where the code uses td2 instead, since td is used as the function argument name in those cases. For those we'd have to rename that existing td to something else, which seems manageable.

It's doable but comes down to style and whether the juice is worth the squeeze. The alternative is the original option 2, where we assure td ends with a value of NULL so we at least catch the usage via a segfault. To implement that I would modify tnumber_to_td() to range check and return NULL for a thread # that is >= #threads.

@horshack-dpreview
Copy link
Contributor Author

@axboe, Here is the corrected fix for the original issue. Would you like me to create a new pull request or can you pull it from it from my branch?

Link: horshack-dpreview@aac9c03

horshack-dpreview added a commit to horshack-dpreview/fio that referenced this pull request Mar 2, 2023
I recently introduced a bug caused by reusing a struct thread_data *td
after the end of a for_each_td() loop construct.

Link: axboe#1521 (comment)

To prevent others from making this same mistake, this commit refactors
for_each_td() so that both the struct thread_data * and the loop index
variable are placed inside their own scope for the loop. This will cause
any reference to those variables outside the for_each_td() to produce an
undeclared identifier error, provided the outer scope doesn't already
reuse those same variable names for other code within the routine (which
is fine, since the scope will be different).

Because C/C++ doesn't let you declare two different variable types
within the scope of a for() loop initializer, creating a scope both both
struct thread_data * and the loop index required explicitly declaring a
scope with a curly brace. This means for_each_td() includes an opening
curly brace to create the scope, which means all uses of for_each_td()
must now end with an invocation of a new macro named end_for_each_td()
to emit an ending curly brace to match the scope brace created by
for_each_td():

	for_each_td(td, i) {
		while (td->runstate < TD_EXITED)
			sleep(1);
	}
	end_for_each_td()

The alternative is to end every for_each_td() construct with an inline
curly brace, which is off-putting since the implementation of an extra
opening curly brace is abstracted in for_each_td():

	for_each_td(td, i) {
		while (td->runstate < TD_EXITED)
			sleep(1);
	}}

Most fio logic only declares "struct thread_data *td" and "int i" for
use in for_each_td(), which means those declarations will now cause
-Wunused-variable warnings since they're not used outside the scope the
refactored for_each_td(). Those declarations have been removed.

Implementing this change caught a latent bug in eta.c::calc_thread_status()
that accesses the ending value of struct thread_data *td after the end
of for_each_td(), now manifesting as a compile error, so working as
designed :) That bug is corrected as part of this commit.
horshack-dpreview added a commit to horshack-dpreview/fio that referenced this pull request Mar 2, 2023
I recently introduced a bug caused by reusing a struct thread_data *td
after the end of a for_each_td() loop construct.

Link: axboe#1521 (comment)

To prevent others from making this same mistake, this commit refactors
for_each_td() so that both the struct thread_data * and the loop index
variable are placed inside their own scope for the loop. This will cause
any reference to those variables outside the for_each_td() to produce an
undeclared identifier error, provided the outer scope doesn't already
reuse those same variable names for other code within the routine (which
is fine, since the scope will be different).

Because C/C++ doesn't let you declare two different variable types
within the scope of a for() loop initializer, creating a scope both both
struct thread_data * and the loop index required explicitly declaring a
scope with a curly brace. This means for_each_td() includes an opening
curly brace to create the scope, which means all uses of for_each_td()
must now end with an invocation of a new macro named end_for_each_td()
to emit an ending curly brace to match the scope brace created by
for_each_td():

	for_each_td(td, i) {
		while (td->runstate < TD_EXITED)
			sleep(1);
	}
	end_for_each_td()

The alternative is to end every for_each_td() construct with an inline
curly brace, which is off-putting since the implementation of an extra
opening curly brace is abstracted in for_each_td():

	for_each_td(td, i) {
		while (td->runstate < TD_EXITED)
			sleep(1);
	}}

Most fio logic only declares "struct thread_data *td" and "int i" for
use in for_each_td(), which means those declarations will now cause
-Wunused-variable warnings since they're not used outside the scope the
refactored for_each_td(). Those declarations have been removed.

Implementing this change caught a latent bug in eta.c::calc_thread_status()
that accesses the ending value of struct thread_data *td after the end
of for_each_td(), now manifesting as a compile error, so working as
designed :) That bug is corrected as part of this commit.

Signed-off-by: Adam Horshack (horshack@live.com)
@horshack-dpreview
Copy link
Contributor Author

@axboe, I've implemented the discussed changes to for_each_td():

#1531

horshack-dpreview added a commit to horshack-dpreview/fio that referenced this pull request Mar 2, 2023
I recently introduced a bug caused by reusing a struct thread_data *td
after the end of a for_each_td() loop construct.

Link: axboe#1521 (comment)

To prevent others from making this same mistake, this commit refactors
for_each_td() so that both the struct thread_data * and the loop index
variable are placed inside their own scope for the loop. This will cause
any reference to those variables outside the for_each_td() to produce an
undeclared identifier error, provided the outer scope doesn't already
reuse those same variable names for other code within the routine (which
is fine because the scopes are separate).

Because C/C++ doesn't let you declare two different variable types
within the scope of a for() loop initializer, creating a scope for both
struct thread_data * and the loop index required explicitly declaring a
scope with a curly brace. This means for_each_td() includes an opening
curly brace to create the scope, which means all uses of for_each_td()
must now end with an invocation of a new macro named end_for_each_td()
to emit an ending curly brace to match the scope brace created by
for_each_td():

	for_each_td(td, i) {
		while (td->runstate < TD_EXITED)
			sleep(1);
	}
	end_for_each_td()

The alternative is to end every for_each_td() construct with an inline
curly brace, which is off-putting since the implementation of an extra
opening curly brace is abstracted in for_each_td():

	for_each_td(td, i) {
		while (td->runstate < TD_EXITED)
			sleep(1);
	}}

Most fio logic only declares "struct thread_data *td" and "int i" for use in
for_each_td(), which means those declarations will now cause -Wunused-variable
warnings since they're not used outside the scope of the refactored
for_each_td(). Those declarations have been removed.

Implementing this change caught a latent bug in eta.c::calc_thread_status()
that accesses the ending value of struct thread_data *td after the end
of for_each_td(), now manifesting as a compile error, so working as
designed :) That bug is corrected as part of this commit.

Signed-off-by: Adam Horshack (horshack@live.com)
horshack-dpreview added a commit to horshack-dpreview/fio that referenced this pull request Mar 2, 2023
I recently introduced a bug caused by reusing a struct thread_data *td
after the end of a for_each_td() loop construct.

Link: axboe#1521 (comment)

To prevent others from making this same mistake, this commit refactors
for_each_td() so that both the struct thread_data * and the loop index
variable are placed inside their own scope for the loop. This will cause
any reference to those variables outside the for_each_td() to produce an
undeclared identifier error, provided the outer scope doesn't already
reuse those same variable names for other code within the routine (which
is fine because the scopes are separate).

Because C/C++ doesn't let you declare two different variable types
within the scope of a for() loop initializer, creating a scope for both
struct thread_data * and the loop index required explicitly declaring a
scope with a curly brace. This means for_each_td() includes an opening
curly brace to create the scope, which means all uses of for_each_td()
must now end with an invocation of a new macro named end_for_each_td()
to emit an ending curly brace to match the scope brace created by
for_each_td():

	for_each_td(td, i) {
		while (td->runstate < TD_EXITED)
			sleep(1);
	}
	end_for_each_td()

The alternative is to end every for_each_td() construct with an inline
curly brace, which is off-putting since the implementation of an extra
opening curly brace is abstracted in for_each_td():

	for_each_td(td, i) {
		while (td->runstate < TD_EXITED)
			sleep(1);
	}}

Most fio logic only declares "struct thread_data *td" and "int i" for use in
for_each_td(), which means those declarations will now cause -Wunused-variable
warnings since they're not used outside the scope of the refactored
for_each_td(). Those declarations have been removed.

Implementing this change caught a latent bug in eta.c::calc_thread_status()
that accesses the ending value of struct thread_data *td after the end
of for_each_td(), now manifesting as a compile error, so working as
designed :) That bug is corrected as part of this commit.

Signed-off-by: Adam Horshack (horshack@live.com)
horshack-dpreview added a commit to horshack-dpreview/fio that referenced this pull request Mar 3, 2023
I recently introduced a bug caused by reusing a struct thread_data *td
after the end of a for_each_td() loop construct.

Link: axboe#1521 (comment)

To prevent others from making this same mistake, this commit refactors
for_each_td() so that both the struct thread_data * and the loop index
variable are placed inside their own scope for the loop. This will cause
any reference to those variables outside the for_each_td() to produce an
undeclared identifier error, provided the outer scope doesn't already
reuse those same variable names for other code within the routine (which
is fine because the scopes are separate).

Because C/C++ doesn't let you declare two different variable types
within the scope of a for() loop initializer, creating a scope for both
struct thread_data * and the loop index required explicitly declaring a
scope with a curly brace. This means for_each_td() includes an opening
curly brace to create the scope, which means all uses of for_each_td()
must now end with an invocation of a new macro named end_for_each()
to emit an ending curly brace to match the scope brace created by
for_each_td():

	for_each_td(td) {
		while (td->runstate < TD_EXITED)
			sleep(1);
	} end_for_each();

The alternative is to end every for_each_td() construct with an inline
curly brace, which is off-putting since the implementation of an extra
opening curly brace is abstracted in for_each_td():

	for_each_td(td) {
		while (td->runstate < TD_EXITED)
			sleep(1);
	}}

Most fio logic only declares "struct thread_data *td" and "int i" for use in
for_each_td(), which means those declarations will now cause -Wunused-variable
warnings since they're not used outside the scope of the refactored
for_each_td(). Those declarations have been removed.

Implementing this change caught a latent bug in eta.c::calc_thread_status()
that accesses the ending value of struct thread_data *td after the end
of for_each_td(), now manifesting as a compile error, so working as
designed :)

Signed-off-by: Adam Horshack (horshack@live.com)
horshack-dpreview added a commit to horshack-dpreview/fio that referenced this pull request Mar 3, 2023
I recently introduced a bug caused by reusing a struct thread_data *td
after the end of a for_each_td() loop construct.

Link: axboe#1521 (comment)

To prevent others from making this same mistake, this commit refactors
for_each_td() so that both the struct thread_data * and the loop index
variable are placed inside their own scope for the loop. This will cause
any reference to those variables outside the for_each_td() to produce an
undeclared identifier error, provided the outer scope doesn't already
reuse those same variable names for other code within the routine (which
is fine because the scopes are separate).

Because C/C++ doesn't let you declare two different variable types
within the scope of a for() loop initializer, creating a scope for both
struct thread_data * and the loop index required explicitly declaring a
scope with a curly brace. This means for_each_td() includes an opening
curly brace to create the scope, which means all uses of for_each_td()
must now end with an invocation of a new macro named end_for_each()
to emit an ending curly brace to match the scope brace created by
for_each_td():

	for_each_td(td) {
		while (td->runstate < TD_EXITED)
			sleep(1);
	} end_for_each();

The alternative is to end every for_each_td() construct with an inline
curly brace, which is off-putting since the implementation of an extra
opening curly brace is abstracted in for_each_td():

	for_each_td(td) {
		while (td->runstate < TD_EXITED)
			sleep(1);
	}}

Most fio logic only declares "struct thread_data *td" and "int i" for use in
for_each_td(), which means those declarations will now cause -Wunused-variable
warnings since they're not used outside the scope of the refactored
for_each_td(). Those declarations have been removed.

Implementing this change caught a latent bug in eta.c::calc_thread_status()
that accesses the ending value of struct thread_data *td after the end
of for_each_td(), now manifesting as a compile error, so working as
designed :)

Signed-off-by: Adam Horshack (horshack@live.com)
horshack-dpreview added a commit to horshack-dpreview/fio that referenced this pull request Mar 3, 2023
I recently introduced a bug caused by reusing a struct thread_data *td
after the end of a for_each_td() loop construct.

Link: axboe#1521 (comment)

To prevent others from making this same mistake, this commit refactors
for_each_td() so that both the struct thread_data * and the loop index
variable are placed inside their own scope for the loop. This will cause
any reference to those variables outside the for_each_td() to produce an
undeclared identifier error, provided the outer scope doesn't already
reuse those same variable names for other code within the routine (which
is fine because the scopes are separate).

Because C/C++ doesn't let you declare two different variable types
within the scope of a for() loop initializer, creating a scope for both
struct thread_data * and the loop index required explicitly declaring a
scope with a curly brace. This means for_each_td() includes an opening
curly brace to create the scope, which means all uses of for_each_td()
must now end with an invocation of a new macro named end_for_each()
to emit an ending curly brace to match the scope brace created by
for_each_td():

	for_each_td(td) {
		while (td->runstate < TD_EXITED)
			sleep(1);
	} end_for_each();

The alternative is to end every for_each_td() construct with an inline
curly brace, which is off-putting since the implementation of an extra
opening curly brace is abstracted in for_each_td():

	for_each_td(td) {
		while (td->runstate < TD_EXITED)
			sleep(1);
	}}

Most fio logic only declares "struct thread_data *td" and "int i" for use in
for_each_td(), which means those declarations will now cause -Wunused-variable
warnings since they're not used outside the scope of the refactored
for_each_td(). Those declarations have been removed.

Implementing this change caught a latent bug in eta.c::calc_thread_status()
that accesses the ending value of struct thread_data *td after the end
of for_each_td(), now manifesting as a compile error, so working as
designed :)

Signed-off-by: Adam Horshack (horshack@live.com)
horshack-dpreview added a commit to horshack-dpreview/fio that referenced this pull request Mar 3, 2023
I recently introduced a bug caused by reusing a struct thread_data *td
after the end of a for_each_td() loop construct.

Link: axboe#1521 (comment)

To prevent others from making this same mistake, this commit refactors
for_each_td() so that both the struct thread_data * and the loop index
variable are placed inside their own scope for the loop. This will cause
any reference to those variables outside the for_each_td() to produce an
undeclared identifier error, provided the outer scope doesn't already
reuse those same variable names for other code within the routine (which
is fine because the scopes are separate).

Because C/C++ doesn't let you declare two different variable types
within the scope of a for() loop initializer, creating a scope for both
struct thread_data * and the loop index required explicitly declaring a
scope with a curly brace. This means for_each_td() includes an opening
curly brace to create the scope, which means all uses of for_each_td()
must now end with an invocation of a new macro named end_for_each()
to emit an ending curly brace to match the scope brace created by
for_each_td():

	for_each_td(td) {
		while (td->runstate < TD_EXITED)
			sleep(1);
	} end_for_each();

The alternative is to end every for_each_td() construct with an inline
curly brace, which is off-putting since the implementation of an extra
opening curly brace is abstracted in for_each_td():

	for_each_td(td) {
		while (td->runstate < TD_EXITED)
			sleep(1);
	}}

Most fio logic only declares "struct thread_data *td" and "int i" for use in
for_each_td(), which means those declarations will now cause -Wunused-variable
warnings since they're not used outside the scope of the refactored
for_each_td(). Those declarations have been removed.

Implementing this change caught a latent bug in eta.c::calc_thread_status()
that accesses the ending value of struct thread_data *td after the end
of for_each_td(), now manifesting as a compile error, so working as
designed :)

Signed-off-by: Adam Horshack (horshack@live.com)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants