New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correct order of ossl_condvar_signal in quic_multistream_test #22616
Conversation
quic_multistream test was issuing a signal on a condvar after dropping the corresponding mutex, not before, leading to potential race conditions in the reading of the associated data Fixes openssl#22588
fake_now in the quictestlib is read/written by potentially many threads, and as such should have a surrounding lock to prevent WAR/RAW errors as caught by tsan: 2023-11-03T16:27:23.7184999Z ================== 2023-11-03T16:27:23.7185290Z WARNING: ThreadSanitizer: data race (pid=18754) 2023-11-03T16:27:23.7185720Z Read of size 8 at 0x558f6f9fe970 by main thread: 2023-11-03T16:27:23.7186726Z #0 qtest_create_quic_connection_ex <null> (quicapitest+0x14aead) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7187665Z openssl#1 qtest_create_quic_connection <null> (quicapitest+0x14b220) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7188567Z openssl#2 test_quic_write_read quicapitest.c (quicapitest+0x150ee2) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7189561Z openssl#3 run_tests <null> (quicapitest+0x2237ab) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7190294Z openssl#4 main <null> (quicapitest+0x223d2b) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7190720Z 2023-11-03T16:27:23.7190902Z Previous write of size 8 at 0x558f6f9fe970 by thread T1: 2023-11-03T16:27:23.7191607Z #0 qtest_create_quic_connection_ex <null> (quicapitest+0x14aecf) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7192505Z openssl#1 run_server_thread quictestlib.c (quicapitest+0x14b1d6) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7193361Z openssl#2 thread_run quictestlib.c (quicapitest+0x14cadf) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7193848Z 2023-11-03T16:27:23.7194220Z Location is global 'fake_now.0' of size 8 at 0x558f6f9fe970 (quicapitest+0x1af4970) 2023-11-03T16:27:23.7194636Z 2023-11-03T16:27:23.7194816Z Thread T1 (tid=18760, running) created by main thread at: 2023-11-03T16:27:23.7195465Z #0 pthread_create <null> (quicapitest+0xca12d) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7196317Z openssl#1 qtest_create_quic_connection_ex <null> (quicapitest+0x14adcb) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7197214Z openssl#2 qtest_create_quic_connection <null> (quicapitest+0x14b220) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7198111Z openssl#3 test_quic_write_read quicapitest.c (quicapitest+0x150ee2) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7198940Z openssl#4 run_tests <null> (quicapitest+0x2237ab) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7199661Z openssl#5 main <null> (quicapitest+0x223d2b) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7200083Z 2023-11-03T16:27:23.7200862Z SUMMARY: ThreadSanitizer: data race (/home/runner/work/openssl/openssl/test/quicapitest+0x14aead) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) in qtest_create_quic_connection_ex
LGTM...pending this coming out of draft before approval. |
I think this is probably worth merging, even though it doesn't fix the nonstop problem yet, as locking shared data (fake_now), and correcting the mutex/signal order is the right thing to do in terms of programming patterns |
Few comments, otherwise looks good |
This might fix the issues I've noticed on VMS. I'll have a look this evening |
/* zero time can have a special meaning, bump it */ | ||
qtest_add_time(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this racy? Could you just ossl_time_add() inside the write lock?
Alternatively you can see this function as something that should be never used in racy contexts and drop the lock calls completely here. Using this in a racy context would mean that we need to have multiple fake_now storages per use actually because otherwise one thread could reset the fake time for another innocent thread and the time would go backwards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is racy. qtest_add_time retakes the global lock, so there aren't any read/write races strictly speaking. It is theoretically possible that another thread might read the zero time, but given that this is during setup, there should be no other threads running
24 hours has passed since 'approval: done' was set, but as this PR has been updated in that time the label 'approval: ready to merge' is not being automatically set. Please review the updates and set the label manually. |
As the last fixup is only a trivial code style fix no re-approvals are IMO needed. |
Merged to the master and 3.2 branches. Thank you. |
quic_multistream test was issuing a signal on a condvar after dropping the corresponding mutex, not before, leading to potential race conditions in the reading of the associated data Fixes #22588 Reviewed-by: Hugo Landau <hlandau@openssl.org> Reviewed-by: Paul Dale <pauli@openssl.org> Reviewed-by: Matt Caswell <matt@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> (Merged from #22616) (cherry picked from commit 22b482a)
fake_now in the quictestlib is read/written by potentially many threads, and as such should have a surrounding lock to prevent WAR/RAW errors as caught by tsan: 2023-11-03T16:27:23.7184999Z ================== 2023-11-03T16:27:23.7185290Z WARNING: ThreadSanitizer: data race (pid=18754) 2023-11-03T16:27:23.7185720Z Read of size 8 at 0x558f6f9fe970 by main thread: 2023-11-03T16:27:23.7186726Z #0 qtest_create_quic_connection_ex <null> (quicapitest+0x14aead) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7187665Z #1 qtest_create_quic_connection <null> (quicapitest+0x14b220) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7188567Z #2 test_quic_write_read quicapitest.c (quicapitest+0x150ee2) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7189561Z #3 run_tests <null> (quicapitest+0x2237ab) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7190294Z #4 main <null> (quicapitest+0x223d2b) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7190720Z 2023-11-03T16:27:23.7190902Z Previous write of size 8 at 0x558f6f9fe970 by thread T1: 2023-11-03T16:27:23.7191607Z #0 qtest_create_quic_connection_ex <null> (quicapitest+0x14aecf) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7192505Z #1 run_server_thread quictestlib.c (quicapitest+0x14b1d6) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7193361Z #2 thread_run quictestlib.c (quicapitest+0x14cadf) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7193848Z 2023-11-03T16:27:23.7194220Z Location is global 'fake_now.0' of size 8 at 0x558f6f9fe970 (quicapitest+0x1af4970) 2023-11-03T16:27:23.7194636Z 2023-11-03T16:27:23.7194816Z Thread T1 (tid=18760, running) created by main thread at: 2023-11-03T16:27:23.7195465Z #0 pthread_create <null> (quicapitest+0xca12d) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7196317Z #1 qtest_create_quic_connection_ex <null> (quicapitest+0x14adcb) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7197214Z #2 qtest_create_quic_connection <null> (quicapitest+0x14b220) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7198111Z #3 test_quic_write_read quicapitest.c (quicapitest+0x150ee2) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7198940Z #4 run_tests <null> (quicapitest+0x2237ab) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7199661Z #5 main <null> (quicapitest+0x223d2b) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7200083Z 2023-11-03T16:27:23.7200862Z SUMMARY: ThreadSanitizer: data race (/home/runner/work/openssl/openssl/test/quicapitest+0x14aead) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) in qtest_create_quic_connection_ex Reviewed-by: Hugo Landau <hlandau@openssl.org> Reviewed-by: Paul Dale <pauli@openssl.org> Reviewed-by: Matt Caswell <matt@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> (Merged from #22616) (cherry picked from commit 11179b3)
quic_multistream test was issuing a signal on a condvar after dropping the corresponding mutex, not before, leading to potential race conditions in the reading of the associated data Fixes #22588 Reviewed-by: Hugo Landau <hlandau@openssl.org> Reviewed-by: Paul Dale <pauli@openssl.org> Reviewed-by: Matt Caswell <matt@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> (Merged from #22616)
fake_now in the quictestlib is read/written by potentially many threads, and as such should have a surrounding lock to prevent WAR/RAW errors as caught by tsan: 2023-11-03T16:27:23.7184999Z ================== 2023-11-03T16:27:23.7185290Z WARNING: ThreadSanitizer: data race (pid=18754) 2023-11-03T16:27:23.7185720Z Read of size 8 at 0x558f6f9fe970 by main thread: 2023-11-03T16:27:23.7186726Z #0 qtest_create_quic_connection_ex <null> (quicapitest+0x14aead) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7187665Z #1 qtest_create_quic_connection <null> (quicapitest+0x14b220) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7188567Z #2 test_quic_write_read quicapitest.c (quicapitest+0x150ee2) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7189561Z #3 run_tests <null> (quicapitest+0x2237ab) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7190294Z #4 main <null> (quicapitest+0x223d2b) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7190720Z 2023-11-03T16:27:23.7190902Z Previous write of size 8 at 0x558f6f9fe970 by thread T1: 2023-11-03T16:27:23.7191607Z #0 qtest_create_quic_connection_ex <null> (quicapitest+0x14aecf) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7192505Z #1 run_server_thread quictestlib.c (quicapitest+0x14b1d6) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7193361Z #2 thread_run quictestlib.c (quicapitest+0x14cadf) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7193848Z 2023-11-03T16:27:23.7194220Z Location is global 'fake_now.0' of size 8 at 0x558f6f9fe970 (quicapitest+0x1af4970) 2023-11-03T16:27:23.7194636Z 2023-11-03T16:27:23.7194816Z Thread T1 (tid=18760, running) created by main thread at: 2023-11-03T16:27:23.7195465Z #0 pthread_create <null> (quicapitest+0xca12d) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7196317Z #1 qtest_create_quic_connection_ex <null> (quicapitest+0x14adcb) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7197214Z #2 qtest_create_quic_connection <null> (quicapitest+0x14b220) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7198111Z #3 test_quic_write_read quicapitest.c (quicapitest+0x150ee2) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7198940Z #4 run_tests <null> (quicapitest+0x2237ab) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7199661Z #5 main <null> (quicapitest+0x223d2b) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7200083Z 2023-11-03T16:27:23.7200862Z SUMMARY: ThreadSanitizer: data race (/home/runner/work/openssl/openssl/test/quicapitest+0x14aead) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) in qtest_create_quic_connection_ex Reviewed-by: Hugo Landau <hlandau@openssl.org> Reviewed-by: Paul Dale <pauli@openssl.org> Reviewed-by: Matt Caswell <matt@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> (Merged from #22616)
Unfortunately, this didn't fix the VMS test issue either... [sigh] |
quic_multistream test was issuing a signal on a condvar after dropping the corresponding mutex, not before, leading to potential race conditions in the reading of the associated data Fixes #22588 Reviewed-by: Hugo Landau <hlandau@openssl.org> Reviewed-by: Paul Dale <pauli@openssl.org> Reviewed-by: Matt Caswell <matt@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> (Merged from openssl/openssl#22616) Signed-off-by: fly2x <fly2x@hitls.org>
fake_now in the quictestlib is read/written by potentially many threads, and as such should have a surrounding lock to prevent WAR/RAW errors as caught by tsan: 2023-11-03T16:27:23.7184999Z ================== 2023-11-03T16:27:23.7185290Z WARNING: ThreadSanitizer: data race (pid=18754) 2023-11-03T16:27:23.7185720Z Read of size 8 at 0x558f6f9fe970 by main thread: 2023-11-03T16:27:23.7186726Z #0 qtest_create_quic_connection_ex <null> (quicapitest+0x14aead) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7187665Z #1 qtest_create_quic_connection <null> (quicapitest+0x14b220) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7188567Z #2 test_quic_write_read quicapitest.c (quicapitest+0x150ee2) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7189561Z #3 run_tests <null> (quicapitest+0x2237ab) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7190294Z #4 main <null> (quicapitest+0x223d2b) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7190720Z 2023-11-03T16:27:23.7190902Z Previous write of size 8 at 0x558f6f9fe970 by thread T1: 2023-11-03T16:27:23.7191607Z #0 qtest_create_quic_connection_ex <null> (quicapitest+0x14aecf) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7192505Z #1 run_server_thread quictestlib.c (quicapitest+0x14b1d6) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7193361Z #2 thread_run quictestlib.c (quicapitest+0x14cadf) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7193848Z 2023-11-03T16:27:23.7194220Z Location is global 'fake_now.0' of size 8 at 0x558f6f9fe970 (quicapitest+0x1af4970) 2023-11-03T16:27:23.7194636Z 2023-11-03T16:27:23.7194816Z Thread T1 (tid=18760, running) created by main thread at: 2023-11-03T16:27:23.7195465Z #0 pthread_create <null> (quicapitest+0xca12d) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7196317Z #1 qtest_create_quic_connection_ex <null> (quicapitest+0x14adcb) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7197214Z #2 qtest_create_quic_connection <null> (quicapitest+0x14b220) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7198111Z #3 test_quic_write_read quicapitest.c (quicapitest+0x150ee2) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7198940Z #4 run_tests <null> (quicapitest+0x2237ab) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7199661Z #5 main <null> (quicapitest+0x223d2b) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) 2023-11-03T16:27:23.7200083Z 2023-11-03T16:27:23.7200862Z SUMMARY: ThreadSanitizer: data race (/home/runner/work/openssl/openssl/test/quicapitest+0x14aead) (BuildId: d06f7b04830b55de9c8482b398a1781472d1c7d5) in qtest_create_quic_connection_ex Reviewed-by: Hugo Landau <hlandau@openssl.org> Reviewed-by: Paul Dale <pauli@openssl.org> Reviewed-by: Matt Caswell <matt@openssl.org> Reviewed-by: Tomas Mraz <tomas@openssl.org> (Merged from openssl/openssl#22616) Signed-off-by: fly2x <fly2x@hitls.org>
quic_multistream test was issuing a signal on a condvar after dropping the corresponding mutex, not before, leading to potential race conditions in the reading of the associated data
Related to #22588