Explore failing //lte/gateway/c/core/oai/test/mme_app_task:mme_procedures_test with --config=asan #11955

themarwhal · 2022-03-04T00:05:07Z

Passes with --config=lsan and without --config.
With --config=asan we get the logging:

TASK_MME_APP terminated
lte/gateway/c/core/oai/test/mme_app_task/test_mme_procedures.cpp:2435: Failure
Actual function call count doesn't match EXPECT_CALL(*s1ap_handler, s1ap_mme_handle_handover_command( check_params_in_mme_app_handover_command( mme_ue_s1ap_id, new_enb_id)))...
         Expected: to be called once
           Actual: never called - unsatisfied and active
lte/gateway/c/core/oai/test/mme_app_task/test_mme_procedures.cpp:2424: Failure
Actual function call count doesn't match EXPECT_CALL(*s1ap_handler, s1ap_mme_handle_handover_request( check_params_in_mme_app_handover_request( mme_ue_s1ap_id, 0)))...
         Expected: to be called once
           Actual: never called - unsatisfied and active

The text was updated successfully, but these errors were encountered:

themarwhal · 2022-03-07T23:01:41Z

Thanks @LKreutzer !

themarwhal · 2022-03-08T13:43:17Z

Actually, I seem to still see this test on master :o @LKreutzer are you seeing this too? (I can't tell if it's just flaky or not though)

LKreutzer · 2022-03-08T13:50:57Z

@themarwhal sorry about the confusion! Yes, the PR #11966 fixes only one asan error in the mme_procedures_test, but there are others that remain, which we have not been able to fix so far.

themarwhal · 2022-03-08T13:53:59Z

got it! Thanks ;D

LKreutzer · 2022-03-09T15:25:06Z

The remaining errors are in the TestFailedPagingForPendingBearers and TestS1HandoverSuccess tests.

It seems that for both tests the errors originate from a race condition on the EXPECT_CALL macros. In each test the two EXPECT_CALL macros seem to be running in parallel threads. We found that often only one of the MATCHER_P2 macros is called. The behaviour seems to be undefined.

From the google mock doc "Important note: Google Mock requires expectations to be set before the mock functions are called, otherwise the behavior is undefined. In particular, you mustn't interleave EXPECT_CALL()s and calls to the mock functions." It might be that these EXPECT_CALL macros interleave in these cases.

We experimented with adding testing::Mock::VerifyAndClearExpectations(s1ap_handler.get()); or .InSequence(seq) but so far without success.

ssanadhya · 2022-03-15T20:44:12Z

@pruthvihebbani , could you please look at these failures? Several of these are stemming from the fact that EXPECT_CALL is registered after calling the function triggering the event in the EXPECT_CALL.

electronjoe · 2022-03-15T20:48:49Z

Reproduction

If you have access to GitHub Codespaces (you should if you are a member of GH Magma)
Go over to my Bazel prototype branch
Click on the Green colored Code drop down button
Select the Codespaces toggle
Select New Codespace and when asked select 16 core (why not!)
This will spin up a GitHub codespace that contains my branch
Once the codespace is up (vs code in browswer) in the terminal window type bazel test //lte/gateway/c/core/oai/test/mme_app_task:mme_procedures_test --config=asan and hit enter. This triggers a bazel build of all necessary components to run the test, then runs the test. You will see ASAN failures.

LKreutzer · 2022-03-16T09:10:47Z

FYI moving the EXPECT_CALLs in the TestS1HandoverSuccess and TestFailedPagingForPendingBearers tests to the earliest possible location we find that they only FAILED in 10 out of 1000 runs with bazel instead of them mostly failing e.g. FAILED in 624 out of 1000 bazel runs - which seems to support the idea that there is a race condition.

ssanadhya · 2022-03-16T17:56:27Z

@LKreutzer , could you please redo the above analysis now that #12141 is merged?

LKreutzer · 2022-03-17T10:06:34Z

@ssanadhya @themarwhal Findings regarding the flakiness (on the current master) (#12166):

Running the tests 1000 times without asan or lsan with
bazel test //lte/gateway/c/core/oai/test/mme_app_task:mme_procedures_test --runs_per_test=1000
results in FAILED in 126 out of 1000 runs (instead of FAILED in 20 out of 100 before this change).
Running the tests 100 times with asan
bazel test //lte/gateway/c/core/oai/test/mme_app_task:mme_procedures_test --config=asan --runs_per_test=100
results in FAILED in 24 out of 100 runs.

Errors still related to the EXPECT_CALLs, but now for a number of different TEST_F.

ssanadhya · 2022-03-17T19:34:57Z

@LKreutzer , thanks for doing the analysis. Let's continue the discussion on #12166 .

themarwhal mentioned this issue Mar 4, 2022

bazel: Bazelify MME #9714

Closed

LKreutzer mentioned this issue Mar 4, 2022

fix(mme): fix asan error in mme_app_ip_imsi #11966

Merged

1 task

themarwhal linked a pull request Mar 7, 2022 that will close this issue

fix(mme): fix asan error in mme_app_ip_imsi #11966

Merged

1 task

themarwhal closed this as completed Mar 7, 2022

LKreutzer reopened this Mar 8, 2022

ssanadhya mentioned this issue Mar 15, 2022

[agw][unit tests] Failure in MME service unit tests with ASAN enabled #12128

Closed

4 tasks

ssanadhya assigned pruthvihebbani Mar 15, 2022

ssanadhya mentioned this issue Mar 15, 2022

fix(mme): Fix ASAN errors in EMM context conversion unit test #12133

Merged

pruthvihebbani linked a pull request Mar 16, 2022 that will close this issue

fix(mme): Fix flaky unit tests #12141

Merged

ssanadhya closed this as completed in #12141 Mar 16, 2022

themarwhal mentioned this issue Mar 17, 2022

chore: remove manual tags from mme and spgw procedure tests #12164

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Explore failing //lte/gateway/c/core/oai/test/mme_app_task:mme_procedures_test with --config=asan #11955

Explore failing //lte/gateway/c/core/oai/test/mme_app_task:mme_procedures_test with --config=asan #11955

themarwhal commented Mar 4, 2022 •

edited by LKreutzer

themarwhal commented Mar 7, 2022

themarwhal commented Mar 8, 2022

LKreutzer commented Mar 8, 2022

themarwhal commented Mar 8, 2022

LKreutzer commented Mar 9, 2022

ssanadhya commented Mar 15, 2022

electronjoe commented Mar 15, 2022

LKreutzer commented Mar 16, 2022

ssanadhya commented Mar 16, 2022

LKreutzer commented Mar 17, 2022

ssanadhya commented Mar 17, 2022

Explore failing //lte/gateway/c/core/oai/test/mme_app_task:mme_procedures_test with --config=asan #11955

Explore failing //lte/gateway/c/core/oai/test/mme_app_task:mme_procedures_test with --config=asan #11955

Comments

themarwhal commented Mar 4, 2022 • edited by LKreutzer

themarwhal commented Mar 7, 2022

themarwhal commented Mar 8, 2022

LKreutzer commented Mar 8, 2022

themarwhal commented Mar 8, 2022

LKreutzer commented Mar 9, 2022

ssanadhya commented Mar 15, 2022

electronjoe commented Mar 15, 2022

Reproduction

LKreutzer commented Mar 16, 2022

ssanadhya commented Mar 16, 2022

LKreutzer commented Mar 17, 2022

ssanadhya commented Mar 17, 2022

themarwhal commented Mar 4, 2022 •

edited by LKreutzer