Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: selected nonsanity tests are stabilized #13543

Merged
merged 1 commit into from Aug 10, 2022

Conversation

nstng
Copy link
Contributor

@nstng nstng commented Aug 8, 2022

Signed-off-by: Nils Semmelrock nils.semmelrock@tngtech.com

Summary

Executing the nonsanity tests vs an instance with a .deb installed magma showed some instability.

Here

  • test_activate_deactivate_multiple_dedicated.py
    • A little sleep changed outcome from 0/10 green runs to 10/10.
  • test_no_attach_complete_with_mme_restart.py
    • the UE_ATTACH_ACCEPT_IND seem to be send after the mme restart in various places - used an existing mitigation in more places -> 10/10 green runs
  • test_sctp_shutdown_while_mme_is_stopped.py
    • only updated an outdated print
    • this test still fails reliable for me with
      • AssertionError: Timeout (180 sec) occurred while waiting for response message in s1ap_utils.py:221
      • from self._s1ap_wrapper._s1setup() in test_sctp_shutdown_while_mme_is_stopped.py:86

Test Plan

  • standard magma, magma_test, magma_trfserver setup
  • ~/magma$ ./bazel/scripts/run_integ_tests.sh --setup-nonsanity
  • ./bazel/scripts/run_integ_tests.sh --skip-setup-teardown-nonsanity //lte/gateway/python/integ_tests/s1aptests:test_activate_deactivate_multiple_dedicated
  • ./bazel/scripts/run_integ_tests.sh --skip-setup-teardown-nonsanity //lte/gateway/python/integ_tests/s1aptests:test_no_attach_complete_with_mme_restart
  • ./bazel/scripts/run_integ_tests.sh --skip-setup-teardown-nonsanity //lte/gateway/python/integ_tests/s1aptests:test_sctp_shutdown_while_mme_is_stopped

Additional Information

  • This change is backwards-breaking

@pull-request-size pull-request-size bot added the size/M Denotes a PR that changes 30-99 lines. label Aug 8, 2022
@github-actions
Copy link
Contributor

github-actions bot commented Aug 8, 2022

Thanks for opening a PR! 💯

A couple initial guidelines

Howto

  • Reviews. The "Reviewers" listed for this PR are the Magma maintainers who will shepherd it.
  • Checks. All required CI checks must pass before merge.
  • Merge. Once approved and passing CI checks, use the ready2merge label to indicate the maintainers can merge your PR.

More info

Please take a moment to read through the Magma project's

If this is your first Magma PR, also consider reading

@github-actions github-actions bot added the component: agw Access gateway-related issue label Aug 8, 2022
@github-actions
Copy link
Contributor

github-actions bot commented Aug 8, 2022

Oops! Looks like you failed the Python Format Check.

Howto

♻️ Updated: ✅ The check is passing the Python Format Check after the last commit.

@github-actions
Copy link
Contributor

github-actions bot commented Aug 8, 2022

feg-workflow

    2 files  203 suites   40s ⏱️
374 tests 374 ✔️ 0 💤 0
388 runs  388 ✔️ 0 💤 0

Results for commit 486f16e.

♻️ This comment has been updated with latest results.

@nstng nstng marked this pull request as ready for review August 8, 2022 18:13
@nstng nstng requested review from a team and ssanadhya August 8, 2022 18:13
@nstng nstng force-pushed the fix_selected_nonsanity_tests branch 2 times, most recently from f93ff7d to fe3aee4 Compare August 8, 2022 18:28
@github-actions
Copy link
Contributor

github-actions bot commented Aug 8, 2022

dp-workflow

15 tests   15 ✔️  3m 49s ⏱️
  1 suites    0 💤
  1 files      0

Results for commit 486f16e.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Contributor

github-actions bot commented Aug 8, 2022

agw-workflow

615 tests   611 ✔️  3m 55s ⏱️
    2 suites      4 💤
    2 files        0

Results for commit 486f16e.

♻️ This comment has been updated with latest results.


self.assertEqual(
response.msg_type,
response = self._s1ap_wrapper.s1_util.get_response(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fix for this test case has some technical issues like get_response() does not expect the message type as argument and the function assertResponseIgnoringAttachAccept is not utilized. Moreover, I tested this fix and this test case still seems to be flaky.

The current issue with the test case is that it handles the retransmitted NW initiated detach only once whereas MME is able to send it multiple times because it comes up early after restart. Please have a look on the fixed test case I have attached here:
test_no_attach_complete_with_mme_restart.py.txt
The testcase is completely stable with this change after multiple iterations.

Here is the diff:

root@aaman-10765-varqim:~/repo/magma/lte/gateway/python/integ_tests# git diff s1aptests/test_no_attach_complete_with_mme_restart.py
diff --git a/lte/gateway/python/integ_tests/s1aptests/test_no_attach_complete_with_mme_restart.py b/lte/gateway/python/integ_tests/s1aptests/test_no_attach_complete_with_mme_restart.py
index 109fbf401..ab0becbc9 100644
--- a/lte/gateway/python/integ_tests/s1aptests/test_no_attach_complete_with_mme_restart.py
+++ b/lte/gateway/python/integ_tests/s1aptests/test_no_attach_complete_with_mme_restart.py
@@ -119,7 +119,7 @@ class TestNoAttachCompleteWithMmeRestart(unittest.TestCase):
         # Receive NW initiated detach request
         response = self._s1ap_wrapper.s1_util.get_response()

-        while response.msg_type == s1ap_types.tfwCmd.UE_ATTACH_ACCEPT_IND:
+        while response.msg_type == s1ap_types.tfwCmd.UE_ATTACH_ACCEPT_IND.value:
             print(
                 "Received Attach Accept retransmission from before restart",
                 "Ignoring...",
@@ -140,22 +140,6 @@ class TestNoAttachCompleteWithMmeRestart(unittest.TestCase):
             nw_init_detach_req.Type,
             s1ap_types.ueNwInitDetType_t.TFW_RE_ATTACH_REQUIRED.value,
         )
-        # Receive NW initiated detach request
-        response = self._s1ap_wrapper.s1_util.get_response()
-        self.assertEqual(
-            response.msg_type,
-            s1ap_types.tfwCmd.UE_NW_INIT_DETACH_REQUEST.value,
-        )
-        nw_init_detach_req = response.cast(s1ap_types.ueNwInitdetachReq_t)
-        print(
-            "**************** Received NW initiated Detach Req with detach "
-            "type set to ",
-            nw_init_detach_req.Type,
-        )
-        self.assertEqual(
-            nw_init_detach_req.Type,
-            s1ap_types.ueNwInitDetType_t.TFW_RE_ATTACH_REQUIRED.value,
-        )

         print("**************** Sending Detach Accept")
         # Send detach accept
@@ -168,6 +152,28 @@ class TestNoAttachCompleteWithMmeRestart(unittest.TestCase):

         # Wait for UE context release command
         response = self._s1ap_wrapper.s1_util.get_response()
+
+        # Meanwhile ignore retransmitted NW Initiated Detach Request messages.
+        # This script waits for 20 seconds after MME restart, but most of the
+        # times MME comes up early after restart and retransmits multiple NW
+        # Initiated Detach Request messages multiple times on T3422 Timer expiry
+        while (
+            response.msg_type
+            == s1ap_types.tfwCmd.UE_NW_INIT_DETACH_REQUEST.value
+        ):
+            nw_init_detach_req = response.cast(s1ap_types.ueNwInitdetachReq_t)
+            print(
+                "**************** Received retransmitted NW Initiated Detach "
+                "Req with detach type set to",
+                nw_init_detach_req.Type,
+                "Ignoring...",
+            )
+            self.assertEqual(
+                nw_init_detach_req.Type,
+                s1ap_types.ueNwInitDetType_t.TFW_RE_ATTACH_REQUIRED.value,
+            )
+            response = self._s1ap_wrapper.s1_util.get_response()
+
         self.assertEqual(
             response.msg_type,
             s1ap_types.tfwCmd.UE_CTX_REL_IND.value,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, wow, I must have mixed something up in my commit ... assertResponseIgnoringAttachAccept should have been used in all calls after the mme restart to filter out possible UE_ATTACH_ACCEPT_IND responses. Sorry for that.

But your solution seems more sophisticated so I applied it (with minor comment changes). Is this OK for you?

Copy link
Member

@VinashakAnkitAman VinashakAnkitAman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comment

Signed-off-by: Nils Semmelrock <nils.semmelrock@tngtech.com>
@nstng nstng force-pushed the fix_selected_nonsanity_tests branch from fe3aee4 to 486f16e Compare August 10, 2022 12:57
Copy link
Member

@VinashakAnkitAman VinashakAnkitAman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@VinashakAnkitAman VinashakAnkitAman merged commit c181ca4 into magma:master Aug 10, 2022
maxhbr pushed a commit to maxhbr/magma that referenced this pull request Aug 13, 2022
Signed-off-by: Nils Semmelrock <nils.semmelrock@tngtech.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backported-v1.8 component: agw Access gateway-related issue size/M Denotes a PR that changes 30-99 lines.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants