Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(ci): Add debug output to in order to make it easier to investiga… #14238

Conversation

crasu
Copy link
Contributor

@crasu crasu commented Oct 21, 2022

Add debug output to in order to make it easier to investigate services not starting in integ tests

Summary

There have been several instances when eventd did not startup during agw integ tests during the last days. As this does not happen locally this will enable us to investigate these issues further. Currently I can only reproduce this on master ;-(

Example:
1 2 3 4

Integ Run

https://github.com/crasu/magma/actions/runs/3300444273 (green run ;-(
https://github.com/crasu/magma/actions/runs/3320152230 (some flaky test failures)

@pull-request-size pull-request-size bot added the size/S Denotes a PR that changes 10-29 lines. label Oct 21, 2022
@github-actions
Copy link
Contributor

Thanks for opening a PR! 💯

A couple initial guidelines

Howto

  • Reviews. The "Reviewers" listed for this PR are the Magma maintainers who will shepherd it.
  • Checks. All required CI checks must pass before merge.
  • Merge. Once approved and passing CI checks, use the ready2merge label to indicate the maintainers can merge your PR.

More info

Please take a moment to read through the Magma project's

If this is your first Magma PR, also consider reading

@github-actions github-actions bot added the component: agw Access gateway-related issue label Oct 21, 2022
@github-actions
Copy link
Contributor

github-actions bot commented Oct 21, 2022

Oops! Looks like you failed the Python Format Check.

Howto

♻️ Updated: ✅ The check is passing the Python Format Check after the last commit.

@github-actions
Copy link
Contributor

github-actions bot commented Oct 21, 2022

feg-workflow

    2 files  203 suites   40s ⏱️
374 tests 374 ✔️ 0 💤 0
388 runs  388 ✔️ 0 💤 0

Results for commit 6114fd5.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Contributor

github-actions bot commented Oct 21, 2022

dp-workflow

14 tests   14 ✔️  2m 18s ⏱️
  1 suites    0 💤
  1 files      0

Results for commit 6114fd5.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Contributor

github-actions bot commented Oct 21, 2022

agw-workflow

615 tests   611 ✔️  4m 30s ⏱️
    2 suites      4 💤
    2 files        0

Results for commit 6114fd5.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Contributor

github-actions bot commented Oct 24, 2022

Oops! Looks like you failed the DCO check. Be sure to sign all your commits.

Howto

♻️ Updated: ✅ The check is passing the DCO check after the last commit.

@crasu crasu force-pushed the pr/improve-debug-output-for-services-running-integ-tests branch from ae81277 to e0bd11a Compare October 24, 2022 06:47
@@ -22,6 +22,6 @@ state_recovery:

# Number of restarts of services_check that triggers recovery process
restart_threshold: 2
interval_check_mins: 3
interval_check_mins: 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change? The PR description lists systemd issues.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be removed no idea how that slipped in here

@@ -105,6 +106,11 @@ def _query_state_of_services(self, service_status):
print(f' {active_state}')
print(f' {start_time}')

def get_failed_service_info(self, failed_service):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently this tests only runs for a systemd setup. But if TBD (PR from @mpfirrmann) gets merged in its current state then calling this function does not really give a helpful output. I assume the errors='ignore' will just cause an output of Unit eventd.service could not be found..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the docker case we probably would need a docker logs . Once this pr exists/gets merged.

@crasu crasu force-pushed the pr/improve-debug-output-for-services-running-integ-tests branch from e0bd11a to 71d3aa3 Compare October 25, 2022 10:29
@crasu
Copy link
Contributor Author

crasu commented Oct 25, 2022

Looks like the issues were caused by ntp. Syslog contains the following message:
Oct 19 19:24:47 magma-dev eventd[67686]: 19 Oct 19:24:47 ntpdate[67686]: no server suitable for synchronization found

…e services not starting in integ tests. Add debug output to ntpdate as this seems to cause the failed integ tests.

Signed-off-by: Christian Krämer <christian.kraemer@tngtech.com>
@crasu crasu force-pushed the pr/improve-debug-output-for-services-running-integ-tests branch from 71d3aa3 to 6114fd5 Compare October 25, 2022 11:29
@@ -15,7 +15,7 @@ Description=Magma eventd service
[Service]
Type=simple
EnvironmentFile=/etc/environment
ExecStartPre=/usr/sbin/ntpdate pool.ntp.org
ExecStartPre=/usr/sbin/ntpdate -vd pool.ntp.org
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should give us some debug output on master.

@@ -105,6 +106,11 @@ def _query_state_of_services(self, service_status):
print(f' {active_state}')
print(f' {start_time}')

def get_failed_service_info(self, failed_service):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the docker case we probably would need a docker logs . Once this pr exists/gets merged.

@crasu crasu requested a review from nstng October 25, 2022 12:51
@crasu crasu self-assigned this Oct 25, 2022
@crasu crasu marked this pull request as ready for review October 25, 2022 17:38
@crasu crasu requested a review from a team October 25, 2022 17:38
@crasu crasu requested a review from a team as a code owner October 25, 2022 17:38
@crasu crasu requested a review from ulaskozat October 25, 2022 17:38
@crasu crasu merged commit 2c6f85c into magma:master Oct 27, 2022
@crasu crasu deleted the pr/improve-debug-output-for-services-running-integ-tests branch October 27, 2022 08:04
@nstng nstng linked an issue Nov 1, 2022 that may be closed by this pull request
lucasgonze pushed a commit to lucasgonze/magma that referenced this pull request Feb 29, 2024
…e services not starting in integ tests. Add debug output to ntpdate as this seems to cause the failed integ tests. (magma#14238)


Signed-off-by: Christian Krämer <christian.kraemer@tngtech.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: agw Access gateway-related issue size/S Denotes a PR that changes 10-29 lines.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

eventd is correctly started for the integration tests
4 participants