Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GitHub actions #2714

Merged
merged 1 commit into from Jan 29, 2020
Merged

GitHub actions #2714

merged 1 commit into from Jan 29, 2020

Conversation

dalehamel
Copy link
Member

@dalehamel dalehamel commented Jan 20, 2020

Run BCC test suite with github actions

With this commit, pushes to branches can trigger tests directly on
Github Actions.

Here it will be able to test against kernel 4.14 and 5.0.0, which correspond
to the latest Ubuntu LTS kernel releases.

Tests are run for both Debug and Release modes.

Tests run inside docker, in an ubuntu 19.04 container base.

For the github workflow:

- The test_libbcc suite is run first, to potentially fail fast on these faster
unit tests.
- The Python test suite is run

Some of these tests are allowed to fail, but the failure is still reported:

- In catch2 tests, using the [!mayfail] tag, where this will be displayed in
the test summary
- In Python unittests, using the `@mayFail("Reason")` decorator, which is
introduce for consistency with catch2. These will report the reason for the
failure, and log this to an artifact of the test.

test_libbcc

Currently this appears to actually be running more test cases than on jenkins:

test cases:   37 |   35 passed | 2 failed
assertions: 2414 | 2412 passed | 2 failed

But on jenkins for libbcc I see:

3: test cases:  37 |  36 passed | 1 failed
3: assertions: 626 | 624 passed | 2 failed

Python tests

There are a few python tests that needed to be skipped for now with @mayFail. Fixing these I think is outside of the scope of adding github actions, and these failures seem to be legitimate.

These failures are summarized as part of the action run, and the logs are explicitly added as an artifact. An issue should be cut to fix these up once this is merged.

@dalehamel
Copy link
Member Author

FYI @yonghong-song because you mentioned being interested in this in #2710

@yonghong-song
Copy link
Collaborator

@dalehamel Looks great. Thanks! I am new to github actions. But looks a really good alternative for jenkins.

@yonghong-song
Copy link
Collaborator

For the test_libbcc failure, it requires the following (test_pinned_table.cc)

  if (system("mount | grep /sys/fs/bpf")) {
    REQUIRE(system("mkdir -p /sys/fs/bpf") == 0);
    REQUIRE(system("mount -o nosuid,nodev,noexec,mode=700 -t bpf bpf /sys/fs/bpf") == 0);
    mounted = true;
  }

Maybe the above failed in github VMs?

For py_test_brb and py_test_brb2 failure, most likely some python networking packages are missing. But on top of my head, I cannot remember what they are. Do you know which distro has this issue in github actions?

@dalehamel
Copy link
Member Author

For the test_libbcc failure, it requires the following (test_pinned_table.cc)

Oh ok, I'll check this out. I should be able to do this as part of the test initialization. Thanks for the tip @yonghong-song

For py_test_brb and py_test_brb2 failure, most likely some python networking packages are missing. But on top of my head, I cannot remember what they are. Do you know which distro has this issue in github actions?

I installed these, check the Dockerfile. The failures still occur, it looks like a device is not in the state it expects.

@yonghong-song
Copy link
Collaborator

I installed these, check the Dockerfile. The failures still occur, it looks like a device is not in the state it expects.

The test also failed in my fc31 environment. I will take a look today as well.

@dalehamel
Copy link
Member Author

For the test_libbcc failure, it requires the following (test_pinned_table.cc)

Oh this was a red herring - these aren't test failures at all. It looks like this test case expects a failure, and tests for it. This is just the (expected) output. Nowhere else is this table mentioned.

I think that this test is working as expected.

@yonghong-song
Copy link
Collaborator

I debugged test_brb.py failure on fc31. The following change can fix the issue. I will send a pull request soon.

[yhs@localhost python]$ git diff
diff --git a/examples/networking/simulation.py b/examples/networking/simulation.py
index 5395d5df..92921776 100644
--- a/examples/networking/simulation.py
+++ b/examples/networking/simulation.py
@@ -65,7 +65,8 @@ class Simulation(object):
                 raise
 
         if out_ifc: out_ifc.up().commit()
-        ns_ipdb.interfaces.lo.up().commit()
+        if ns_ipdb.interfaces.lo['state'] != 'up':
+            ns_ipdb.interfaces.lo.up().commit()
         ns_ipdb.initdb()
         in_ifc = ns_ipdb.interfaces[in_ifname]
         with in_ifc as v:

Basically, for pyroute2 package on fc31, the 'lo' interface already up. It looks like to commit an already up interface on fc31 caused exceptions.

The test_brb2 failed with the following:

Traceback (most recent call last):
  File "./test_brb2.py", line 176, in test_brb2
    self.assertEqual(self.pem_stats[c_uint(0)].value, 12)
AssertionError: 13 != 12

Off by 1 error. The 12 is based on precise packet countering. Maybe there is a retransmit or something.

@dalehamel
Copy link
Member Author

@yonghong-song oh great - so those are the root of the CommitException: target state is not set issue?

I will send a pull request soon.

Awesome!

What should I do for this PR? There are a couple of failing tests still, but I'm not sure which ones also fail on Jenkins.

@yonghong-song
Copy link
Collaborator

I merged the patch to fix test_brb.py issue on fc31. Could you rebase and resubmit? How do I check github actions results?

@yonghong-song
Copy link
Collaborator

[buildbot, test this please]

@yonghong-song
Copy link
Collaborator

[buildbot, ok to test]

@yonghong-song
Copy link
Collaborator

I just kicked the jenkins test. I will post the testing result once it is done.

@dalehamel
Copy link
Member Author

Could you rebase and resubmit?

Yeah i'm working on this now on a separate branch, I'm rebasing your changes.

How do I check github actions results?

Just push a copy of this branch to your a private fork for now, then you can see the results no the actions tab. For me this looks like:

https://github.com/dalehamel/bcc/actions

@dalehamel
Copy link
Member Author

dalehamel commented Jan 22, 2020

hmm the tests are hanging, I think that one of my tests may have left a zombie ruby process...

edit: yeah, it needs the --kill-child flag or it will hang indefinitely.

@dalehamel
Copy link
Member Author

hmm now the tests are hanging on py_test_brb interesting...

@yonghong-song
Copy link
Collaborator

hmm now the tests are hanging on py_test_brb interesting...

Could you apply the following change

-bash-4.4$ git diff                                                                                                   
diff --git a/examples/networking/simulation.py b/examples/networking/simulation.py
index b7b5ed28..e2386fd0 100644
--- a/examples/networking/simulation.py
+++ b/examples/networking/simulation.py
@@ -76,7 +76,7 @@ class Simulation(object):
                 ns_ipdb.interfaces.lo.up().commit()
         else:
             ns_ipdb.interfaces.lo.up().commit()
-        ns_ipdb.initdb()
+        # ns_ipdb.initdb()
         in_ifc = ns_ipdb.interfaces[in_ifname]
         with in_ifc as v:
             v.ifname = ns_ifc
-bash-4.4$

to see whether the test can pass or not?

@dalehamel
Copy link
Member Author

to see whether the test can pass or not?

@yonghong-song it still seems to hang on this test in the CI environment unfortunately...

@dalehamel
Copy link
Member Author

@yonghong-song I reverted your commit on my branch and it no longer hangs, interesting.

@dalehamel
Copy link
Member Author

It also seems like the kernel version is not being detected correctly. I see errors in tests that are only supposed to run on 4.17 or newer, but are running on the 4.15 kernel : /

@yonghong-song
Copy link
Collaborator

Looks like github actions works? The only failure is test-script? Not sure what it is.

@dalehamel
Copy link
Member Author

@yonghong-song the libbcc tests pass (except for the 3 that are marked to allow failures, which report but don't trigger failure), but for the make test step, I have || true hardcoded right now, so that I can examine the test failures (otherwise once a failure happens, all runs stop). In this step there are still a few python failures.

I'll try and SSH in to the test runner to see if I can figure out these failures. I'd really like if this could be merged fully green, but these last few python tests are proving tricky.

@dalehamel
Copy link
Member Author

It looks like in the CI environment the operstate for lo is UNKNOWN, this is why it is failing the commit() call. I think this might be a byproduct of having docker running? In anycase, the link appears to actually be up and functional, but the test is failing because the state is UNKNOWN and not explicitly UP. I'll see if i can work around this.

@yonghong-song
Copy link
Collaborator

It looks like in the CI environment the operstate for lo is UNKNOWN, this is why it is failing the commit() call. I think this might be a byproduct of having docker running? In anycase, the link appears to actually be up and functional, but the test is failing because the state is UNKNOWN and not explicitly UP. I'll see if i can work around this.

Great. Thanks for debugging this.

@dalehamel
Copy link
Member Author

Ok I've fixed a bunch of the tests, but there are still a few failing.

On the 4.15 kernel:

	 15 - py_test_debuginfo (Failed)
	 17 - py_test_brb2 (Failed)
	 29 - py_test_tools_smoke (Failed)
	 31 - py_test_usdt (Failed)
	 33 - py_test_usdt3 (Failed)

and on the 5.0.0 kernel:

	 11 - py_test_trace3_c (Failed)
	 15 - py_test_debuginfo (Failed)
	 17 - py_test_brb2 (Failed)
	 22 - py_test_stackid (Failed)
	 29 - py_test_tools_smoke (Failed)
	 31 - py_test_usdt (Failed)
	 32 - py_test_usdt2 (Failed)
	 33 - py_test_usdt3 (Failed)

I'm wondering if I should just mark these as "ok to fail" for now, i think it will probably take some time to get them ship-shape again. Are all of these passing on jenkins @yonghong-song ?

@dalehamel
Copy link
Member Author

@yonghong-song ok, this should be pretty close to ready for review. I added a python test helper to decorate tests that fail on github actions with @mayFail("reason"). This will allow these tests to fail, while still reporting a warning and the reason for the failure, without making CI Red on github actions.

The better solution will be to actually fix these tests on github actions, but that will require some more debugging time for each of them. This allows for keeping track of these failing tests, so they can be improved incrementally. I can file an issue for these failing tests if you're ok with this approach.

@dalehamel dalehamel force-pushed the github-actions branch 3 times, most recently from e3bdabc to 026ab01 Compare January 29, 2020 02:57
With this commit, pushes to branches can trigger tests directly on
Github Actions.

Here it will be able to test against kernel 4.14 and 5.0.0, which correspond
to the latest Ubuntu LTS kernel releases.

Tests are run for both Debug and Release modes.

Tests run inside docker, in an ubuntu 19.04 container base.

For the github workflow:

- The test_libbcc suite is run first, to potentially fail fast on these faster
unit tests.
- The Python test suite is run

Some of these tests are allowed to fail, but the failure is still reported:

- In catch2 tests, using the [!mayfail] tag, where this will be displayed in
the test summary
- In Python unittests, using the `@mayFail("Reason")` decorator, which is
introduce for consistency with catch2. These will report the reason for the
failure, and log this to an artifact of the test.
@dalehamel
Copy link
Member Author

I've gotten this to be green by introducing a parallel concept to the catch2 [!mayfail] tag to python, in the form of the @mayFail decorator. I think this is now ready for review.

@yonghong-song
Copy link
Collaborator

@dalehamel Thanks! Looks great. Perfect is the enemy of the good. We can address the remaining test issues later. I will wait to merge until tomorrow morning just in case that anybody has additional comments.

@yonghong-song yonghong-song merged commit a47c44f into iovisor:master Jan 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants