Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

helpers: try to get exact lines from dmesg that appeared during a test #76

Merged
merged 1 commit into from
Jan 18, 2019

Conversation

i-rinat
Copy link
Contributor

@i-rinat i-rinat commented Jan 17, 2019

It turned out the current version of DmesgOopsFinder is not precise enough. It tries to cut everything before line with substring "Start test" in it. But none of the tests I tried, push such line to the kernel buffer. In that case the whole output of dmesg was used. That makes a single line with "ERROR" substring to mark all subsequent tests as failed.

We can remove that fallback, but then there is no way to determine which parts of the log to take based only on the log contents anyway. However, we know that we need only the part that appeared after the moment DmesgOopsFinder is constructed.

The code in the patch polls output of dmesg -w to extract only lines we are interested in.

@i-rinat i-rinat requested a review from vankoven January 17, 2019 22:35
Copy link
Contributor

@vankoven vankoven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your inconvenience here, but the proposed solution can't work on CI. See my comment below. I'm not happy with the current DmesgOopsFinder implementation, it calls dmesg twice and copies the full buffer between two vms to get very short piece of system log. I proposed to use journalctl to get very precise system log part, see point 4 in #56

It tries to cut everything before line with substring "Start test" in it. But none of the tests I tried, push such line to the kernel buffer.

What was the tests that doesn't push that line? We need to fix that.

helpers/dmesg.py Outdated
@@ -3,6 +3,8 @@
from __future__ import print_function
import re
from . import remote
import subprocess
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're still use python2 for the tests, and official documentation encourage to use the backported subprocess32 module instead. I've also run into some troubles with original subprocess module, like lack of required timeout argument for Popen call. Please use subprocess32 and fallback to default if not found.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turned out, subprocess module is not required here, so it's remove.

helpers/dmesg.py Outdated
@@ -37,15 +39,29 @@ class DmesgOopsFinder(object):

def __init__(self):
self.node = remote.tempesta
self.proc = subprocess.Popen(['dmesg', '-w'], stdout=subprocess.PIPE,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You create subprocess directly. The test framework originally was designed to use several virtual (or physical) machines at once. One of the main reasons was to make test framework tolerant to Tempesta crashes. If both Tempesta and the framework are started on the same vm, Tempesta crash will lose the tests progress. If framework is started on a separate vm, it's more tolerant to Tempesta crashes: if crash happens, the framework waits Tempesta vm to reboot and continues with the next tests.

However during development, we start the framework and Tempesta on the same vm just for simplicity. So the proposed change works for you. But it won't work for CI.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. Moved back to self.node.run_cmd(). Now it runs journalctl to get a slice of the kernel buffer.

@i-rinat i-rinat force-pushed the ri-sharpen-dmesg-oops-finder branch from 7ef2aa0 to 2d3d84c Compare January 18, 2019 13:25
helpers/dmesg.py Outdated
@@ -37,15 +38,11 @@ class DmesgOopsFinder(object):

def __init__(self):
self.node = remote.tempesta
self.start_time = time.time()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tempesta can be started on other vm, not the same that was used for framework itself. So time between systems may differ, and we need to get current time value on Tempesta node.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to date +%s.%N running on the remote host.

@i-rinat
Copy link
Contributor Author

i-rinat commented Jan 18, 2019

I proposed to use journalctl to get very precise system log part, see point 4 in #56

Yep, new patch version uses journalctl. I was not using it before because documentation said: «Date specifications should be of the format "2012-10-30 18:17:16"». That format doesn't have a sub-second precision, and therefore may cause issues with some lines slipping in and out. But it turned out, journalctl also accepts UNIX timestamps with sub-second precision. Used that.

What was the tests that doesn't push that line? We need to fix that.

("Start test" was not printed to a kernel log until -v flag was used when starting tests. Fixed in #77.)

@i-rinat i-rinat force-pushed the ri-sharpen-dmesg-oops-finder branch from 2d3d84c to 228e1f1 Compare January 18, 2019 13:56
Copy link
Contributor

@vankoven vankoven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Please mark p.4 of #56 as resolved

@i-rinat i-rinat merged commit 7c8e51c into master Jan 18, 2019
@i-rinat i-rinat deleted the ri-sharpen-dmesg-oops-finder branch January 18, 2019 20:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants