-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
subtests.docker_cli: Add --sig-proxy tests #65
Conversation
self.stderr_strip) | ||
else: | ||
return super(OutputGoodBase, self).__str__() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interface change...flagging for 0.6.1 (not a bad thing)
Re:
I agree. Wow, this is a Godzilla of a test 😀 I stuck it on 0.6.1 due to (small) interface change. A quick skim of the code and nothing stood out at me, I'll let you know if anything breaks when integrating with other changes. Thanks! |
The interface (library calls, variables, ...) is untouched, it only adds some info to the log... Or you can just cherrypick the test and add the framework modification later... |
Untrue |
Ugggg, this PR is being a real bugger to integrate. @ldoktor next time, please squash all the related updates into a single commit. i.e. The 6 commits here about the 'kill' test should all go into one commit with details on all the changes. The commits unrelated to the 'kill' test should go into separate PRs. I'm trying to cherry-pick and rebase my way around, but it's really taking waaay longer than I'd like (partly b/c it's a very complicated set of tests)... |
...okay, they seemed to squash and merge okay, running both tests now just for my own sanity (it's quickly fading 😫) |
Uuuggg, I've spent nearly all afternoon on this one PR and...
...everything is failing 😢 Perhaps worse though is both the kill and sigproxy tests are sooooooooooooo huge and complicated, I don't even know where to begin debugging. Please don't take it personally, the only commits I can accept here for 0.6.1 are 23526ad and e364d0d. Can you please re-work the 'kill' test to be a LOT simpler and do the same for sig-proxy?
We don't need to test every aspect in every combination. Just pick one or two and write something simple, 3-400 lines tops! Any larger, and it should be broken down into a separate tests or dropped. The more simple/basic the better - remember novices may need, extend and maintain the code. |
Hi @cevich, sorry about that,I tried to make it informative enough, so when you don't cut the verbose output, you have the reasons described... The output using ... 16:05:29 ERROR| RUNNING ---- ---- timestamp=1398175529 localtime=Apr 22 16:05:29 ERROR: random_num failed to run_once: DockerTestFail: Check line not in docker output, signal was probably not passed/handled properly. missing: Received 20, ignoring... stopped_log: set([25, 1, 5, 20, 15]) docker_output: ['Received 1, ignoring...', 'Received 5, ignoring...', 'Received 15, ignoring...', 'Received 18, ignoring...', 'Received 25, ignoring...'] 16:05:29 DEBUG| RUNNING ---- ---- timestamp=1398175529 localtime=Apr 22 16:05:29 DEBUG: Traceback (most recent call last): File "/home/medic/Work/Projekty/autotest/autotest-ldoktor/client/tests/docker/dockertest/subtest.py", line 527, in call_subsubtest_method method() File "/home/medic/Work/Projekty/autotest/autotest-ldoktor/client/tests/docker/subtests/docker_cli/kill/kill.py", line 316, in run_once raise xceptions.DockerTestFail(msg) DockerTestFail: Check line not in docker output, signal was probably not passed/handled properly. missing: Received 20, ignoring... stopped_log: set([25, 1, 5, 20, 15]) docker_output: ['Received 1, ignoring...', 'Received 5, ignoring...', 'Received 15, ignoring...', 'Received 18, ignoring...', 'Received 25, ignoring...'] # signal 20 was not passed to the docker (we were expecting signals 25, 1, 20, 15, in the output you can see that the 20 is missing...) 16:05:47 ERROR| RUNNING ---- ---- timestamp=1398175547 localtime=Apr 22 16:05:47 ERROR: random_name failed to run_once: DockerTestFail: Check line not in docker output, signal was probably not passed/handled properly. missing: Received 20, ignoring... stopped_log: set([1, 2, 4, 5, 6, 7, 8, 10, 11, 12, 14, 15, 16, 20, 21, 22, 23, 24, 26, 27, 28, 29, 30]) docker_output: ['Received 1, ignoring...', 'Received 4, ignoring...', 'Received 5, ignoring...', 'Received 6, ignoring...', 'Received 7, ignoring...', 'Received 8, ignoring...', 'Received 10, ignoring...', 'Received 11, ignoring...', 'Received 12, ignoring...', 'Received 14, ignoring...', 'Received 15, ignoring...', 'Received 16, ignoring...', 'Received 18, ignoring...', 'Received 23, ignoring...', 'Received 24, ignoring...', 'Received 26, ignoring...', 'Received 27, ignoring...', 'Received 28, ignoring...', 'Received 29, ignoring...', 'Received 30, ignoring...', 'Received 2, ignoring...'] 16:05:47 DEBUG| RUNNING ---- ---- timestamp=1398175547 localtime=Apr 22 16:05:47 DEBUG: Traceback (most recent call last): File "/home/medic/Work/Projekty/autotest/autotest-ldoktor/client/tests/docker/dockertest/subtest.py", line 527, in call_subsubtest_method method() File "/home/medic/Work/Projekty/autotest/autotest-ldoktor/client/tests/docker/subtests/docker_cli/kill/kill.py", line 316, in run_once raise xceptions.DockerTestFail(msg) DockerTestFail: Check line not in docker output, signal was probably not passed/handled properly. missing: Received 20, ignoring... stopped_log: set([1, 2, 4, 5, 6, 7, 8, 10, 11, 12, 14, 15, 16, 20, 21, 22, 23, 24, 26, 27, 28, 29, 30]) docker_output: ['Received 1, ignoring...', 'Received 4, ignoring...', 'Received 5, ignoring...', 'Received 6, ignoring...', 'Received 7, ignoring...', 'Received 8, ignoring...', 'Received 10, ignoring...', 'Received 11, ignoring...', 'Received 12, ignoring...', 'Received 14, ignoring...', 'Received 15, ignoring...', 'Received 16, ignoring...', 'Received 18, ignoring...', 'Received 23, ignoring...', 'Received 24, ignoring...', 'Received 26, ignoring...', 'Received 27, ignoring...', 'Received 28, ignoring...', 'Received 29, ignoring...', 'Received 30, ignoring...', 'Received 2, ignoring...'] # The same situation, only expected set is bigger (also the output of docker) check: Received 27, ignoring... output:[] 16:06:57 DEBUG| RUNNING ---- ---- timestamp=1398175617 localtime=Apr 22 16:06:57 DEBUG: Traceback (most recent call last): File "/home/medic/Work/Projekty/autotest/autotest-ldoktor/client/tests/docker/dockertest/subtest.py", line 527, in call_subsubtest_method method() File "/home/medic/Work/Projekty/autotest/autotest-ldoktor/client/tests/docker/subtests/docker_cli/kill/kill.py", line 335, in run_once raise xceptions.DockerTestFail(msg) DockerTestFail: Check line not in docker output, signal was probably not passed/handled properly. check: Received 27, ignoring... output:[] # here we expect signal 27, but none was received. 16:07:51 DEBUG| DockerTestFail: Sub-subtest failures: set(['attach_sigproxy_stress', 'run_sigproxy', 'bad', 'sigstop', 'run_sigproxy_stress', 'random_num', 'attach_sigproxy', 'random_name']) 16:07:52 INFO | FAIL docker/subtests/docker_cli/kill.test_1-of-1 docker/subtests/docker_cli/kill.test_1-of-1 timestamp=1398175672 localtime=Apr 22 16:07:52 Sub-subtest failures: set(['attach_sigproxy_stress', 'run_sigproxy', 'bad', 'sigstop', 'run_sigproxy_stress', 'random_num', 'attach_sigproxy', 'random_name']) 16:07:52 INFO | END FAIL docker/subtests/docker_cli/kill.test_1-of-1 docker/subtests/docker_cli/kill.test_1-of-1 timestamp=1398175672 localtime=Apr 22 16:07:52 16:07:52 DEBUG| Persistent state client._record_indent now set to 1 16:07:52 DEBUG| Persistent state client.unexpected_reboot deleted 16:07:52 DEBUG| Persistent state client.steps now set to [] 16:07:52 INFO | Good: []; Not Good: ['non-default-images.sh']; Details: (non-default-images.sh, {'exit': 3, 'stderr': 'Found unexpected image: ldoktor/fedora-qemu latest c4619e7bccac3e5425257ae530434b5514d40d267e48137a6bbc383fb059f6a2\n', 'stdout': ''}) 16:07:52 INFO | Environment checks failed! Blame docker/subtests/docker_cli/kill So you can see usually some signal was not passed. (you can also meet another issue, where signal was not send because it's name is unknown to the Anyway I'll try to simplify this test. About the split patches I'm use to put different stuff into separate patches to help the reviewer follow the problem. Later it's also easier to rip off the part which is making troubles. Anyway I can squash them, if you don't want these benefits... |
Hehe, you wrote your own better output:
In other words, if you can write a simpler explanation for something...use it! So just the test literally output: "Signal(s) 20 was not passed to docker, though signals 21, 21, 15 were." or "Signal(s) 20, 21, 15 passed to docker but signal(s) 20 not received". Then there is nothing to try and "decode", the output is immediately human-accessible and there's no question what it means 😀 |
Re:
Another great explanation that could be helpful output: Sorry I didn't give more details last week. IIRC, the other main problems in this test were:
Also, don't apologize! Writing anything (including code) is nearly always more about "re-writing" than getting it the first time. I normally end up catching myself about half-way through a new test with "This is too complicated". At that point I mercilessly hold down the delete key for a bit, and go on from there. Finally, it's HARD to make code accessible/readable, it's much easier to write compact and line-efficient. The art is in knowing what to expand and be more verbose, what is okay already, and what should be thrown out. IMHO, the greatest programming tool is the delete key 😀 |
OK, next week I'll look at it. |
Hi Chris, I tried but failed to do 1 and 2, but the rest should be better now... The configurable stuff (apart from test command) can't be hardcoded as the inherited tests need to set the params differently. So the superclass needs to be tweakable and make it configurable via config is IMO the best way... On the first look it might look strange, but I believe that on second look even the code is easily understandable. The output should be self-explanatory enough. |
Yep, I've been through the |
Suggestion re:
Remember the sub/subsubtests are in fact subclasses, so you're free to extend that interface as needed for use within your modules. For example, instead of a configurable option, you could add a class attribute, then inherit or override it as needed in further sub-classes. This helps extending the case, makes it easy to extract/update those attributes from configuration (if/when needed) or extract them + their generalized callers for use in the dockertest API. Having a class attribute also reduces the amount of documentation you need to write (though documenting the class attribute is helpful to readers). |
yes, but I prefer configurable way as sometimes I want to test just certain signals, or I want to change the stress command. (usually in case of failure I want to simplify the test condition and this allows me to do it in config). |
@ldoktor Much better split out into separate files. We're almost there, now pleas see what you can do to clean up the pylint findings. Thanks. |
DockerTestFail: Sub-subtest failures: set(['attach_sigproxy_stress', 'run_sigproxy', 'parallel_stress', 'bad', 'sigstop', 'run_sigproxy_stress', 'random_num', 'attach_sigproxy', 'random_name'])
|
This pull request adds the possibility to use os.kill instead of "docker kill" as a test command. Additionally it adds the possibility to run the main docker command as docker attach process. Booth features are utilized in 4 new tests, which test the sig-proxy feature of the main process (run or attach). Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
this test checks `docker run`/`docker attach` --sig-proxy feature. It spawns container process (run/attach) and using subprocess.send_signal kills it with desired signals. Than it verifies they were (not)forwarded based on the expected values. There is outstanding bug where sig-proxy can't handle multiple signals without delay. This test uses workaround with time.sleep(0.1). Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
Split test into separate files, improve the log output and code cleanups. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
It seems there are still some pylint findings that need addressing:
|
I have no idea why this one really fails, we know passing signals works in the simple cases...
Maybe it's a problem caused by recent updates to DockerCmd? Either way, it's not clear why this is failing even from looking at the debug log.
There shouldn't be failures in cleanup() for an expected condition like a container not existing.
For each these, the debug log shows
This is completely misleading,
There's no need to spam the console like that. Just the bare-minimum is fine, like the next test does I realize we have some Bugzilla open, but I know there are more working 'kill' & 'sigproxy' features than we have passing tests for. Perhaps expose the known failures into one simple sub-tests, and add another few that all PASS for the working features. That would be much more useful for CI than a bunch of tests that fail out of the gate and the working stuff goes untested. |
This patch simplifies the config by removing unnecessarry options into test-specific variables. Additionally it adjusts some logging to the recent framework changes. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
Hi Chris, could you please provide the full debug log? From the snippets you posted here I'm unable to analyze the problem. As for the run_sigproxy tests, I never saw it failing. Anyway please look at the latest changes, I striped out most of the test-specific config variables and simplified the code a bit. |
Btw when this test is accepted, I'll send another PR which removes the 3s wait for container initialization and another variant of the |
Latest docker doesn't return the exit code, but follows this schema: * with tty=on `docker kill` => 0 * with tty=off `docker kill` => 255 * bash `kill -9 $cont_pid` => 137 Skip the exit code check for now and wait until the behavior stabilizes. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
Sorry, I did not get chance to look at these improvements yet, I have been troubleshooting iptables test all day, that is done now so I can take a look. |
Lukas, I'm still seeing some pylint problems here that need fixing:
The W0703's look like they could be fixed using a pair of try...finally's instead of try...except's. Then there's these old ones that would be nice to get rid of as well, but they're not new:
Which would resolve issue #167. Not sure if I mentioned it to you or not, but just in case: We're using pylint as a quick way to enforce a common set of code/style standards. It's not perfect, but it does have the useful side-effect of generally helping the code to be more readable and maintainable by others. Since there's a good chance somebody else will have to extend or maintain these tests in the future, it's nice to be helpful to them by at least meeting the basic pylint-benchmarks. Anyway, see what you can do. Sometimes it's just a matter of consolidating lists into dicts, making some extra methods, or tweaking a heirarchy. I can try and help with some fixes if you're too overwhelmed or tired of looking at this code. Just let me know. |
This patch doesn't change any logic, it only refactors the code to avoid all pylint/pep style errors. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
Hi @cevich I do know and always read carefully pylint/pep8 messages (I even have them in Eclipse). On some places it makes sense to ignore them and I'm not a big fan of too many functions packing a simple test. But if this is the style you expect in docker_test, I can produce it. Please let me know, what you think about the latest version. |
Okay will do. Yes I prefer it, besides enforcing style, having it show totally clean makes seeing new possible problems faster to see when reviewing. Yep, I'll take a peek at new version right after lunch... |
@ldoktor Pylint looks better now. However, the console and log-spam makes it really hard to see what is failing. Also two sub-subtests are not making use of your I'm still seeing the failure from
The confusing part for me is, clearly 'USR1' is not a bad signal, so why shouldn't it return zero? The other challenge for me (can be fixed later) is I think most of these tests fail due to the same one or two bugzillas. In both cases, the failure cause is well known and fixes should be coming soon. How about modifying all but one or two of tests so they PASS. Leave the failures are targeted at the specific bugs? My reasoning is, all the failing tests will basically get skipped in continuous integration, negating any benefit they may provide in exercising the functioning code-paths. Anyway, it can be fixed later, but think about it. |
pylint looks better, the code sometimes not. But I can live with it 😄. About the log-spam, yes. As I wrote in your rebase I left the beauty part for another pull request because I have 2 other subtests in a queue. The kill command is rebase problem, sorry for that. The Yes, most of the failures fail due of the same BZ now, but the tests are different and once the BZs are fixed they will look for specific regressions. Actually this test might soon start working as the 3 most troubled signals are about to be declared as non-passable. Then I'll just add them into |
Oooohh, okay, I didn't understand that was your intention. Yep, incremental fixes/additions are always welcome. Okay, I'll double check/fix my rebase didn't break |
during rebases appending of results moved after negative check causing results of previous kill_cmd being printed instead the current ones. Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
@cevich, please see and include the latest fix |
will do. |
Okay, had to re-jigger some things around but I made sure you fix is in there before squashing everything down into a few commits. Not so good if we ever need to bisect, but much easier to see at a glance what changed between versions...running final checks before merging, then I can close #167 too, w00t! |
Okay, merged in via #201 |
Hi guys,
this pull request changes the old kill test to make it possible to run some of it's tests using
os.kill
ondocker run
anddocker attach
, thus test the--sig-proxy=true
on already existing tests.Additionally it adds basic test, which verifies
--sig-proxy
works how it should (booth, positive and negative tests).The first commit modifies the
OutputGood
to log the command output in case of failure. This is really essential for log analysis.Regards,
Lukáš