Skip to content
This repository was archived by the owner on Apr 30, 2020. It is now read-only.

Conversation

@kparal
Copy link
Collaborator

@kparal kparal commented Feb 1, 2018

This converts the task to the Standard Interface [1] [2]. It's a slight variation of SI, because SI is only concerned with package-specific tasks, not generic tasks (running on all packages). The small differences are described in libtaskotron's readme file (in develop branch, currently). We call such tasks "Taskotron generic tasks". [3]

In a nutshell, runtask.yml is removed and tests.yml (an ansible playbook) is used instead. The task needs to run as root when running through runtask, because that's what SI demands. Of course you can adjust the playbook however you see fit and change how it's run locally (ansible-playbook tests.yml), so it might be possible to have a local way of executing the same task without root.

I had to disable integration testing in the test suite for this very reason (unless you want to run it as root locally, hum). I'm not sure how fast (or if) we'll have a solution for this in taskotron runner itself.

One of the few upsides to SI is that you can now run the task pretty independently on any test system. I.e. you don't need to run it through runtask or any other runner, you run it directly through ansible (on a remote machine as root, of course), and you see exactly what's happening. The only thing required are a couple of ansible variables set, in this case just taskotron_item (and perhaps taskotron_supported_arches).

I had to add download_rpms.py as a workaround way to be able to download RPMs. Originally I had written this using just koji download-build command (directly called from the playbook), but with the recent feature of also downloading build.log files, I had to revert back to using libtaskotron's koji directive (the koji command can't download build.log files when called from command line).

Currently all tasks running on http://taskotron-dev.fedoraproject.org/ need to be based on SI, and are pulled from develop branch, so I had to disable python-versions in there until this change is accepted. So from my point of view, since SI is the inevitable future, I'd be glad to have this code merged to develop so that we can continue running it and polishing the rough edges. If you'd prefer having it in a differently named branch, our tooling can't really do that right now, but of course everything can be patched with enough effort. (For us, it would be much easier to have this in develop).

https://taskotron.fedoraproject.org/ continues to run the old-style tasks (with runtask.yml) pulled from master branch for the moment. I don't have ETA for using SI tasks on production server.
(The stg server might be going away soon so I'm not mentioning it).

[1] https://fedoraproject.org/wiki/CI/Tests
[2] https://fedoraproject.org/wiki/Changes/InvokingTests
[3] https://pagure.io/taskotron/libtaskotron/blob/develop/f/README.rst

@hroncok hroncok self-requested a review February 1, 2018 14:28
taskotron_generic_task: true
# below are fallback vars for local testing
artifacts: ./artifacts
taskotron_item: python-gear-0.11.0-1.fc27 # you should really override at least this :)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this do? What is python-gear and why does it say we need to change it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a default variable value (a random recent koji package with python code in it), so that you can run ansible-playbook tests.yml without studying and adding any variables on the command line, and still see it doing something.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

download_rpms.py Outdated

koji = koji_directive.KojiDirective()

print 'Downloading rpms for %s into %s' % (koji_build, rpmsdir)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's 2018 and you've just wrote a Python 2 only script. Every time you do that, some little part in me dies. We are not going to maintain a Legacy Python only script in our repo. It's already tedious to keep in mind that our code here needs to be Python 2 compatible, this is just making it worse. I suppose this code should be part of taskotron, not our repo, but I'll get to this in a different section.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's because I'm not familiar with Python 3, even in 2018 :-/ This code is a semi-temporary file that I've already copied to several tasks and I'm definitely not happy about it. It's there because:
a) invoking koji download-build command directly is not sufficient for this task (you need build.logs)
b) you can't interact with python directly in ansible, so you need a wrapper that accepts cmdline arguments and converts them into python calls which use our koji_directive helper library

In the future, we'd like to have a native ansible module that exposes all important input arguments, so that you can call it directly and don't need to have a wrapper.

Of course, if you decide to convert this to python23 in the meantime (or even a better idea), I have no issue with it (quite the opposite).

on a production machine!)::

.. code:: console
$ ansible-playbook tests.yml -e taskotron_item=<nevr>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this how you run the checks in production now instead of the runtask command? Will this command give the yaml output that we can parse as we did in our integration tests? We want to make sure the tests work in Taskotron not, that some ansible playbook works.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, yes. We set up more variables. Here's an example:
http://taskotron-dev.fedoraproject.org/artifacts/all/6ea80e0c-0756-11e8-920b-525400ee7c53/task_output/tests.yml/taskotron/ansible_vars.json.gz

The ResultYAML file will be stored in {{artifacts}}/taskotron/results.yaml, which for local execution defaults to ./artifacts/taskotron/results.yaml. That's exactly the file we will then send to ResultsDB.

This is as close as possible to actual taskotron execution, so verifying this is useful, I believe. Of course it's not the same as running through the whole taskotron stack.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our integration tests now read the results.yaml file.

artifacts: ./artifacts
taskotron_item: python-gear-0.11.0-1.fc27 # you should really override at least this :)
taskotron_supported_binary_arches:
- x86_64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we just get rid of all the other architectures?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, that's a default variable value, for local execution when developing/experimenting. It is expected that you override this (and taskotron stack will do that for you, with the list of arches we want to execute on).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

tests.yml Outdated
- name: Run task
shell: >
./python_versions_check.py {{ taskotron_item }} {{ workdir.path }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we invoke python2 here directly instead of adding a shebang and executable flag? Would that break something?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't break anything, I believe. I'm simply used to the current approach, that's why I wrote it that way.

tests.yml Outdated
- block:
- name: Download RPMs from Koji
shell: >
./download_rpms.py {{ taskotron_item }} {{ workdir.path }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of our home brewed script, maybe a takotorn_download_rpms --with_logs command should exist and be maintained upstream and used here. How are the other checks doing this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could, yes. As replied above, a more preferable approach is a native Ansible module. But right now, this is what I have ready.

tox.ini Outdated
basepython = python3
commands = python -m pytest -v {posargs} test/integration
sitepackages = False
;; disabled during transition to ansiblized Taskotron (Standard Interface support)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the actual reason to disable the tests? If I change the integration tests to run the ansible playbook in mock, would that work?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, there is a way, I have a very rough code ready, testing it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea :-) The reason I disabled it is because I don't have time ATM to implement running the test suite in docker/mock/something else, and I hoped you'll either don't care that much (since with ansible playbook, taskotron runner is now a very thin wrapper around all of this), or implement it yourself. Or I might get to it, eventually. Just not immediately.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pushed, let's see what Travis says

dnf:
name: "{{ item }}"
state: latest
with_items:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What Fedora version is this running on in production?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach hasn't changed, we pick the image to run on based on the item tested. So in this case we parse NVR and use disttag to decide which image to spin up.

Once everything is installed you can run the task on a Koji build
using the
``name-(epoch:)version-release`` (``nevr``) identifier.
To run this task locally, execute the following command as root (don't do this
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is bad design, it worked before and now it requires this extra step (getting a system where this runs). Once I setup a mock based workaround for this, I'll provide a way to run it not only in the integration tests, but also manually.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I proposed some ideas how to mitigate this at least for local execution in the PR description.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added mock section bellow.

For example::

.. code:: console
$ ansible-playbook tests.yml -e taskotron_item=python-gear-0.11.0-1.fc27
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a --debug switch to runtask. How do I turn on debug logging now?

Copy link
Collaborator Author

@kparal kparal Feb 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, when you use runtask --debug, we do two things:

  1. Switch taskotron runner to debug mode
  2. Append -vv to ansible-playbook invocation.

Since the task no longer runs on the same machine and in the same process as the runner, our setting no longer applies to libtaskotron libraries being used inside the task. However, you can set up your own ansible variable and if it is present, you can switch your task to debug output (for libtaskotron libraries, simple use logging.getLogger('libtaskotron').setLevel(logging.DEBUG)). Even better, we can expose an ansible variable taskotron_debug: bool and you can react on that. Which means the task will show debug output even when run through runtask and there will be a standard way to toggle this for all tasks (that support that).

That being said, all of this might be of limited use and I'm actually not sure it's needed. Since the task is now run through ansible, you now longer see its standard output (at least not in realtime). Instead, you're required to create {{artifacts}}/test.log file containing log output of your task. That can be always put to debug level, if you want. Or you can create test.log with standard level output, and test.log.debug with debug level output, every time. It's your preference. We will always archive whole {{artifacts}} dir. Still, if you prefer a boolean toggle, I can expose it easily.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keep it how it is, thanks for explanation

tests.yml Outdated
- name: Compute architectures to download
set_fact:
download_arches: "{{ taskotron_supported_arches | join(',') }}"
download_arches: "{{ taskotron_supported_binary_arches | join(',') }}"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Actually, I messed up this a little when converting from koji download-build to download_rpms.py, so I'll fix it in a following commit.

@hroncok hroncok force-pushed the feature/ansiblize branch 2 times, most recently from 37f5cc1 to b1833b5 Compare February 1, 2018 18:59
@hroncok
Copy link
Member

hroncok commented Feb 1, 2018

Rebased. Integration tests now use mock. The Tests section of the README needs some adjusting, but I don't have enough energy any more.

Thoughts, @irushchyshyn?

@hroncok hroncok requested a review from irushchyshyn February 1, 2018 19:43
@hroncok hroncok added the WIP label Feb 1, 2018
@kparal
Copy link
Collaborator Author

kparal commented Feb 2, 2018

Awesome work, Miro, thanks. You deserve a 🍪 .

@irushchyshyn
Copy link
Collaborator

Thoughts, @irushchyshyn?

I just took a quick look over the code and the general idea and it looks good. Great work @kparal and @hroncok !

I would like to play with it a little more today and get a better idea of what is going on. I will let you know, but feel free to merge if this is pressing.

files = ['taskotron_python_versions'] + glob.glob('*.py') + ['tests.yml']
mockenv.copy_in(files)
yield mockenv
mockenv.orphanskill()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is the right way of doing this, but I found not way to "kill all processes from mock".

Note that sometimes I had problem when I tried to Ctrl+C the tests, that it only killed the mock process and the entire pytest thing detached from my stdin, so it was running in background and I could not Ctrl+C it any more. This is very inconvenient, so I recommend running he integration tests with --maxfail=1 when running interactively.

print(err, file=sys.stderr) # always print stderr in this case
raise RuntimeError('runtask exited with {}'.format(proc.returncode))
results = parse_results(err)
mock.shell('ansible-playbook tests.yml -e taskotron_item={}'.format(nevr))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ends up with positive exit code when the check FAIL. So I didn't check the result. However there is failure (of the check) and failure (of the entire circus) and we should check the exit status and try to guess (at least if it's negative, it was killed and we should raise KeybordInterupt or something like that).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, it's either 0 or 2.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify: the SI spec says the playbook exit code reflects the result of the check - 0 means PASS, non-zero means FAIL. It doesn't know the concept of generic tests. To stay close to the SI spec, we opted (at least for now) to exiting with 0 if all tests pass and non-zero if at least one check fails in our generic tasks. Unfortunately that prevents us from distinguishing a failed test from a crashed test. So, as long there is {{artifacts}}/taskotron/results.yml present, we consider the test to have finished fine. This is something we should discuss with SI spec authors in the future.

mock.shell('ansible-playbook tests.yml -e taskotron_item={}'.format(nevr))
mock.copy_out('artifacts', clean_target=True)

with open('artifacts/test.log') as f:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The log contains all the previous runs, we need to purge it before each test.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testing this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be missing something, but test.log should be overwritten in the Download RPMs from Koji ansible step. A possible approach is also to use a different artifacts directory for each integration test suite invocation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It contained other stuff. Now I need to preserve each artifacts dir for each nevr, but that's fine. it will also make it easier when debugging test suite failures.

Copy link
Member

@hroncok hroncok Feb 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The testsuite actually only worked because it got preserved. So this was clearly a bug in the design of the tests (previously, each runtask had a different artifact path, now they shared it).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed. Also, this may allow us to add a --fake switch that will not run the checks but use the stuff from the artifacts directories (e.g. if you are only changing the tests but not the implementation).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added that as well.

Previously this only worked by accident

Bonus: We can now inspect the contents of the folder, if there is a failure
*.pyo
*.pyc
__pycache__
/artifacts
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why this gets removed.
An own directory gets created for each nevr only during integration tests so far. When running ansible-playbook ... command the results get saved to artifacts.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you overriding artifacts variable on the command line?
ansible-playbook ... -e artifacts=path

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, maybe I misunderstood the question. Either way, I also believe it's better to keep this in .gitignore. When somebody runs the playbook with default vars, that's where the artifacts get created.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not overriding artifacts, I just took README and tried running as it says :)
I will add it back.

Besides, while test.log gets overridden with each subsequent run, output.log keeps previous results (also with default vars).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, get it back, my bad, I haven't considered direct runs.

Copy link
Collaborator Author

@kparal kparal Feb 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@irushchyshyn When running it locally, it's your duty to clear artifacts dir (or not). The task can't know your intended use case. Of course, you can e.g. generate random artifacts dir paths inside the playbook (the same way it generates workdirs), in case the variable is not set externally. It's your task, you can do whatever you prefer for local runs :) Local runs are mostly for development.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kparal thanks for explanation.

Copy link
Collaborator

@irushchyshyn irushchyshyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added some changes to README tests section, otherwise it looks good to me 👍

@irushchyshyn irushchyshyn merged commit 569ed88 into develop Feb 2, 2018
@irushchyshyn
Copy link
Collaborator

@kparal thank you for this change!
Merged to develop, let me know if there are any issues enabling it for http://taskotron-dev.fedoraproject.org/

@irushchyshyn irushchyshyn deleted the feature/ansiblize branch February 2, 2018 18:22
@kparal
Copy link
Collaborator Author

kparal commented Feb 3, 2018

I'll enable it on Monday and let you know if there are any issues. Thanks for a fast response here in this PR.

@kparal
Copy link
Collaborator Author

kparal commented Feb 5, 2018

It's running on taskotron-dev now once again:
http://taskotron-dev.fedoraproject.org/resultsdb/results?&testcases=dist.python-versions

@hroncok
Copy link
Member

hroncok commented Feb 21, 2018

@kparal Can we merge this to master?

@kparal
Copy link
Collaborator Author

kparal commented Feb 21, 2018

Unfortunately, not yet. I'm still working on getting the libtaskotron code merged to master.

@hroncok
Copy link
Member

hroncok commented Feb 21, 2018

Do you have any (even raw) estimate? I just need to know whether we need to backport new checks that we want to get running before the bodhi activation point of F28.

@kparal
Copy link
Collaborator Author

kparal commented Feb 22, 2018

I'll discuss this with @tflink and get back to you tomorrow.

@kparal
Copy link
Collaborator Author

kparal commented Feb 22, 2018

The current plan is that we could do the switch in 1-2 weeks, hopefully. Does that work for you?

@hroncok
Copy link
Member

hroncok commented Feb 22, 2018

Doing this before 2018-03-06 (Bodhi activation point) would be perfect, but 2 weeks is fine. Thanks.

@hroncok
Copy link
Member

hroncok commented Mar 6, 2018

For whoever is reading this. Today is the Bodhi activation point. Yet this was already deployed via #55

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants