Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zuul CI tracking issue #42

Closed
jodh-intel opened this issue Jun 25, 2018 · 16 comments
Closed

Zuul CI tracking issue #42

jodh-intel opened this issue Jun 25, 2018 · 16 comments

Comments

@jodh-intel
Copy link
Contributor

One requirement we have for some of the repo tests is to provide a terminal to the jobs (shim tests for example). Jenkins does not do this whereas Travis does.

We've got a temporary hack in place to work around the issue:

Related: kata-containers/proxy#74

/cc @cboylan.

@jodh-intel
Copy link
Contributor Author

Ideally, if we drop Travis + Jenkins and move to zuul, we'd need coverage for the currently supported architectures:

  • x86_64
  • ARM64
  • PPC64le

But we may also need OSX support (see kata-containers/shim#87, kata-containers/proxy#82).

/cc @Pennyzct, @nitkon, @raravena80.

Related: kata-containers/kata-containers#22.

@mnaser
Copy link
Member

mnaser commented Jun 26, 2018

@jodh-intel

There is currently no coverage for PPC64le and ARM64 at the moment. I don't see how we should be linking one with the other.

@cboylan
Copy link

cboylan commented Jun 26, 2018

In general all of this is possible, it is just going to require work. Nothing about Zuul prevents us from running tests on x86_64, arm64, or PPC64le. In fact our current installation has x86_64 and arm64 build resources. Those arm64 instances won't work for Kata as they are VMs and arm64 has no nested virt support, but we know that we can operate on multiple arches as a result.

At the risk of getting into too many implementation details the rough implementation requirements are nodepool driver to provision instances (may simply be the static host driver with a static list of instances) and an ansible connection driver to talk to whatever operating system was booted. SSH is the common implementation today but people are using a Windows connection driver too.

My intent getting into this was to largely replace Jenkins with Zuul. In this case whatever testing you are running on travis (like OS X) can remain there for now. If we get access to PPC or OSX build resources we can add those into nodepool and have zuul consume them, but I'm not sure that is necessary as a first step?

@nitkon
Copy link

nitkon commented Jun 26, 2018

@cboylan : Just FYI Nested virtualization is also not supported in case of ppc64le.

@grahamwhaley
Copy link
Contributor

I feel I should note to you all #39, which implies you either need a clean way to run fresh bare metal nodes (via some allocate and deploy mechanism such as nodepool and ansible I guess), or you need a way to create fresh VMs on demand (like we do via the Jenkins Azure plugin, and for our bare-metal machines for metrics CI with https://github.com/kata-containers/ci/tree/master/VMs/metrics for instance).
Just noting, as this may add a layer of complexity into your final solutions.

@jodh-intel
Copy link
Contributor Author

Hi @cboylan - thanks for the update. You are right, I think the first step is to create a zuul setup that would mirror what we currently have in Jenkins (which is currently x86_64 only) to allow us to consider dropping Jenkins.

I agree that assuming we do drop Jenkins, we'll also have to retain Travis for some time until we can migrate the features we use there into zuul.

To summarise, those remaining features we need that Travis provides are:

What Travis lacks is virtualisation extensions meaning we can really only run unit-tests. Hence if an arch is only using Travis, test coverage will be very low since none of the tests in https://github.com/kata-containers/tests can be run. So osx and ppc64le will not have good test coverage until we can move them off Travis.

What Travis does provide is a terminal for the test jobs. Jenkins doesn't so we've had to introduce a hack to fake it. It will be interesting to see the environment zuul provides.

@jodh-intel
Copy link
Contributor Author

Zuul question - ooi, would it be possible to define separate timeout for each architecture?

See: kata-containers/osbuilder#122.

@cboylan
Copy link

cboylan commented Jun 27, 2018

Yes, each job can have its own timeout value. The valid range is 1 second to the service configured maximum (I think ours is something like 5 hours?). In this case we'd just have a different job config for each architecture specifying different timeouts as necessary.

@jodh-intel
Copy link
Contributor Author

Nice! Thanks @cboylan ;)

@Pennyzct
Copy link
Contributor

Hi~ @jodh-intel Sorry for the delayed comment😅.
We are working on the CI Jenkins set-up on the arm64, there existed some issues and failures, and we will raise a issue to address. @jongwu
Since if the first step is to create a zuul setup that would mirror what you currently have in Jenkins, above problem will maybe also occur in arm64.

@kalyxin02
Copy link

While we are fixing the existing issues or failures on ARM during running CI tests especially with jenkins_job_build.sh, could anyone shed some lights of how Zuul CI to be configured or setup? Is there any documents about this written down for Kata?

@grahamwhaley
Copy link
Contributor

The Zuul CI is still in bringup for Kata. Once up, I expect us to store the config, and hopefully a guide document as well, in this repo (like we do for Jenkins).
I don't think we have specific Zuul Kata info yet - but /cc @cboylan @chavafg for input there.

@chavafg
Copy link
Contributor

chavafg commented Jul 10, 2018

Hi, I think we still do not have documents, but these is the updated configuration of the Zuul Job, which are ansible playbooks and roles:
https://git.openstack.org/cgit/openstack-infra/openstack-zuul-jobs/tree/playbooks/kata-runsh
https://git.openstack.org/cgit/openstack-infra/openstack-zuul-jobs/tree/roles/kata-setup

@cboylan may have more info.

@kalyxin02
Copy link

@grahamwhaley @chavafg Ok, thanks for the information and the ansible scripts. After talking with @gnawux , since it is still a long way to go for zuul to be finally configured, we'd better to make the original Jenkins configuration work for ARM first.

@jodh-intel
Copy link
Contributor Author

Hi @cboylan - could you give another update on zuul? Every (?) build appears to fail under it so it seems that there are still teething issues with it.

Also, would it be possible to change the URL for the zuul CI run for individual PRs to a unique URL rather than just https://zuul.openstack.org/?

GabyCT pushed a commit to GabyCT/ci that referenced this issue Feb 12, 2019
Don't run `gofmt` when testing with golang tip.

Fixes kata-containers#42.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
@jodh-intel jodh-intel added this to To do in Issue backlog Aug 10, 2020
@GabyCT
Copy link
Contributor

GabyCT commented May 26, 2021

Closing this issue as this is not related with kata 2.x

@GabyCT GabyCT closed this as completed May 26, 2021
Issue backlog automation moved this from To do to Done May 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Issue backlog
  
Done
Development

No branches or pull requests

9 participants