Containerize build using LXD #92

johnsca · 2017-02-19T02:39:28Z

This is a pretty significant refactor, obviously. I'd really like to see all of the logic not directly related to managing the LXD image, Jenkins jobs, and Juju config (and possibly the release logic) moved in to the underlying tooling (cwr, bundletester, matrix). Specifically, I think we need a well-defined way of providing general override information for bundles for the purposes of testing. This would need to cover not just overriding specific charms with other revs or builds from repos, but also things like adding a testing specific charm, overriding the default number of units, etc. Having all of that in the tooling would make the charm much simpler.

In the meantime, we might consider moving much of the logic into the cwrbox image. It would allow us to push out updates to the logic in the container that would be picked up on the next build (unless a given deployment was using a locally attached resource version of the cwrbox image, in which case it would be manual for that deployment).

On the point of the image source, manually hosting the tarball in S3 was the quickest way to have it work out of the box, but is less than ideal. Ideally, we could run a public LXD remote server, but that would require more resources, a domain, and I'm not sure how to—or if you even can—lock down all operations other than copying images from it. I also looked in to running a simplestreams host for the images, which would be read-only out of the box, but that requires repackaging the image that gets exported (because simplestreams doesn't support unified images and only supports xz compression), and we'd still need to host that.

ktsakalozos · 2017-02-20T15:21:41Z

Nice work! It is a lot of work, I wish we could have done it in more steps so it would be easier.

In any case, I have taken it for a spin in aws and lxd, here are the errors I got: http://pastebin.ubuntu.com/24034264/ and http://pastebin.ubuntu.com/24033990/

Is noble-spider your pet? :)

johnsca · 2017-02-20T19:14:35Z

@ktsakalozos Ah, I missed that lxd init would need to be run when deploying on a fresh machine / VM. I also improved the job console output by turning off script debugging, adding some additional informational echos, and ensuring that set -e is always on.

ktsakalozos · 2017-02-21T10:05:51Z

I removed the old cwr subordinate and added the new one and got the follwoing error:
http://pastebin.ubuntu.com/24039183/

Then I logged in jenkins and did a lxc image remove cwrbox
After resolving the above error I got this one:
http://pastebin.ubuntu.com/24039207/

On a clean install of jenkins+cwr on lxd:
http://pastebin.ubuntu.com/24039408/

When deployed on a new image, the LXD storage pool won't be configured. The charm needs to ensure that `lxd init` is run to do so. If deployed on a localhost/lxd provider with an already initialized LXD, the charm should continue gracefully. Also turned off script debugging and added additional echos to improve the job console log. Also ensure that immediate exit on any error is enabled for all jobs by setting it at the top of cwr-helpers.sh.

johnsca · 2017-02-21T15:29:16Z

I rebased against master and fixed the NoneType exception (run_as doesn't pass through kwargs like I thought it did).

The second failure is somewhat expected; if you delete the image you'll also need to remove the signature file from /var/lib/jenkins/cwrbox.tar.gz.sig or the hash value from unitdata to get it to re-import the image. However, it looks like the set -e is not working for some reason, and that's a significant issue but I can't see any possible cause.

The last error I can't replicate, likely because I'm using ZFS for my LXD storage. I'll try to replicate by bootstrapping Juju with LXD on an Amazon instance, but any debugging you can do on your end would be appreciated.

johnsca · 2017-02-21T15:34:48Z

This seems to be the issue with -e: https://stackoverflow.com/questions/4072984/set-e-in-a-function

When using directory-backed storage for LXD, the perms require that the containers be marked as privileged. We were already mapping the container's root user to the charm's jenkins user, so we don't get any additional security from unprivileged containers anyway.

johnsca · 2017-02-21T21:58:29Z

All of the issues that @ktsakalozos hit are resolved now.

kwmonroe · 2017-02-21T23:57:31Z

This is working great for me.. I tested with cwr-52 and ran a charm and bundle job concurrently. Watching ps on the jenkins unit, i saw multiple cwr processes with multiple containers being active. This is a huge improvement -- previously 2 simultaneous cwr processes had a high likelihood of stomping each other's system-level deps.

I really want to push the merge button because i'm that excited about this. However, I'll let @ktsakalozos do it so he can verify his earlier comments have been addressed in cwr-52.

+1, lgtm.

kwmonroe · 2017-02-22T00:21:07Z

Nooooo! I spoke too soon.. Bundle job finished clean, but charm job hit a connection timed out :(

http://juju.does-it.net:8081/job/charm_openjdk_in_cs__kwmonroe_bundle_java_devenv/6/consoleFull

Edit: seemed to be a transient issue. Re-running both jobs succeeded. I retract my "Noooooo", but I would like to see the connection timed out issue handled better.

This was from a previous attempt to manage networking with an older version of lxd.

johnsca · 2017-02-22T14:36:32Z

@kwmonroe The timeout seems to be from deployer connecting to the API in the middle of a test run (during "reset") so doesn't seem related to this PR. It also seems to have cleared up on a subsequent run.

lazypower · 2017-02-22T15:30:13Z

This looks super cool but travis seems to hate it :(

lazypower · 2017-02-22T15:31:58Z

config.yaml

+  install_sources:
+    description: PPAs from which to install LXD and Juju
+    type: string
+    default: |


i have a dumb question. Why use the apt packages over the snaps? It seems like a lot of tooling isn't going to be maintained in debs anymore... I cite:

charm-tools

conjure-up

as two candidates in question. Are we signing up for pain later not integrating with snaps out of the gate?

I had run in to issues with the snaps during development before I found out about the squashfuse work-around. It would probably be good to switch to snaps where possible, though snaps do make the restricted network story more complicated. Is there a way to run a snap mirror similar to an apt mirror?

I found this: https://insights.ubuntu.com/2016/06/24/howto-host-your-own-snap-store/

johnsca · 2017-02-22T15:35:04Z

@chuckbutler The Travis failures are due to an upstream packaging issue with libcharmstore when installing charm-tools on trusty. We're waiting on @marcoceppi to resolve that. I tried to use the snap, but that failed due to this issue. I'd like it if we could figure out a way to use the snap in Travis but I have no idea how to proceed there.

pengale · 2017-02-21T21:12:50Z

actions/build-on-release

@@ -54,7 +53,7 @@ def add_job():
        branch = "*/master"
    elif repo_access == 'poll':
        trigger = TRIGGER_PERIODICALLY
-        skip_builds = SKIP_BUILDS
+        skip_builds = 'skip_builds'


Since this string is in too places, it's probably better to leave it in the constant. Avoids the problem where someone alters one down the line, but not the other.

pengale · 2017-02-22T23:34:24Z

Overall, I am +1 on this. Nothing major jumped out to me in a readthrough of the code, and I'm able to deploy without errors to aws, and setup and run the tests.

pengale · 2017-02-22T23:35:01Z

@kwmonroe The timeout that you ran into is more likely a problem with the charm in general, rather than a problem with containerizing, correct?

If so, I think that we should merge this ...

ktsakalozos · 2017-02-23T09:24:01Z

LGTM2! Merging it!

johnsca mentioned this pull request Feb 19, 2017

Decouple jobs from the charm code #89

Open

johnsca force-pushed the feature/lxd branch from f448670 to b19e750 Compare February 21, 2017 14:52

johnsca added 2 commits February 21, 2017 10:16

Containerize build using LXD

76c9e2f

johnsca force-pushed the feature/lxd branch from b19e750 to fb07d2e Compare February 21, 2017 15:16

johnsca added 2 commits February 21, 2017 16:19

Fixed -e not being honored in test

294439c

johnsca force-pushed the feature/lxd branch from bbf264b to 8e5b4c2 Compare February 21, 2017 21:22

Remove unused template

f3f5ed2

This was from a previous attempt to manage networking with an older version of lxd.

lazypower reviewed Feb 22, 2017

View reviewed changes

pengale reviewed Feb 22, 2017

View reviewed changes

ktsakalozos merged commit d1c8e73 into master Feb 23, 2017

kwmonroe deleted the feature/lxd branch February 23, 2017 20:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Containerize build using LXD #92

Containerize build using LXD #92

johnsca commented Feb 19, 2017

ktsakalozos commented Feb 20, 2017

johnsca commented Feb 20, 2017

ktsakalozos commented Feb 21, 2017

johnsca commented Feb 21, 2017

johnsca commented Feb 21, 2017

johnsca commented Feb 21, 2017

kwmonroe commented Feb 21, 2017 •

edited

Loading

kwmonroe commented Feb 22, 2017 •

edited

Loading

johnsca commented Feb 22, 2017

lazypower commented Feb 22, 2017

lazypower Feb 22, 2017

johnsca Feb 22, 2017

lazypower Feb 22, 2017

johnsca commented Feb 22, 2017

pengale Feb 21, 2017

pengale commented Feb 22, 2017

pengale commented Feb 22, 2017

ktsakalozos commented Feb 23, 2017

Containerize build using LXD #92

Containerize build using LXD #92

Conversation

johnsca commented Feb 19, 2017

ktsakalozos commented Feb 20, 2017

johnsca commented Feb 20, 2017

ktsakalozos commented Feb 21, 2017

johnsca commented Feb 21, 2017

johnsca commented Feb 21, 2017

johnsca commented Feb 21, 2017

kwmonroe commented Feb 21, 2017 • edited Loading

kwmonroe commented Feb 22, 2017 • edited Loading

johnsca commented Feb 22, 2017

lazypower commented Feb 22, 2017

lazypower Feb 22, 2017

Choose a reason for hiding this comment

johnsca Feb 22, 2017

Choose a reason for hiding this comment

lazypower Feb 22, 2017

Choose a reason for hiding this comment

johnsca commented Feb 22, 2017

pengale Feb 21, 2017

Choose a reason for hiding this comment

pengale commented Feb 22, 2017

pengale commented Feb 22, 2017

ktsakalozos commented Feb 23, 2017

kwmonroe commented Feb 21, 2017 •

edited

Loading

kwmonroe commented Feb 22, 2017 •

edited

Loading