Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve testing infrastructure for macOS/Windows by using buildkite #621

Closed
cweagans opened this issue Jan 25, 2018 · 16 comments
Closed

Improve testing infrastructure for macOS/Windows by using buildkite #621

cweagans opened this issue Jan 25, 2018 · 16 comments
Assignees
Milestone

Comments

@cweagans
Copy link
Contributor

Problems

  • Watching real time output of Windows and Mac builds is impossible
  • Builds randomly fail
  • Test environments aren't "pristine"
  • Cross platform testing of Docker Images would be difficult
  • todo: other problems?

Proposed solution

My initial idea was to entirely replace Circle and Surf with a Jenkins instance, but there aren't any services that allow for EC2-style API driven management of macOS or Windows 10 Pro boxes that are capable of running a VM (which is a hard requirement for Docker for Mac/Windows). Working around that is possible, but somewhat expensive, so I would instead like to propose an iterative approach that improves what we have in place already.

Phase 0

Cost: $0

  • Continue using CircleCI for Linux builds and producing build artifacts.
  • Deprecate Surf in favor of Buildkite. Buildkite is essentially a bring-your-own-compute build service, and they explicitly support Mac and Windows. This allows for watching real time build output, as well as manually starting/killing builds as necessary. Buildkite is free for open source projects.
  • Clear the entire testing environment before and after every Mac and Windows test run. This ensures that every test is starting from the same state, and will reduce cases where the side effects of a previous test impact a subsequent test. I think this should just be filesystem cleanup (i.e. remove old Git clones and such) and kill all containers and remove all of the saved images. This will cause the normal builds to take a bit longer, but I think the tradeoff between speed and reliability is worthwhile here.
  • Put the test logic for Mac and Windows into the ddev repository, similar to the CircleCI config.

Phase 1

Cost: ~$69/mo/machine (2 machines needed initially - one mac, one windows)

  • Use something like MacStadium (which is the Mac hosting service that Travis CI and Unity3D use for their cloud builds) to bring the Windows 10 Pro and Mac OS boxes into the cloud. This will allow us to easily manage the boxes as a team.
  • This would also allow us to (relatively) easily grow and shrink the number of build agents as needed to accommodate additional testing.
  • Note that we only realistically need two machines. We're at around 100 build hours per month on each OS, which is not even close to the maximum capacity of a single machine.

Phase 2

Cost: indeterminate - depends on amount of load on build agents.

  • Start testing our other projects with the new setup, including ddev-ui, docker images, etc. Pretty much anything that's open source can be piped through Buildkite.

Comments, concerns, or haikus welcome!

@rfay
Copy link
Member

rfay commented Jan 25, 2018

If you've had good experience with buildkite, I'd sure be happy to see if it's more mature.

Note that the test logic for mac and windows is in the repository and documented - build.sh, and there is documentation. Also, all our builds use make test for the build logic, and that should remain the case. But there is setup required of course.

We have in fact set up the surf builds to start clean - all images are updated, all containers are removed at the beginning.

I'm glad to hear about MacStadium, but what would you use for Windows in Phase 1? I'd sure like both of those to be on cloud services.

In all cases, the "cost" is the cost of developing and maintaining these systems, which swamps any infrastructure costs by orders of magnitude, so I basically think that we should only (very slight overstatement) think about the development and maintenance costs.

@cweagans
Copy link
Contributor Author

If you've had good experience with buildkite, I'd sure be happy to see if it's more mature.

I've used it at past companies. It's rock solid. The main benefit is that there's a bit more visibility into the build process. Real time output, being able to kick off/kill a build, etc.

Note that the test logic for mac and windows is in the repository and documented - build.sh, and there is documentation

Ah, great! I wasn't sure on build.sh.

We have in fact set up the surf builds to start clean - all images are updated, all containers are removed at the beginning.

I think we should be removing all running containers and removing all images. Start from a completely empty Docker. I recall one situation where a cached image was causing some issues with a subsequent build. If we just nuke all of the images from orbit as part of build cleanup, I think that might improve reliability a bit.

I'm glad to hear about MacStadium, but what would you use for Windows in Phase 1? I'd sure like both of those to be on cloud services.

MacStadium's machines can run Windows. See the OS dropdown here https://www.macstadium.com/mac-mini/#modelselect -- we could totally just pull the trigger on that and do that in Phase 0 if you're eager to get them into the cloud. I was shooting for small incremental changes to make it easier to back out if there's a problem.

In all cases, the "cost" is the cost of developing and maintaining these systems, which swamps any infrastructure costs by orders of magnitude, so I basically think that we should only (very slight overstatement) think about the development and maintenance costs.

Absolutely. This is partially why I leaned toward a hosted service like Buildkite. I don't want to be in the business of building and maintaining a distributed CI system unless/until we need to. For what it's worth, I don't think any one of the three phases mentioned above will be a significant lift. They're pretty small chunks of work.

@rfay
Copy link
Member

rfay commented Jan 25, 2018

I suspect that buildkite would go in very easily, and using MacStadium should be exactly equivalent to what we have, so perhaps phases 0 and 1 are just fine and not costly at all.

@rfay
Copy link
Member

rfay commented Jan 25, 2018

Seems to me like phases 0 and 1 are a no-brainer and would involve very little cost and development effort.

Testing for ddev-ui on windows will be absolutely critical, so phase 2 ends up being driven by that.

Our container tests are still pretty rudimentary, but it would still be nice to run them on multiple machines.

Currently the surf build assumes it owns the whole machine. But our needs are low enough that if buildkite is able to serialize all builds of all types, then that's not a problem.

@cweagans
Copy link
Contributor Author

But our needs are low enough that if buildkite is able to serialize all builds of all types, then that's not a problem.

Just one last note on this: Buildkite assumes that the agent can do whatever needs done, but the Buildkite service is ultimately responsible for deciding what to run and maintaining the queue of jobs to send to the worker nodes. This is why I was pushing the totally clean build environment -- if we start testing Docker image builds on this infrastructure, then we don't want any possibility of that impacting the ddev tests (or vice versa).

You noted in Slack that updating images before a build solved our immediate problem for the issue I mentioned, but that may not be the case if we're building multiple products.

I'm not opposed to combining Phases 0 and 1 if the machine cost is not an issue. It doesn't look like it would be a huge lift either way, and I think it would be a pretty nice workflow improvement.

@rfay
Copy link
Member

rfay commented Jan 30, 2018

I added this to "to do" on our working sprint. I think it's a totally valid approach and I'd love to see it. This has to be prioritized from a resource standpoint though (money and time) and so will have to be sold to the powers that be on that. It's worth starting that process as soon as possible.

@rickmanelius
Copy link
Contributor

Updated flags because I believe this is currently WIP.

@rickmanelius
Copy link
Contributor

Just referencing the PR #639

@rickmanelius
Copy link
Contributor

@pgalligan80126 @sgrandchamp
Just cc'ing you to make you aware of the approval of the new service "MacStadium" that will need to get added to our monthly budget under recurring software expenses. Thanks!

@sgrandchamp
Copy link

@rickmanelius Please coordinate directly with @pgalligan80126 the appropriate place to add this projected expense. In our previous document we had a place for all software subscriptions that netted out a monthly total.

@rfay rfay changed the title Proposal: Improve testing infrastructure Improve testing infrastructure for macOS/Windows by using buildkite Feb 19, 2018
@rfay
Copy link
Member

rfay commented Feb 21, 2018

I was trying to get our mac stadium Windows machine to run docker yesterday and failed. Ticket https://portal.macstadium.com/tickets/47328 explains the request. "Hyper-V cannot be installed. Virtualization support is disabled in the firmware". This also prevents Virtualbox from running appropriately of course.

Searching on the web about this problem reminded me that I have this problem periodically with the Mac I use to run Windows for testing. If it's been turned off, docker can't work on windows unless you boot into macOS first, then boot into Windows.

So we probably need a physical Windows machine that we can rent just like mac stadium does. It's conceivable that somebody has a cloud machine that can do virtualization. The good news is there's lots more Windows for rent than Macs. The bad news is they generally have Windows Server on them.

@rfay
Copy link
Member

rfay commented Feb 21, 2018

I see https://iweb.com/dedicated-server/pricing has all their dedicated servers doing Hyper-V. As expected, they're all Win Server. This is kind of a catch-22, unless you can do nested virtualization with Win server (put Win 10 Pro in Hyper-V and have it use nested, and do testing in the 10 Pro virtual machine). I don't know if that's possible.

Docker CE doesn't run on Win Server, which comes with Docker-ee now.

Edit: Nested virtualization is possible, https://docs.microsoft.com/en-us/system-center/vmm/vm-nested-virtualization?view=sc-vmm-1801

So it's conceivable we could run a few versions of Windows inside a Windows Server machine.

@rfay
Copy link
Member

rfay commented Feb 27, 2018

Azure has a WIn 10 Pro instance, https://azuremarketplace.microsoft.com/en-us/marketplace/apps/Microsoft.Windows10RS3Prox64?tab=Overview - don't know if it would support Hyper-V.

Cameron also noted that we could get a real Windows Enterprise machine with OVH.com that would do Hyper-V, and perhaps then run various windows versions inside it.

@cweagans
Copy link
Contributor Author

I also got a quote from a local datacenter - it works out to be about the same as ovh in terms of pricing, except that we'd have to provide the hardware.

@cweagans
Copy link
Contributor Author

@rfay rfay added this to the v0.19.0 milestone May 23, 2018
@dclear dclear removed this from the v0.19.0 milestone May 29, 2018
rfay added a commit that referenced this issue May 29, 2018
@rickmanelius rickmanelius added this to the v0.19.0 milestone May 29, 2018
@rickmanelius
Copy link
Contributor

PR was merged. Going to consider this complete. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants