Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Windows] Fix vcpkg installation #1340

Conversation

dibir-magomedsaygitov
Copy link
Contributor

@dibir-magomedsaygitov dibir-magomedsaygitov commented Jul 31, 2020

Description

We need to install vcpkg from the latest release instead of master branch commit.

Related issue:

#1332

Check list

  • Related issue / work item is attached
  • Tests are written (if applicable)
  • Documentation is updated (if applicable)
  • Changes are tested and related VM images are successfully generated

@maxim-lobanov
Copy link
Contributor

/azp run windows2016, windows2019

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@kcgen
Copy link

kcgen commented Jul 31, 2020

This week's master branch of vspkg was broken due to their lack of strong CI gating on their part to handle a recent change in cmake. This break rippled across vspkg users, impacting many projects.

Causing responses like this:

We need to install vcpkg from the latest release instead of master branch commit.

"Fixing" this by falling back to vspkg's last release package will re-introduce approximately ~300 bugs that GitHub VM users have relied upon being fixed (which will now rear their heads again).

A more robust solution involves testing the latest master of vspkg first, before accepting it:

  • Run a handful of low-level packages installations - like png, zlib, and gpg. Any one of these would have caught this week's issue.
  • If any fail, then rollback to the prior passing vspkg.
  • In the worst case here, falling back to the "last passing vspkg" is the sweet spot: it gives VM users a working vspkg system, while still including the vast majority of near-term vspkg fixes.

@maxim-lobanov
Copy link
Contributor

@kcgen , thank you for your feedback. That all makes sense.
Unfortunately, we don't have capacity to test the latest master of vcpkg, before accepting it. So we have to trust VCPKG team here.
Actually, We have two solutions:

  • Always install latest master
  • Always install latest release

Since we don't have much experience with VCPKG, probably you have better context what is preferred.

@strega-nil, Hello!
Could you please take a look at the discussion in this PR and in #1332 and advice the best solution?

@kcgen
Copy link

kcgen commented Jul 31, 2020

Unfortunately, we don't have capacity to test the latest master of vcpkg, before accepting it. So we have to trust VCPKG team here.

Thanks for explaining the dilemma @maxim-lobanov - yes, the problem of "we don't have capacity to test" completely dwarfs this vspkg issue. Are you telling me the VM images that (millions?) of users rely on every week are loaded with software that hasn't undergone even the most basic of sanity checks?

Surely Microsoft and GitHub have Windows VMs where all of the candidate packages can be installed and some dead-simple powershell commands could execute the software (starting with the most simplest of tests (vspkg install zlib + return-code pass/fail is all we're talking about) prior to accepting that software?

You could additionally have a virus scanner and behavior/network scanner inspect the process to, and include that in your CI gate.

This is so fundamental that I hope it's someone's full time job there. It's only a matter of time before just one of these "trusted-but-not-tested" packages are trojaned with a GitHub SECRETs harvester.

Back to this vspkg issue - this issue alone has blown up for them, causing apologies and broken workflows; they see the immediate need for strong CI gating to prevent this from ever happening again. I suspect that will be in-place in the coming days or weeks. Whatever trust you currently had in their master branch, you will be able to trust it a whole lot more going forward.

In the meantime, I understand they've fixed this issue too (but I'm still affected because the GitHub VM won't absorb the fix until the next generation cycle).

Perhaps the safest course of action would be to temporarily roll back to the stable release for the next couple weeks - and inform your users ( which might be other GitHub teams, like Workflows, etc) who in turn can inform their end-users (like myself) that vspkg is being reverted and prior fixes might be broken again.

This will give vspkg time to implement their strong CI gating and get any kinks worked out. Once that's stable, I would then roll forward to their master branch again as breaking-changed will be blocked before they hit master.

@maxim-lobanov
Copy link
Contributor

@kcgen,
As for the validation, we have validation gates that contain a bunch of smoke (very very simple) tests to validate that the tool is installed on image successfully and antivirus checks. (Unfortunately, vcpkg is not covered properly right now and we are planning to do that soon).

When I said "we don't have capacity to validate latest master of VCPKG", I meant a bit different: we need to generate image, do validation of image, if vcpkg doesn't work properly, we have to do PR to rollback it and hardcode some old version, regenerate image and etc. It will take much time and could block and delay image rollout and other important updates. Considering the fact that we have ton of tools installed and every could be affected by the same (release contains bugs). We could risk never deploy image in this case because we hang in rolling back versions :)
I understand your point and we are trying to improve our test-coverage constantly but unfortunately, it is not possible to follow up all tools and all release notes. And as you mentioned above, it is always dilemma between good test-coverage from our side and trusting tool owners. (Absolutely, agree that this particular issue is pretty obvious and could be caught from our side).
It is what I mentioned "We have to trust tool owners" that their releases don't contain bugs (at least critical).

As for the Vcpkg, we do even worse thing - always pick up master branch. It means every day a lot of commits could break everything (especially, if tool doesn't have own strong validation as you said)
Personally, I prefer pickup the latest release instead of latest master as you mentioned. In this case, vcpkg version could be a bit outdated but the risk of critical bug is lower.

@kcgen
Copy link

kcgen commented Jul 31, 2020

@maxim-lobanov - thanks for the clarification! Great to hear that smoke tests are in place for major components and is improving for lesser components like vspkg.

Personally, I prefer pickup the latest release instead of latest master as you mentioned.

With testing of vspkg weak on both sides, I think falling back to the stable release is the best you can do. Once automatic CI gating is in place, then you can safely move up to master and users will benefit from the post-relaese fixes.

@kcgen
Copy link

kcgen commented Aug 1, 2020

@maxim-lobanov: some new information that changes the above assumptions:

If vcpkg finds cmake installed on the host, then it will blindly use that instead of its own internally-bundled proven-and-tested version.

This exposes a version race-condition: if cmake on the host has new API breaking-changes versus the latest version that the vcpkg devs have tested with (and embedded), then vcpkg will fail when it tries to interface with the host cmake.

So our proposed solution above, rolling back to the last stable release-build of vxpkg, won't solve the problem. It only "knows" how to interface with cmake 1.17 but the version of cmake on the GitHub VMs is 1.18 -- so this version race-condition will continue to exist.

On GitHub's side, this can be solved by:

  • not installing cmake on the host (vxpkg will then use its internal version), or
  • install cmake, but only up to the version that vxpkg has been tested with (1.17 for prior versions, or 1.18 as of today's latest vcpkg master)
  • only permit new cmake versions once vxpkg has been tested with them

On vcpkg's side:

  • I've proposed to the vcpkg team that vcpkg be changed to check the host cmake version, and only use that if its version is equal-to or less-than what vcpkg is using internally. This will eliminate that race condition.

microsoft/vcpkg#12569 (comment)

@strega-nil
Copy link

@kcgen I will say that breaking changes in cmake are exceedingly rare, and are mostly bugfixes; our code was buggy, and was only allowed by cmake also being buggy.

@kcgen
Copy link

kcgen commented Aug 2, 2020

breaking changes in cmake are exceedingly rare

So to prevent this from happening again, the answer is ... Do nothing?

  • This issue knocked out vcpkg for almost a week.

  • GitHub's CI virtual machines are still knocked out because the infrastructure team re-generates the images once a week. My project still cannot merge PRs.

  • Airgap deployments bundling today's "fixed" vcpkg that are cmake-1.18 compatible carry a constant risk that any /future/ cmake version will break vxpkg.

wreckage aside, why are we still trying to avoid the seat belt? What benefit does vxpkg gain by -not- limiting the version of cmake used to that which vxpkg has been tested against?

Good design always prefers hard proofs over leaps of faith, when the two options carry equal costs. This is one case where a trivial version check avoids the leap of faith.

our code was buggy, and was only allowed by cmake also being buggy.

All the more reason to only trust up-to the tested version. Bugs will continue to be added by both projects; that's just a development truism we all are afflicted with :-)

@strega-nil
Copy link

Perhaps. We'll have to look at it, especially because we cannot download tools on some platforms.

@kcgen
Copy link

kcgen commented Aug 2, 2020

Thanks @strega-nil.

@ygj6 ygj6 mentioned this pull request Aug 3, 2020
5 tasks
@maxim-lobanov
Copy link
Contributor

Thank you for your discussion, that is helpful.

Honestly, I don't think that synchronize cmake and vcpkg version on image suites for us:

  • It will add maintenance work to check compatibility and update cmake manually if it is allowed on weekly basis
  • Customers who don't care about vcpkg but want to use the latest cmake won't be happy to wait additional weeks until new cmake is tested and approved by vcpkg team (usually, we get requests to update cmake right after releasing new version - that is the reason why we just pickup the latest version automatically)

If vcpkg has strong dependency on cmake version, it makes sense to include cmake to vcpkg bundle and update it regularly and use host version of cmake only if it is compatible as it was proposed above. The current approach doesn't suit for using in CI systems.

We are going to hold off / close current PRs to switch vcpkg to latest release because as we have discussed above it will cause more impact and will cause issue with cmake.
Let's move further discussion to #1332

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants