Deplatforming #7049

eddyashton · 2025-06-11T14:20:21Z

The build-time distinction between SNP and Virtual is unhelpful, results in a huge amount of duplication in our CMake. This was necessary on SGX, where the targets had significantly different build options, but is no longer doing anything.

This PR removes the COMPILE_TARGET option from CMake. This has a bunch of knock-on effects, that I think fall into a few distinct camps:

De-duplicated targets. We no longer have ccf_kv.host and ccf_kv.snp, we just have ccf_kv. I've manually tried to verify in each instance that the calls to each target were always identical, but I may have missed some.
Removal of PLATFORM_ preprocessor define. Most places were #if PLATFORM_VIRTUAL || PLATFORM_SNP, so the #if/#endif pair can be trivially removed. Deciding what kind of quote we produce is now a run-time decision (each binary is capable of trying to produce either format), based on the existing enclave.platform config argument. I think that ought to be promoted to a CLI argument (security critical), but that's deferred.
Test clarity. Functions that check whether an ioctl device is available (in C++ and Python) are now named to show that they're testing for SNP_SUPPORT. The decision of whether this run (of an e2e test or a node) should try to get a virtual or SNP attesation is driven by configuration, not by this capability detection.
Simpler library names. We still (for now) produce a single cchost and then a .so for each "enclave" application. But those applications are now libjs_generic.so, rather than libjs_generic.virtual.so/libjs_generic.snp.so.
Gluing those together it turns out it is still convenient to have a global "default test platform" in CMake, so that we can indicate whether we think we're testing virtual or SNP. That looks a bit like COMPILE_TARGET, but it's now called CCF_TEST_PLATFORM, and hopefully used in few enough places to clarify intent. Perhaps the default value should be derived from capability detection? TBD.

I've only run a handful of builds and tests locally, this is likely missing some obvious breakages, so opening as a Draft initially and awaiting CI coverage.

Also, I think this is a big enough breaking change that it shouldn't affect 6.0, so I propose a release/6.x branch.

Follow-ups:

Remove enclave section in configuration #6623
De-duplicate builds in CI: build once and copy to test runners
Tear out more enclave/platform distinctions in the test infra. I'm always wary of how hard we can go on this while maintaining some lts_ story, but I think right now we can be pretty aggressive? But let's do it separately to derisk.

UPDATE

As often happens, this PR lived for a while and changed significantly in that period. The description above is now slightly inaccurate, but kept for posterity. The final merge leans hard on platform detection to avoid any required top-level configuration. Overrides via env var are still possible, but the default behaviour (independently per-node and in Python infra) is to test once at startup for SNP device support, else fall-back to virtual.

…EST_PLATFORM for some test control

cmake/qcbor.cmake

src/enclave/main.cpp

src/pal/test/snp_ioctl_test.cpp

tests/sandbox/sandbox.sh

CMakeLists.txt

tests/connections.py

tests/infra/network.py

achamayou · 2025-06-11T15:27:45Z

Deciding what kind of quote we produce is now a run-time decision (each binary is capable of trying to produce either format), based on the existing enclave.platform config argument. I think that ought to be promoted to a CLI argument (security critical), but that's deferred.

It seems to me that in 99% of cases, this can be automatically decided without security implications, because either the join policy, or the consortium-issued transition service to open decide whether the attestation is acceptable.

Having a way to force virtual is likely useful, either through an env var or through the configuration, but is not security critical for the reason above. Whether that's good usability is arguable.

Gluing those together it turns out it is still convenient to have a global "default test platform" in CMake, so that we can indicate whether we think we're testing virtual or SNP. That looks a bit like COMPILE_TARGET, but it's now called CCF_TEST_PLATFORM, and hopefully used in few enough places to clarify intent. Perhaps the default value should be derived from capability detection? TBD.

Because we (still) use ctest as a launcher, I agree that this is good, but that's something we could decide to revisit, as the decrease in compile-time parameters also diminishes the value of that layer.

achamayou · 2025-06-11T15:31:18Z

Also, I think this is a big enough breaking change that it shouldn't affect 6.0, so I propose a release/6.x branch.

100%

eddyashton · 2025-06-11T16:28:30Z

It seems to me that in 99% of cases, this can be automatically decided without security implications, because either the join policy, or the consortium-issued transition service to open decide whether the attestation is acceptable.

Having a way to force virtual is likely useful, either through an env var or through the configuration, but is not security critical for the reason above. Whether that's good usability is arguable.

I'm coming round to the idea of automatic platform detection, let's discuss tomorrow. Just listing a few discussion points:

Can we still force the platform to emit a certain kind of attestation when we want to test that?
Do we silently lose test coverage when the automated capability detection fails/is flaky?
Do we promote visibility of the chosen platform somewhere? It's in the attestation if we get that far, and I guess we argue that behaviour up-to that point is identical? So we log as soon as we branch, and store it in the KV ASAP.

achamayou · 2025-06-11T16:44:27Z

Can we still force the platform to emit a certain kind of attestation when we want to test that?

I assume you mean a virtual attestation? I would say yes, not because I can think of a reason why it might be useful, but because it seems really easy to do, and could come in handy when isolating failures.

Do we silently lose test coverage when the automated capability detection fails/is flaky?

So long as the tests/test infra enforce a specific platform is being tested, I don't think so? If we are worried that the test auto-detection can fail silently too, we could push the number of tests run to bencher, and set up a threshold alert?
Having said that ./tests --platform=SNP is pretty simple, and not subject to this failure mode at all. It may well be the right thing to do.

Do we promote visibility of the chosen platform somewhere? It's in the attestation if we get that far, and I guess we argue that behaviour up-to that point is identical? So we log as soon as we branch, and store it in the KV ASAP.

The attestation gets stored in Tx0. I think we want to tighten the initial consortium check to only open if we're happy with the platform, and we should be good from there.

eddyashton · 2025-06-13T16:19:47Z

I'm currently convinced that auto-detection is fine, and we don't need any explicit/tested overrides. In my head:

This means a CCF node will look for SNP IOCTL devices, and produce an SNP attestation if it finds them (at this point requiring some additional SNP endorsement server config), else it'll fall back and produce a virtual attestation. The Python infra will do the same check independently, to decide whether it is expecting SNP attestations (with associated tests), and to confirm the attestation format before submitting service open. CMake needs no knowledge of what platforms the tests are running on/targeting, so this can go entirely (I think - maybe something about test labels that need to contain platform info, but I think that can be inserted later). Then we convince ourselves that some of the tests are actually running on SNP, by writing a standalone "confirm_this_is_snp" script/util, and calling that early in the appropriate CI jobs.

Possible future things we can support in future:

Running SNP nodes from non-SNP host-infra. We're moving away from this in the current infra, since it's expensive to maintain and unused. The story here would be simple - the detection needs to run on the target, not the host.
A manual override for getting a virtual attestation despite SNP being available. I've wanted to do this once before, but very manually. Tempted to bury a magic env var that provides this, and exclude it from any automated testing? Plumbing the same override through the Python infra would be expensive and unnecessary, but is also easier to patch in manually later.
Running cross-platform upgrade tests. This would need either of the above options - infra launches remote nodes, or is able to override what some local nodes do. Not currently tested, can add when it's needed.

…orms

CMakeLists.txt

Copilot

Pull Request Overview

This PR removes the build‐time split between SNP and Virtual targets in both CMake and code, consolidating enclave handling into a single build with runtime platform detection.

Eliminates COMPILE_TARGET and PLATFORM_* defines, simplifying CMake scripts and sample configs
Introduces infra.platform_detection for test suites and replaces compile‐time checks with is_snp()/is_virtual()
Updates C++ host/enclave layers to use ccf::pal::platform and refactors quote generation

Reviewed Changes

Copilot reviewed 70 out of 71 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
tla/make_traces.sh	Removed `-DCOMPILE_TARGET=virtual` from trace‐generation script
tests/infra/platform_detection.py	Added runtime detection (`is_snp`, `is_virtual`, `SNP_SUPPORT`)
tests/infra/network.py	Unified key setup to always symlink
tests/connections.py	Replaced conditional timeout with fixed value
tests/infra/utils.py	Removed enclave_type parameter and adjusted signature

Comments suppressed due to low confidence (2)

tests/infra/network.py:542

The original code used cp for SNP platforms and ln -s for Virtual; unconditionally using ln -s may break the intended behavior on SNP. Consider applying infra.platform_detection.is_snp() here to choose cp when true, preserving the prior logic.

        cmd = ["ln", "-s"]

tests/connections.py:193

[nitpick] Tests previously used a longer timeout when running on SNP (timeout=3); fixing it at 1s may introduce flakiness on SNP platforms. Consider restoring a conditional timeout based on infra.platform_detection.is_snp().

                            timeout=1,

tests/infra/utils.py

src/apps/tpcc/tpcc.cmake

src/enclave/virtual_enclave.h

tests/historical_query_perf.py

tests/infra/platform_detection.py

tests/infra/remote.py

tests/infra/runner.py

tests/perf-system/submitter/CMakeLists.txt

eddyashton added 8 commits June 11, 2025 09:32

Stop building platform-specific variants of 3rd party libraries

d08edb1

Cleanups from the last one, squash CCF_PROJECT values

bb5f363

Stop building platform-specifics of _more_ libraries

cd10c75

Probably too much for 1 commit! Remove COMPILE_TARGET, retain a CCF_T…

43d89c5

…EST_PLATFORM for some test control

Ophiological surgery

5d4db59

Remove snp.IS_SNP (retain related snp.SNP_SUPPORT), and format etc

1b8c48c

More format

bc09875

Undocument removed option

3e8bd88

eddyashton commented Jun 11, 2025

View reviewed changes

eddyashton added 2 commits June 11, 2025 14:42

Wrong case for type

9989345

Merge branch 'main' of github.com:microsoft/CCF into no_platforms

cff3bcd

lts_compat needs some restoration

76c29a5

eddyashton added 3 commits June 11, 2025 17:19

Format

14f6229

Remove debug logging

c5941ee

Merge branch 'main' of github.com:microsoft/CCF into no_platforms

9bbfcac

eddyashton added the run-long-test label Jun 13, 2025

eddyashton added 3 commits June 13, 2025 14:23

Better version check, for lts?

b6fcb46

ahaerh

0aec063

Merge branch 'main' of github.com:microsoft/CCF into no_platforms

f680361

eddyashton added 6 commits June 16, 2025 09:41

Merge branch 'main' of github.com:microsoft/CCF into no_platforms

0c95a1a

Doesn't feel right, but bump the version

b8167f1

Update docs to remove platform-specific instructions

408972f

Add an infra/platform.py file for platform auto-detection

e0122b4

Try removing platform-dependent timeout

3c6dfe8

Steps towards removing args.platform

f0753fb

eddyashton and others added 6 commits July 4, 2025 14:04

Merge branch 'main' of github.com:microsoft/CCF into no_platforms

d45195c

Update docs

cbb82fb

Bump version, add CHANGELOG

e4a2323

Merge branch 'main' into no_platforms

021dbfc

Merge branch 'main' of https://github.com/microsoft/CCF into no_platf…

a4e1547

…orms

Bump version

4db8bd9

eddyashton marked this pull request as ready for review July 9, 2025 13:31

eddyashton requested a review from a team as a code owner July 9, 2025 13:31