-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(agent): Check subnet canister ranges #580
fix(agent): Check subnet canister ranges #580
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work @oggy-dfin! I left a few comments and nits. One main comment was around using await
when calling fetchRootKey
.
@@ -87,8 +87,8 @@ export const request = async (options: { | |||
const response = await agent.readState(canisterId, { | |||
paths: [encodedPaths[index]], | |||
}); | |||
const cert = new Certificate(response, agent); | |||
const verified = await cert.verify(); | |||
const cert = new Certificate(response.certificate, agent.fetchRootKey()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agent.fetchRootKey()
is asynchronous, so this would need an await
- otherwise, the value that gets passed to the Certificate constructor is a Promise.
const rootKey = await agent.fetchRootKey();
const cert = new Certificate(response.certificate, rootKey);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Certiificate
constructor takes a Promise
, it's later awaited on in Certificate.verify
, where it's used - I didn't want to block before the point where it was needed (not that it makes much of a difference in practice, as the user pretty much has to call verify
to do anything with the certificate).
const cert = new Certificate(response, agent); | ||
const verified = await cert.verify(); | ||
const cert = new Certificate(response.certificate, agent.fetchRootKey()); | ||
const verified = await cert.verify(canisterId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
quick check here - this changes the interface of verify
. We should bump agent-js to the next minor version instead of the next patch version.
also, is there anywhere else in the codebase that calls verify()
without arguments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have the same concern. Not all certificates contain a path with canister_id. How do we verify the certificate if we only want to read non canister related paths, e.g. /time
, /subnet/
and /request_status
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point - I'll bump the version up. We could also consider rejigging the type of verify
(or getting rid of it) - see the answer to your next comment.
For the other point (other places calling verify
), I'd hope that the Typescript type system is sound enough to catch this? However, I did miss one place, revealed by the failed PR tests (I didn't check these, my bad), in e2e
, because I never called the TS compiler on that package. Let's discuss with Kyle to add a command that builds the whole monorepo in the right order and catches all such errors? I tried to do this as we discussed a couple of weeks ago, but didn't get anywhere in an hour or so that I spent on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For @chenyan-dfinity 's point: any call to read_state
(including those to get time, subnet and request status) has to have an effective canister ID, and the scope of the certificate must be checked against this ID:
https://internetcomputer.org/docs/current/references/ic-interface-spec/#http-read-state
Is there another way to obtain such a certificate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd hope that the Typescript type system is sound enough to catch this?
Nope. TS type system is NOT sound. Never rely on its type checker.
any call to read_state (including those to get time, subnet and request status) has to have an effective canister ID, and the scope of the certificate must be checked against this ID
Good point. Then we could use effective_canister_id
as the argument name? There is also an exception that provisional_create_canister_with_cycles
allows effective canister id to be any id (https://docs.dfinity.systems/spec/public/#http-effective-canister-id). So we need to at least disable the check for the provisional API.
Also, can this effective_canister_id
be a member of the Certificate class, so that we don't have to pass the canister id when calling verify
? This also makes it a non-breaking change. We can disable the check if the effective_canister_id is missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we need to at least disable the check for the provisional API.
I think the interface spec is actually wrong on this, as it says that any canister is an effective canister ID for provisional_create_canister_with_cycles
, but then also requires that the delegation is checked for the call to read_state
. I think we should disable the check for the case when aaaaa-aa
is the effective canister ID, as per the spec it can only be the effective canister ID for the provisional_create_canister_with_cycles
call. I'll discuss this with Jens once he's back from vacation.
Also, can this effective_canister_id be a member of the Certificate class, so that we don't have to pass the canister id when calling verify? This also makes it a non-breaking change. We can disable the check if the effective_canister_id is missing.
Since we're fixing a security flaw, I think we have to make a breaking change (callers that don't change the call would stay vulnerable). After discussing with Jason, since it's a breaking change already, I've changed the API to perform the verification immediately in the constructor - we couldn't find a use case where you'd want to delay the verification (do you maybe see one?). Also, I unified the error reporting to always use exceptions, instead of a mixture of exceptions and Booleans as was previously the case for verify
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the interface spec is actually wrong on this
I think the spec is fine. We just need to disable the check when the effective canister id is aaaaa-aa
. agent-rs does the same thing: https://github.com/dfinity/agent-rs/blob/main/ic-agent/src/agent/mod.rs#L1214
Since we're fixing a security flaw, I think we have to make a breaking change (callers that don't change the call would stay vulnerable)
Not sure it has to be a breaking change. Callers that didn't upgrade agent-js to the fixed version stays vulnerable. It doesn't have to be a breaking change to fix a vulnerability.
I've changed the API to perform the verification immediately in the constructor - we couldn't find a use case where you'd want to delay the verification (do you maybe see one?).
There might be a use case that the caller doesn't want to do a verification. If we verify in the constructor, it can be harder to skip verification? The problem is that bls verify is written in Wasm, some platforms don't support wasm execution, e.g. mobile platform, chrome plugins. In those environments, they either have to use a slower JS implementation or skip the verification completely. I know a grant awardee is writing a Chrome plugin to inspect all IC traffic, and cannot run verify in the plugin setting. Since it's a debugging tool, it's not too bad to disable the verification.
try { | ||
const result = await cert.verify(canisterId); | ||
expect(result).toEqual(false); | ||
} catch (e) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question about this test - it sounds like you want to test that an exception gets thrown. shouldn't you put the expect
in the catch
block then? and rejig the test so that the call to verify
throws an error? one way you can force verify
to throw is by using a mock
on the other hand, if all you want to check is that result
is a boolean, the try / catch
isnt necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem here is that the interface of verify
is a bit smelly. It signals some errors by returning false
, but others by throwing instead. Here I know that it will throw because I wrote it, but I wanted to cover both ways of reporting errors.
IMO we should clean it up. I think an event better alternative would be to have the constructor take the canisterId
as the parameter and internally call verify
, which blows up earlier which IMO is a good thing (TM). Thoughts? Should I do that if we're already mucking around with the interfaces and doing backwards incompatible changes?
packages/agent/src/certificate.ts
Outdated
|
||
constructor(response: ReadStateResponse, private _agent: Agent = getDefaultAgent()) { | ||
this.cert = cbor.decode(new Uint8Array(response.certificate)); | ||
constructor(certificate: ArrayBuffer, private _rootKey: Promise<ArrayBuffer>) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_rootKey
should probably be type ArrayBuffer
- Promises are typically not passed into constructors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we change the interface to stick the canisterId
into the certificate that makes total sense - if we for whatever reason want to call verify
separately (I don't know why we would), a Promise
would allow to delay blocking until we need it.
packages/agent/src/certificate.ts
Outdated
throw new Error('fail to verify delegation certificate'); | ||
} | ||
|
||
const lookup = cert.lookup(['subnet', d.subnet_id, 'public_key']); | ||
if (!lookup) { | ||
if (canisterId.compareTo(Principal.managementCanister()) != 'eq') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
javascript is weird, so this should be !==
instead of !=
. If the latter is used, type coercion is performed. please disregard if type coercion is intentional here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this post is correct, there'd be no coercion in TS (it'd be a compile-time error instead)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not rely on TS for correctness. If one side the operand is any
type, the check will pass, and we still get coercion when translated to JS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, changed to !==
packages/principal/src/index.test.ts
Outdated
const anonymous = Principal.anonymous(); | ||
const principal1 = Principal.fromText('ryjl3-tyaaa-aaaaa-aaaba-cai'); | ||
const principal2 = Principal.fromText('ivg37-qiaaa-aaaab-aaaga-cai'); | ||
for (let p of [anonymous, principal1, principal2]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lint error here: the tests are failing because this is a let
instead of const
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that the same check is missing in agent-rs
.
if (!d) { | ||
if (!this._rootKey) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why removing this logic? fetchRootKey()
should only be called in local replica. The root key for the mainnet is hardcoded and doesn't require explicit fetching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching this - my fix had made things worse! However, I do think that this is the wrong place for this logic - there's no real need for Certificate
to know anything about the agent IMO (I ended up staring at this piece of code for a while when first looking it, and obviously got it wrong), it just needs a root key.
I changed the code now so that the callers use the root key if available. We can probably extract a convenience method on Agent
to do this.
const cert = new Certificate(response, agent); | ||
const verified = await cert.verify(); | ||
const cert = new Certificate(response.certificate, agent.fetchRootKey()); | ||
const verified = await cert.verify(canisterId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have the same concern. Not all certificates contain a path with canister_id. How do we verify the certificate if we only want to read non canister related paths, e.g. /time
, /subnet/
and /request_status
?
The check was added to |
f46ce59
to
626077f
Compare
626077f
to
76ac41c
Compare
const cert = new Certificate(response, agent); | ||
const verified = await cert.verify(); | ||
const cert = new Certificate(response.certificate, agent.fetchRootKey()); | ||
const verified = await cert.verify(canisterId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd hope that the Typescript type system is sound enough to catch this?
Nope. TS type system is NOT sound. Never rely on its type checker.
any call to read_state (including those to get time, subnet and request status) has to have an effective canister ID, and the scope of the certificate must be checked against this ID
Good point. Then we could use effective_canister_id
as the argument name? There is also an exception that provisional_create_canister_with_cycles
allows effective canister id to be any id (https://docs.dfinity.systems/spec/public/#http-effective-canister-id). So we need to at least disable the check for the provisional API.
Also, can this effective_canister_id
be a member of the Certificate class, so that we don't have to pass the canister id when calling verify
? This also makes it a non-breaking change. We can disable the check if the effective_canister_id is missing.
packages/agent/src/certificate.ts
Outdated
throw new Error('fail to verify delegation certificate'); | ||
} | ||
|
||
const lookup = cert.lookup(['subnet', d.subnet_id, 'public_key']); | ||
if (!lookup) { | ||
if (canisterId.compareTo(Principal.managementCanister()) != 'eq') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not rely on TS for correctness. If one side the operand is any
type, the check will pass, and we still get coercion when translated to JS.
packages/agent/src/polling/index.ts
Outdated
const verified = await cert.verify(); | ||
const rootKey = agent.rootKey == null ? agent.fetchRootKey() : Promise.resolve(agent.rootKey); | ||
const cert = new Certificate(state.certificate, rootKey); | ||
const verified = await cert.verify(canisterId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a function to compute the effective canister id, especially when the client is calling aaaaa-aa
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have sufficient information at this point in the code to compute this (there's no request); per the comment, the function already expects an effective canister ID. I don't see any special computation of the effective ID, but I may have missed something.
So I think you're right and we should add it, but I don't think this PR is breaking anything beyond what's broken already. In particular, to read the status of calls to the management canister other than provisional_create_canister_with_cycles
, you need to compute the effective canister ID regardless of whether you're properly checking the certificate, and then read_state
on that ID, otherwise (assuming that the replica actually follows the interface spec) you don't get a response.
9a9e9a7
to
4203c2e
Compare
One note as I'm starting to review this - while I agree that this change should go out under I'll go ahead and clean that up, which should also remove some overhead in the changed files view |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've contributed some doc comments for Principal since I try to include them for all new methods and to add more coverage over time
chore: reverts package versions
Co-authored-by: Kyle Peacock <kylpeacock@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, thanks a lot for implementing the range check @oggy-dfin!
Please find a few comments below.
packages/agent/src/certificate.ts
Outdated
*/ | ||
rootKey: ArrayBuffer; | ||
/** | ||
* The effective canister ID of the request that generated the certificate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Isn't this the canister ID of the canister that generated the certificate? For example:
* The effective canister ID of the request that generated the certificate. | |
* The effective canister ID of the canister that generated the certificate. This canister ID is used in the request URL when issuing the request from the agent. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The canister doesn't necessarily generate the certificate - the result can be just the call to read_state
(for example, to read the status of a request). But you're right, it can also be generated by the canister. I think we can say effective canister ID/signing canister ID, depending on the context. I'll formulate something along those lines.
packages/agent/src/certificate.ts
Outdated
* @see {@link CreateCertificateOptions} | ||
* @param {ArrayBuffer} options.certificate The bytes of the certificate | ||
* @param {ArrayBuffer} options.rootKey The root key to verify against | ||
* @param {Principal} options.canisterId The root key to verify against |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
* @param {Principal} options.canisterId The root key to verify against | |
* @param {Principal} options.canisterId The ID of the canister that generated the certificate |
*/ | ||
public static async create(options: CreateCertificateOptions): Promise<Certificate> { | ||
const cert = new Certificate(options.certificate, options.rootKey, options.canisterId); | ||
await cert.verify(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like that you can now only create certs that are valid! 👍
} | ||
|
||
test('delegation check fails for canisters outside of the subnet range', async () => { | ||
// Use a different principal than the happy path, which isn't in the delegation ranges. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I would have to check the allowed ranges myself here. Would it help to provide the allowed range(s) as a comment here?
also, since it would be cheap: would it make sense to add tests for the boundary cases at the (closed) ends of a range?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH I don't even know what the subnet ranges in this certificate are - I just copied these over from the Rust tests. But even if I did know what they were, I don't know if our textual encoding preserves the ordering?
Boundary cases totally make sense, I'll have a look at those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can actually decode this (it's cbor, https://cbor.me comes in handy).
So the first level of decoding reveals
"delegation": {"subnet_id": h'D77B2A2F7199B9A8AEC93FE6FB588661358CF12223E9A3AF7B4EBAC402', "certificate": h'D9D9F7A26474726565830182045820AE023F28C3B9D966C8FB09F9ED755C828AADB5152E00AAF700B18C9C067294B483018302467375626E6574830182045820E83BB025F6574C8F31233DC0FE289FF546DFA1E49BD6116DD6E8896D90A4946E830182045820E782619092D69D5BEBF0924138BD4116B0156B5A95E25C358EA8CF7E7161A661830183018204582062513FA926C9A9EF803AC284D620F303189588E1D3904349AB63B6470856FC4883018204582060E9A344CED2C9C4A96A0197FD585F2D259DBD193E4EADA56239CAC26087F9C58302581DD77B2A2F7199B9A8AEC93FE6FB588661358CF12223E9A3AF7B4EBAC402830183024F63616E69737465725F72616E6765738203581BD9D9F781824A000000000020000001014A00000000002FFFFF010183024A7075626C69635F6B657982035885308182301D060D2B0601040182DC7C0503010201060C2B0601040182DC7C050302010361009933E1F89E8A3C4D7FDCCCDBD518089E2BD4D8180A261F18D9C247A52768EBCE98DC7328A39814A8F911086A1DD50CBE015E2A53B7BF78B55288893DAA15C346640E8831D72A12BDEDD979D28470C34823B8D1C3F4795D9C3984A247132E94FE82045820996F17BB926BE3315745DEA7282005A793B58E76AFEB5D43D1A28CE29D2D158583024474696D6582034995B8AAC0E4EDA2EA16697369676E61747572655830ACE9FCDD9BC977E05D6328F889DC4E7C99114C737A494653CB27A1F55C06F4555E0F160980AF5EAD098ACC195010B2F7'}
The included delegation cert then specifies for the subnet (in addition to it's public key) the canister_ranges D9D9F781824A000000000020000001014A00000000002FFFFF0101
(which is again cbor. Decoded this results in the following ranges:
[(CanisterId(jrlun-jiaaa-aaaab-aaaaa-cai), CanisterId(v2nog-2aaaa-aaaab-p777q-cai))]
Which are the following principals:
Principal(PrincipalInner { len: 10, bytes: [0, 0, 0, 0, 0, 32, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] }), Principal(PrincipalInner { len: 10, bytes: [0, 0, 0, 0, 0, 47, 255, 255, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] })
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @frederikrothenberger !
I now realized that the interface spec also provides handy tools for decoding the text representation (which indeed doesn't preserve the lexicographic ordering, due to the use of crc-32).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the checks now, switched to a hex representation of the principals as that makes the ranges clearly visible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks for addressing this @oggy-dfin !
delegation?: { subnet_id: ArrayBuffer; certificate: ArrayBuffer }; | ||
}; | ||
|
||
test('certificate verification fails for an invalid signature', async () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding these tests!
The following comment suggests to add more tests, generally for cert verification. I think that should not be part of this PR however, but maybe it makes sense to create a separate ticket / PR for it, WDYT?
IIUC these are now all the tests we have for cert verification right? It seems to me that there is potential to improve the testing, e.g. for the following cases:
- Cert without delegation verifies
- Cert with length 1 delegation chain verifies
- Cert with length >1 delegation chain verifies
- Cert with wrong range in single delegation fails
- Cert with wrong range in >1 depth delegation chain fails
- Cert with invalid signature in single delegation fails
- Cert with invalid signature in >1 depth delegation chain fails
- Tests where the tree reconstruction or path lookup fails
@frederikrothenberger wrote Rust code to generate certificates for tests and actually used that code to generate certificate test vectors in the context of the service worker, see here. IIUC this could be used to generate test vectors for the above cases as well!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes total sense, certification is important. Created a ticket now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect, thanks for creating the ticket.
Description
We were missing checks that ensured that responses were signed by the correct subnet.
In detail, when the user reads the state of a canister
C
, the IC returns a state tree in a responseResp
. To ensure the integrity of theResp
, the state tree must be signed. It can be signed directly by the root key of the IC, or by a different key (in particular, a subnet key) that is certified by the IC root key through a (possibly nested) delegation. Such delegations are scoped, and valid only for certain canister ranges. ForResp
to be valid,C
has to be included in the ranges of all the delegations, but this check was missing. See:https://internetcomputer.org/docs/current/references/ic-interface-spec/#certification
https://internetcomputer.org/docs/current/references/ic-interface-spec/#http-read-state
https://internetcomputer.org/docs/current/references/ic-interface-spec/#canister-signatures
Fixes SDK-315
How Has This Been Tested?
Unit tests
Checklist: