From e77919621d611a597c2b6b8615aadfa34d0509f1 Mon Sep 17 00:00:00 2001 From: David Enyeart Date: Mon, 25 Nov 2019 01:53:17 -0500 Subject: [PATCH] [FAB-17135] Private data sharing doc Document private data enhancements in v2.0 including private data sharing patterns and example. Signed-off-by: David Enyeart --- docs/source/private-data/private-data.md | 219 ++++++++++++++++++++++- 1 file changed, 213 insertions(+), 6 deletions(-) diff --git a/docs/source/private-data/private-data.md b/docs/source/private-data/private-data.md index 985f27dd125..ee5e7332f46 100644 --- a/docs/source/private-data/private-data.md +++ b/docs/source/private-data/private-data.md @@ -21,9 +21,8 @@ A collection is the combination of two elements: 1. **The actual private data**, sent peer-to-peer [via gossip protocol](../gossip.html) to only the organization(s) authorized to see it. This data is stored in a - private state database on the peers of authorized organizations (sometimes - called a "side" database, or "SideDB"), which can be accessed from chaincode - on these authorized peers. + private state database on the peers of authorized organizations, + which can be accessed from chaincode on these authorized peers. The ordering service is not involved here and does not see the private data. Note that because gossip distributes the private data peer-to-peer across authorized organizations, it is required to set up anchor peers on the channel, @@ -45,6 +44,12 @@ third party can then compute the hash of the private data and see if it matches state on the channel ledger, proving that the state existed between the collection members at a certain point in time. +In some cases, you may decide to have a set of collections each comprised of a +single organization. For example an organization may record private data in their own +collection, which could later be shared with other channel members and +referenced in chaincode transactions. We'll see examples of this in the sharing +private data topic below. + ### When to use a collection within a channel vs. a separate channel * Use **channels** when entire transactions (and ledgers) must be kept @@ -91,9 +96,7 @@ private data collections **(PDC)** can be defined to share private data between: Using this example, peers owned by the **Distributor** will have multiple private databases inside their ledger which includes the private data from the **Distributor**, **Farmer** and **Shipper** relationship and the -**Distributor** and **Wholesaler** relationship. Because these databases are kept -separate from the database that holds the channel ledger, private data is -sometimes referred to as "SideDB". +**Distributor** and **Wholesaler** relationship. ![private-data.private-data](./PrivateDataConcept-3.png) @@ -140,6 +143,210 @@ documentation on [transaction flow](../txflow.html). their copy of the private state database and private writeset storage. The private data is then deleted from the `transient data store`. +## Sharing private data + +In many scenarios private data keys/values in one collection may need to be shared with +other channel members or with other private data collections, for example when you +need to transact on private data with a channel member or group of channel members +who were not included in the original private data collection. The receiving parties +will typically want to verify the private data against the on-chain hashes +as part of the transaction. + +There are several aspects of private data collections that enable the +sharing and verification of private data: + +* First, you don't necessarily have to be a member of a collection to write to a key in + a collection, as long as the endorsement policy is satisfied. + Endorsement policy can be defined at the chaincode level, key level (using state-based + endorsement), or collection level (starting in Fabric v2.0). + +* Second, starting in v1.4.2 there is a chaincode API GetPrivateDataHash() that allows + chaincode on non-member peers to read the hash value of a private key. This is an + important feature as you will see later, because it allows chaincode to verify private + data against the on-chain hashes that were created from private data in previous transactions. + +This ability to share and verify private data should be considered when designing +applications and the associated private data collections. +While you can certainly create sets of multilateral private data collections to share data +among various combinations of channel members, this approach may result in a large +number of collections that need to be defined. +Alternatively, consider using a smaller number of private data collections (e.g. +one collection per organization, or one collection per pair of organizations), and +then sharing private data with other channel members, or with other +collections as the need arises. Starting in Fabric v2.0, implicit organization-specific +collections are available for any chaincode to utilize, +so that you don't even have to define these per-organization collections when +deploying chaincode. + +### Private data sharing patterns + +When modeling private data collections per organization, multiple patterns become available +for sharing or transferring private data without the overhead of defining many multilateral +collections. Here are some of the sharing patterns that could be leveraged in chaincode +applications: + +* **Use a corresponding public key for tracking public state** - + You can optionally have a matching public key for tracking public state (e.g. asset + properties, current ownership. etc), and for every organization that should have access + to the asset's corresponding private data, you can create a private key/value in each + organization's private data collection. + +* **Chaincode access control** - + You can implement access control in your chaincode, to specify which clients can + query private data in a collection. For example, store an access control list + for a private data collection key or range of keys, then in the chaincode get the + client submitter's credentials (using GetCreator() chaincode API or CID library API + GetID() or GetMSPID() ), and verify they have access before returning the private + data. Similarly you could require a client to pass a passphrase into chaincode, + which must match a passphrase stored at the key level, in order to access the + private data. Note, this pattern can also be used to restrict client access to public + state data. + +* **Sharing private data out of band** - + As an off-chain option, you could share private data out of band with other + organizations, and they can hash the key/value to verify it matches + the on-chain hash by using GetPrivateDataHash() chaincode API. For example, + an organization that wishes to purchase an asset from you may want to verify + an asset's properties and that you are the legitimate owner by checking the + on-chain hash, prior to agreeing to the purchase. + +* **Sharing private data with other collections** - + You could 'share' the private data on-chain with chaincode that creates a matching + key/value in the other organization's private data collection. You'd pass the + private data key/value to chaincode via transient field, and the chaincode + could confirm a hash of the passed private data matches the on-chain hash from + your collection using GetPrivateDataHash(), and then write the private data to + the other organization's private data collection. + +* **Transferring private data to other collections** - + You could 'transfer' the private data with chaincode that deletes the private data + key in your collection, and creates it in another organization's collection. + Again, use the transient field to pass the private data upon chaincode invoke, + and in the chaincode use GetPrivateDataHash() to confirm that the data exists in + your private data collection, before deleting the key from your collection and + creating the key in another organization's collection. To ensure that a + transaction always deletes from one collection and adds to another collection, + you may want to require endorsements from additional parties, such as a + regulator or auditor. + +* **Using private data for transaction approval** - + If you want to get a counterparty's approval for a transaction before it is + completed (e.g. an on-chain record that they agree to purchase an asset for + a certain price), the chaincode can require them to 'pre-approve' the transaction, + by either writing a private key to their private data collection or your collection, + which the chaincode will then check using GetPrivateDataHash(). In fact, this is + exactly the same mechanism that the built-in lifecycle system chaincode uses to + ensure organizations agree to a chaincode definition before it is committed to + a channel. Starting with Fabric v2.0, this pattern + becomes more powerful with collection-level endorsement policies, to ensure + that the chaincode is executed and endorsed on the collection owner's own trusted + peer. Alternatively, a mutually agreed key with a key-level endorsement policy + could be used, that is then updated with the pre-approval terms and endorsed + on peers from the required organizations. + +* **Keeping transactors private** - + Variations of the prior pattern can also eliminate leaking the transactors for a given + transaction. For example a buyer indicates agreement to buy on their own collection, + then in a subsequent transaction seller references the buyer's private data in + their own private data collection. The proof of transaction with hashed references + is recorded on-chain, only the buyer and seller know that they are the transactors, + but they can reveal the pre-images if a need-to-know arises, such as in a subsequent + transaction with another party who could verify the hashes. + +Coupled with the patterns above, it is worth noting that transactions with private +data can be bound to the same conditions as regular channel state data, specifically: + +* **Key level transaction access control** - + You can include ownership credentials in a private data value, so that subsequent + transactions can verify that the submitter has ownership privilege to share or transfer + the data. In this case the chaincode would get the submitter's credentials + (e.g. using GetCreator() chaincode API or CID library API GetID() or GetMSPID() ), + combine it with other private data that gets passed to the chaincode, hash it, + and use GetPrivateDataHash() to verify that it matches the on-chain hash before + proceeding with the transaction. + +* **Key level endorsement policies** - + And also as with normal channel state data, you can use state-based endorsement + to specify which organizations must endorse transactions that share or transfer + private data, using SetPrivateDataValidationParameter() chaincode API, + for example to specify that only an owner's organization peer, custodian's organization + peer, or other third party must endorse such transactions. + +### Private data sharing example + +The private data sharing patterns mentioned above can be combined to enable powerful +chaincode-based applications. For example, consider how an asset transfer scenario +could be implemented using per-organization private data collections: + +* An asset may be tracked by a UUID key in public chaincode state. Only the asset's + ownership is recorded, nothing else is known about the asset. + +* The chaincode will require that any transfer request must originate from the owning client, + and the key is bound by state-based endorsement requiring that a peer from the + owner's organization and a regulator's organization must endorse any transfer requests. + +* The asset owner's private data collection contains the private details about + the asset, keyed by a hash of the UUID. Other organizations and the ordering + service will only see a hash of the asset details. + +* Let's assume the regulator is a member of each collection as well, and therefore + persists the private data, although this need not be the case. + +A transaction to trade the asset would unfold as follows: + +1. Off-chain, the owner and a potential buyer strike a deal to trade the asset + for a certain price. + +2. The seller provides proof of their ownership, by either passing the private details + out of band, or by providing the buyer with credentials to query the private + data on their node or the regulator's node. + +3. Buyer verifies a hash of the private details matches the on-chain public hash. + +4. The buyer invokes chaincode to record their bid details in their own private data collection. + The chaincode is invoked on buyer's peer, and potentially on regulator's peer if required + by the collection endorsement policy. + +5. The current owner (seller) invokes chaincode to sell and transfer the asset, passing in the + private details and bid information. The chaincode is invoked on peers of the + seller, buyer, and regulator, in order to meet the endorsement policy of the public + key, as well as the endorsement policies of the buyer and seller private data collections. + +6. The chaincode verifies that the submitting client is the owner, verifies the private + details against the hash in the seller's collection, and verifies the bid details + against the hash in the buyer's collection. The chaincode then writes the proposed + updates for the public key (setting ownership to the buyer, and setting endorsement + policy to be the buying organization and regulator), writes the private details to the + buyer's private data collection, and potentially deletes the private details from seller's + collection. Prior to final endorsement, the endorsing peers ensure private data is + disseminated to any other authorized peers of the seller and regulator. + +7. The seller submits the transaction with the public data and private data hashes + for ordering, and it is distributed to all channel peers in a block. + +8. Each peer's block validation logic will consistently verify the endorsement policy + was met (buyer, seller, regulator all endorsed), and verify that public and private + state that was read in the chaincode has not been modified by any other transaction + since chaincode execution. + +9. All peers commit the transaction as valid since it passed validation checks. + Buyer peers and regulator peers retrieve the private data from other authorized + peers if they did not receive it at endorsement time, and persist the private + data in their private data state database (assuming the private data matched + the hashes from the transaction). + +10. With the transaction completed, the asset has been transferred, and other + channel members interested in the asset may query the history of the public + key to understand its provenance, but will not have access to any private + details unless an owner shares it on a need-to-know basis. + +The basic asset transfer scenario could be extended for other considerations, +for example the transfer chaincode could verify that a payment record is available +to satisfy payment versus delivery requirements, or verify that a bank has +submitted a letter of credit, prior to the execution of the transfer chaincode. +And instead of transactors directly hosting peers, they could transact through +custodian organizations who are running peers. + ## Purging private data For very sensitive data, even the parties sharing the private data might want