Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
Covalent : A Smart Policy Platform
The Internet is only as smart as the data it carries. Right now, data is mindless, easily exposed and abused. Imagine a new Internet composed of Smart Data that remembers, keeps a secret, learns from its users, and is open for good. COVA contributes to the making of this new Internet.
The existing Internet protocol suite provides end-to-end data communication specifying how data should be packetized, addressed, transmitted, routed, and received. However, it does not specify how data should be used. A successful data usage control would need to regulate what happens to data in the future after access is granted, which makes the problem extremely difficult and remains a last frontier of the Internet protocol suite.
Covalent is a new addition to the Internet protocol suite that specifies and enforces how data should be used. Under the Covalent protocol, data carries a "smart policy". Whereas a usual data usage policy is expressed in natural language and is only enforceable by law, a "smart policy" is specified in programming language and is enforceable by code.
Example Data Usage Policies
I want …
- that my credit card number may only be used once for a one-time charge. It will be erased from the server after the payment is completed.
- that a patient’s CT Scan may be processed by an aggregate function and not rendered individually.
- that my Snapchat photo may only be rendered individually and not be processed by facial recognition.
- that these text messages can only be viewed after tonight.
- that this e-book can only be shared 15 times.
- that a shopper’s mobile GPS data may not be used with her other data to construct an intrusive customer profiling.
- that k-anonymity or differential privacy is applied before my data can be used in aggregate.
- that only a specific open-source codebase can use my data.
- an ad or disclaimer has to be viewed before the rest of the content.
Covalent experiments with a "Hardware + Software" approach.
- "Hardware" refers to the lower layer of "Trusted Execution Environments (TEE)". We first incentivize the creation of a network of TEE nodes which provides us with a scalable amount of "trusted" computing power. Having this "trusted" computing power allows us to make the important assumption about the program behavior of important "supervisor" programs. With this assumption, a much wider repertoire of tools for policy definition and enforcement becomes available to the upper layer.
- "Software" refers to the upper layer of a Policy Specification Language (PSL) that specifies data usage policies in programming language and a Policy Enforcement Framework (PEF) that enforces the PSL through code. A reductive demo of the PSL + PEF setup is the implementation of usage policies commonly used in AI ("that a patient’s CT Scan may be processed by an aggregate function and not rendered individually."). We then work out a generalization where a wider range of data usage policies can be expressed and enforced.
- [Done] Private Testnet: The Private Testnet is built to this specification. This Alpha release implements a standalone blockchain network of TEE nodes (Intel SGX) that communicate with a smart contract system (currently bigchaindb/tendermint). The source code is available here.
- Public Testnet: Adjust the setup according to this schematics in preparation for Smart Policy software. Release this version for public testing.
- [Done] Private Testnet: The Private Testnet is built to this specification. It is a starting point for our over-arching Smart Policy framework. It implements a specific Smart Policy through a specific system architecture. The specify Smart Policy is: "only certain functions from the Python Sklearn library can be executed on data, and the returned outputs are checked for compliance". The specific architecture is that we use the TEE nodes only as control units, and the heavy-lifting computation takes place in a cloud environment that has inherited TEE's security and privacy guarantees. This specific Smart Policy and system architecture are adequate and practical for a large number of data scientists and AI practitioners. The source code is available here.
- Public Testnet: Creation of a new language (Centrifuge) and a new virtual machine (Cova Virtual Machine) in order to implement "Smart Policy" so that a wide range of data usage policies can be expressed and enforced. Release this version for public testing.