diff --git a/README.bigpicture.md b/README.bigpicture.md deleted file mode 100644 index baeb8794..00000000 --- a/README.bigpicture.md +++ /dev/null @@ -1,199 +0,0 @@ -# Postgres data security with CipherStash - -## Introduction - -This reference guide provides a comprehensive overview of CipherStash's encryption in use solution, including the CipherStash Proxy and the Encrypt Query Language (EQL). -It is designed for developers and engineers who need to implement robust data security in PostgreSQL without sacrificing performance or usability. - -## Table of Contents - -1. [Encryption in use](#1-encryption-in-use) - - [1.1 What is encryption in use?](#11-what-is-encryption-in-use) - - [1.2 Why use encryption in use?](#12-why-use-encryption-in-use) -2. [CipherStash Proxy](#2-cipherstash-proxy) - - [2.1 Overview](#21-overview) - - [2.2 How it works](#22-how-it-works) - - [2.3 Setup and configuration](#23-setup-and-configuration) -3. [Encrypt Query Language (EQL)](#3-encrypt-query-language-eql) - - [3.1 Overview](#31-overview) - - [3.2 Key components](#32-key-components) - - [3.2.1 Encrypted columns](#321-encrypted-columns) - - [3.2.2 EQL functions](#322-eql-functions) - - [3.2.3 Data format](#323-data-format) - - [3.3 Using EQL](#33-using-eql) - - [3.3.1 Write operations](#331-write-operations) - - [3.3.2 Read operations](#332-read-operations) -4. [Best practices](#4-best-practices) -5. [Advanced topics](#5-advanced-topics) - - [5.1 Integrating without proxy](#51-integrating-without-proxy) -6. [Conclusion](#6-conclusion) - -## 1. Encryption in use - -### 1.1 What is encryption in use? - -Encryption in use is the practice of keeping data encrypted even while it's being processed or queried in the database. -Unlike traditional encryption methods that secure data only at rest (on disk) or in transit (over the network), encryption in use keeps the data encrypted while operations are being performed on the data. -This provides an additional layer of security against unauthorized access — an adversary needs access to the encrypted data _and_ encryption keys. - -### 1.2 Why use encryption in use? - -While encryption at rest and in transit are essential, they don't protect data when the database server itself is compromised. -Encryption in use mitigates this risk by ensuring that: - -- **Data remains secure**: Even if the database server is breached, the data remains encrypted and unreadable without the proper keys. -- **Compliance controls are stronger**: When you need stronger data security controls than what SOC2/SOC3 or ISO27001 mandate, encryption in use helps you meet those stringent requirements. - -## 2. CipherStash Proxy - -### 2.1 Overview - -CipherStash Proxy is a transparent proxy that sits between your application and your PostgreSQL database. -It intercepts SQL queries and handles the encryption and decryption of data on-the-fly. -This enables encryption in use without significant changes to your application code. - -### 2.2 How it works - -- **Intercepts queries**: CipherStash Proxy captures SQL statements from the client application. -- **Encrypts data**: For write operations, it encrypts the plaintext data before sending it to the database. -- **Decrypts data**: For read operations, it decrypts the encrypted data retrieved from the database before returning it to the client. -- **Maintains searchability**: Ensures that the encrypted data is searchable and retrievable without sacrificing performance or application functionality. -- **Manages encryption keys**: Securely handles encryption keys required for encrypting and decrypting data. - -### 2.3 Setup and configuration - -1. **Getting started**: Follow the official [Getting Started guide](https://cipherstash.com/docs/getting-started/cipherstash-proxy) to install and configure CipherStash Proxy. -3. **Application modification**: Update your application's database connection configuration to point to the Proxy instead of the database directly. - -**Example connection string update:** - -```plaintext -# Original -postgresql://user:password@postgres.host:5432/mydb - -# Updated -postgresql://user:password@cipherstash-proxy.host:6432/mydb -``` - -## 3. Encrypt Query Language (EQL) - -### 3.1 Overview - -Encrypt Query Language (EQL) is a set of PostgreSQL functions and data types provided by CipherStash to work with encrypted data and indexes. -EQL allows you to perform queries on encrypted data without decrypting it, supporting operations like equality checks, range queries, and unique constraints. - -### 3.2 Key components - -#### 3.2.1 Encrypted columns - -Encrypted columns are defined using the `cs_encrypted_v1` domain type, which extends the `jsonb` type with additional constraints to ensure data integrity. - -**Example table definition:** - -```sql -CREATE TABLE users ( - id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, - name_encrypted cs_encrypted_v1 -); -``` - -#### 3.2.2 EQL functions - -EQL provides specialized functions to interact with encrypted data: - -- **`cs_ciphertext_v1(val JSONB)`**: Extracts the ciphertext for decryption by CipherStash Proxy. -- **`cs_match_v1(val JSONB)`**: Retrieves the match index for equality comparisons. -- **`cs_unique_v1(val JSONB)`**: Retrieves the unique index for enforcing uniqueness. -- **`cs_ore_v1(val JSONB)`**: Retrieves the Order-Revealing Encryption index for range queries. - -#### 3.2.3 Data Format - -Encrypted data is stored as `jsonb` with a specific schema: - -- **Plaintext Payload (Client Side):** - - ```json - { - "v": 1, - "k": "pt", - "p": "plaintext value", - "e": { - "t": "table_name", - "c": "column_name" - } - } - ``` - -- **Encrypted Payload (Database Side):** - - ```json - { - "v": 1, - "k": "ct", - "c": "ciphertext value", - "e": { - "t": "table_name", - "c": "column_name" - } - } - ``` - -Please refer to the [EQL reference guide](https://cipherstash.com/docs/getting-started/cipherstash-encrypt) for more information on the `jsonb` schema. - -### 3.3 Using EQL - -#### 3.3.1 Write Operations - -When inserting data: - -1. **Application sends plaintext**: Wrap the plaintext in the appropriate JSON structure. - - ```sql - INSERT INTO users (name_encrypted) VALUES ('{"p": "Alice"}'); - ``` - -2. **Proxy encrypts data**: CipherStash Proxy encrypts the plaintext before storing it in the database. - -#### 3.3.2 Read Operations - -When querying data: - -1. **Use EQL functions**: Wrap encrypted columns and query parameters with EQL functions. - - ```sql - SELECT cs_ciphertext_v1(name_encrypted) - FROM users - WHERE cs_match_v1(name_encrypted) @> cs_match_v1('{"p": "Alice"}'); - ``` - -2. **Proxy decrypts data**: CipherStash Proxy decrypts the results before returning them to the application. - -## 4. Best Practices - -- **Leverage CipherStash Proxy**: Use CipherStash Proxy to handle encryption/decryption transparently. -- **Utilize EQL functions**: Always use EQL functions when interacting with encrypted data. -- **Define constraints**: Apply database constraints to maintain data integrity. -- **Secure key management**: Ensure encryption keys are securely managed and stored. -- **Monitor performance**: Keep an eye on query performance and optimize as needed. - -## 5. Advanced Topics - -### 5.1 Integrating without CipehrStash Proxy - -> The SDK approach is currently in development, but if you're interested in contributing, please start a discussion [here](https://github.com/cipherstash/cipherstash). - -For advanced users who prefer to handle encryption within their application: - -- **SDKs available**: Use CipherStash SDKs (at the moment, Rust and TypeScript) to manage encryption/decryption. -- **Manual encryption**: Implement encryption logic in your application code. -- **Data conformity**: Ensure encrypted data matches the expected `jsonb` schema. -- **Key management**: Handle encryption keys securely within your application. - -**Note**: This approach increases complexity and is recommended only if CipherStash Proxy does not meet specific requirements. - -## 6. Conclusion - -CipherStash's encryption in use solution, comprising CipherStash Proxy and EQL, provides a practical way to enhance data security in Postgres databases. -By keeping data encrypted even during processing, you minimize the risk of data breaches and comply with stringent security standards without significant changes to your application logic. - -**Contact Support:** For further assistance, start a discussion [here](https://github.com/cipherstash/cipherstash). diff --git a/README.md b/README.md index 2bfe7114..335ca86e 100644 --- a/README.md +++ b/README.md @@ -1,74 +1,180 @@ # CipherStash Encrypt Query Language (EQL) +[![Why we built EQL](https://img.shields.io/badge/Why%20we%20built%20EQL-8A2BE2)](https://github.com/cipherstash/encrypt-query-language/blob/main/WHYEQL.md) + Encrypt Query Language (EQL) is a set of abstractions for transmitting, storing & interacting with encrypted data and indexes in PostgreSQL. EQL provides a data format for transmitting and storing encrypted data & indexes, and database types & functions to interact with the encrypted material. ## Table of Contents -- [1. Encryption in use](#1-encryption-in-use) - - [1.1 What is encryption in use?](#11-what-is-encryption-in-use) - - [1.2 Why use encryption in use?](#12-why-use-encryption-in-use) -- [2. CipherStash Proxy](#2-cipherstash-proxy) - - [2.1 Overview](#21-overview) - - [2.2 How it works](#22-how-it-works) - - [2.3 How EQL works with CipherStash Proxy](#23-how-eql-works-with-cipherstash-proxy) - - [2.3.1 Writes](#231-writes) - - [2.3.2 Reads](#232-reads) -- [3. Encrypt Query Language (EQL)](#3-encrypt-query-language-eql) - - [3.1 Encrypted columns](#31-encrypted-columns) - - [3.2 EQL functions](#32-eql-functions) - - [3.3 Index functions](#33-index-functions) - - [3.3 Query Functions](#33-query-functions) - - [3.4 Data Format](#34-data-format) - - [3.4.1 Helper packages](#341-helper-packages) -- [4. Getting started](#4-getting-started) - - [4.1 Prerequisites](#41-prerequisites) - - [4.2 Installation](#42-installation) - - [4.3 Add a table with encrypted columns](#43-add-a-table-with-encrypted-columns) - - [4.4 Inserting data](#44-inserting-data) - - [4.5 Querying data](#45-querying-data) +- [Getting started](#getting-started) + - [Prerequisites](#prerequisites) + - [Installation](#installation) + - [Add a table with encrypted columns](#add-a-table-with-encrypted-columns) + - [Inserting data](#inserting-data) + - [Querying data](#querying-data) + - [Adding an index and enable encryption](#adding-an-index-and-enable-encryption) + - [Removing an index and disabling encryption](#removing-an-index-and-disabling-encryption) +- [CipherStash Proxy](#cipherstash-proxy) + - [How EQL works with CipherStash Proxy](#how-eql-works-with-cipherstash-proxy) + - [Writes](#writes) + - [Reads](#reads) +- [Encrypt Query Language (EQL)](#encrypt-query-language-eql) + - [Encrypted columns](#encrypted-columns) + - [EQL functions](#eql-functions) + - [Index functions](#index-functions) + - [Query Functions](#query-functions) + - [Data Format](#data-format) + - [Helper packages](#helper-packages) + +## Getting started + +The following guide assumes you have the prerequisites installed and running, and are running the SQL statements through your CipherStash Proxy instance. + +### Prerequisites + +- [PostgreSQL 14+](https://www.postgresql.org/download/) +- [Cipherstash Proxy](https://cipherstash.com/docs/getting-started/cipherstash-proxy) +- [Cipherstash Encrypt](https://cipherstash.com/docs/getting-started/cipherstash-encrypt) + - You can use the empty `cipherstash/dataset.yml` file in the `cipherstash` directory, as EQL does not require a dataset to be configured but it does need to be initialized (we plan to fix this in the future). + +EQL relies on [Cipherstash Proxy](https://cipherstash.com/docs/getting-started/cipherstash-proxy) and [Cipherstash Encrypt](https://cipherstash.com/docs/getting-started/cipherstash-encrypt) for low-latency encryption & decryption. +We plan to support direct language integration in the future. + +> Note: You will need to copy the `cipherstash/cipherstash-proxy.toml.example` file to `cipherstash/cipherstash-proxy.toml` and update the values to match your environment before running the script. + +### Installation + +In order to use EQL, you must first install the EQL extension in your PostgreSQL database. +You can do this by running the following command, which will execute the SQL from the `src/install.sql` file: + +Update the database credentials based on your environment. + +```bash +psql -U postgres -d postgres -f src/install.sql +``` + +### Add a table with encrypted columns + +Create a table with encrypted columns. +For this example, we'll use the `users` table, with a plaintext `email` column and an encrypted `email_encrypted` column. + +```sql +CREATE TABLE IF NOT EXISTS "users" ( + "id" serial PRIMARY KEY NOT NULL, + "email" varchar, + "email_encrypted" "cs_encrypted_v1" +); +``` + +In some instances, especially when using langugage specific ORMs, EQL also supports `jsonb` columns rather than the `cs_encrypted_v1` domain type. + +### Inserting data + +When inserting data into the encrypted column, you must wrap the plaintext in the appropriate EQL payload. +These statements must be run through the CipherStash Proxy in order to **encrypt** the data. + +```sql +INSERT INTO users (email_encrypted) VALUES ('{"v":1,"k":"pt","p":"test@test.com","i":{"t":"users","c":"email_encrypted"}}'); +``` + +For reference, the EQL payload is defined as a `jsonb` with a specific schema: + +```json +{ + "v": 1, + "k": "pt", + "p": "test@test.com", + "i": { + "t": "users", + "c": "email_encrypted" + } +} +``` + +### Querying data -## 1. Encryption in use +When querying data, you must wrap the encrypted column in the appropriate EQL payload. +These statements must be run through the CipherStash Proxy in order to **decrypt** the data. -EQL enables encryption in use, without significant changes to your application code. -A variety of searchable encryption techniques are available, including: +```sql +SELECT email_encrypted FROM users; +``` -- **Matching** - Equality or partial matches -- **Ordering** - comparison operations using order revealing encryption -- **Uniqueness** - enforcing unique constraints +For reference, the EQL payload is defined as a `jsonb` with a specific schema: -### 1.1 What is encryption in use? +```json +{ + "v": 1, + "k": "ct", + "c": "test@test.com", + "i": { + "t": "users", + "c": "email_encrypted" + } +} +``` -Encryption in use is the practice of keeping data encrypted even while it's being processed or queried in the database. -Unlike traditional encryption methods that secure data only at rest (on disk) or in transit (over the network), encryption in use keeps the data encrypted while operations are being performed on the data. -This provides an additional layer of security against unauthorized access — an adversary needs access to the encrypted data _and_ encryption keys. +### Adding an index and enable encryption -### 1.2 Why use encryption in use? +To add an index to the encrypted column, you must run the `cs_add_index_v1` function. +This function takes the following parameters: -While encryption at rest and in transit are essential, they don't protect data when the database server itself is compromised. -Encryption in use mitigates this risk by ensuring that: +- `table_name`: The name of the table containing the encrypted column. +- `column_name`: The name of the encrypted column. +- `index_name`: The name of the index. +- `cast_as`: The type of the index (text, int, small_int, big_int, real, double, boolean, date, jsonb). +- `opts`: An optional JSON object containing additional index options. -- **Data remains secure**: Even if the database server is breached, the data remains encrypted and unreadable without the proper keys. -- **Compliance controls are stronger**: When you need stronger data security controls than what SOC2/SOC3 or ISO27001 mandate, encryption in use helps you meet those stringent requirements. +For the example above, and using a match index, the following statement would be used: -## 2. CipherStash Proxy +```sql +SELECT cs_add_index_v1('users', 'email_encrypted', 'match', 'text', '{"token_filters": [{"kind": "downcase"}], "tokenizer": { "kind": "ngram", "token_length": 3 }}'); +``` -### 2.1 Overview +Once you have added an index, you must enable encryption. +This will update the encryption configuration to include the new index. -CipherStash Proxy is a transparent proxy that sits between your application and your PostgreSQL database. -It intercepts SQL queries and handles the encryption and decryption of data on-the-fly. -This enables encryption in use without significant changes to your application code. +```sql +SELECT cs_encrypt_v1(); +SELECT cs_activate_v1(); +``` -### 2.2 How it works +In this example `cs_encrypt_v1` and `cs_activate_v1` are called to immediately set the new Encrypt config to **active** for demonstration purposes. +In a production environment, you will need to consider a migration strategy to ensure the encryption config is updated based on the current state of the database. -- **Intercepts queries**: CipherStash Proxy captures SQL statements from the client application. -- **Encrypts data**: For write operations, it encrypts the plaintext data before sending it to the database. -- **Decrypts data**: For read operations, it decrypts the encrypted data retrieved from the database before returning it to the client. -- **Maintains searchability**: Ensures that the encrypted data is searchable and retrievable without sacrificing performance or application functionality. -- **Manages encryption keys**: Securely handles encryption keys required for encrypting and decrypting data. +See the [reference guide on migrations](https://cipherstash.com/docs/getting-started/cipherstash-encrypt#migrations) for more information. -### 2.3 How EQL works with CipherStash Proxy +### Removing an index and disabling encryption + +To remove an index from the encrypted column, you must run the `cs_remove_index_v1` function. +This function takes the following parameters: + +- `table_name`: The name of the table containing the encrypted column. +- `column_name`: The name of the encrypted column. +- `index_name`: The name of the index. + +For the example above, and using a match index, the following statement would be used: + +```sql +SELECT cs_remove_index_v1('users', 'email_encrypted', 'match'); +``` + +Once you have removed an index, you must disable encryption. +This will update the encryption configuration to exclude the removed index. + +```sql +SELECT cs_encrypt_v1(); +``` + +--- + +## CipherStash Proxy + +Read more about CipherStash Proxy in the [WHYEQL.md](https://github.com/cipherstash/encrypt-query-language/blob/main/WHYEQL.md#cipherstash-proxy) file. + +### How EQL works with CipherStash Proxy EQL uses **CipherStash Proxy** to mediate access to your PostgreSQL database and provide low-latency encryption & decryption. @@ -78,28 +184,30 @@ At a high level: - references to the column in sql statements are wrapped in a helper function - Cipherstash Proxy transparently encrypts and indexes data -#### 2.3.1 Writes +#### Writes 1. Database client sends `plaintext` data encoded as `jsonb` -2. CipherStash Proxy encrypts the `plaintext` and encodes the `ciphertext` value and associated indexes into the `jsonb` payload -3. The data is written to the encrypted column +1. CipherStash Proxy encrypts the `plaintext` and encodes the `ciphertext` value and associated indexes into the `jsonb` payload +1. The data is written to the encrypted column ![Insert](/diagrams/overview-insert.drawio.svg) -#### 2.3.2 Reads +#### Reads 1. Wrap references to the encrypted column in the appropriate EQL function -2. CipherStash Proxy encrypts the `plaintext` -3. PostgreSQL executes the SQL statement -4. CipherStash Proxy decrypts any returned `ciphertext` data and returns to client +1. CipherStash Proxy encrypts the `plaintext` +1. PostgreSQL executes the SQL statement +1. CipherStash Proxy decrypts any returned `ciphertext` data and returns to client ![Select](/diagrams/overview-select.drawio.svg) -## 3. Encrypt Query Language (EQL) +--- + +## Encrypt Query Language (EQL) Before you get started, it's important to understand some of the key components of EQL. -### 3.1 Encrypted columns +### Encrypted columns Encrypted columns are defined using the `cs_encrypted_v1` [domain type](https://www.postgresql.org/docs/current/domains.html), which extends the `jsonb` type with additional constraints to ensure data integrity. @@ -112,7 +220,7 @@ CREATE TABLE users ( ); ``` -### 3.2 EQL functions +### EQL functions EQL provides specialized functions to interact with encrypted data: @@ -121,11 +229,11 @@ EQL provides specialized functions to interact with encrypted data: - **`cs_unique_v1(val JSONB)`**: Retrieves the unique index for enforcing uniqueness. - **`cs_ore_v1(val JSONB)`**: Retrieves the Order-Revealing Encryption index for range queries. -### 3.3 Index functions +### Index functions These Functions expect a `jsonb` value that conforms to the storage schema. -#### 3.3.1 cs_add_index +#### cs_add_index ```sql cs_add_index(table_name text, column_name text, index_name text, cast_as text, opts jsonb) @@ -139,8 +247,7 @@ cs_add_index(table_name text, column_name text, index_name text, cast_as text, o | cast_as | The PostgreSQL type decrypted data will be cast to | Optional. Defaults to `text` | opts | Index options | Optional for `match` indexes (see below) - -###### cast_as +##### cast_as Supported types: - text @@ -150,7 +257,7 @@ Supported types: - boolean - date -###### match opts +##### match opts A match index enables full text search across one or more text fields in queries. @@ -171,41 +278,7 @@ The default Match index options are: } ``` -- `tokenFilters`: a list of filters to apply to normalise tokens before indexing. -- `tokenizer`: determines how input text is split into tokens. -- `m`: The size of the backing [bloom filter](https://en.wikipedia.org/wiki/Bloom_filter) in bits. Defaults to `2048`. -- `k`: The maximum number of bits set in the bloom filter per term. Defaults to `6`. - -**Token Filters** - -There are currently only two token filters available `downcase` and `upcase`. These are used to normalise the text before indexing and are also applied to query terms. An empty array can also be passed to `tokenFilters` if no normalisation of terms is required. - -**Tokenizer** - -There are two `tokenizer`s provided: `standard` and `ngram`. -The `standard` simply splits text into tokens using this regular expression: `/[ ,;:!]/`. -The `ngram` tokenizer splits the text into n-grams and accepts a configuration object that allows you to specify the `tokenLength`. - -**m** and **k** - -`k` and `m` are optional fields for configuring [bloom filters](https://en.wikipedia.org/wiki/Bloom_filter) that back full text search. - -`m` is the size of the bloom filter in bits. `filterSize` must be a power of 2 between `32` and `65536` and defaults to `2048`. - -`k` is the number of hash functions to use per term. -This determines the maximum number of bits that will be set in the bloom filter per term. -`k` must be an integer from `3` to `16` and defaults to `6`. - -**Caveats around n-gram tokenization** - -While using n-grams as a tokenization method allows greater flexibility when doing arbitrary substring matches, it is important to bear in mind the limitations of this approach. -Specifically, searching for strings _shorter_ than the `tokenLength` parameter will not _generally_ work. - -If you're using n-gram as a token filter, then a token that is already shorter than the `tokenLength` parameter will be kept as-is when indexed, and so a search for that short token will match that record. -However, if that same short string only appears as a part of a larger token, then it will not match that record. -In general, therefore, you should try to ensure that the string you search for is at least as long as the `tokenLength` of the index, except in the specific case where you know that there are shorter tokens to match, _and_ you are explicitly OK with not returning records that have that short string as part of a larger token. - -#### 3.3.2 cs_modify_index +#### cs_modify_index ```sql _cs_modify_index_v1(table_name text, column_name text, index_name text, cast_as text, opts jsonb) @@ -214,7 +287,7 @@ _cs_modify_index_v1(table_name text, column_name text, index_name text, cast_as Modifies an existing index configuration. Accepts the same parameters as `cs_add_index` -#### 3.3.3 cs_remove_index +#### cs_remove_index ```sql cs_remove_index_v1(table_name text, column_name text, index_name text) @@ -222,11 +295,11 @@ cs_remove_index_v1(table_name text, column_name text, index_name text) Removes an index configuration from the column. -### 3.3 Query Functions +### Query Functions These Functions expect a `jsonb` value that conforms to the storage schema, and are used to perform search operations. -#### 3.3.1 cs_ciphertext_v1 +#### cs_ciphertext_v1 ```sql cs_ciphertext_v1(val jsonb) @@ -235,7 +308,7 @@ cs_ciphertext_v1(val jsonb) Extracts the ciphertext from the `jsonb` value. Ciphertext values are transparently decrypted in transit by Cipherstash Proxy. -#### 3.3.2 cs_match_v1 +#### cs_match_v1 ```sql cs_match_v1(val jsonb) @@ -244,7 +317,7 @@ cs_match_v1(val jsonb) Extracts a match index from the `jsonb` value. Returns `null` if no match index is present. -#### 3.3.3 cs_unique_v1 +#### cs_unique_v1 ```sql cs_unique_v1(val jsonb) @@ -253,7 +326,7 @@ cs_unique_v1(val jsonb) Extracts a unique index from the `jsonb` value. Returns `null` if no unique index is present. -#### 3.3.4 cs_ore_v1 +#### cs_ore_v1 ```sql cs_ore_v1(val jsonb) @@ -262,7 +335,7 @@ cs_ore_v1(val jsonb) Extracts an ore index from the `jsonb` value. Returns `null` if no ore index is present. -### 3.4 Data Format +### Data Format Encrypted data is stored as `jsonb` with a specific schema: @@ -312,136 +385,8 @@ Cipherstash proxy handles the encoding, and EQL provides the functions. | o.1 | ORE index | Ciphertext index value. Encrypted by proxy. | u.1 | Uniqueindex | Ciphertext index value. Encrypted by proxy. -#### 3.4.1 Helper packages +#### Helper packages We have created a few langague specific packages to help you interact with the payloads: - [@cipherstash/eql](https://github.com/cipherstash/encrypt-query-language/tree/main/javascript/packages/eql): This is a TypeScript implementation of EQL. - -## 4. Getting started - -The following guide assumes you have the prerequisites installed and running, and are running the SQL statements through your CipherStash Proxy instance. - -### 4.1 Prerequisites - -- [PostgreSQL 14+](https://www.postgresql.org/download/) -- [Cipherstash Proxy](https://cipherstash.com/docs/getting-started/cipherstash-proxy) -- [Cipherstash Encrypt](https://cipherstash.com/docs/getting-started/cipherstash-encrypt) - - It's important to have your dataset configured for encryption before you start using EQL. - - You can use the `cipherstash/dataset.yml` file in the `cipherstash` directory as a starting point. - -EQL relies on [Cipherstash Proxy](https://cipherstash.com/docs/getting-started/cipherstash-proxy) and [Cipherstash Encrypt](https://cipherstash.com/docs/getting-started/cipherstash-encrypt) for low-latency encryption & decryption. -We plan to support direct language integration in the future. - -> Note: You will need to copy the `cipherstash/cipherstash-proxy.toml.example` file to `cipherstash/cipherstash-proxy.toml` and update the values to match your environment before running the script. - -### 4.2 Installation - -In order to use EQL, you must first install the EQL extension in your PostgreSQL database. -You can do this by running the following command, which will execute the SQL from the `src/install.sql` file: - -Update the database credentials based on your environment. - -```bash -psql -U postgres -d postgres -f src/install.sql -``` - -### 4.3 Add a table with encrypted columns - -Create a table with encrypted columns. -For this example, we'll use the `users` table, with a plaintext `email` column and an encrypted `email_encrypted` column. - -```sql -CREATE TABLE IF NOT EXISTS "users" ( - "id" serial PRIMARY KEY NOT NULL, - "email" varchar, - "email_encrypted" "cs_encrypted_v1" -); -``` - -### 4.4 Inserting data - -When inserting data into the encrypted column, you must wrap the plaintext in the appropriate EQL payload. - -```sql -INSERT INTO users (email_encrypted) VALUES ('{"v":1,"k":"pt","p":"test@test.com","i":{"t":"users","c":"email_encrypted"}}'); -``` - -For reference, the EQL payload is defined as a `jsonb` with a specific schema: - -```json -{ - "v": 1, - "k": "pt", - "p": "test@test.com", - "i": { - "t": "users", - "c": "email_encrypted" - } -} -``` - -### 4.5 Querying data - -When querying data, you must wrap the encrypted column in the appropriate EQL payload. - -```sql -SELECT email_encrypted FROM users WHERE cs_match_v1(email_encrypted) @> cs_match_v1('{"v":1,"k":"pt","p":"test@test.com","i":{"t":"users","c":"email_encrypted"}}'); -``` - -For reference, the EQL payload is defined as a `jsonb` with a specific schema: - -```json -{ - "v": 1, - "k": "ct", - "c": "test@test.com", - "i": { - "t": "users", - "c": "email_encrypted" - } -} -``` - ---- - -In progress... - -## Add an encrypted column - -TODO: Do we need this? - -```SQL --- Alter tables from the configuration -cs_create_encrypted_columns_v1() - --- Explicit alter table -ALTER TABLE users ADD column email_encrypted cs_encrypted_v1; -``` - -## Add an index for searchability - -EQL supports three types of indexes: - -- match -- ore (order revealing encryption) -- unique - -Indexes are managed using EQL functions and can be baked into an existing database migration process. - -```sql --- Add an ore index to users.name -cs_add_index('users', 'name', 'ore'); - --- Remove an ore index from users.name -cs_remove_index('users', 'name', 'ore'); -``` - -Adding the index to your configuration does not _encrypt_ the data. - -The encryption process needs to update every row in the target table. -Depending on the size of the target table, this process can be long-running. - -{{LINK TO MIGRATOR DETAILS HERE}} - -.... more to come diff --git a/WHYEQL.md b/WHYEQL.md new file mode 100644 index 00000000..e015a031 --- /dev/null +++ b/WHYEQL.md @@ -0,0 +1,97 @@ +# Postgres data security with CipherStash + +This article gives a high-level overview of CipherStash's encryption in use solution, including the CipherStash Proxy and the Encrypt Query Language (EQL). + +It is designed for developers and engineers who need to implement robust data security in PostgreSQL without sacrificing performance or usability. + +## Table of Contents + +1. [Encryption in use](#encryption-in-use) + - [What is encryption in use?](#what-is-encryption-in-use) + - [Why use encryption in use?](#why-use-encryption-in-use) +2. [CipherStash Proxy](#cipherstash-proxy) + - [Proxy overview](#proxy-overview) + - [How it works](#how-it-works) +3. [Encrypt Query Language (EQL)](#encrypt-query-language-eql) +4. [Best practices](#best-practices) +5. [Advanced topics](#advanced-topics) + - [Integrating without proxy](#integrating-without-proxy) +6. [Conclusion](#conclusion) + +## Encryption in use + +EQL enables encryption in use, without significant changes to your application code. +A variety of searchable encryption techniques are available, including: + +- **Matching** - Equality or partial matches +- **Ordering** - comparison operations using order revealing encryption +- **Uniqueness** - enforcing unique constraints + +### What is encryption in use? + +Encryption in use is the practice of keeping data encrypted even while it's being processed or queried in the database. +Unlike traditional encryption methods that secure data only at rest (on disk) or in transit (over the network), encryption in use keeps the data encrypted while operations are being performed on the data. +This provides an additional layer of security against unauthorized access — an adversary needs access to the encrypted data _and_ encryption keys. + +### Why use encryption in use? + +While encryption at rest and in transit are essential, they don't protect data when the database server itself is compromised. +Encryption in use mitigates this risk by ensuring that: + +- **Data remains secure**: Even if the database server is breached, the data remains encrypted and unreadable without the proper keys. +- **Compliance controls are stronger**: When you need stronger data security controls than what SOC2/SOC3 or ISO27001 mandate, encryption in use helps you meet those stringent requirements. + +## CipherStash Proxy + +### Proxy overview + +CipherStash Proxy is a transparent proxy that sits between your application and your PostgreSQL database. +It intercepts SQL queries and handles the encryption and decryption of data on-the-fly. +This enables encryption in use without significant changes to your application code. + +### How it works + +- **Intercepts queries**: CipherStash Proxy captures SQL statements from the client application. +- **Encrypts data**: For write operations, it encrypts the plaintext data before sending it to the database. +- **Decrypts data**: For read operations, it decrypts the encrypted data retrieved from the database before returning it to the client. +- **Maintains searchability**: Ensures that the encrypted data is searchable and retrievable without sacrificing performance or application functionality. +- **Manages encryption keys**: Securely handles encryption keys required for encrypting and decrypting data. + +## Encrypt Query Language (EQL) + +Encrypt Query Language (EQL) is a set of PostgreSQL functions and data types provided by CipherStash to work with encrypted data and indexes. +EQL allows you to perform queries on encrypted data without decrypting it, supporting operations like equality checks, range queries, and unique constraints. + +To get started, see the root [README.md](https://github.com/cipherstash/encrypt-query-language?tab=readme-ov-file#getting-started) file. + +## Best Practices + +- **Leverage CipherStash Proxy**: Use CipherStash Proxy to handle encryption/decryption transparently. +- **Utilize EQL functions**: Always use EQL functions when interacting with encrypted data. +- **Define constraints**: Apply database constraints to maintain data integrity. +- **Secure key management**: Ensure encryption keys are securely managed and stored. +- **Monitor performance**: Keep an eye on query performance and optimize as needed. + +## Advanced Topics + +### Integrating without CipehrStash Proxy + +> The SDK approach is currently in development, but if you're interested in contributing, please start a discussion [here](https://github.com/cipherstash/cipherstash). + +For advanced users who prefer to handle encryption within their application: + +- **SDKs available**: Use CipherStash SDKs (at the moment, Rust and TypeScript) to manage encryption/decryption. +- **Manual encryption**: Implement encryption logic in your application code. +- **Data conformity**: Ensure encrypted data matches the expected `jsonb` schema. +- **Key management**: Handle encryption keys securely within your application. + +**Note**: This approach increases complexity and is recommended only if CipherStash Proxy does not meet specific requirements. + +## Conclusion + +CipherStash's encryption in use solution, comprising CipherStash Proxy and EQL, provides a practical way to enhance data security in Postgres databases. +By keeping data encrypted even during processing, you minimize the risk of data breaches and comply with stringent security standards without significant changes to your application logic. + +To get started, see the root [README.md](https://github.com/cipherstash/encrypt-query-language?tab=readme-ov-file#getting-started) file. + +**Contact Support:** For further assistance, raise an issue [here](https://github.com/cipherstash/encrypt-query-language/issues). diff --git a/cipherstash/README.md b/cipherstash/README.md index c651c27c..b5526705 100644 --- a/cipherstash/README.md +++ b/cipherstash/README.md @@ -1 +1 @@ -psql -h localhost -p 6432 -U postgres.wvhsiwlbufuixlvdunxr -d postgres +psql -h localhost -p 6432 -U postgres -d postgres diff --git a/cipherstash/dataset.yml b/cipherstash/dataset.yml index 4ff94d04..6eafaeba 100644 --- a/cipherstash/dataset.yml +++ b/cipherstash/dataset.yml @@ -1,43 +1 @@ -tables: - - path: users - fields: - - name: email_encrypted - in_place: false - mode: plaintext-duplicate - cast_type: utf8-str - indexes: - - version: 1 - kind: match - tokenizer: - kind: ngram - token_length: 3 - token_filters: - - kind: downcase - k: 6 - m: 2048 - include_original: true - - version: 1 - kind: ore - - version: 1 - kind: unique - - path: User - fields: - - name: email_encrypted - in_place: false - mode: plaintext-duplicate - cast_type: utf8-str - indexes: - - version: 1 - kind: match - tokenizer: - kind: ngram - token_length: 3 - token_filters: - - kind: downcase - k: 6 - m: 2048 - include_original: true - - version: 1 - kind: ore - - version: 1 - kind: unique +tables: [] \ No newline at end of file diff --git a/cipherstash/start.sh b/cipherstash/start.sh index b4764d84..050ad513 100755 --- a/cipherstash/start.sh +++ b/cipherstash/start.sh @@ -1,3 +1,4 @@ #!/bin/bash -docker run -p 6432:6432 -e CS_STATEMENT_HANDLER=mylittleproxy -e LOG_LEVEL=debug -v $(pwd)/cipherstash-proxy.toml:/etc/cipherstash-proxy/cipherstash-proxy.toml cipherstash/cipherstash-proxy:cipherstash-proxy-v0.0.25 \ No newline at end of file +# The version is hard coded as we are in active development of some of the EQL features +docker run -p 6432:6432 -e CS_STATEMENT_HANDLER=mylittleproxy -e LOG_LEVEL=debug -v $(pwd)/cipherstash-proxy.toml:/etc/cipherstash-proxy/cipherstash-proxy.toml cipherstash/cipherstash-proxy:pr-1008 \ No newline at end of file diff --git a/src/install.sql b/src/install.sql index a9a7661e..b7683453 100644 --- a/src/install.sql +++ b/src/install.sql @@ -1,473 +1,3 @@ --- --- PostgreSQL CipherStash Extension --- - --- --- Name: pgcrypto; Type: EXTENSION; Schema: -; Owner: - --- - --- CREATE EXTENSION IF NOT EXISTS pgcrypto WITH SCHEMA public; - - --- --- Name: ore_64_8_v1_term; Type: TYPE; Schema: public; --- - -CREATE TYPE public.ore_64_8_v1_term AS ( - bytes bytea -); - --- --- Name: ore_64_8_v1; Type: TYPE; Schema: public; --- - -CREATE TYPE public.ore_64_8_v1 AS ( - terms public.ore_64_8_v1_term[] -); - --- --- Name: compare_ore_64_8_v1(public.ore_64_8_v1, public.ore_64_8_v1); Type: FUNCTION; Schema: public; --- - -CREATE FUNCTION public.compare_ore_64_8_v1(a public.ore_64_8_v1, b public.ore_64_8_v1) RETURNS integer - LANGUAGE plpgsql - AS $$ - DECLARE - cmp_result integer; - BEGIN - -- Recursively compare blocks bailing as soon as we can make a decision - RETURN compare_ore_array(a.terms, b.terms); - END -$$; - --- --- Name: compare_ore_64_8_v1_term(public.ore_64_8_v1_term, public.ore_64_8_v1_term); Type: FUNCTION; Schema: public; --- - -CREATE FUNCTION public.compare_ore_64_8_v1_term(a public.ore_64_8_v1_term, b public.ore_64_8_v1_term) RETURNS integer - LANGUAGE plpgsql - AS $$ - DECLARE - eq boolean := true; - unequal_block smallint := 0; - hash_key bytea; - target_block bytea; - - left_block_size CONSTANT smallint := 16; - right_block_size CONSTANT smallint := 32; - right_offset CONSTANT smallint := 136; -- 8 * 17 - - indicator smallint := 0; - BEGIN - IF a IS NULL AND b IS NULL THEN - RETURN 0; - END IF; - - IF a IS NULL THEN - RETURN -1; - END IF; - - IF b IS NULL THEN - RETURN 1; - END IF; - - IF bit_length(a.bytes) != bit_length(b.bytes) THEN - RAISE EXCEPTION 'Ciphertexts are different lengths'; - END IF; - - FOR block IN 0..7 LOOP - -- Compare each PRP (byte from the first 8 bytes) and PRF block (8 byte - -- chunks of the rest of the value). - -- NOTE: - -- * Substr is ordinally indexed (hence 1 and not 0, and 9 and not 8). - -- * We are not worrying about timing attacks here; don't fret about - -- the OR or !=. - IF - substr(a.bytes, 1 + block, 1) != substr(b.bytes, 1 + block, 1) - OR substr(a.bytes, 9 + left_block_size * block, left_block_size) != substr(b.bytes, 9 + left_block_size * BLOCK, left_block_size) - THEN - -- set the first unequal block we find - IF eq THEN - unequal_block := block; - END IF; - eq = false; - END IF; - END LOOP; - - IF eq THEN - RETURN 0::integer; - END IF; - - -- Hash key is the IV from the right CT of b - hash_key := substr(b.bytes, right_offset + 1, 16); - - -- first right block is at right offset + nonce_size (ordinally indexed) - target_block := substr(b.bytes, right_offset + 17 + (unequal_block * right_block_size), right_block_size); - - indicator := ( - get_bit( - encrypt( - substr(a.bytes, 9 + (left_block_size * unequal_block), left_block_size), - hash_key, - 'aes-ecb' - ), - 0 - ) + get_bit(target_block, get_byte(a.bytes, unequal_block))) % 2; - - IF indicator = 1 THEN - RETURN 1::integer; - ELSE - RETURN -1::integer; - END IF; - END; -$$; - - --- --- Name: compare_ore_array(public.ore_64_8_v1_term[], public.ore_64_8_v1_term[]); Type: FUNCTION; Schema: public; --- - -CREATE FUNCTION public.compare_ore_array(a public.ore_64_8_v1_term[], b public.ore_64_8_v1_term[]) RETURNS integer - LANGUAGE plpgsql - AS $$ - DECLARE - cmp_result integer; - BEGIN - IF (array_length(a, 1) = 0 OR a IS NULL) AND (array_length(b, 1) = 0 OR b IS NULL) THEN - RETURN 0; - END IF; - IF array_length(a, 1) = 0 OR a IS NULL THEN - RETURN -1; - END IF; - IF array_length(b, 1) = 0 OR a IS NULL THEN - RETURN 1; - END IF; - - cmp_result := compare_ore_64_8_v1_term(a[1], b[1]); - IF cmp_result = 0 THEN - -- Removes the first element in the array, and calls this fn again to compare the next element/s in the array. - RETURN compare_ore_array(a[2:array_length(a,1)], b[2:array_length(b,1)]); - END IF; - - RETURN cmp_result; - END -$$; - - --- --- Name: ore_64_8_v1_eq(public.ore_64_8_v1, public.ore_64_8_v1); Type: FUNCTION; Schema: public; --- - -CREATE FUNCTION public.ore_64_8_v1_eq(a public.ore_64_8_v1, b public.ore_64_8_v1) RETURNS boolean - LANGUAGE sql - AS $$ - SELECT compare_ore_64_8_v1(a, b) = 0 -$$; - - --- --- Name: ore_64_8_v1_gt(public.ore_64_8_v1, public.ore_64_8_v1); Type: FUNCTION; Schema: public; --- - -CREATE FUNCTION public.ore_64_8_v1_gt(a public.ore_64_8_v1, b public.ore_64_8_v1) RETURNS boolean - LANGUAGE sql - AS $$ - SELECT compare_ore_64_8_v1(a, b) = 1 -$$; - - --- --- Name: ore_64_8_v1_gte(public.ore_64_8_v1, public.ore_64_8_v1); Type: FUNCTION; Schema: public; --- - -CREATE FUNCTION public.ore_64_8_v1_gte(a public.ore_64_8_v1, b public.ore_64_8_v1) RETURNS boolean - LANGUAGE sql - AS $$ - SELECT compare_ore_64_8_v1(a, b) != -1 -$$; - - --- --- Name: ore_64_8_v1_lt(public.ore_64_8_v1, public.ore_64_8_v1); Type: FUNCTION; Schema: public; --- - -CREATE FUNCTION public.ore_64_8_v1_lt(a public.ore_64_8_v1, b public.ore_64_8_v1) RETURNS boolean - LANGUAGE sql - AS $$ - SELECT compare_ore_64_8_v1(a, b) = -1 -$$; - - --- --- Name: ore_64_8_v1_lte(public.ore_64_8_v1, public.ore_64_8_v1); Type: FUNCTION; Schema: public; --- - -CREATE FUNCTION public.ore_64_8_v1_lte(a public.ore_64_8_v1, b public.ore_64_8_v1) RETURNS boolean - LANGUAGE sql - AS $$ - SELECT compare_ore_64_8_v1(a, b) != 1 -$$; - - --- --- Name: ore_64_8_v1_neq(public.ore_64_8_v1, public.ore_64_8_v1); Type: FUNCTION; Schema: public; --- - -CREATE FUNCTION public.ore_64_8_v1_neq(a public.ore_64_8_v1, b public.ore_64_8_v1) RETURNS boolean - LANGUAGE sql - AS $$ - SELECT compare_ore_64_8_v1(a, b) <> 0 -$$; - - --- --- Name: ore_64_8_v1_term_eq(public.ore_64_8_v1_term, public.ore_64_8_v1_term); Type: FUNCTION; Schema: public; --- - -CREATE FUNCTION public.ore_64_8_v1_term_eq(a public.ore_64_8_v1_term, b public.ore_64_8_v1_term) RETURNS boolean - LANGUAGE sql - AS $$ - SELECT compare_ore_64_8_v1_term(a, b) = 0 -$$; - - --- --- Name: ore_64_8_v1_term_gt(public.ore_64_8_v1_term, public.ore_64_8_v1_term); Type: FUNCTION; Schema: public; --- - -CREATE FUNCTION public.ore_64_8_v1_term_gt(a public.ore_64_8_v1_term, b public.ore_64_8_v1_term) RETURNS boolean - LANGUAGE sql - AS $$ - SELECT compare_ore_64_8_v1_term(a, b) = 1 -$$; - - --- --- Name: ore_64_8_v1_term_gte(public.ore_64_8_v1_term, public.ore_64_8_v1_term); Type: FUNCTION; Schema: public; --- - -CREATE FUNCTION public.ore_64_8_v1_term_gte(a public.ore_64_8_v1_term, b public.ore_64_8_v1_term) RETURNS boolean - LANGUAGE sql - AS $$ - SELECT compare_ore_64_8_v1_term(a, b) != -1 -$$; - - --- --- Name: ore_64_8_v1_term_lt(public.ore_64_8_v1_term, public.ore_64_8_v1_term); Type: FUNCTION; Schema: public; --- - -CREATE FUNCTION public.ore_64_8_v1_term_lt(a public.ore_64_8_v1_term, b public.ore_64_8_v1_term) RETURNS boolean - LANGUAGE sql - AS $$ - SELECT compare_ore_64_8_v1_term(a, b) = -1 -$$; - - --- --- Name: ore_64_8_v1_term_lte(public.ore_64_8_v1_term, public.ore_64_8_v1_term); Type: FUNCTION; Schema: public; --- - -CREATE FUNCTION public.ore_64_8_v1_term_lte(a public.ore_64_8_v1_term, b public.ore_64_8_v1_term) RETURNS boolean - LANGUAGE sql - AS $$ - SELECT compare_ore_64_8_v1_term(a, b) != 1 -$$; - - --- --- Name: ore_64_8_v1_term_neq(public.ore_64_8_v1_term, public.ore_64_8_v1_term); Type: FUNCTION; Schema: public; --- - -CREATE FUNCTION public.ore_64_8_v1_term_neq(a public.ore_64_8_v1_term, b public.ore_64_8_v1_term) RETURNS boolean - LANGUAGE sql - AS $$ - SELECT compare_ore_64_8_v1_term(a, b) <> 0 -$$; - - --- --- Name: <; Type: OPERATOR; Schema: public; --- - -CREATE OPERATOR public.< ( - FUNCTION = public.ore_64_8_v1_term_lt, - LEFTARG = public.ore_64_8_v1_term, - RIGHTARG = public.ore_64_8_v1_term, - COMMUTATOR = OPERATOR(public.>), - NEGATOR = OPERATOR(public.>=), - RESTRICT = scalarltsel, - JOIN = scalarltjoinsel -); - - --- --- Name: <; Type: OPERATOR; Schema: public; --- - -CREATE OPERATOR public.< ( - FUNCTION = public.ore_64_8_v1_lt, - LEFTARG = public.ore_64_8_v1, - RIGHTARG = public.ore_64_8_v1, - COMMUTATOR = OPERATOR(public.>), - NEGATOR = OPERATOR(public.>=), - RESTRICT = scalarltsel, - JOIN = scalarltjoinsel -); - - --- --- Name: <=; Type: OPERATOR; Schema: public; --- - -CREATE OPERATOR public.<= ( - FUNCTION = public.ore_64_8_v1_term_lte, - LEFTARG = public.ore_64_8_v1_term, - RIGHTARG = public.ore_64_8_v1_term, - COMMUTATOR = OPERATOR(public.>=), - NEGATOR = OPERATOR(public.>), - RESTRICT = scalarlesel, - JOIN = scalarlejoinsel -); - - --- --- Name: <=; Type: OPERATOR; Schema: public; --- - -CREATE OPERATOR public.<= ( - FUNCTION = public.ore_64_8_v1_lte, - LEFTARG = public.ore_64_8_v1, - RIGHTARG = public.ore_64_8_v1, - COMMUTATOR = OPERATOR(public.>=), - NEGATOR = OPERATOR(public.>), - RESTRICT = scalarlesel, - JOIN = scalarlejoinsel -); - - --- --- Name: <>; Type: OPERATOR; Schema: public; --- - -CREATE OPERATOR public.<> ( - FUNCTION = public.ore_64_8_v1_term_neq, - LEFTARG = public.ore_64_8_v1_term, - RIGHTARG = public.ore_64_8_v1_term, - NEGATOR = OPERATOR(public.=), - MERGES, - HASHES, - RESTRICT = eqsel, - JOIN = eqjoinsel -); - - --- --- Name: <>; Type: OPERATOR; Schema: public; --- - -CREATE OPERATOR public.<> ( - FUNCTION = public.ore_64_8_v1_neq, - LEFTARG = public.ore_64_8_v1, - RIGHTARG = public.ore_64_8_v1, - NEGATOR = OPERATOR(public.=), - MERGES, - HASHES, - RESTRICT = eqsel, - JOIN = eqjoinsel -); - - --- --- Name: =; Type: OPERATOR; Schema: public; --- - -CREATE OPERATOR public.= ( - FUNCTION = public.ore_64_8_v1_term_eq, - LEFTARG = public.ore_64_8_v1_term, - RIGHTARG = public.ore_64_8_v1_term, - NEGATOR = OPERATOR(public.<>), - MERGES, - HASHES, - RESTRICT = eqsel, - JOIN = eqjoinsel -); - - --- --- Name: =; Type: OPERATOR; Schema: public; --- - -CREATE OPERATOR public.= ( - FUNCTION = public.ore_64_8_v1_eq, - LEFTARG = public.ore_64_8_v1, - RIGHTARG = public.ore_64_8_v1, - NEGATOR = OPERATOR(public.<>), - MERGES, - HASHES, - RESTRICT = eqsel, - JOIN = eqjoinsel -); - --- --- Name: >; Type: OPERATOR; Schema: public; --- - -CREATE OPERATOR public.> ( - FUNCTION = public.ore_64_8_v1_term_gt, - LEFTARG = public.ore_64_8_v1_term, - RIGHTARG = public.ore_64_8_v1_term, - COMMUTATOR = OPERATOR(public.<), - NEGATOR = OPERATOR(public.<=), - RESTRICT = scalargtsel, - JOIN = scalargtjoinsel -); - - --- --- Name: >; Type: OPERATOR; Schema: public; --- - -CREATE OPERATOR public.> ( - FUNCTION = public.ore_64_8_v1_gt, - LEFTARG = public.ore_64_8_v1, - RIGHTARG = public.ore_64_8_v1, - COMMUTATOR = OPERATOR(public.<), - NEGATOR = OPERATOR(public.<=), - RESTRICT = scalargtsel, - JOIN = scalargtjoinsel -); - - --- --- Name: >=; Type: OPERATOR; Schema: public; --- - -CREATE OPERATOR public.>= ( - FUNCTION = public.ore_64_8_v1_term_gte, - LEFTARG = public.ore_64_8_v1_term, - RIGHTARG = public.ore_64_8_v1_term, - COMMUTATOR = OPERATOR(public.<=), - NEGATOR = OPERATOR(public.<), - RESTRICT = scalarlesel, - JOIN = scalarlejoinsel -); - - --- --- Name: >=; Type: OPERATOR; Schema: public; --- - -CREATE OPERATOR public.>= ( - FUNCTION = public.ore_64_8_v1_gte, - LEFTARG = public.ore_64_8_v1, - RIGHTARG = public.ore_64_8_v1, - COMMUTATOR = OPERATOR(public.<=), - NEGATOR = OPERATOR(public.<), - RESTRICT = scalarlesel, - JOIN = scalarlejoinsel -); - DROP CAST IF EXISTS (text AS ore_64_8_v1_term); DROP FUNCTION IF EXISTS cs_match_v1; @@ -489,12 +19,13 @@ DROP DOMAIN IF EXISTS cs_unique_index_v1; CREATE DOMAIN cs_match_index_v1 AS smallint[]; CREATE DOMAIN cs_unique_index_v1 AS text; +CREATE DOMAIN cs_ste_vec_v1 AS text[]; -- cs_encrypted_v1 is a column type and cannot be dropped if in use DO $$ BEGIN IF NOT EXISTS (SELECT 1 FROM pg_type WHERE typname = 'cs_encrypted_v1') THEN - CREATE DOMAIN cs_encrypted_v1 AS JSONB; + CREATE DOMAIN cs_encrypted_v1 AS JSONB; END IF; END $$; @@ -504,8 +35,7 @@ CREATE FUNCTION _cs_encrypted_check_kind(val jsonb) RETURNS BOOLEAN LANGUAGE sql IMMUTABLE STRICT PARALLEL SAFE BEGIN ATOMIC - RETURN - (val->>'k' = 'ct' AND val ? 'c' AND NOT val ? 'p'); + RETURN (val->>'k' = 'ct' AND val ? 'c') AND NOT val ? 'p'; END; @@ -590,6 +120,28 @@ BEGIN ATOMIC RETURN cs_unique_v1_v0_0(col); END; +-- extracts json containment index from an encrypted column +CREATE OR REPLACE FUNCTION cs_ste_vec_v1_v0_0(col jsonb) + RETURNS cs_ste_vec_v1 + LANGUAGE sql IMMUTABLE STRICT PARALLEL SAFE +BEGIN ATOMIC + SELECT ARRAY(SELECT jsonb_array_elements(col->'sv'))::cs_ste_vec_v1; +END; + +CREATE OR REPLACE FUNCTION cs_ste_vec_v1_v0(col jsonb) + RETURNS cs_ste_vec_v1 + LANGUAGE sql IMMUTABLE STRICT PARALLEL SAFE +BEGIN ATOMIC + RETURN cs_ste_vec_v1_v0_0(col); +END; + +CREATE OR REPLACE FUNCTION cs_ste_vec_v1(col jsonb) + RETURNS cs_ste_vec_v1 + LANGUAGE sql IMMUTABLE STRICT PARALLEL SAFE +BEGIN ATOMIC + RETURN cs_ste_vec_v1_v0_0(col); +END; + -- casts text to ore_64_8_v1_term (bytea) CREATE FUNCTION _cs_text_to_ore_64_8_v1_term_v1_0(t text) RETURNS ore_64_8_v1_term @@ -671,14 +223,15 @@ CREATE FUNCTION _cs_config_check_indexes(val jsonb) RETURNS BOOLEAN LANGUAGE sql IMMUTABLE STRICT PARALLEL SAFE BEGIN ATOMIC - SELECT jsonb_object_keys(jsonb_path_query(val, '$.tables.*.*.indexes')) = ANY('{match_1, ore_1, ore_1_term, unique_1}'); + SELECT jsonb_object_keys(jsonb_path_query(val, '$.tables.*.*.indexes')) = ANY('{match, ore, unique, json}'); END; + CREATE FUNCTION _cs_config_check_cast(val jsonb) RETURNS BOOLEAN LANGUAGE sql IMMUTABLE STRICT PARALLEL SAFE BEGIN ATOMIC - SELECT jsonb_array_elements_text(jsonb_path_query_array(val, '$.tables.*.*.cast_as')) = ANY('{text, int}'); + SELECT jsonb_array_elements_text(jsonb_path_query_array(val, '$.tables.*.*.cast_as')) = ANY('{text, int, small_int, big_int, real, double, boolean, date, jsonb}'); END; @@ -689,7 +242,7 @@ ALTER DOMAIN cs_configuration_data_v1 DROP CONSTRAINT IF EXISTS cs_configuration ALTER DOMAIN cs_configuration_data_v1 ADD CONSTRAINT cs_configuration_data_v1_check CHECK ( - VALUE ?& array['s', 'tables'] AND + VALUE ?& array['v', 'tables'] AND VALUE->'tables' <> '{}'::jsonb AND _cs_config_check_cast(VALUE) AND _cs_config_check_indexes(VALUE) @@ -733,6 +286,8 @@ DROP FUNCTION IF EXISTS cs_encrypt_v1(); DROP FUNCTION IF EXISTS cs_activate_v1(); DROP FUNCTION IF EXISTS cs_discard_v1(); +DROP FUNCTION IF EXISTS cs_refresh_encrypt_config(); + DROP FUNCTION IF EXISTS _cs_config_default(); DROP FUNCTION IF EXISTS _cs_config_match_1_default(); @@ -748,7 +303,7 @@ CREATE FUNCTION _cs_config_default(config jsonb) AS $$ BEGIN IF config IS NULL THEN - SELECT jsonb_build_object('s', 1, 'tables', '{}') INTO config; + SELECT jsonb_build_object('v', 1, 'tables', jsonb_build_object()) INTO config; END IF; RETURN config; END; @@ -763,7 +318,7 @@ AS $$ tbl jsonb; BEGIN IF NOT config #> array['tables'] ? table_name THEN - SELECT jsonb_build_object(table_name, '{}') into tbl; + SELECT jsonb_build_object(table_name, jsonb_build_object()) into tbl; SELECT jsonb_set(config, array['tables'], tbl) INTO config; END IF; RETURN config; @@ -780,9 +335,8 @@ AS $$ col jsonb; BEGIN IF NOT config #> array['tables', table_name] ? column_name THEN - SELECT jsonb_build_object(column_name, - jsonb_build_object('indexes', json_build_object())) into col; - SELECT jsonb_set(config, array['tables', table_name], col) INTO config; + SELECT jsonb_build_object('indexes', jsonb_build_object()) into col; + SELECT jsonb_set(config, array['tables', table_name, column_name], col) INTO config; END IF; RETURN config; END; @@ -824,7 +378,7 @@ BEGIN ATOMIC 'm', 2048, 'include_original', true, 'tokenizer', json_build_object('kind', 'ngram', 'token_length', 3), - 'token_filters', json_build_object('kind', 'downcase')); + 'token_filters', json_build_array(json_build_object('kind', 'downcase'))); END; -- @@ -846,7 +400,7 @@ AS $$ RAISE EXCEPTION '% index exists for column: % %', index_name, table_name, column_name; END IF; - IF NOT cast_as = ANY('{text, int}') THEN + IF NOT cast_as = ANY('{text, int, small_int, big_int, real, double, boolean, date, jsonb}') THEN RAISE EXCEPTION '% is not a valid cast type', cast_as; END IF; @@ -952,9 +506,9 @@ CREATE FUNCTION cs_encrypt_v1() RETURNS boolean AS $$ BEGIN - IF NOT cs_ready_for_encryption_v1() THEN - RAISE EXCEPTION 'Some pending columns do not have an encrypted target'; - END IF; + -- IF NOT cs_ready_for_encryption_v1() THEN + -- RAISE EXCEPTION 'Some pending columns do not have an encrypted target'; + -- END IF; IF NOT EXISTS (SELECT FROM cs_configuration_v1 c WHERE c.state = 'pending') THEN RAISE EXCEPTION 'No pending configuration exists to encrypt'; @@ -1084,6 +638,13 @@ AS $$ END; $$ LANGUAGE plpgsql; +CREATE FUNCTION cs_refresh_encrypt_config() + RETURNS void +LANGUAGE sql STRICT PARALLEL SAFE +BEGIN ATOMIC + RETURN NULL; +END; + -- DROP and CREATE functions -- Function types cannot be changed after creation so we DROP for flexibility DROP FUNCTION IF EXISTS cs_select_pending_columns_v1; @@ -1255,4 +816,4 @@ BEGIN INTO result; RETURN result; END; -$$ LANGUAGE plpgsql; +$$ LANGUAGE plpgsql; \ No newline at end of file diff --git a/src/uninstall.sql b/src/uninstall.sql new file mode 100644 index 00000000..7f9536ae --- /dev/null +++ b/src/uninstall.sql @@ -0,0 +1,67 @@ +-- Drop constraints on domains +ALTER DOMAIN cs_encrypted_v1 DROP CONSTRAINT IF EXISTS cs_encrypted_v1_check; +ALTER DOMAIN cs_configuration_data_v1 DROP CONSTRAINT IF EXISTS cs_configuration_data_v1_check; + +-- Drop functions +DROP FUNCTION IF EXISTS cs_count_encrypted_with_active_config_v1(text, text); +DROP FUNCTION IF EXISTS cs_rename_encrypted_columns_v1(); +DROP FUNCTION IF EXISTS cs_create_encrypted_columns_v1(); +DROP FUNCTION IF EXISTS cs_ready_for_encryption_v1(); +DROP FUNCTION IF EXISTS cs_select_target_columns_v1(); +DROP FUNCTION IF EXISTS cs_select_pending_columns_v1(); +DROP FUNCTION IF EXISTS _cs_diff_config_v1(jsonb, jsonb); +DROP FUNCTION IF EXISTS cs_add_column_v1(text, text); +DROP FUNCTION IF EXISTS cs_remove_column_v1(text, text); +DROP FUNCTION IF EXISTS cs_add_index_v1(text, text, text, text, jsonb); +DROP FUNCTION IF EXISTS cs_remove_index_v1(text, text, text); +DROP FUNCTION IF EXISTS cs_modify_index_v1(text, text, text, text, jsonb); +DROP FUNCTION IF EXISTS cs_encrypt_v1(); +DROP FUNCTION IF EXISTS cs_activate_v1(); +DROP FUNCTION IF EXISTS cs_discard_v1(); +DROP FUNCTION IF EXISTS cs_refresh_encrypt_config(); +DROP FUNCTION IF EXISTS _cs_config_default(jsonb); +DROP FUNCTION IF EXISTS _cs_config_match_1_default(); +DROP FUNCTION IF EXISTS _cs_config_add_table(text, jsonb); +DROP FUNCTION IF EXISTS _cs_config_add_column(text, text, jsonb); +DROP FUNCTION IF EXISTS _cs_config_add_cast(text, text, text, jsonb); +DROP FUNCTION IF EXISTS _cs_config_add_index(text, text, text, jsonb, jsonb); +DROP FUNCTION IF EXISTS cs_ciphertext_v1(jsonb); +DROP FUNCTION IF EXISTS cs_ciphertext_v1_v0(jsonb); +DROP FUNCTION IF EXISTS cs_ciphertext_v1_v0_0(jsonb); +DROP FUNCTION IF EXISTS cs_match_v1(jsonb); +DROP FUNCTION IF EXISTS cs_match_v1_v0(jsonb); +DROP FUNCTION IF EXISTS cs_match_v1_v0_0(jsonb); +DROP FUNCTION IF EXISTS cs_unique_v1(jsonb); +DROP FUNCTION IF EXISTS cs_unique_v1_v0(jsonb); +DROP FUNCTION IF EXISTS cs_unique_v1_v0_0(jsonb); +DROP FUNCTION IF EXISTS cs_ste_vec_v1(jsonb); +DROP FUNCTION IF EXISTS cs_ste_vec_v1_v0(jsonb); +DROP FUNCTION IF EXISTS cs_ste_vec_v1_v0_0(jsonb); +DROP FUNCTION IF EXISTS cs_ore_64_8_v1(jsonb); +DROP FUNCTION IF EXISTS cs_ore_64_8_v1_v0(jsonb); +DROP FUNCTION IF EXISTS cs_ore_64_8_v1_v0_0(jsonb); +DROP FUNCTION IF EXISTS _cs_text_to_ore_64_8_v1_term_v1_0(text) CASCADE; +DROP FUNCTION IF EXISTS _cs_encrypted_check_kind(jsonb); +DROP FUNCTION IF EXISTS _cs_config_check_indexes(jsonb); +DROP FUNCTION IF EXISTS _cs_config_check_cast(jsonb); + +-- Drop cast +DROP CAST IF EXISTS (text AS ore_64_8_v1_term); + +-- Drop indexes +DROP INDEX IF EXISTS cs_configuration_v1_index_active; +DROP INDEX IF EXISTS cs_configuration_v1_index_pending; +DROP INDEX IF EXISTS cs_configuration_v1_index_encrypting; + +-- Drop table +DROP TABLE IF EXISTS cs_configuration_v1; + +-- Drop domains +DROP DOMAIN IF EXISTS cs_match_index_v1; +DROP DOMAIN IF EXISTS cs_unique_index_v1; +DROP DOMAIN IF EXISTS cs_ste_vec_v1; +DROP DOMAIN IF EXISTS cs_encrypted_v1; -- Note: This domain cannot be dropped if it's in use +DROP DOMAIN IF EXISTS cs_configuration_data_v1; + +-- Drop type +DROP TYPE IF EXISTS cs_configuration_state_v1;