Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add uniqueNulls property to unique constraints pattern #648

Merged
merged 1 commit into from Nov 7, 2019
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
110 changes: 89 additions & 21 deletions specs/patterns.md
Expand Up @@ -778,27 +778,52 @@ Data packages can also be declared inline in the data catalog:

None known.

## Table Schema: Unique constraint
## Table Schema: Unique constraints

### Overview

A `primaryKey` uniquely identifies each row in a table. Per SQL standards, it
cannot contain `null` values. This pattern implements the SQL UNIQUE constraint
by introducing a `uniqueKeys` array, defining one or more row uniqueness
constraints which do support `null` values.
constraints which do support `null` values. An additional `uniqueNulls` property
controls how `null` values are to be treated in unique constraints.

### Specification

The `uniqueKeys` property, if present, `MUST` be an array. Each entry in the
array must be a `uniqueKey`. A `uniqueKey` `MUST` be an object and `MUST` have
the following properties:
#### `uniqueKeys` (add)

- `fields` – A string or array (structured as per `primaryKey`) specifying the
resource field or fields required to be unique for each row in the table.
- `nullUnique` – A boolean that dictates whether `null` are considered unique
(if `true`) or like any other value (if `false`).
The `uniqueKeys` property, if present, `MUST` be an array. Each entry
(`uniqueKey`) in the array `MUST` be a string or array (structured as per
`primaryKey`) specifying the resource field or fields required to be unique for
each row in the table.

#### Examples
#### `uniqueNulls` (add)

The `uniqueNulls` property is a boolean that dictates how `null` values should
be treated by all unique constraints set on a resource.

- If `true` (the default), `null` values are treated as unique (per most SQL
databases). By this definition, `1, null, null` is UNIQUE.
- If `false`, `null` values are treated like any other value (per Microsoft SQL
Server, Python pandas, R data.frame, Google Sheets). By this definition, `1,
null, null` is NOT UNIQUE.

#### `foreignKeys` (edit)

Per SQL standards, `null` values are permitted in both the local and reference
keys of a foreign key. However, reference keys `MUST` be unique and are
therefore equivalent to a `uniqueKey` set on the reference resource (the meaning
of which is determined by the reference `uniqueNulls`).

Furthermore, per SQL standards, the local key `MAY` contain keys with field
values not present in the reference key if and only if at least one of the
fields is locally `null`. For example, `(1, null)` is permitted locally even if
the reference is `[(2, 1), (3, 1)]`. This behavior is the same regardless of the
value of `uniqueNulls`.

### Examples

#### `null` in unique constraints

| a | b | c | d |
|---|---|---|---|
Expand All @@ -812,29 +837,72 @@ The above table meets the following primary key and two unique key constraints:
{
"primaryKey": ["a"],
"uniqueKeys": [
{
"fields": ["b", "c"],
"nullUnique": true
},
{
"fields": ["c", "d"],
"nullUnique": false
}
]
["b", "c"],
["c", "d"]
],
"uniqueNulls": true
}
```

The primary key `(a)` only contains unique, non-`null` values. In contrast, the
unique keys can contain `null` values. Although unique key `(b, c)` contains two
identical keys `(2, null)`, this is permitted because `nullUnique: true`
identical keys `(2, null)`, this is permitted because `uniqueNulls: true`
specifies that `null` values are unique. This behavior is consistent with the
UNIQUE constraint of PostgreSQL and most other SQL implementations, as
illustrated by this
[dbfiddle](https://dbfiddle.uk/?rdbms=postgres_11&fiddle=34cab8ba7d74b488d215a96f7e83c096).
The same keys would be considered duplicates if `nullUnique: false`, consistent
The same keys would be considered duplicates if `uniqueNulls: false`, consistent
with the UNIQUE constraint of Microsoft SQL Server, as illustrated by this
[dbfiddle](https://dbfiddle.uk/?rdbms=sqlserver_2019l&fiddle=34cab8ba7d74b488d215a96f7e83c096).

#### Setting unique constraints

For a given resource, unique constraints can be set for one field using a
field's `unique` constraint, for one or multiple fields using a `uniqueKey`, and
for one or multiple fields using a `foreignKey` referencing the resource. Each
of the following examples set a unique constraint on field `a`:

**Field `constraints`**

```json
{
"fields": [
{
"name": "a",
"constraints": {
"unique": true
}
}
]
}
```

**`uniqueKeys`**

```json
{
"uniqueKeys": [
"a"
]
}
```

**`foreignKeys`**

```json
{
"foreignKeys": [
{
"fields": "a",
"reference": {
"resource": "",
"fields": "a"
}
}
]
}
```

### Implementations

None known.