Skip to content

Commit

Permalink
Rename register_(hash)_functions to plural for consistency
Browse files Browse the repository at this point in the history
  • Loading branch information
nyurik committed Mar 20, 2024
1 parent 79afd55 commit 356e15a
Show file tree
Hide file tree
Showing 11 changed files with 113 additions and 76 deletions.
8 changes: 5 additions & 3 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@ jobs:
- uses: Swatinem/rust-cache@v2
if: github.event_name != 'release' && github.event_name != 'workflow_dispatch'
- run: just test
- name: Check semver
uses: obi1kenobi/cargo-semver-checks-action@v2

msrv:
name: Test MSRV
Expand Down Expand Up @@ -117,9 +119,9 @@ jobs:
EXTENSION_FILE: target/${{ matrix.target }}/release/examples/${{ matrix.file }}
SQLITE3_BIN: ${{ matrix.sqlite3 }}
run: ./tests/test-ext.sh
# - name: Test ${{ matrix.target }} extension
# if: matrix.target != 'aarch64-apple-darwin'
# run: just sqlite3=${{ matrix.sqlite3 }} extension_file=target/${{ matrix.target }}/release/examples/${{ matrix.file }} test-ext
# - name: Test ${{ matrix.target }} extension
# if: matrix.target != 'aarch64-apple-darwin'
# run: just sqlite3=${{ matrix.sqlite3 }} extension_file=target/${{ matrix.target }}/release/examples/${{ matrix.file }} test-ext
- name: Package
run: |
pushd target/${{ matrix.target }}/release/examples
Expand Down
2 changes: 1 addition & 1 deletion Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "sqlite-hashes"
version = "0.6.0" # This value is also used in the README.md
version = "0.7.0" # This value is also used in the README.md
description = "Hashing functions for SQLite with aggregation support: MD5, SHA1, SHA256, SHA512, FNV-1a, xxHash"
authors = ["Yuri Astrakhan <YuriAstrakhan@gmail.com>"]
repository = "https://github.com/nyurik/sqlite-hashes"
Expand Down
117 changes: 76 additions & 41 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,22 +6,31 @@
[![crates.io version](https://img.shields.io/crates/l/sqlite-hashes.svg)](https://github.com/nyurik/sqlite-hashes/blob/main/LICENSE-APACHE)
[![CI build](https://github.com/nyurik/sqlite-hashes/actions/workflows/ci.yml/badge.svg)](https://github.com/nyurik/sqlite-hashes/actions)


Implement SQLite hashing functions with aggregation support, including MD5, SHA1, SHA224, SHA256, SHA384, SHA512, FNV-1a, xxHash. Functions are available as a loadable extension, or as a Rust library.
Implement SQLite hashing functions with aggregation support, including MD5, SHA1, SHA224, SHA256, SHA384, SHA512,
FNV-1a, xxHash. Functions are available as a loadable extension, or as a Rust library.

See also [SQLite-compressions](https://github.com/nyurik/sqlite-compressions) extension for gzip & brotli compressions.

## Usage

This SQLite extension adds hashing functions like `sha256(...)`, `sha256_hex(...)`, `sha256_concat` and `sha256_concat_hex` for multiple hashing algorithms. The `sha256` and `sha256_concat` function returns a blob value, while the `*_hex` return a HEX string similar to SQLite's own `hex()` function.
This SQLite extension adds hashing functions like `sha256(...)`, `sha256_hex(...)`, `sha256_concat`
and `sha256_concat_hex` for multiple hashing algorithms. The `sha256` and `sha256_concat` function returns a blob value,
while the `*_hex` return a HEX string similar to SQLite's own `hex()` function.

Functions support any number of arguments, e.g. `sha256('foo', 'bar', 'baz')`, hashing them in order as if they were concatenated. Functions can hash text and blob values, but will raise an error on other types like integers and floating point numbers. All `NULL` values are ignored. When calling the built-in SQLite `hex(NULL)`, the result is an empty string, so `sha256_hex(NULL)` will return an empty string as well to be consistent.
Functions support any number of arguments, e.g. `sha256('foo', 'bar', 'baz')`, hashing them in order as if they were
concatenated. Functions can hash text and blob values, but will raise an error on other types like integers and floating
point numbers. All `NULL` values are ignored. When calling the built-in SQLite `hex(NULL)`, the result is an empty
string, so `sha256_hex(NULL)` will return an empty string as well to be consistent.

The `*_concat` functions support aggregate to compute combined hash over a set of values like a column in a table, e.g. `sha256_concat` and `sha256_concat_hex`. Just like scalar functions, multiple arguments are also supported, so you can compute a hash over a set of columns, e.g. `sha256_concat(col1, col2, col3)`.
The `*_concat` functions support aggregate to compute combined hash over a set of values like a column in a table,
e.g. `sha256_concat` and `sha256_concat_hex`. Just like scalar functions, multiple arguments are also supported, so you
can compute a hash over a set of columns, e.g. `sha256_concat(col1, col2, col3)`.

**Note:** The window functionality is not supported in the loadable extension, only when used as as a Rust crate. PRs welcome.
**Note:** The window functionality is not supported in the loadable extension, only when used as as a Rust crate. PRs
welcome.

### Extension

To use as an extension, load the `libsqlite_hashes.so` shared library into SQLite.

```bash
Expand All @@ -32,15 +41,19 @@ sqlite> SELECT md5_hex('Hello world!');
```

### Rust library
To use as a Rust library, add `sqlite-hashes` to your `Cargo.toml` dependencies. Then, register the needed functions with `register_hash_functions(&db)`. This will register all available functions, or you can use `register_gzip_functions(&db)` or `register_brotli_functions(&db)` to register just the needed ones (you may also disable the default features to reduce compile time and binary size).

To use as a Rust library, add `sqlite-hashes` to your `Cargo.toml` dependencies. Then, register the needed functions
with `register_hash_functions(&db)`. This will register all available functions, or you can
use `register_md5_functions(&db)` or `register_sha256_functions(&db)` to register just the needed ones (you may also
disable the default features to reduce compile time and binary size).

```rust
use sqlite_hashes::{register_hash_functions, rusqlite::Connection};

fn main() {
// Connect to SQLite DB and register needed hashing functions
let db = Connection::open_in_memory().unwrap();
// can also use hash-specific ones like register_sha256_function(&db)
// can also use hash-specific ones like register_sha256_functions(&db)
register_hash_functions(&db).unwrap();

// Hash 'password' using SHA-256, and dump resulting BLOB as a HEX string
Expand All @@ -62,70 +75,88 @@ fn main() {
```

### Aggregate and Window Functions
When `aggregate` or `window` feature is enabled (default), there are functions to compute combined hash over a set of values like a column in a table, e.g. `sha256_concat` and `sha256_concat_hex`. Just like scalar functions, multiple arguments are also supported, so you can compute a hash over a set of columns, e.g. `sha256_concat(col1, col2, col3)`. Note that the window functionality is not supported in the loadable extension.

When `aggregate` or `window` feature is enabled (default), there are functions to compute combined hash over a set of
values like a column in a table, e.g. `sha256_concat` and `sha256_concat_hex`. Just like scalar functions, multiple
arguments are also supported, so you can compute a hash over a set of columns, e.g. `sha256_concat(col1, col2, col3)`.
Note that the window functionality is not supported in the loadable extension.

#### IMPORTANT NOTE: ORDERING

SQLite does NOT guarantee the order of rows when executing aggregate functions. A query `SELECT group_concat(v) FROM tbl ORDER BY v;` will NOT concatenate values in sorted order, but will use some internal storage order instead. Other databases like PostgreSQL support `SELECT string_agg(v ORDER BY v) FROM tbl;`, but SQLite does not.
SQLite does NOT guarantee the order of rows when executing aggregate functions. A
query `SELECT sha256_concat(v) FROM tbl ORDER BY v;` will NOT concatenate values in sorted order, but will use some
internal storage order instead.

One common workaround is to use a subquery, e.g. `SELECT group_concat(v) FROM (SELECT v FROM tbl ORDER BY v);`. This is NOT guaranteed to work in future versions of SQLite. See [discussion](https://sqlite.org/forum/info/a49d9c4083b5350c) for more details.
SQLite [v3.44.0](https://www.sqlite.org/changes.html#version_3_44_0)(2023-11-01) added support for the `ORDER BY` clause
**inside** the aggregate function call, e.g. `SELECT sha256_concat(v ORDER BY v) FROM tbl;`. Make sure to use that to
guarantee consistent results.

In order to guarantee the ordering, you must use a window function.
For older SQLite versions, one common workaround was to use a subquery,
e.g. `SELECT group_concat(v) FROM (SELECT v FROM tbl ORDER BY v);`. This is
NOT guaranteed to work in future versions of SQLite. See [discussion](https://sqlite.org/forum/info/a49d9c4083b5350c)
for more details.

```sql
Another way for older SQLite to guarantee the ordering is to use a window function.

```sql,ignore
SELECT sha256_concat_hex(v)
OVER (ORDER BY v ROWS
BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
FROM tbl
LIMIT 1;
```

The hashing window functions will only work if the starting point of the window is not moving (`UNBOUNDED PRECEDING`). To force a non-NULL value, use COALESCE:
The hashing window functions will only work if the starting point of the window is not moving (`UNBOUNDED PRECEDING`).
To force a non-NULL value, use COALESCE:

```sql
```sql,ignore
SELECT coalesce(
(SELECT sha256_concat_hex(v)
OVER (ORDER BY v ROWS
BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
FROM tbl
LIMIT 1),
sha256_hex('')
);
(SELECT sha256_concat_hex(v)
OVER (ORDER BY v ROWS
BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
FROM tbl
LIMIT 1),
sha256_hex('')
);
```

Note that window functions are only available in SQLite 3.25 and later, so a bundled SQLite version must be used, at least for now.
Note that window functions are only available in SQLite 3.25 and later, so a bundled SQLite version must be used, at
least for now.

```rust
use sqlite_hashes::{register_hash_functions, rusqlite::Connection};

fn main() {
let db = Connection::open_in_memory().unwrap();
register_hash_functions(&db).unwrap();
let db = Connection::open_in_memory().unwrap();
register_hash_functions(&db).unwrap();

// Pre-populate the DB with some data. Note that the b values are not alphabetical.
db.execute_batch("
// Pre-populate the DB with some data. Note that the b values are not alphabetical.
db.execute_batch("
CREATE TABLE tbl(id INTEGER PRIMARY KEY, v TEXT);
INSERT INTO tbl VALUES (1, 'bbb'), (2, 'ccc'), (3, 'aaa');
").unwrap();

let sql = "SELECT sha256_concat_hex(v) OVER (
let sql = "SELECT sha256_concat_hex(v) OVER (
ORDER BY v ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
FROM tbl LIMIT 1;";
let hash: String = db.query_row_and_then(&sql, [], |r| r.get(0)).unwrap();
assert_eq!(hash, "FB84A45F6DF7D1D17036F939F1CFEB87339FF5DBDF411222F3762DD76779A287");
// The above window aggregation example is equivalent to this scalar hash:
let sql = "SELECT sha256_hex('aaabbbccc');";
let hash: String = db.query_row_and_then(&sql, [], |r| r.get(0)).unwrap();
assert_eq!(hash, "FB84A45F6DF7D1D17036F939F1CFEB87339FF5DBDF411222F3762DD76779A287");
let hash: String = db.query_row_and_then(&sql, [], |r| r.get(0)).unwrap();
assert_eq!(hash, "FB84A45F6DF7D1D17036F939F1CFEB87339FF5DBDF411222F3762DD76779A287");

// The above window aggregation example is equivalent to this scalar hash:
let sql = "SELECT sha256_hex('aaabbbccc');";
let hash: String = db.query_row_and_then(&sql, [], |r| r.get(0)).unwrap();
assert_eq!(hash, "FB84A45F6DF7D1D17036F939F1CFEB87339FF5DBDF411222F3762DD76779A287");
}
```

## Crate features
By default, this crate will compile with all features. You can enable just the ones you need to reduce compile time and binary size.

By default, this crate will compile with all features. You can enable just the ones you need to reduce compile time and
binary size.

```toml
[dependencies]
sqlite-hashes = { version = "0.6", default-features = false, features = ["hex", "window", "sha256"] }
sqlite-hashes = { version = "0.7", default-features = false, features = ["hex", "window", "sha256"] }
```

* **trace** - enable tracing support, logging all function calls and their arguments
Expand All @@ -141,13 +172,17 @@ sqlite-hashes = { version = "0.6", default-features = false, features = ["hex",
* **fnv** - enable FNV-1a hash support
* **xxhash** - enable xxh32, xxh64, xxh3_64, xxh3_128 hash support

The **loadable_extension** feature should only be used when building a `.so` / `.dylib` / `.dll` extension file that can be loaded directly into sqlite3 executable.
The **loadable_extension** feature should only be used when building a `.so` / `.dylib` / `.dll` extension file that can
be loaded directly into sqlite3 executable.

## Development
* This project is easier to develop with [just](https://github.com/casey/just#readme), a modern alternative to `make`. Install it with `cargo install just`.

* This project is easier to develop with [just](https://github.com/casey/just#readme), a modern alternative to `make`.
Install it with `cargo install just`.
* To get a list of available commands, run `just`.
* To run tests, use `just test`.
* On `git push`, it will run a few validations, including `cargo fmt`, `cargo clippy`, and `cargo test`. Use `git push --no-verify` to skip these checks.
* On `git push`, it will run a few validations, including `cargo fmt`, `cargo clippy`, and `cargo test`.
Use `git push --no-verify` to skip these checks.

## License

Expand Down
24 changes: 12 additions & 12 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -34,37 +34,37 @@ pub use crate::state::HashState;
mod md5;

#[cfg(feature = "md5")]
pub use crate::md5::register_md5_function;
pub use crate::md5::register_md5_functions;

#[cfg(feature = "sha1")]
mod sha1;

#[cfg(feature = "sha1")]
pub use crate::sha1::register_sha1_function;
pub use crate::sha1::register_sha1_functions;

#[cfg(feature = "sha224")]
mod sha224;

#[cfg(feature = "sha224")]
pub use crate::sha224::register_sha224_function;
pub use crate::sha224::register_sha224_functions;

#[cfg(feature = "sha256")]
mod sha256;

#[cfg(feature = "sha256")]
pub use crate::sha256::register_sha256_function;
pub use crate::sha256::register_sha256_functions;

#[cfg(feature = "sha384")]
mod sha384;

#[cfg(feature = "sha384")]
pub use crate::sha384::register_sha384_function;
pub use crate::sha384::register_sha384_functions;

#[cfg(feature = "sha512")]
mod sha512;

#[cfg(feature = "sha512")]
pub use crate::sha512::register_sha512_function;
pub use crate::sha512::register_sha512_functions;

#[cfg(feature = "fnv")]
mod fnv;
Expand Down Expand Up @@ -133,17 +133,17 @@ pub use crate::xxhash::register_xxhash_functions;
/// ```
pub fn register_hash_functions(conn: &Connection) -> Result<()> {
#[cfg(feature = "md5")]
register_md5_function(conn)?;
register_md5_functions(conn)?;
#[cfg(feature = "sha1")]
register_sha1_function(conn)?;
register_sha1_functions(conn)?;
#[cfg(feature = "sha224")]
register_sha224_function(conn)?;
register_sha224_functions(conn)?;
#[cfg(feature = "sha256")]
register_sha256_function(conn)?;
register_sha256_functions(conn)?;
#[cfg(feature = "sha384")]
register_sha384_function(conn)?;
register_sha384_functions(conn)?;
#[cfg(feature = "sha512")]
register_sha512_function(conn)?;
register_sha512_functions(conn)?;
#[cfg(feature = "fnv")]
register_fnv_functions(conn)?;
#[cfg(feature = "xxhash")]
Expand Down
6 changes: 3 additions & 3 deletions src/md5.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,16 +12,16 @@ use crate::scalar::create_hash_fn;
///
/// ```
/// # use sqlite_hashes::rusqlite::{Connection, Result};
/// # use sqlite_hashes::register_md5_function;
/// # use sqlite_hashes::register_md5_functions;
/// # fn main() -> Result<()> {
/// let db = Connection::open_in_memory()?;
/// register_md5_function(&db)?;
/// register_md5_functions(&db)?;
/// let hash: Vec<u8> = db.query_row("SELECT md5('hello')", [], |r| r.get(0))?;
/// let expected = b"\x5d\x41\x40\x2a\xbc\x4b\x2a\x76\xb9\x71\x9d\x91\x10\x17\xc5\x92";
/// assert_eq!(hash, expected);
/// # Ok(())
/// # }
/// ```
pub fn register_md5_function(conn: &Connection) -> Result<()> {
pub fn register_md5_functions(conn: &Connection) -> Result<()> {
create_hash_fn::<Md5>(conn, "md5")
}
6 changes: 3 additions & 3 deletions src/sha1.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,16 +11,16 @@ use crate::rusqlite::{Connection, Result};
///
/// ```
/// # use sqlite_hashes::rusqlite::{Connection, Result};
/// # use sqlite_hashes::register_sha1_function;
/// # use sqlite_hashes::register_sha1_functions;
/// # fn main() -> Result<()> {
/// let db = Connection::open_in_memory()?;
/// register_sha1_function(&db)?;
/// register_sha1_functions(&db)?;
/// let hash: Vec<u8> = db.query_row("SELECT sha1('hello')", [], |r| r.get(0))?;
/// let expected = b"\xaa\xf4\xc6\x1d\xdc\xc5\xe8\xa2\xda\xbe\xde\x0f\x3b\x48\x2c\xd9\xae\xa9\x43\x4d";
/// assert_eq!(hash, expected);
/// # Ok(())
/// # }
/// ```
pub fn register_sha1_function(conn: &Connection) -> Result<()> {
pub fn register_sha1_functions(conn: &Connection) -> Result<()> {
crate::scalar::create_hash_fn::<Sha1>(conn, "sha1")
}
6 changes: 3 additions & 3 deletions src/sha224.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,16 +11,16 @@ use crate::rusqlite::{Connection, Result};
///
/// ```
/// # use sqlite_hashes::rusqlite::{Connection, Result};
/// # use sqlite_hashes::register_sha224_function;
/// # use sqlite_hashes::register_sha224_functions;
/// # fn main() -> Result<()> {
/// let db = Connection::open_in_memory()?;
/// register_sha224_function(&db)?;
/// register_sha224_functions(&db)?;
/// let hash: Vec<u8> = db.query_row("SELECT sha224('hello')", [], |r| r.get(0))?;
/// let expected = b"\xea\x09\xae\x9c\xc6\x76\x8c\x50\xfc\xee\x90\x3e\xd0\x54\x55\x6e\x5b\xfc\x83\x47\x90\x7f\x12\x59\x8a\xa2\x41\x93";
/// assert_eq!(hash, expected);
/// # Ok(())
/// # }
/// ```
pub fn register_sha224_function(conn: &Connection) -> Result<()> {
pub fn register_sha224_functions(conn: &Connection) -> Result<()> {
crate::scalar::create_hash_fn::<Sha224>(conn, "sha224")
}
6 changes: 3 additions & 3 deletions src/sha256.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,16 +11,16 @@ use crate::rusqlite::{Connection, Result};
///
/// ```
/// # use sqlite_hashes::rusqlite::{Connection, Result};
/// # use sqlite_hashes::register_sha256_function;
/// # use sqlite_hashes::register_sha256_functions;
/// # fn main() -> Result<()> {
/// let db = Connection::open_in_memory()?;
/// register_sha256_function(&db)?;
/// register_sha256_functions(&db)?;
/// let hash: Vec<u8> = db.query_row("SELECT sha256('hello')", [], |r| r.get(0))?;
/// let expected = b"\x2c\xf2\x4d\xba\x5f\xb0\xa3\x0e\x26\xe8\x3b\x2a\xc5\xb9\xe2\x9e\x1b\x16\x1e\x5c\x1f\xa7\x42\x5e\x73\x04\x33\x62\x93\x8b\x98\x24";
/// assert_eq!(hash, expected);
/// # Ok(())
/// # }
/// ```
pub fn register_sha256_function(conn: &Connection) -> Result<()> {
pub fn register_sha256_functions(conn: &Connection) -> Result<()> {
crate::scalar::create_hash_fn::<Sha256>(conn, "sha256")
}
Loading

0 comments on commit 356e15a

Please sign in to comment.