Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs: Intro, Cross compile install, and Configuration cleanup #274

Merged
merged 7 commits into from Mar 31, 2023
207 changes: 13 additions & 194 deletions README.md
Expand Up @@ -35,187 +35,27 @@ PL/Rust itself is a [`pgx`](https://github.com/tcdi/pgx)-based Postgres extensio
plrust` function are themselves mini-pgx extensions. `pgx`is a generalized framework for developing Postgres extensions with Rust. Like this project, `pgx`
is developed by [TCDI](https://www.tcdi.com).

The following sections discuss PL/Rusts safety guarantees, configuration settings, and installaiton instructions.
The following sections discuss PL/Rusts safety guarantees, configuration settings, and installation instructions.

# General Safety, by Rust

Quoted from the "Rustonomicon":

> Safe Rust is the true Rust programming language. If all you do is write Safe Rust, you will never have to worry
> about type-safety or memory-safety. You will never endure a dangling pointer, a use-after-free, or any other kind
> of Undefined Behavior (a.k.a. UB).

This is the universe in which PL/Rust functions live. If a PL/Rust function compiles it has these guarantees, by
the Rust compiler, that it won't "crash." This quality is important for natively-compiled code running in a
production database.

## What about `unsafe`?

PL/Rust uses the Rust compiler itself to wholesale **disallow** the use of `unsafe` in user functions. If
a `LANGUAGE plrust` function uses `unsafe` it won't compile.

Generally, what this means is that PL/Rust functions cannot call `unsafe fn`s, cannot call `extern "C"`s into
Postgres itself, and cannot dereference pointers.

This is accomplished using Rust's built-in `#![forbid(unsafe_code)]` lint.

3rd-party crate dependencies are allowed to use `unsafe`. We'll discuss this below.

## What about `pgx`?

If `pgx` is a "generalized framework for developing Postgres extensions with Rust", and if PL/Rust user functions
are themselves "mini-pgx extensions", what prevents a `LANGUAGE plrust` function from using any part of `pgx`?

The [`plrust-trusted-pgx`](https://github.com/tcdi/plrust/tree/main/plrust-trusted-pgx) crate does!

`plrust-trusted-pgx` is a tightly-controlled "re-export crate" on top of `pgx` that exposes the bare minimum necessary for
PL/Rust user functions to compile along with the bare minimum, **safe** features of `pgx`.

The crate is versioned independently to both `pgx` and `plrust` and is published on [crates.io](https://crates.io/crates/plrust-trusted-pgx).
By default, the version a plrust user function will use is that of the one set in the project repository when plrust itself
is compiled. However, the `plrust.trusted_pgx_version` GUC can be set to specify a specific version.

The intent is that `plrust-trusted-pgx` can evolve independently of both `pgx` and `plrust`.

There are a few "unsafe" parts of `pgx` exposed through `plrust-trusted-pgx`, but PL/Rust's ability to block `unsafe`
renders them useless by PL/Rust user functions. `plrust-trusted-pgx`'s docs are available on [docs.rs](https://docs.rs/plrust-trusted-pgx).

## Trusted with `postgrestd` on Linux x86_64/aarch64

The "trusted" version of PL/Rust uses a unique fork of Rust's `std` entitled
[`postgrestd`](https://github.com/tcdi/postgrestd) when compiling `LANGUAGE plrust` user functions. `postgrestd` is
a specialized Rust compilation target which disallows access to the filesystem and the host operating system.

Currently, `postgrestd` is only supported on Linux x86_64 and aarch64 platforms.

When `plrust` user functions are compiled and linked against `postgrestd`, they are prohibited from using the
filesystem, executing processes, and otherwise interacting with the host operating system.

In order for PL/Rust to use `postgrestd`, its Rust compilation targets must be installed on the Postgres server.
This happens via plrust's [`plrust/build`](plrust/build) script, which clones `postgrestd`, compiles it, by
default, for both x86_64 and aarch64 architectures, and ultimately places a copy of the necessary libraries used by
Rust for `std` into the appropriate "sysroot", which is the location that rustc will look for building those
libraries.

## What about Rust compiler bugs?

PL/Rust uses its own "rustc driver" which enables it to apply custom lints to the user's `LANGUAGE plrust` function.
In general, these lints will fail compilation if the user's code uses certain code idioms or patterns which we know to
have "I-Unsound" issues.

PL/Rust contains a small set of lints to block what the developers have deemed the most egregious "I-Unsound" Rust bugs.

Should new Rust bugs be found, and detection lints are developed for PL/Rust, the lints can be applied to new user
function compilations along with ensuring that future function executions had those lints applied at compile time.

Note that this is done on a best-effort basis, and does *not* provide a strong level of security — it's not a sandbox,
and as such, it's likely that a skilled hostile attacker who is sufficiently motivated could find ways around it
(PostgreSQL itself is not a particuarly hardened codebase, after all). You should ensure such actors cannot execute SQL
on your database, but to be clear: this is true regardless of whether or not PL/Rust is installed. Having said that, any
issues found with our implementation will be taken seriously, and should be [reported appropriately](./SECURITY.md).

## The `trusted` Feature Flag

PL/Rust has a feature flag simply named `trusted`. When compiled with the `trusted` feature flag PL/Rust will
**always** use the `postgrestd` targets when compiling user functions. Again, this is only supported on x86_64 and
aarch64 Linux systems.

`postgrestd` and the `trusted` feature flag are **not** supported on other platforms. As such, PL/Rust cannot be
considered fully trusted on those platforms.

If the `trusted` feature flag is not used when comiling PL/Rust, which is the default, then `postgrestd` is **not**
used when compiling user functions, and while they'll still benefit from Rust's general compile-time safety
checked, forced usage of the `plrust-trusted-pgx` crate, and PL/Rust's `unsafe` blocking, they will be able to access the
filesystem and communicate with the host operating system, as the user running the connected Postgres backend
(typically, this is a user named `postgres`).

# PL/Rust is also a Cross Compiler

In this day and age of sophisticated and flexible Postgres replication, along with cloud providers offering
Postgres on, and replication to, disparate CPU architectures, it's important that plrust, since it stores the user
function binary bytes in a database table, support running that function on a replicated Postgres server of a
different CPU architecture.

*cross compilation has entered the chat*

By default, plrust will not perform cross compilation. It must be turned on through configuration.

Configuring a *host* to properly cross compile is a thing that can take minimal effort to individual feats of
heroic effort. Reading the (still in-progress) guide at https://github.com/tcdi/pgx/blob/master/CROSS_COMPILE.md
can help. Generally speaking, it's not too awful to setup on Debian-based Linux systems, such as Ubuntu. Basically,
you install the "cross compilation toolchain" `apt` package for the *other* platform.

For full "trusted" PL/Rust user functions, `postgrestd` is required and must also be installed.

# Installing PL/Rust

Installing PL/Rust and especially `postgrestd` requires a normal installation of Rust via
[`rustup`](https://rustup.rs) and for the relevant locations to be writeable on the building host.
See the [Install PL/Rust](https://tcdi.github.io/plrust/install-plrust.html)
section of the documentation for notes on installing PL/Rust and its dependencies.

These steps assume cross compilation is also going to be used. If not, simply remove references to the architecture
that isn't yours.

## Install `cargo-pgx`

PL/Rust is a [`pgx`](https://github.com/tcdi/pgx)-based Postgres extension and requires it be installed.

```bash
$ cargo install cargo-pgx --version 0.7.2 --locked
$ cargo pgx init
```

Next, lets clone this repo:

```bash
$ git clone https://github.com/tcdi/plrust.git
$ cd plrust
```

## Cross Compilation Support

If you want cross-compilation support, install the Rust targets for aarch64 and x86_64, then install `postgrestd`.
These are necessary to cross compile `postgrestd` and PL/Rust user functions.

```bash
$ cd plrust
$ rustup target install aarch64-unknown-linux-gnu
$ rustup target install x86_64-unknown-linux-gnu
```

Once finished, while still in the plrust directory subdirectory, run the `postgrestd` build script. This
example assumes that the `pg_config` binary from Postgres v15 is on your $PATH. If v15 is not your intended
Postgres version, change it to the proper major version number.

```bash
$ PG_VER=15 \
STD_TARGETS="x86_64-postgres-linux-gnu aarch64-postgres-linux-gnu" \
./build
```
See the
[Cross compliation](https://tcdi.github.io/plrust/install-cross-compile.html)
section of the documentation for cross-compilation details.

(note: the above environment variables are the default... you can just run `./build`)

This will take a bit of time as it clones the `postgrestd` repository, builds it for two architectures, and finally
runs PL/Rust's entire test suite in "trusted" mode.

## Install PL/Rust

Installing the `plrust` extension is simple. Make sure the `pg_config` binary for the Postgres installation on the
host is in the `$PATH`, and simply run:

```bash
$ cargo pgx install --release --features "trusted"
```

Alternatively, you can specify the path to `pg_config`:

```bash
$ cargo pgx install --release --features "trusted" -c /path/to/pg_config
```

If you'd prefer PL/Rust be "untrusted" and haven't also installed `postgrestd` for at least the host architecture,
you can omit the `--features "trusted"` arguments.

# Configuration
## Configuration

See the [PostgreSQL Configuration](https://tcdi.github.io/plrust/config-pg.html)
section of the documentation for notes on configuring PL/Rust in
Expand All @@ -225,26 +65,7 @@ section of the documentation for notes on configuring PL/Rust in
----


For PL/Rust to cross compile user functions it needs to know which CPU architectures via
`plrust.compilation_targets`. This is a comma-separated list of values, of which only `x86_64` and `aarch64` are
currently supported.

The architecture linker names have sane defaults and shouldn't need to be be changed (unless the host is some
esoteric Linux distro we haven't encountered yet).

The `plrust.{arch}_pgx_bindings_path` settings are actually required but PL/Rust will happily cross compile without them. If unspecified,
PL/Rust will use the pgx bindings of the host architecture for the cross compilation target architecture too. In other words, if the host
is `x86_64` and PL/Rust is configured to cross compile to `aarch64` and the `plrust.aarch64_pgx_bindings_path` is *not* configured, it'll
blindly use the bindings it already has for `x86_64`. This may or may not actually work.

To get the bindings, install `cargo-pgx` on the other system and run `cargo pgx cross pgx-target`. That'll generate a tarball. Copy that back
to the primary host machine and untar it somewhere (plrust doesn't care where), and use that path as the configuration setting.

Note that it is perfectly fine (and really, expected) to set all of these configuration settings on both architectures.
plrust will silently ignore the one for the current host. In other words, plrust only uses them when cross compiling for
the other architecture.

### Lints
## Lints

See the [Lints section](https://tcdi.github.io/plrust/config-lints.html)
of the documentation.
Expand Down Expand Up @@ -294,15 +115,15 @@ strlen

In the Postgres world it seems common for procedural languages to have two styles, "trusted" and "untrusted". The consensus is to name those as "lang" and "langu", respectively -- where the "u" is supposed to represent "untrusted" (see "plperl" v/s "plperlu" for example).

plrust does not do this. The only thing that Postgres uses to determine if a language handler is considered "trusted" is if it was created using `CREATE TRUSTED LANGUAGE`. It does not inspect the name.
PL/Rust does not do this. The only thing that Postgres uses to determine if a language handler is considered "trusted" is if it was created using `CREATE TRUSTED LANGUAGE`. It does not inspect the name.

plrust stores the compiled user function binaries as a `bytea` in an extension-specific table uniquely key'd with its compilation target.
PL/Rust stores the compiled user function binaries as a `bytea` in an extension-specific table uniquely key'd with its compilation target.

As such, compiling a function with an "untrusted" version of plrust, then installing the "trusted" version and trying to run that function will fail -- "trusted" and "untrusted" are considered different compilation targets and are not compatible with each other, even if the underlying hardware is exactly the same.
As such, compiling a function with an "untrusted" version of PL/Rust, then installing the "trusted" version and trying to run that function will fail -- "trusted" and "untrusted" are considered different compilation targets and are not compatible with each other, even if the underlying hardware is exactly the same.

This does mean that it is not possible to install both "trusted" and "untrusted" versions of plrust on the same Postgres database cluster.
This does mean that it is not possible to install both "trusted" and "untrusted" versions of PL/Rust on the same Postgres database cluster.

In the future, as `postgrestd` is ported to more platforms, we will seriously consider having both `plrust` and `plrustu`. Right now, since "trusted" is only possible on Linux x86_64/aarch64, our objective is to drive production installations to be "trusted", while allowing non-Linux developers the ability to use `LANGUAGE plrust` too.
In the future, as `postgrestd` is ported to more platforms, we will seriously consider having both `plrust` and `plrustu`. Right now, since "trusted" is only possible on Linux `x86_64`/`aarch64`, our objective is to drive production installations to be "trusted", while allowing non-Linux developers the ability to use `LANGUAGE plrust` too.


# Security Notice
Expand All @@ -312,5 +133,3 @@ Please read the [Security](SECURITY.md) for directions on reporting a potential
# License

PL/Rust is licensed under "The PostgreSQL License", which can be found [here](LICENSE.md).

[docs-rs-tracing-directive]: https://docs.rs/tracing-subscriber/0.3.11/tracing_subscriber/filter/struct.EnvFilter.html
14 changes: 5 additions & 9 deletions doc/src/SUMMARY.md
Expand Up @@ -4,9 +4,10 @@

# Installation

- [Install Prerequisites](./install-prerequisites.md)
- [Install PL/Rust](./install-plrust.md)
- [Update PL/Rust](./update-plrust.md)
- [Cross compilation](./install-cross-compile.md)


# PL/Rust Usage

Expand All @@ -17,17 +18,12 @@
- [Triggers](./triggers.md)
- [SPI](./spi.md)
- [Trusted and Untrusted PL/Rust](./trusted-untrusted.md)
- [Rules and Regulations](./rules-regulations.md)

# PL/Rust Configuration

- [PostgreSQL configuration](./config-pg.md)
- [Lints](./config-lints.md)
- [Environment variables](./config-env-var.md)


- [Rules and Regulations](./rules-regulations.md)

# PL/Rust Under the Hood

- [Architecture](./architecture.md)
- [Designing for Trust](./designing-for-trust.md)
- [Lints](./config-lints.md)
- [Environment variables](./config-env-var.md)