Skip to content

Commit

Permalink
Start the documentation: overview, TableLocator, default gateways
Browse files Browse the repository at this point in the history
  • Loading branch information
sad-spirit committed Aug 31, 2023
1 parent be43e4b commit 2f2e54b
Show file tree
Hide file tree
Showing 4 changed files with 299 additions and 6 deletions.
14 changes: 8 additions & 6 deletions README.md
Expand Up @@ -13,11 +13,11 @@ Using those packages immediately allows

## Design goals

* Code generation should not be necessary, default gateway implementations should be useful as-is.
* Gateways should be aware of the table properties: columns, primary key, foreign keys.
* It should be possible to cache the generated SQL, skipping the whole parsing/building process.
* Therefore, API should encourage building parametrized queries.
* It should be possible to combine queries built by several Gateways via joins / `EXISTS()`
* Code generation is not necessary, default gateway implementations are useful as-is.
* Gateways are aware of the table metadata: columns, primary key, foreign keys.
* It is possible to cache the generated SQL, skipping the whole parsing/building process.
* API encourages building parametrized queries.
* Queries built by several Gateways can be combined via joins / `EXISTS()` / etc.

## Usage example

Expand Down Expand Up @@ -125,5 +125,7 @@ where gw_2."name" ~* $1::"text"

## Documentation

TBD
* [Package overview](./docs/index.md)
* [`TableLocator` class](./docs/locator.md)
* [`TableGateway` interface and its implementations](./docs/gateways.md)

129 changes: 129 additions & 0 deletions docs/gateways.md
@@ -0,0 +1,129 @@
# Gateways

## `TableGateway` interface

This interface extends `TableDefinition` (thus gateways provide access to table metadata) and defines
four methods corresponding to SQL statements:
* `delete($fragments = null, array $parameters = []): ResultSet`
* `insert(array<string, mixed>|SelectCommon|SelectProxy $values, $fragments = null, array $parameters = []): ResultSet`
* `select($fragments = null, array $parameters = []): SelectProxy`
* `update(array $set, $fragments = null, array $parameters = []): ResultSet`

`$fragments` parameter for the above methods can be one of the following
* `\Closure` - this is used for ad-hoc queries;
* Implementation of `Fragment` or `FragmentBuilder`;
* Most commonly, iterable over `Fragment` or `FragmentBuilder` implementations.

`$values` (when an array) / `$set` parameter for `insert()` / `update()` is an associative array of the form
`'column name' => 'value'`. Here `'value'` may be either a literal or an instance of `Expression` which is used
to set the column value to an SQL expression:
```PHP
$documentsGateway->insert([
'id' => 1,
'title' => 'default',
'added' => new Expression('now()')
]);
```

Note also that while `delete()` / `insert()` / `update()` methods immediately return `ResultSet` objects,
`select()` returns a `SelectProxy` instance.

### Ad-hoc queries

It is sometimes needed to modify the query AST in a completely custom way. Passing a closure as `$fragments` to one of
the above methods allows exactly this:
```PHP
$gateway->delete(function (Delete $delete) {
// Modify the $delete query any way you like
});
```

The downside is that a query built in that way will not be cached.

### `SelectProxy` interface

Unlike other methods of `TableGateway`, `select()` *will not* immediately execute the generated `SELECT` statement,
but will return a proxy object. An implementation of `SelectProxy` should contain all the data needed to execute
`SELECT` (and `SELECT COUNT(*)`), with actual queries executed only when `getIterator()` or `executeCount()` is called,
respectively.

The most common case still looks the same way as if `select()` did return `ResultSet`:
```PHP
foreach ($gateway->select($fragments) as $row) {
// process the row
}
```

But having a proxy object allows less common cases as well:
* It is frequently needed to additionally execute the query that returns the total number of rows that satisfy
the given conditions (e.g. for pagination), this is done with `executeCount()`;
* The configured object can be used inside a more complex query, this is covered by `createSelectAST()` method.

The package provides a default implementation in `TableSelect` class, it is implemented immutable as is the case with
all other Fragments.

## `TableGateway` implementations

The package contains three implementations of `TableGateway` interface, an instance of one of these will be returned by
`GenericTableGateway::create()` or `$tableLocator->get()` if the locator was not configured
with a custom gateway factory.

What exactly will be returned depends on whether `PRIMARY KEY` constraint was defined on the table and the number
of columns in that key.

### `GenericTableGateway`

This is the simplest gateway implementation, an instance of which is returned for tables that do not have a primary key
defined. In addition to the methods defined in the interface the methods to create statements are available:
* `createDeleteStatement(FragmentList $fragments): NativeStatement`
* `createInsertStatement(FragmentList $fragments): NativeStatement`
* `createUpdateStatement(FragmentList $fragments): NativeStatement`

The results of those can be used for e.g. `prepare()` / `execute()`. `FragmentList` is an object that keeps all
the fragments used in a query and possibly parameter values for those. It is usually created via
`FragmentList::normalize()` from whatever can be passed as `$fragments` to `TableGateway` methods.

There are also several builder methods defined, these return `Fragment`s / `FragmentBuilder`s configured for
that particular gateway.

### `PrimaryKeyTableGateway`

If a table has a `PRIMARY KEY` constraint defined and the key has only one column, then an instance of this class
will be returned. It implements an additional `PrimaryKeyAccess` interface with the following methods
* `deleteByPrimaryKey(mixed $primaryKey): ResultSet`
* `selectByPrimaryKey(mixed $primaryKey): SelectProxy`
* `updateByPrimaryKey(mixed $primaryKey, array $set): ResultSet`
* `upsert(array $values): array`

The last method builds and executes an `INSERT ... ON CONFLICT DO UPDATE ...` statement returning the primary key of
the inserted / updated row:
```PHP
$documentsGateway->upsert([
'id' => 1,
'title' => 'New title'
]);
```
will most probably return `['id' => 1]`.

The class also defines a `primaryKey(mixed $value): ParametrizedCondition` method which returns a condition used internally
by the methods listed above. It can be combined with other Fragments when received that way.


### `CompositePrimaryKeyTableGateway`

When the table's `PRIMARY KEY` constraint contains two or more columns, this class will be used. We assume that
such a table is generally used for defining an M:N relationship and provide a method that allows to replace
all records related to a key from one side of relationship:
* `replaceRelated(array $primaryKeyPart, iterable $rows): array`

Assuming the schema defined in [README](../README.md) we can use this method to replace the list of roles
assigned to the user after e.g. editing user's profile:
```PHP
$tableLocator->atomic(function (TableLocator $locator) use ($userData, $roles) {
$pkey = $locator->get('example.users')
->upsert($userData);

return $locator->get('example.users_roles')
->replaceRelated($pkey, $roles);
});
```
67 changes: 67 additions & 0 deletions docs/index.md
@@ -0,0 +1,67 @@
# sad_spirit/pg_gateway

The Table Data Gateway serves as a gateway to a table in the database, it provides methods that mirror the most common
table operations (`delete()`, `insert()`, `select()`, `update()`) and encapsulates SQL code that is needed to actually
perform these operations.

As `pg_gateway` is built upon [pg_wrapper](https://github.com/sad-spirit/pg-wrapper)
and [pg_builder](https://github.com/sad-spirit/pg-builder) packages it does not provide database abstraction,
only targeting Postgres. This allows leveraging its strengths like rich type system and expressive SQL syntax while
maybe sacrificing some flexibility.

Some specific design decisions were made for `pg_gateway`, these are outlined below and discussed more verbosely
on the separate pages.

## Database is the source of truth

The package does not try to generate database schema based on some classes. Instead, it uses the existing schema
to configure the table gateways:
* List of table columns is used for building Conditions depending on columns and for configuring the output of the query;
* `PRIMARY KEY` constraints allow finding rows by primary key and `upsert()` operations;
* `FOREIGN KEY` constraints are used to perform joins.

There is also no need to specify data types outside of SQL: the underlying packages take care to convert
both the output columns and the input parameters. It is sufficient to write
```
field = any(:param::integer[])
```
in your Condition and the package will expect an array of integers for a value of `param` parameter
and properly convert that array for RDBMS's consumption. Output columns are transparently converted to proper PHP types
as well thanks to `pg_wrapper`.

## Queries are built as ASTs

`pg_builder` package contains a partial reimplementation of PostgreSQL's own query parser. It allows converting
manually written SQL into Abstract Syntax Tree, analyzing and manipulating this tree,
and finally converting it back to an SQL string.

`pg_gateway` in turn allows direct access to the AST being built and provides its own manipulation options.
For example, it is possible to configure a `SELECT` targeting one table via its gateway's `select()` method
and then embed this `SELECT` into query being built by a gateway to another table. The fact that we aren't dealing
with strings here allows applying additional conditions and updating table aliases, even if (parts) of SQL
were provided as strings initially.

The obvious downside is that parsing SQL and building SQL from AST are expensive operations, so we provide means
to cache the complete query.

## Preferring parametrized queries

While Postgres only allows positional parameters like `$1` in queries, `pg_builder` package accepts named
ones like `:param` that are later converted to native positional ones.

As was mentioned above, there is no need to specify parameter types outside of SQL they appear in.
There are also means to pass parameter values alongside query parts that use them.

These feature make it easy to combine a query from several parts having parameter placeholders, instead of
substituting literals into query. Parametrized queries can be cached and reused later with other parameter values.

## Reusable query parts

The main concept of the package is `Fragment`: it serves as a sort of proxy to a part of query AST.
Every query being built starts from the base AST (e.g. `SELECT self.* from table_name as self`) and then
Fragments are applied to it. Those may modify the list of returned columns or add conditions to the `WHERE` clause.

Fragments and related classes have a `getKey()` method that should return a string uniquely identifying the Fragment
based on its contents. It is assumed that applying Fragments having the same keys will result in the same changes
to query. These keys are combined to generate a cache key for the complete query and possibly skip
the parse / build operations.
95 changes: 95 additions & 0 deletions docs/locator.md
@@ -0,0 +1,95 @@
# TableLocator class

This class serves as a facade to features of `pg_gateway` and the packages it depends on. It is also used
to create table gateways.

It is recommended to pass an instance of this class as a dependency instead of individual gateway objects.

## Constructor arguments

`TableLocator`'s constructor has the following signature
```PHP
use sad_spirit\pg_wrapper\Connection;
use sad_spirit\pg_gateway\TableGatewayFactory;
use sad_spirit\pg_builder\StatementFactory;
use Psr\Cache\CacheItemPoolInterface;

public function __construct(
Connection $connection,
?TableGatewayFactory $gatewayFactory = null,
?StatementFactory $statementFactory = null,
?CacheItemPoolInterface $statementCache = null
) {
// ...
}
```

As you can see, the only required argument is the `Connection` object:
```PHP
$locator = new TableLocator(new Connection('...connection string...'));
```

If `$gatewayFactory` is given, it will be used when calling `get()` method, otherwise `get()` will
return an instance of a [default gateway](./gateways.md).

If `$statementFactory` is omitted, a factory for the given `Connection` will be created
via `StatementFactory::forConnection()` method.

`$statementCache` can be any [PSR-6](https://www.php-fig.org/psr/psr-6/) cache implementation. If given,
it will be used for caching complete statements. Note that table metadata will be cached using
the metadata cache of `Connection` object, if one is available.

`$connection` and `$statementFactory` objects are later accessible via getters:
* `getConnection(): Connection`
* `getStatementFactory(): StatementFactory`


## Facade methods

* `atomic(callable $callback, bool $savepoint = false): mixed` - calls `Connection::atomic()` passing
`TableLocator` instance as the first argument to the given callback. This executes the callback atomically
(i.e. within database transaction).
* `getParser(): Parser` - returns an instance of `Parser` used by `StatementFactory`
* `createFromString(string $sql): Statement` - calls the same method of `StatementFactory`, parses
SQL of a complete statement returning its AST.
* `createFromAST(Statement $ast): NativeStatement` - calls the same method of `StatementFactory`, builds an SQL string
from AST and returns object encapsulating this string and parameter data.
* `getTypeConverterFactory(): TypeNameNodeHandler` - returns the type converter factory object used by `Connection`
* `createTypeNameNodeForOID($oid): TypeName` - calls the same method of `TypeNameNodeHandler`, returns `TypeName` node
corresponding to database type OID that can be used in statement AST.

## Creating statements

`createNativeStatementUsingCache(\Closure $factoryMethod, ?string $cacheKey): NativeStatement` method is used
by `TableGateway` and `SelectProxy` implementations for creating statements.

Note the return type: the goal of this method is to prevent parse / build operations and return the actual pre-built SQL.
`$factoryMethod` closure, on the other hand, should return an instance of `Statement`, consider the actual
implementation of `GenericTableGateway::createInsertStatement()`:
```PHP
public function createInsertStatement(FragmentList $fragments): NativeStatement
{
return $this->tableLocator->createNativeStatementUsingCache(
function () use ($fragments): Insert {
$insert = $this->tableLocator->getStatementFactory()->insert(new InsertTarget(
$this->getName(),
new Identifier(TableGateway::ALIAS_SELF)
));
$fragments->applyTo($insert);
return $insert;
},
$this->generateStatementKey(self::STATEMENT_INSERT, $fragments)
);
}
```

## Creating gateways (the `Locator` part)

Gateways are created using `get(string|QualifiedName $name): TableGateway` method. This will call `create()` method
of `TableGatewayFactory` implementation that was passed to the constructor and will fall back to
`GenericTableGateway::create()` if there is either no factory or its `create()` method returned `null`.

If a gateway was already created for the given table name, the existing instance will be returned.

It is recommended to always provide a qualified name (`schema_name.table_name`) for a table: the package does not try
to process `search_path` and will just assume that an unqualified name belongs to the `public` schema.

0 comments on commit 2f2e54b

Please sign in to comment.