Create scaffolding for adding Postgres support #171

domMayhew · 2024-04-25T01:49:27Z

Add 'postgres' flag and dummy Postgres module

Add 'postgres' flag to:

make targets:
1. make configure
2. make /dist/bin/kupo
kupo.cabal:
1. Conditionally include the Postgres module vs the SQLite module
2. Add cpp-options: -Dpostgres
Kupo.App.Database:
1. Import Postgres or SQLite module depending on flag
2. Postgres module is currently a duplicate of the SQLite module

Also add additional files to make clean to cause the project to be recompiled after clean

KtorZ

Sounds about right so far. ~~I take it that the current Postgres module is a copy of the SQLite at the moment?~~ (It's written in the PR description 👍)

domMayhew · 2024-05-29T20:37:12Z

New work completed

Create a DBPool type that abstracts the setup/instantiation of a DB, providing a few different types of connections (through a resource pool), and shutdown. This allows SQLite to create its two separate pools and use its locking mechanism to provide a priority ("long-lived") connection, without forcing Postgres to follow the same semantics. Postgres can create a single pool of connections for readers and writers, for example, and does not need to create a DB file on initial start up.
1. The types of connections supported are simply the types that were already in use: a blocking connection, a connection that will fail if the max number of resources are already in use, and an "exclusive writer" connection (this is the "long-lived" connection).
Rename WorkDir to DatabaseLocation and add a --postgres-url=URL option to its sum types. The alternative to this approach is to conditionally compile the WorkDir option, but this increases the number of places that conditional compilation is taking place. It also forces us to conditionally compile the parser for the option. By simply adding an extra sum type, we reduce the static type safety since we have to check at runtime that a --postgres-url has not been passed to an SQLite binary, but we significantly reduce the complexity of conditional compilation.
Make minor modifications to test files to keep up with the changes.
1. One significant change is that an in-memory DB created by DatabaseSpec is not created with a test-specific connection string. It is created with the same connection string as running instances of Kupo. Does this mean that if a test is run on a machine that has a running instance of Kupo there will be a conflict between the test in-memory DB and the actual in-memory DB?

Current Issues

The Makefile currently fails to build the Postgres version because it depends on a the libpq system library. I am not sure what the best way to access this is. A couple options:
1. As part of the configure stage, clone the PostgreSQL repo and, in the build phase, provide the libpq directory to cabal.
2. Modify the flake to include libpq.
3. Create a small wrapper around the flake to add libpq to the dev shell.
The unit tests are failing for me, both in this branch and in master.

Work still to be done

Write the Postgres module.
Consider if the concurrency parameters need to change for Postgres (i.e., how many concurrent connections do we want to support?)
Determine if the Postgres module should support a priority/long-lived connection like the SQL module does, or if that should just be a normal connection.
Determine how to handle WriteOnly connections and database copying.
Write tests to support both SQLite and Postgres compilations.
Make any changes necessary to allow migrations to run on a PostgreSQL instance.

KtorZ · 2024-06-11T08:16:03Z

Rename WorkDir to DatabaseLocation and add a --postgres-url=URL option to its sum types. The alternative to this approach is to conditionally compile the WorkDir option, but this increases the number of places that conditional compilation is taking place.

I agree that the full alternative is somewhat a stretch, and that sounds good. To not confuse people though, I'd still suggest to hide the --postgres-url from the cli unless the flag is present using hidden. That would keep the code rather simple and gives the same illusion from an interface standpoint 😅.

Does this mean that if a test is run on a machine that has a running instance of Kupo there will be a conflict between the test in-memory DB and the actual in-memory DB?

No; because in-memory database using the :memory: connection strings are separate and lives in-process. There are ways to create shared in-memory databases with SQLite using connection flags provided by the open_v2 underlying method and this is what happens when one passes --in-memory as a flag:

InMemory Nothing ->
    ("file::kupo:?mode=memory&cache=shared", openFlags)

What's not obvious here is that the file name is :kupo:, and it identifies a named memory location. Using different names would yield different segmented memory locations. And this is why the test use either :memory:, which is actually a special one in SQLite that always create a separate in-memory location so tests don't even conflict with one another. Or, they specify their own in-memory location like:

            [ ( "in-memory"
              , \test ->
                    test (InMemory (Just "file::concurrent-read-write:?cache=shared&mode=memory"))
              )

This cannot conflict with a live running instance because users cannot actually provide an InMemory with Just a file location. Only InMemory Nothing. So the Just side is only available from test code.

The Makefile currently fails to build the Postgres version because it depends on a the libpq system library. I am not sure what the best way to access this is. A couple options:

It should be possible by using the -iog-full version of the devx nix shell instead of -iog, which includes postgreSQL (primarily for db-sync, but therefore, of good use to us!).

KtorZ · 2024-06-11T08:18:13Z

The unit tests are failing for me, both in this branch and in master.

How so ?

Add 'postgres' flag to: 1. `make` targets: 1. `make configure` 1. `make /dist/bin/kupo` 1. kupo.cabal: 1. Conditionally include the Postgres module vs the SQLite module 1. Add `cpp-options: -Dpostgres` 1. Kupo.App.Database: import Postgres or SQLite module depending on flag Also add additional files to `make clean` to cause the project to be recompiled after `clean`

Previously, creating a SQLite DB file, creating read-only and read-write connection pools, and gatekeeping connection types (e.g., `ReadWrite`) against the Kupo instance type (e.g., read only replica) was exposed by the `SQLite` module and performed in `Kupo`. Now, those functions are encapsulated within the DB-specific modules to allow unique implementations. For instance, PostgreSQL does not have a DB file and does not need separate pools for read-only and read-write connections.

Use a single data structure, `databaseLocation`, to describe either a work directory or in-memory when compiled for SQLite or a remote URL when compiled for PostgreSQL. This provides strong type safety but comes at the cost of having to include a argument parser in the DB modules, which is not really where it belongs. Alternatively, conditional compilation could be added to the argument parsing module, but that also seems out of place. Ultimately, including all 3 alternatives in `databaseLocation` and validating at runtime may be a cleaner solution, despite the loss of type safety.

This reverts commit 2701b34.

Add a command line option to specify a remote URL as DB location instead of work directory or in memory. This was added as another sum type to the existing option to prevent additional conditional compilation and avoid the need to conditionally compile the option parser. This comes at the expense of type safety, as each DB module throws a runtime error if an incompatible DB Location is provided. Trace logging of the max concurrency configuration was kept in `Kupo` rather than the DB modules due to a cyclical dependency between `App/Configuration`, `SQLite`/`Postgres`, and `App/Database`.

Clean whole build directory so cabal actually rebuilds the binary, and include the postgres flag in configure stage

KtorZ · 2024-06-11T10:23:28Z

src/Kupo.hs

-                    longestRollback
-                    dbFile
-                    (action InstallIndexesIfNotExist)
+                (withDatabaseBlocking dbPool) ReadOnly (action InstallIndexesIfNotExist)


These changes slightly alter the behavior. Before, the main background connection will not actually be part of the pool; mostly because there's no need: this connection never goes back to the pool; it's a long running connection that persists the entire life of the application.

It's not a big deal though as it will only consume one reader connection when setup as a read-only replica. It makes things easier down the line.

We cannot just use 'InMemory' for tests as this creates a shared in-memory database (each test therefore conflicting with one another). This is why the 'DatabaseFile' allows for passing an extra path to the InMemory variant. Not that having both 'DatabaseFile' and 'DatabaseLocation' feels somewhat redundant and just adds confusion overall. We shall perhaps consider getting rid of one.

Ideally, we should just not create any resource pool when in read-only mode. But the impact on the API is sufficiently annoying to not be worth it. So we still create a read/write pool which simply remains idle for the entire application lifetime. Yet we must ensure that the pool size is >= 1, otherwise it crashes at runtime. Yet, we don't want to report any writer capabilities in log.

Including `newDBPoolFromFile` in the external API of `App/Database` creates issues for the Postgres module, which does not have any notion of a DB file. Instead, we add a `FilePath` to `Configuration.InMemory` but do not expose ability to provide a filepath to the command line. This allows us to provide a filepath for testing purposes but does allow the user to supply one, all while maintaining good abstraction boundaries.

Due to an issue with CPP, multiline string literals had to be replaced with concatenated single line literals. `internal` was used instead of `hidden` to hide the options from the long help text as well as the short help.

KtorZ · 2024-06-12T11:01:38Z

Merging this already as we've got a few things to do on top which might conflict; so to save everyone time, let's bring new changes as new PRs :)

domMayhew marked this pull request as draft April 25, 2024 20:05

KtorZ reviewed May 3, 2024

View reviewed changes

domMayhew marked this pull request as ready for review June 3, 2024 18:01

domMayhew added 7 commits June 11, 2024 12:01

Revert "Provide DB-specific locations through conditional compilation"

bfe6319

This reverts commit 2701b34.

Fix bugs in Makefile

6b9f250

Clean whole build directory so cabal actually rebuilds the binary, and include the postgres flag in configure stage

KtorZ force-pushed the support-postgres branch from 79dc0f8 to 6b9f250 Compare June 11, 2024 10:04

KtorZ reviewed Jun 11, 2024

View reviewed changes

KtorZ and others added 5 commits June 11, 2024 12:26

Re-format and arrange source code while reviewing.

e5fde65

Conditionally compile invalid command line options to be hidden

dbfa027

Due to an issue with CPP, multiline string literals had to be replaced with concatenated single line literals. `internal` was used instead of `hidden` to hide the options from the long help text as well as the short help.

KtorZ merged commit 5d8f52c into CardanoSolutions:master Jun 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create scaffolding for adding Postgres support #171

Create scaffolding for adding Postgres support #171

domMayhew commented Apr 25, 2024

KtorZ left a comment •

edited

Loading

domMayhew commented May 29, 2024 •

edited

Loading

KtorZ commented Jun 11, 2024 •

edited

Loading

KtorZ commented Jun 11, 2024

KtorZ Jun 11, 2024

KtorZ commented Jun 12, 2024

Create scaffolding for adding Postgres support #171

Create scaffolding for adding Postgres support #171

Conversation

domMayhew commented Apr 25, 2024

KtorZ left a comment • edited Loading

Choose a reason for hiding this comment

domMayhew commented May 29, 2024 • edited Loading

KtorZ commented Jun 11, 2024 • edited Loading

KtorZ commented Jun 11, 2024

KtorZ Jun 11, 2024

Choose a reason for hiding this comment

KtorZ commented Jun 12, 2024

KtorZ left a comment •

edited

Loading

domMayhew commented May 29, 2024 •

edited

Loading

KtorZ commented Jun 11, 2024 •

edited

Loading