Skip to content

Commit

Permalink
postgres repo: started section about performance tuning/monitoring
Browse files Browse the repository at this point in the history
  • Loading branch information
virgo47 committed Mar 31, 2021
1 parent 68b2f98 commit 732db6c
Showing 1 changed file with 102 additions and 65 deletions.
167 changes: 102 additions & 65 deletions docs/repository/db/postgresql.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@

== MidPoint Configuration


=== Database create

[source,sql]
Expand All @@ -14,6 +13,16 @@ CREATE USER midpoint WITH PASSWORD 'password' LOGIN SUPERUSER;
CREATE DATABASE midpoint WITH OWNER = midpoint ENCODING = 'UTF8' TABLESPACE = pg_default LC_COLLATE = 'en_US.UTF-8' LC_CTYPE = 'en_US.UTF-8' CONNECTION LIMIT = -1;
----

[NOTE]
====
.Collation
You're free to use different collation if it suits your needs although the following must be considered:
* PostgreSQL support for collation is based on available OS collations or ICU collations (these can't be used as default for the database).
Check https://www.postgresql.org/docs/current/collation.html[collation documentation] for your version of PG to see more.
* Assure that case-sensitivity or insensitivity does not break task bucketing.
See bug:MID-6468[] for example.
====

=== Configuration sample

Expand All @@ -33,58 +42,49 @@ CREATE DATABASE midpoint WITH OWNER = midpoint ENCODING = 'UTF8' TABLESPACE = pg
----


=== Driver

Driver for this database is bundled in MidPoint WAR archive.


== Tips and Tricks


=== Ubuntu


==== Create Database User

[source]
----
sudo -u postgres createuser --pwprompt --no-superuser --no-createdb --no-createrole midpoint
----


==== Delete Database User

[source]
----
sudo -u postgres dropuser midpoint
----


==== Delete Database

[source]
----
sudo -u postgres dropdb midpoint
----


==== Create database

[source]
----
sudo -u postgres createdb --owner=midpoint midpoint
----


==== Run databse script

[source]
----
psql --host=localhost --username=midpoint < postgresql-schema.sql
----


=== Working with explicit schema

[source]
Expand All @@ -95,34 +95,78 @@ GRANT USAGE, SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA midpoint TO
GRANT USAGE, SELECT ON ALL SEQUENCES IN SCHEMA midpoint TO midpoint_user_role;
----

== Performance tuning

== Development installation on Windows
=== Statement statistics

[WARNING]
While the module/extension exists in previous PostgreSQL versions the table `pg_stat_statements`
has more and renamed columns, so the examples below work only from version 13 on.

To track slow queries it is recommended to enable module https://www.postgresql.org/docs/current/pgstatstatements.html[pg_stat_statements].
To enable it the module must be added to `postgresql.conf` like this:

----
shared_preload_libraries = 'pg_stat_statements'
----

If you already use `shared_preload_libraries`, add it to the comma separated list.
Database restart is required after this configuration change.

To track slow queries one can use query like this:

[source,sql]
----
-- before the first time, run this, otherwise the view doesn't exist
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
-- remove "_exec" from columns for PG < 13
select
(total_exec_time / 1000 / 60)::numeric(20,4) as total_min, -- min as minutes here
mean_exec_time::numeric(20,2) as avg_ms,
max_exec_time::numeric(20,2) as max_ms,
calls,
query
from pg_stat_statements
-- optional where to limit it to one database, if needed (e.g. shared/test DB)
-- where dbid = (select oid from pg_database where datname = 'midpoint')
order by 1 desc -- change order as necessary
limit 50;
----

(Tested in March 2020 with PostgreSQL 12.2.)
=== Tuning indexes

TODO
// TODO

== Development installation on Windows

(Tested in March 2020 with PostgreSQL 12.2)

=== Installation, initdb

To avoid any problems with admin rights we will install link:https://www.enterprisedb.com/download-postgresql-binaries[Postgres from ZIP archive] as suggested on link:https://www.postgresql.org/download/windows/[this page] (for advanced users ☺).

* Unpack the archive, e.g. to *c:\work\tools\pgsql* (*pgsql* is the name of zipped directory).
* Unpack the archive, e.g. to `c:\work\tools\pgsql` (`pgsql` is the name of zipped directory).

* Optional: Add *C:\work\tools\pgsql\bin* to your PATH environment variable for convenience.
* Optional: Add `C:\work\tools\pgsql\bin` to your `PATH` environment variable for convenience.

* Optional: Decide where you want your postgres data and set *PGDATA* environment variable.
This is useful for *initdb* and *pg_ctl* commands, but can be also provided with switches.
I strongly recommend to set this one.

* Run (and provide admin password and [.underline]#remember it#): *initdb.exe -E utf8 --auth-host=scram-sha-256 -U admin -W*
This is useful for `initdb` and `pg_ctl` commands, but can be also provided with switches.
I strongly recommend setting this one.

** If you don't have *PGDATA* set, provide the expected directory name with *-D* switch.

** By default superuser is named *postgres*, you can use that, if you want.
This is useful on Linux where the name matches system user name, less so on Windows.
* Run (and provide admin password and [.underline]#remember it#):
+
----
initdb.exe -E utf8 --auth-host=scram-sha-256 -U admin -W
----

** If you don't have `PGDAT`* set, provide the expected directory name with `-D` switch.

** By default, superuser is named `postgres`, you can use that, if you want.
This is useful on Linux where the name matches system username, less so on Windows.

If we try to mess with locales (e.g. *LC_CTYPE* is set) *initdb* command may fail mysteriously with:
If we try to mess with locales (e.g. `LC_CTYPE` is set) `initdb` command may fail mysteriously with:

[source]
----
Expand All @@ -133,16 +177,13 @@ Check your installation.

This is the start of the locale related problems on Windows, but not the end of it.

* Start the database with: *pg_ctl start*

* Again, use *-D* switch if you don't have *PGDATA* set.

* Log in: *psql -W postgres admin*

** *-W* forces the password prompt and is not necessary, unless you already set some *psql* environment variables; *postgres* is name of the default database, *admin* is the user name.
* Start the database with: `pg_ctl start`

* Again, use `-D` switch if you don't have `PGDATA` set.

* Log in: `psql -W postgres admin`

** `-W` forces the password prompt and is not necessary, unless you already set some `psql` environment variables; `postgres` is name of the default database, `admin` is the username.

=== Creating user and database

Expand All @@ -154,12 +195,15 @@ CREATE USER midpoint WITH PASSWORD 'password' LOGIN SUPERUSER;
CREATE DATABASE midpoint WITH OWNER = midpoint ENCODING = 'UTF8' TABLESPACE = pg_default CONNECTION LIMIT = -1;
----

If we try the locales provided above, it will very likely complain about it and fallback to something like *CPUTF-8*.
If we try the locales provided above, it will very likely complain about it and fallback to something like `CPUTF-8`.

Now the database is up and running, operated manually with *pg_ctl start*/*stop*, fully under our control, no services, no access rights problems (provided the directories you used are available to you).
Now the database is up and running, operated manually with `pg_ctl start`/`stop`, fully under our control, no services, no access rights problems (provided the directories you used are available to you).

Then we want to populate the database, e.g. from the root of Midpoint repository: *psql midpoint midpoint < config\sql\postgresql-4.2-all.sql*
Then we want to populate the database, e.g. from the root of Midpoint repository:

----
psql midpoint midpoint < config\sql\postgresql-4.2-all.sql
----

=== Separate audit database

Expand All @@ -181,34 +225,28 @@ Alternatively, copy paste only audit related tables and other objects:

* foreign keys containing *audit* - *fk_audit_delta* through *fk_audit_resource*.


=== Configuring midPoint

* Adjust the Midpoint configuration.
If you use multiple databases I recommend to add new *midpoint.home*, e.g.: *c:\work\tmp\midpoint-home-postgres*
If you use multiple databases I recommend adding a new *midpoint.home*, e.g.: *c:\work\tmp\midpoint-home-postgres*

* Add *config.xml* there with repository configured to use Postgres (the same as at the start of this page).
* Add `config.xml` there with the repository configured to use Postgres (the same as at the start of this page).

* Start your Midpoint application with switch: *-Dmidpoint.home=c:\work\tmp\midpoint-home-postgres*
* Start your Midpoint application with switch: `-Dmidpoint.home=c:\work\tmp\midpoint-home-postgres`

To make *psql* easier to use you can set environment variables *PGHOST=localhost* (probably default), *PGPORT=5432* (probably default), *PGDATABASE=midpoint*, *PGUSER=midpoint*, *PGPASSWORD=password* (normally obviously unsafe).
To make `psql` easier to use you can set environment variables `PGHOST=localhost` (probably default), `PGPORT=5432` (probably default), `PGDATABASE=midpoint`, `PGUSER=midpoint`, `PGPASSWORD=password` (normally obviously unsafe).

[NOTE]
====
Installation up to this point does not allow connection from other machines and we don't need it for Midpoint and PostgreSQL running on the same machine.
In the next experiment, with PostgreSQL on virtual machine, we have to solve this issue as well.
====

In the next experiment, with PostgreSQL on a virtual machine, we have to solve this issue as well.

== Vagrantbox with PostgreSQL on Linux

(Tested in March 2020, based on bento/ubuntu-19.10 box, PostgreSQL is 11.7)

To shield ourselves from non-linux operating system, we can use power of link:https://www.vagrantup.com/[Vagrant]+link:https://www.virtualbox.org/[VirtualBox] combo - both must be installed in advance.

Instead of following the steps lower, you can also use polished Vagrantbox from here: link:https://github.com/virgo47/midpoint-vagrantboxes/tree/master/vagrant-midpoint-db-postgresql[https://github.com/virgo47/midpoint-vagrantboxes/tree/master/vagrant-midpoint-db-postgresql]

Instead of following the steps lower, you can also use polished Vagrantbox from here: link:https://github.com/virgo47/midpoint-vagrantboxes/tree/master/vagrant-midpoint-db-postgresql[vagrant-midpoint-db-postgresql]

=== Virtual machine provisioning

Expand Down Expand Up @@ -275,52 +313,52 @@ EOF
end
----

Now inside the _vagrant directory_ (the one with *Vagrantfile*) run: *vagrant up*
Now inside the _vagrant directory_ (the one with `Vagrantfile`) run: `vagrant up`

[NOTE]
====
No PostgreSQL can run on the host taking the 5432 port.
We either have to stop the Postgres on the host, or use different port for host.
We either have to stop the Postgres on the host, or use different port for the host.
This is perfectly fine, but we have to change it later in JDBC URL as it will not be default anymore.

====


=== Using the VM

This will provision the virtual machine and starts it up for us.
Next vagrant up is faster, first provisioning always lasts longer.
Now we want to log into the machine: *vagrant ssh*

We can check that everthing works just running *psql*, checking the list of databases there with *\l* - but we still don't have *midpoint* database populated, *\d* shows nothing.
Let's get out with: *\q*
We can check that everthing works just running `psql`, checking the list of databases there with `\l`.
As we still don't have `midpoint` database populated, `\d` shows nothing.
Let's get out with: `\q`

To halt the VM later we need to get out of it (exit) and run *vagrant halt* on our host machine, still in our _vagrant directory_. To restart it later just run *vagrant up* again.
To halt the VM later we need to get out of it (exit) and run `vagrant halt` on our host machine, still in our _vagrant directory_.
To restart it later just run `vagrant up` again.
VM remembers everything when used like this.


=== Populating the DB

There are two ways how to run the initial script:

* We can copy *postgresql-4.2-all.sql* to the vagrant directory, which is shared with the virtual machine as its */vagrant* directory.
Than we can run from inside the box (as *vagrant* user): *psql < /vagrant/postgresql-4.2-all.sql*
* We can copy `postgresql-4.3-all.sql` to the vagrant directory, which is shared with the virtual machine as its `/vagrant` directory.
Then we can run from inside the box (as `vagrant` user):
+
----
psql < /vagrant/postgresql-4.2-all.sql
----

* Or we can use *psql* from the host machine, if PostgreSQL is available there.
Database port 5432 is forwarded from guest to host again as 5432, so it seems to be running on host directly.
* Or we can use `psql` from the host machine, if PostgreSQL is available there.
Database port 5432 is forwarded from guest to host again as 5432, so it seems to be running on the host directly.

The first way has an advantage that we don't need to install anything additional on the host - Midpoint and driver is obvious prerequisite that we need anyway.
From here on it's again about setting the configuration, preferably in newly created *midpoint.home* that we provide to Midpoint using *-D* switch.

From here on it's again about setting the configuration, preferably in newly created `${midpoint.home}` that we provide to Midpoint using `-D` switch.

=== Vagrant advantages

One big advantage of using Vagrant is repeatability.
When we're done with our VM or we want to recreated it from scratch (e.g. something went wrong beyond repair) we just destroy it with: *vagrant destroy -f*
When we're done with our VM or we want to recreate it from scratch (e.g. something went wrong beyond repair) we just destroy it with: `vagrant destroy -f`

After that we just start it again with: *vagrant up*
After that we just start it again with: `vagrant up`

This time it does not need to download the base box, but it still needs to provision it - that is to run update, PG install and other commands in the *Vagrantfile*, but this is much faster than manual work.
This time it does not need to download the base box, but it still needs to provision it - that is to run update, PG install and other commands in the `Vagrantfile`, but this is much faster than manual work.

[WARNING]
====
Expand All @@ -333,4 +371,3 @@ It may be needed to use VirtualBox to really get rid of some stuck VM.
Despite these possible annoying problems, I highly recommend Vagrant for provisioning of development environments.
====

0 comments on commit 732db6c

Please sign in to comment.