This repository contains just enough code to start all of the services required to run the reference data service. Once you have successfully cloned this repository into your local ide, run git pull
to ensure you have the most up to date version of the schema. Next you need to build a reference data image by running ./dev.sh rebuild
in bash (or .\dev.ps1 -Rebuild
in powershell). If using a Windows device use powershell to create your reference data image.
Then you can start it by running ./dev.sh
(or .\dev.ps1
in powershell)
The script will map current flyway
subdirectory to /flyway
and current ../csv
directory to /csvs
. Once started, your terminal will run inside reference data world.
By default, the image does not include Keycloak service, because most of the time you don't need it and startup times increase dramatically. If you need to build an image with Keycloak, run
./dev.sh rebuild full
(.\dev.ps1 -Rebuild -Full
).
Next run ./flyway.sh
this pushes schema changes to the pgweb instance. This should be run in the first instance such that you have the most up to date schema within your local simulation, when making csv changes although this is not required it is good practice to run this command as it re-reads the schema and restarts the tables to include your csv changes.
Any version of the container runs an instance of pgweb (tiny postgrest UI) which you can access by going to http://localhost:8081/ in your browser. It's already pre-connected and pre-authenticated to local reference data instance:
The tool allows to view schema, data, export data to CSV etc.
NOTE: If you are only using this code to run the Reference Data service customising the run is unnecessary and you can skip to making schema changes section
The image is serving the following ports inside the container:
3000
- PostgREST.5432
- Postgres DB.8000
- Keycloak.8081
- pgweb.
It's better not to expose them in order not to conflict with anything running on your machine, but if you need to say expose port 5432
just use docker syntax i.e. run this instead:
docker run -it --rm -v "$(pwd)/flyway:/flyway" -v "$(dirname $(pwd))/csv:/csvs" -p 5432:5432 refdata
You can customise the command if you need to map flyway and/or flyway to different location.
With any schema change create a separate branch from master for all of your changes.
- Create a new flyway file in
schemas/reference
- follow flyway documentation if you are not familiar with this system. Look at the existing scripts and try to be like a ninja - make your scripts blend in. Please refer to table requirements document to understand how it should be done. The file format we use for reference schema is as shown: V1__example.sql where two_
are used. - Change
flyway.target
indocker/flyway_reference_docker.conf
to your script number. - Raise a PR and wait for validation to complete. Check build logs if anything goes wrong - you will get a detailed reason why something is not working.
You can test these changes made outside of docker image from inside the docker image, thereby pushing them to your local pgweb instance by starting
./flyway.sh
because local flyway
directory is mapped as a docker volume inside.
Run
./validate.sh
When the validation script runs you will see a progress bar within your terminal, be sure to check the file you have changed is displayed to ensure the script has recognised your file when running tests.
Due to the fact this repository does not contain any csv data, and assuming you have already cloned the csv data folder one level above, test csv changes by running
./csv.sh
inside docker container as it maps local ../csv
folder as /csv
. This command only pushes csv data to your local pgweb instance, to apply changes to the master branch follow the same process as outlined for schema changes.
When running the reference data image in bash on a Windows device, the container does not always recognise flyway changes and as a result the validate script will not pick up any errors. It is unclear why this occurs however the current workaround is if you are using a Windows device use powershell to build the reference data image.
Use PSQL
psql -U postgres -d ref
to check table schema:
SELECT column_name, data_type FROM information_schema.columns WHERE TABLE_NAME = 'table_name';
or to get table values:
SELECT * from table_name;
As we have duplicating IDs (but unique id+validfrom
) there is no way to create the usual foreign key constraint, unless the constraint is using id+validfrom
as a source and target key. This is not what you want though, because latest version of the target table's record is always preferred.
In order to validate that the range of a column is in range of ids of a target table's id
column, you can add a trigger. Let's say address
table has countryid
column, and it has to be in range of id
columns from the country
table:
create trigger address_countryid before insert on address
for each row execute procedure validate_id('countryid', 'country', 'id');
validate_id
is a custom function which is defined as follows at the moment of this writing:
-- the following function validates that a column is in range of other table column's values
-- expects 3 parameters
-- 0: source column name
-- 1: target table name
-- 2: target column name
create or replace function validate_id() returns trigger as $$
declare
src TEXT;
tbl TEXT;
dest TEXT;
value TEXT;
in_range BOOLEAN;
begin
src := tg_argv[0];
tbl := tg_argv[1];
dest := tg_argv[2];
-- log
--raise notice '% -> %.%', src, tbl, dest;
-- dynamically query whether source value is in range
-- i.e.
-- select [new value] in (select [id] from [dest_table])
execute format('select ($1).%s in (select %I from %I)', src, dest, tbl)
using NEW
into in_range;
if in_range then
return NEW;
end if;
-- raise proper error
-- get source value
execute format('select ($1).%s::text', src)
using NEW
into value;
raise exception '% % is not in range of %.%', src, value, tbl, dest;
end;
$$ language plpgsql;
From psql you can execute the copy command:
\copy (select * from port) to /csvs/port_export.csv with csv