Skip to content

Commit

Permalink
#55: Updated documentation (#56)
Browse files Browse the repository at this point in the history
* #55: Updated documentation
Co-authored-by: Sebastian Bär <redcatbear@ursus-minor.de>
  • Loading branch information
ckunki committed Aug 29, 2022
1 parent 3a4ac56 commit 61c880a
Show file tree
Hide file tree
Showing 10 changed files with 222 additions and 75 deletions.
8 changes: 5 additions & 3 deletions README.md
Expand Up @@ -31,16 +31,18 @@ If you want to set up a Virtual Schema for a different database system, please h

### Information for Users

* [PostgreSQL dialect User Guide](doc/user_guide/postgresql_user_guide.md)
* [General Virtual Schema User Guide][user-guide]
* [Virtual Schema User Guide](https://docs.exasol.com/database_concepts/virtual_schemas.htm)
* [PostgreSQL Dialect User Guide](doc/user_guide/postgresql_user_guide.md)
* [List of supported capabilities](doc/generated/capabilities.md)
* [Changelog](doc/changes/changelog.md)
* [Dependencies](dependencies.md)

Find all the documentation in the [Virtual Schemas project][vs-doc].

## Information for Developers

* [Virtual Schema API Documentation][vs-api]
* [Virtual Schema API Documentation](https://github.com/exasol/virtual-schema-common-java/blob/main/doc/development/api/virtual_schema_api.md)
* [Remote logging](https://docs.exasol.com/db/latest/database_concepts/virtual_schema/logging.htm)

## Additional Resources

Expand Down
1 change: 1 addition & 0 deletions doc/changes/changelog.md
@@ -1,5 +1,6 @@
# Changes

* [2.0.4](changes_2.0.4.md)
* [2.0.3](changes_2.0.3.md)
* [2.0.2](changes_2.0.2.md)
* [2.0.1](changes_2.0.1.md)
Expand Down
19 changes: 19 additions & 0 deletions doc/changes/changes_2.0.4.md
@@ -0,0 +1,19 @@
# Virtual Schema for PostgreSQL 2.0.4, released 2022-08-29

Code name: Documentation and Dependencies update

## Summary

Fixed vulnerability [sonatype-2022-4402](https://ossindex.sonatype.org/vulnerability/sonatype-2022-4402) reported by ossindex for dependency [org.postgresql:postgresql:jar:42.4.0](https://ossindex.sonatype.org/component/pkg:maven/org.postgresql/postgresql@42.4.0?utm_source=ossindex-client&utm_medium=integration&utm_content=1.8.1) in compile by updating dependency.

Updated documentation, fixed broken links added information specific to PostgreSQL virtual schemas.

## Documentation

* #55: Updated documentation

## Dependency Updates

### Compile Dependency Updates

* Updated `org.postgresql:postgresql:42.4.0` to `42.5.0`
92 changes: 92 additions & 0 deletions doc/developers_guide/developers_guide.md
@@ -0,0 +1,92 @@
# Developers Guide

This guide contains information for developers.

## Password for writing to BucketFS of your Exasol database

In case you are running Exasol in a docker container the following script helps you to get the password for writing to BucketFS of your Exasol database:

```shell
CONTAINER=<container id>
export BUCKETFS_PASSWORD=$(
docker exec -it $CONTAINER \
grep WritePass /exa/etc/EXAConf \
| sed -e 's/.* = //' \
| tr -d '\r' \
| base64 -d)
```

## Remote Logging

When [creating a Virtual Schema](../user_guide/postgresql_user_guide.md#creating-a-virtual-schema) you can enable remote access to information logged by the virtual schema adapter see [Remote logging](https://docs.exasol.com/db/latest/database_concepts/virtual_schema/logging.htm).

Please note that remote logging

* imposes security risks on your system
* may affect the performance of your system
* should be used only for debugging and development purposes but not in productive scenarios

## Finding Out the Port of a PostgreSQL Database Installation

PostgreSQL default port is `5432`.

To inquire port in other cases use

```shell
function hfield () { head -1 | sed -e 's/ */\t/g' | cut -f $1 ; }
SENDQ=$(ss -tl | grep postgresql | hfield 3)
ss -tln | grep $SENDQ | hfield 4
```

## Making PostgreSQL Service Listen to External Connections

In order to enable Exasol database to access your PostgreSQL database as a virtual schema you may need to make PostgreSQL Service listen to external connections.

See also:

* [Configuring PostgreSQL to allow remote connections](https://www.bigbinary.com/blog/configure-postgresql-to-allow-remote-connection)
* ["FATAL: no pg_hba.conf entry for host" (Stackoverflow)](https://dba.stackexchange.com/questions/83984/connect-to-postgresql-server-fatal-no-pg-hba-conf-entry-for-host)]https://dba.stackexchange.com/questions/83984/

Please note:

* Accepting external connections imposes security risks on your PostgreSQL database.
* In case you are not sure please contact your local IT security officer.
* The following steps are only suitable for limited experiments in a secure sandbox environment.

Use `sudo vi` to add the following line to file `/etc/postgresql/10/main/postgresql.conf`:
```
listen_addresses = '*'
```

Use `sudo vi`to add the following line to file `/etc/postgresql/10/main/pg_hba.conf`:
```
# TYPE DATABASE USER CIDR-ADDRESS METHOD
host all all 0.0.0.0/0 md5
```

## First Steps With PostgreSQL

See also ["Getting started with PostgreSQL"](https://www3.ntu.edu.sg/home/ehchua/programming/sql/PostgreSQL_GetStarted.html).

Check out the list of [PostgreSQL Database clients](https://wiki.postgresql.org/wiki/PostgreSQL_Clients) to find one that suits your needs.
For the following examples we chose command line client `psql` included in default installation.

| Command | Description |
|---------|-------------|
| `sudo apt install postgresql` | Install PostgreSQL |
| `sudo -u postgres psql -c 'CREATE DATABASE mytest;'` | Create database named `mytest` |
| `sudo -u postgres createuser --superuser $USER` | Create a PostgreSQL user for you |
| `psql mytest` | Connect to database `mytest` using the current user |

Helpful commands in database client:

| Command | Comment |
| -------- | --------- |
| `SELECT version();` | Display version of installed database |
| `\h <command>` | help for command `<command>` |
| `\c <database-name>` | connect to database `<database-name>` |
| `\l` | list databases |
| `\dt` | list tables |
| `\password <user>` | set password for user `<user>` |
| `CREATE TABLE mytable columns (name VARCHAR, age INT);` | Create a table |

3 changes: 0 additions & 3 deletions doc/dialects/postgresql.md

This file was deleted.

151 changes: 95 additions & 56 deletions doc/user_guide/postgresql_user_guide.md
Expand Up @@ -2,17 +2,36 @@

[PostgreSQL](https://www.postgresql.org/) is an open-source Relational Database Management System (RDBMS).

## Uploading the JDBC Driver to EXAOperation
## Uploading the JDBC Driver to Exasol BucketFS

First download the [PostgreSQL JDBC driver](https://jdbc.postgresql.org/).
Driver version 42.2.6 or later is recommended if you want to establish a TLS-secured connection.
1. Download the [PostgreSQL JDBC driver](https://jdbc.postgresql.org/).
Driver version 42.2.6 or later is recommended if you want to establish a TLS-secured connection.
2. Upload the driver to BucketFS, see https://docs.exasol.com/db/latest/administration/on-premise/bucketfs/accessfiles.htm.<br />
Hint: Put the driver into folder `default/drivers/jdbc/` to register it for [ExaLoader](#registering-the-jdbc-driver-for-exaloader), too.

1. [Create a bucket in BucketFS](https://docs.exasol.com/administration/on-premise/bucketfs/create_new_bucket_in_bucketfs_service.htm)
1. Upload the driver to BucketFS
## Registering the JDBC driver for ExaLoader

In order to enable the ExaLoader to fetch data from the external database you must register the driver for ExaLoader as described in the [Installation procedure for JDBC drivers](https://github.com/exasol/docker-db/#installing-custom-jdbc-drivers).
1. ExaLoader expects the driver in BucketFS folder `default/drivers/jdbc`.<br />
If you uploaded the driver for UDF to a different folder, then you need to [upload](#uploading-the-jdbc-driver-to-exasol-bucketfs) the driver again.
2. Additionally you need to create a file `settings.cfg` and [upload](#uploading-the-jdbc-driver-to-exasol-bucketfs) it to the same folder in BucketFS:

```
DRIVERNAME=POSTGRES_JDBC_DRIVER
JAR=<jar file containing the jdbc driver>
DRIVERMAIN=org.postgresql.Driver
PREFIX=jdbc:postgresql:
FETCHSIZE=100000
INSERTSIZE=-1
```

| Variable | Description |
|----------|-------------|
| `<jar file containing the jdbc driver>` | E.g. postgresql-42.4.2.jar |

## Installing the Adapter Script

Upload the latest available release of [PostgrSQL Virtual Schema JDBC Adapter](https://github.com/exasol/postgresql-virtual-schema/releases) to Bucket FS.
[Upload](#uploading-a-file-to-bucketfs) the latest available release of [PostgreSQL Virtual Schema JDBC Adapter](https://github.com/exasol/postgresql-virtual-schema/releases) to Bucket FS.

Then create a schema to hold the adapter script.

Expand All @@ -23,16 +42,17 @@ CREATE SCHEMA ADAPTER;
The SQL statement below creates the adapter script, defines the Java class that serves as entry point and tells the UDF framework where to find the libraries (JAR files) for Virtual Schema and database driver.

```sql
--/
CREATE OR REPLACE JAVA ADAPTER SCRIPT ADAPTER.JDBC_ADAPTER AS
%scriptclass com.exasol.adapter.RequestDispatcher;
%jar /buckets/<BFS service>/<bucket>/virtual-schema-dist-9.0.5-postgresql-2.0.3.jar;
%jar /buckets/<BFS service>/<bucket>/virtual-schema-dist-9.0.5-postgresql-2.0.4.jar;
%jar /buckets/<BFS service>/<bucket>/postgresql-<postgresql-driver-version>.jar;
/
```

## Defining a Named Connection

Define the connection to PostgreSQL as shown below. We recommend using TLS to secure the connection.
Define the connection to the PostgreSQL database as shown below. We recommend using TLS to secure the connection.

```sql
CREATE OR REPLACE CONNECTION POSTGRESQL_CONNECTION
Expand All @@ -41,21 +61,40 @@ USER '<user>'
IDENTIFIED BY '<password>';
```

If your setup does not support SSL then simply remove suffix `?ssl=true&sslfactory=org.postgresql.ssl.DefaultJavaSSLFactory`.


| Variable | Description |
|----------|-------------|
| `<host>` | Hostname or ip address of the machine hosting you PostgreSQL database. |
| `<port>` | Port of the PostgreSQL database, default is `5432`, see also [Developer's guide](../developers_guide/developers_guide.md#finding-out-the-port-of-a-postgresql-database-installation). |
| `<schema name>` | Name of the database schema you want to use in the PostgreSQL database. |

See also [Making PostgreSQL Service Listen to External Connections](../developers_guide/developers_guide.md#making-postgresql-service-listen-to-external-connections) in the Developer's guide.

## Creating a Virtual Schema

Below you see how a PostreSQL Virtual Schema is created.
Use the following SQL command in Exasol database to create a PostgreSQL Virtual Schema:

```sql
CREATE VIRTUAL SCHEMA <virtual schema name>
USING ADAPTER.JDBC_ADAPTER
USING ADAPTER.JDBC_ADAPTER
WITH
CATALOG_NAME = '<catalog name>'
SCHEMA_NAME = '<schema name>'
CONNECTION_NAME = 'POSTGRESQL_CONNECTION'
;
```

## Postgres Identifiers
| Variable | Description |
|----------|-------------|
| `<virtual schema name>` | Name of the virtual schema you want to use. |
| `<catalog name>` | Name of the catalog, usally equivalent to the name of the PostgreSQL database. |
| `<schema name>` | Name of the database schema you want to use in the PostgreSQL database. |

See also section [Remote logging](../developers_guide/developers_guide.md#remote-logging) in the developers guide.

## PostgreSQL Identifiers

In contrast to Exasol, PostgreSQL does not treat identifiers as specified in the SQL standard. PostgreSQL folds unquoted identifiers to lower case instead of upper case. The adapter has two modes for handling this:

Expand All @@ -72,7 +111,7 @@ Regardless of this, you can create or refresh the virtual schema by specifying t

```sql
CREATE VIRTUAL SCHEMA <virtual schema name>
USING ADAPTER.JDBC_ADAPTER
USING ADAPTER.JDBC_ADAPTER
WITH
CATALOG_NAME = '<catalog name>'
SCHEMA_NAME = '<schema name>'
Expand All @@ -87,24 +126,24 @@ ALTER VIRTUAL SCHEMA postgres SET IGNORE_ERRORS = 'POSTGRESQL_UPPERCASE_TABLES';
```
However, you **will not be able to query the identifier containing the upper case character**. An error is thrown when querying the virtual table.

A best practice for this mode is: **never quote identifiers** (in the PostgreSQL Schema as well as in the Exasol Virtual Schema). This way everything works without having to change your queries.
A best practice for this mode is: **never quote identifiers** (in the PostgreSQL Schema as well as in the Exasol Virtual Schema). This way everything works without having to change your queries.<br />
An alternative is to use the second mode for identifier handling (see below).

### PostgreSQL-like identifier handling

If you use quotes on the PostgreSQL side and have identifiers with uppercase characters, then it is recommended to use this mode. The PostgreSQL like identifier handling does no conversions but mirrors the PostgreSQL metadata as is. A small example to make this clear:
```sql
--Postgres Schema
--PostgreSQL Schema
CREATE TABLE "MyTable"("Col1" VARCHAR(100));
CREATE TABLE MySecondTable(Col1 VARCHAR(100));
--Postgres Queries
--PostgreSQL Queries
SELECT "Col1" FROM "MyTable";
SELECT Col1 FROM MySecondTable;
```
```sql
--Create Virtual Schema on EXASOL side
CREATE VIRTUAL SCHEMA <virtual schema name>
USING ADAPTER.JDBC_ADAPTER
USING ADAPTER.JDBC_ADAPTER
WITH
CATALOG_NAME = '<catalog name>'
SCHEMA_NAME = '<schema name>'
Expand Down Expand Up @@ -142,50 +181,50 @@ A best practice for this mode is: **always quote identifiers** (in the PostgreSQ

| PostgreSQL Data Type | Supported | Converted Exasol Data Type| Known limitations
|--------------------------|--------------|---------------------------|-------------------
| BIGINT || DECIMAL(19,0) |
| BIGSERIAL || DECIMAL(19,0) |
| BIT || BOOLEAN |
| BIT VARYING || VARCHAR(5) |
| BOX || VARCHAR(2000000) |
| BYTEA || VARCHAR(2000000) |
| BOOLEAN || BOOLEAN |
| CHARACTER || CHAR |
| CHARACTER VARYING || VARCHAR |
| CIDR || VARCHAR(2000000) |
| CIRCLE || VARCHAR(2000000) |
| DATE || DATE |
| DOUBLE PRECISION || DOUBLE |
| INET || VARCHAR(2000000) |
| INTEGER || DECIMAL(10,0) |
| INTERVAL || VARCHAR(2000000) |
| JSON || VARCHAR(2000000) |
| JSONB || VARCHAR(2000000) |
| LINE || VARCHAR(2000000) |
| LSEG || VARCHAR(2000000) |
| MACADDR || VARCHAR(2000000) |
| MONEY || DOUBLE |
| NUMERIC | ✓ | VARCHAR(2000000) | Stored in Exasol as VARCHAR, because PostgreSQL NUMERIC values can exceed Exasol Decimal limit which makes it impossible to use Virtual Schemas.
| PATH || VARCHAR(2000000) |
| POINT || VARCHAR(2000000) |
| POLYGON || VARCHAR(2000000) |
| REAL || DOUBLE |
| SMALLINT || DECIMAL(5,0) |
| SMALLSERIAL | ? (untested) | |
| SERIAL | ? (untested) | |
| TEXT || VARCHAR(2000000) |
| TIME || VARCHAR(2000000) |
| TIME WITH TIME ZONE || VARCHAR(2000000) |
| TIMESTAMP || TIMESTAMP |
| TIMESTAMP WITH TIME ZONE || TIMESTAMP |
| TSQUERY || VARCHAR(2000000) |
| TSVECTOR || VARCHAR(2000000) |
| UUID || VARCHAR(2000000) |
| XML || VARCHAR(2000000) |
| BIGINT || DECIMAL(19,0) |
| BIGSERIAL || DECIMAL(19,0) |
| BIT || BOOLEAN |
| BIT VARYING || VARCHAR(5) |
| BOX || VARCHAR(2000000) |
| BYTEA || VARCHAR(2000000) |
| BOOLEAN || BOOLEAN |
| CHARACTER || CHAR |
| CHARACTER VARYING || VARCHAR |
| CIDR || VARCHAR(2000000) |
| CIRCLE || VARCHAR(2000000) |
| DATE || DATE |
| DOUBLE PRECISION || DOUBLE |
| INET || VARCHAR(2000000) |
| INTEGER || DECIMAL(10,0) |
| INTERVAL || VARCHAR(2000000) |
| JSON || VARCHAR(2000000) |
| JSONB || VARCHAR(2000000) |
| LINE || VARCHAR(2000000) |
| LSEG || VARCHAR(2000000) |
| MACADDR || VARCHAR(2000000) |
| MONEY || DOUBLE |
| NUMERIC | ✓ | VARCHAR(2000000) | Stored in Exasol as VARCHAR, because PostgreSQL NUMERIC values can exceed Exasol Decimal limit which makes it impossible to use Virtual Schemas.
| PATH || VARCHAR(2000000) |
| POINT || VARCHAR(2000000) |
| POLYGON || VARCHAR(2000000) |
| REAL || DOUBLE |
| SMALLINT || DECIMAL(5,0) |
| SMALLSERIAL | ? (untested) | |
| SERIAL | ? (untested) | |
| TEXT || VARCHAR(2000000) |
| TIME || VARCHAR(2000000) |
| TIME WITH TIME ZONE || VARCHAR(2000000) |
| TIMESTAMP || TIMESTAMP |
| TIMESTAMP WITH TIME ZONE || TIMESTAMP |
| TSQUERY || VARCHAR(2000000) |
| TSVECTOR || VARCHAR(2000000) |
| UUID || VARCHAR(2000000) |
| XML || VARCHAR(2000000) |

## Testing information

In the following matrix you find combinations of JDBC driver and dialect version that we tested.

| Virtual Schema Version | PostgreSQL Version | Driver Name | Driver Version |
|------------------------|--------------------|------------------------|-----------------|
| Latest | PostgreSQL 14.2 | PostgreSQL JDBC Driver | 42.3.3 |
| Latest | PostgreSQL 14.2 | PostgreSQL JDBC Driver | 42.4.2 |
2 changes: 1 addition & 1 deletion pk_generated_parent.pom
Expand Up @@ -3,7 +3,7 @@
<modelVersion>4.0.0</modelVersion>
<groupId>com.exasol</groupId>
<artifactId>postgresql-virtual-schema-generated-parent</artifactId>
<version>2.0.3</version>
<version>2.0.4</version>
<packaging>pom</packaging>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
Expand Down

0 comments on commit 61c880a

Please sign in to comment.