PostgreSQL is an open-source Relational Database Management System (RDBMS).
- Download the PostgreSQL JDBC driver. Driver version 42.2.6 or later is recommended if you want to establish a TLS-secured connection.
- Upload the driver to BucketFS, see https://docs.exasol.com/db/latest/administration/on-premise/bucketfs/accessfiles.htm.
Hint: Put the driver into folderdefault/drivers/jdbc/
to register it for ExaLoader, too.
In order to enable the ExaLoader to fetch data from the external database you must register the driver for ExaLoader as described in the Installation procedure for JDBC drivers.
- ExaLoader expects the driver in BucketFS folder
default/drivers/jdbc
.
If you uploaded the driver for UDF to a different folder, then you need to upload the driver again. - Additionally you need to create a file
settings.cfg
and upload it to the same folder in BucketFS:
DRIVERNAME=POSTGRES_JDBC_DRIVER
JAR=<jar file containing the jdbc driver>
DRIVERMAIN=org.postgresql.Driver
PREFIX=jdbc:postgresql:
FETCHSIZE=100000
INSERTSIZE=-1
Variable | Description |
---|---|
<jar file containing the jdbc driver> |
E.g. postgresql-42.4.2.jar |
Upload the latest available release of PostgreSQL Virtual Schema JDBC Adapter to Bucket FS.
Then create a schema to hold the adapter script.
CREATE SCHEMA ADAPTER;
The SQL statement below creates the adapter script, defines the Java class that serves as entry point and tells the UDF framework where to find the libraries (JAR files) for Virtual Schema and database driver.
--/
CREATE OR REPLACE JAVA ADAPTER SCRIPT ADAPTER.JDBC_ADAPTER AS
%scriptclass com.exasol.adapter.RequestDispatcher;
%jar /buckets/<BFS service>/<bucket>/virtual-schema-dist-9.0.5-postgresql-2.0.4.jar;
%jar /buckets/<BFS service>/<bucket>/postgresql-<postgresql-driver-version>.jar;
/
Define the connection to the PostgreSQL database as shown below. We recommend using TLS to secure the connection.
CREATE OR REPLACE CONNECTION POSTGRESQL_CONNECTION
TO 'jdbc:postgresql://<host>:<port>/<database name>?ssl=true&sslfactory=org.postgresql.ssl.DefaultJavaSSLFactory'
USER '<user>'
IDENTIFIED BY '<password>';
If your setup does not support SSL then simply remove suffix ?ssl=true&sslfactory=org.postgresql.ssl.DefaultJavaSSLFactory
.
Variable | Description |
---|---|
<host> |
Hostname or ip address of the machine hosting you PostgreSQL database. |
<port> |
Port of the PostgreSQL database, default is 5432 , see also Developer's guide. |
<schema name> |
Name of the database schema you want to use in the PostgreSQL database. |
See also Making PostgreSQL Service Listen to External Connections in the Developer's guide.
Use the following SQL command in Exasol database to create a PostgreSQL Virtual Schema:
CREATE VIRTUAL SCHEMA <virtual schema name>
USING ADAPTER.JDBC_ADAPTER
WITH
CATALOG_NAME = '<catalog name>'
SCHEMA_NAME = '<schema name>'
CONNECTION_NAME = 'POSTGRESQL_CONNECTION'
;
Variable | Description |
---|---|
<virtual schema name> |
Name of the virtual schema you want to use. |
<catalog name> |
Name of the catalog, usally equivalent to the name of the PostgreSQL database. |
<schema name> |
Name of the database schema you want to use in the PostgreSQL database. |
See also section Remote logging in the developers guide.
In contrast to Exasol, PostgreSQL does not treat identifiers as specified in the SQL standard. PostgreSQL folds unquoted identifiers to lower case instead of upper case. The adapter has two modes for handling this:
This is the default mode for handling identifiers, but identifier conversion can also be set explicitly using the following property:
ALTER VIRTUAL SCHEMA <virtual schema name> SET POSTGRESQL_IDENTIFIER_MAPPING = 'CONVERT_TO_UPPER';
In this mode you do not have to care about identifier handling. Everything will work as expected out of the box as long as you do not use quoted identifiers (in the PostgreSQL Schema as well as in the Exasol Virtual Schema). More specifically everything will work as long as there are no identifiers in the PostgreSQL database that contain upper case characters. If that is the case an error is thrown when creating or refreshing the virtual schema. Regardless of this, you can create or refresh the virtual schema by specifying the adapter to ignore this particular error as shown below:
CREATE VIRTUAL SCHEMA <virtual schema name>
USING ADAPTER.JDBC_ADAPTER
WITH
CATALOG_NAME = '<catalog name>'
SCHEMA_NAME = '<schema name>'
CONNECTION_NAME = 'POSTGRESQL_CONNECTION'
IGNORE_ERRORS = 'POSTGRESQL_UPPERCASE_TABLES'
;
You can also set this property to an existing virtual schema:
ALTER VIRTUAL SCHEMA postgres SET IGNORE_ERRORS = 'POSTGRESQL_UPPERCASE_TABLES';
However, you will not be able to query the identifier containing the upper case character. An error is thrown when querying the virtual table.
A best practice for this mode is: never quote identifiers (in the PostgreSQL Schema as well as in the Exasol Virtual Schema). This way everything works without having to change your queries.
An alternative is to use the second mode for identifier handling (see below).
If you use quotes on the PostgreSQL side and have identifiers with uppercase characters, then it is recommended to use this mode. The PostgreSQL like identifier handling does no conversions but mirrors the PostgreSQL metadata as is. A small example to make this clear:
--PostgreSQL Schema
CREATE TABLE "MyTable"("Col1" VARCHAR(100));
CREATE TABLE MySecondTable(Col1 VARCHAR(100));
--PostgreSQL Queries
SELECT "Col1" FROM "MyTable";
SELECT Col1 FROM MySecondTable;
--Create Virtual Schema on EXASOL side
CREATE VIRTUAL SCHEMA <virtual schema name>
USING ADAPTER.JDBC_ADAPTER
WITH
CATALOG_NAME = '<catalog name>'
SCHEMA_NAME = '<schema name>'
CONNECTION_NAME = 'POSTGRESQL_CONNECTION'
POSTGRESQL_IDENTIFIER_MAPPING = 'PRESERVE_ORIGINAL_CASE'
;
-- Open Schema and see what tables are there
open schema postgres;
select * from cat;
-- result -->
-- TABLE_NAME TABLE_TYPE
-- ----------------------
-- MyTable | TABLE
-- mysecondtable | TABLE
As you can see MySecondTable
is displayed in lower case in the virtual schema catalog. This is exactly like it is on the PostgreSQL side, but since unquoted identifiers are folded differently in PostgreSQL you cannot query the table like you did in PostgreSQL:
--Querying the virtual schema
--> this works
SELECT "Col1" FROM postgres."MyTable";
--> this does not work
SELECT Col1 FROM postgres.MySecondTable;
--> Error:
-- [Code: 0, SQL State: 42000] object "POSTGRES"."MYSECONDTABLE" not found [line 1, column 18]
--> this works
SELECT "col1" FROM postgres."mysecondtable";
Unquoted identifiers are converted to lowercase on the PostgreSQL side, and since there is no catalog conversion these identifiers are also lowercase in Exasol. To query a lowercase identifier you must use quotes in Exasol, because everything that is unquoted gets folded to uppercase.
A best practice for this mode is: always quote identifiers (in the PostgreSQL Schema as well as in the Exasol Virtual Schema). This way everything works without having to change your queries.
PostgreSQL Data Type | Supported | Converted Exasol Data Type | Known limitations |
---|---|---|---|
BIGINT | ✓ | DECIMAL(19,0) | |
BIGSERIAL | ✓ | DECIMAL(19,0) | |
BIT | ✓ | BOOLEAN | |
BIT VARYING | ✓ | VARCHAR(5) | |
BOX | ✓ | VARCHAR(2000000) | |
BYTEA | ✓ | VARCHAR(2000000) | |
BOOLEAN | ✓ | BOOLEAN | |
CHARACTER | ✓ | CHAR | |
CHARACTER VARYING | ✓ | VARCHAR | |
CIDR | ✓ | VARCHAR(2000000) | |
CIRCLE | ✓ | VARCHAR(2000000) | |
DATE | ✓ | DATE | |
DOUBLE PRECISION | ✓ | DOUBLE | |
INET | ✓ | VARCHAR(2000000) | |
INTEGER | ✓ | DECIMAL(10,0) | |
INTERVAL | ✓ | VARCHAR(2000000) | |
JSON | ✓ | VARCHAR(2000000) | |
JSONB | ✓ | VARCHAR(2000000) | |
LINE | ✓ | VARCHAR(2000000) | |
LSEG | ✓ | VARCHAR(2000000) | |
MACADDR | ✓ | VARCHAR(2000000) | |
MONEY | ✓ | DOUBLE | |
NUMERIC | ✓ | VARCHAR(2000000) | Stored in Exasol as VARCHAR, because PostgreSQL NUMERIC values can exceed Exasol Decimal limit which makes it impossible to use Virtual Schemas. |
PATH | ✓ | VARCHAR(2000000) | |
POINT | ✓ | VARCHAR(2000000) | |
POLYGON | ✓ | VARCHAR(2000000) | |
REAL | ✓ | DOUBLE | |
SMALLINT | ✓ | DECIMAL(5,0) | |
SMALLSERIAL | ? (untested) | ||
SERIAL | ? (untested) | ||
TEXT | ✓ | VARCHAR(2000000) | |
TIME | ✓ | VARCHAR(2000000) | |
TIME WITH TIME ZONE | ✓ | VARCHAR(2000000) | |
TIMESTAMP | ✓ | TIMESTAMP | |
TIMESTAMP WITH TIME ZONE | ✓ | TIMESTAMP | |
TSQUERY | ✓ | VARCHAR(2000000) | |
TSVECTOR | ✓ | VARCHAR(2000000) | |
UUID | ✓ | VARCHAR(2000000) | |
XML | ✓ | VARCHAR(2000000) |
In the following matrix you find combinations of JDBC driver and dialect version that we tested.
Virtual Schema Version | PostgreSQL Version | Driver Name | Driver Version |
---|---|---|---|
Latest | PostgreSQL 14.2 | PostgreSQL JDBC Driver | 42.4.2 |