PostgreSQL 9.5 IMPORT FOREIGN SCHEMA #10

Closed
robe2 opened this Issue Dec 22, 2014 · 9 comments

Comments

Projects
None yet
3 participants
@pramsey

This comment has been minimized.

Show comment
Hide comment
@pramsey

pramsey Dec 23, 2014

Owner

Wow, this is pretty much exactly what we needed at the start, reading the table definitions from the server instead of hand-defining them.

Owner

pramsey commented Dec 23, 2014

Wow, this is pretty much exactly what we needed at the start, reading the table definitions from the server instead of hand-defining them.

@davidfetter

This comment has been minimized.

Show comment
Hide comment
@davidfetter

davidfetter Dec 30, 2014

Indeed. Any ideas as to the scope of this work?

Indeed. Any ideas as to the scope of this work?

@robe2

This comment has been minimized.

Show comment
Hide comment
@robe2

robe2 Dec 27, 2015

Contributor

I've started playing around with this. It's not that much work I don't think. A lot of the logic @pramsey has in ogrfdw_info can be repurposed for this and I've already cut in the IMPORT FOREIGN schema glue I saw in the postgresql_fdw code.

One thing I'm not sure how to handle is IMPORT FOREIGN SCHEMA <remote_schema>

the remote_schema is a required bit. There are two ways that come to mind to handle this:

  1. come up with some bogus remote schema like 'all' that would mean ignore the schema and that people would have to specify

  2. just ignore the schema if the data source does not support schemas

For schema checking I was going to assume that databases that have schema's the tables come thru via with dot's

For example when I connect to SQL Server, table names come down as


dbo.Orders
dbo.Products

etc. but not sure if that is a true statement for all/most OGR data sources or if there is a logic in OGR to handle schemas. I don't think there is.

The other minor issue is handling EXCEPT and LIMIT TO clauses in IMPORT FOREIGN SCHEMA.

One benefit about using a specific schema word to mean ignore schemas, is that if a true schema name is used, then we know the EXCEPT / LIMIT TO need not include the schema name (or it has to be fully qualified in case of all).

Contributor

robe2 commented Dec 27, 2015

I've started playing around with this. It's not that much work I don't think. A lot of the logic @pramsey has in ogrfdw_info can be repurposed for this and I've already cut in the IMPORT FOREIGN schema glue I saw in the postgresql_fdw code.

One thing I'm not sure how to handle is IMPORT FOREIGN SCHEMA <remote_schema>

the remote_schema is a required bit. There are two ways that come to mind to handle this:

  1. come up with some bogus remote schema like 'all' that would mean ignore the schema and that people would have to specify

  2. just ignore the schema if the data source does not support schemas

For schema checking I was going to assume that databases that have schema's the tables come thru via with dot's

For example when I connect to SQL Server, table names come down as


dbo.Orders
dbo.Products

etc. but not sure if that is a true statement for all/most OGR data sources or if there is a logic in OGR to handle schemas. I don't think there is.

The other minor issue is handling EXCEPT and LIMIT TO clauses in IMPORT FOREIGN SCHEMA.

One benefit about using a specific schema word to mean ignore schemas, is that if a true schema name is used, then we know the EXCEPT / LIMIT TO need not include the schema name (or it has to be fully qualified in case of all).

@robe2

This comment has been minimized.

Show comment
Hide comment
@robe2

robe2 Dec 27, 2015

Contributor

Okay getting warm now, I have successfully imported a table with import foreign schema. Now to work out some kinks.

It does seem that there is a mechanism for options:

http://www.postgresql.org/docs/9.5/static/sql-importforeignschema.html

So for first option, I wanted to define a launder column names option. If that is not set to true, then the column names would come in exactly as they are in remote data source, instead of being forced to lower case and having unsavory names changed. Not sure if the launder should apply to table name as well (minus the schema part) or if we should have a separate option for that.

-- UPDATE -- this is sweet

Just generated 101 foreign tables from an access database and without having my columns and table names laundered - in 861 msecs.

Contributor

robe2 commented Dec 27, 2015

Okay getting warm now, I have successfully imported a table with import foreign schema. Now to work out some kinks.

It does seem that there is a mechanism for options:

http://www.postgresql.org/docs/9.5/static/sql-importforeignschema.html

So for first option, I wanted to define a launder column names option. If that is not set to true, then the column names would come in exactly as they are in remote data source, instead of being forced to lower case and having unsavory names changed. Not sure if the launder should apply to table name as well (minus the schema part) or if we should have a separate option for that.

-- UPDATE -- this is sweet

Just generated 101 foreign tables from an access database and without having my columns and table names laundered - in 861 msecs.

@robe2

This comment has been minimized.

Show comment
Hide comment
@robe2

robe2 Dec 27, 2015

Contributor

Okay I decided to go with two separate laundering options and default both to true if not specified.

So I have:

launder_table_names
launder_column_names

So to use for example I have a statements like this:

CREATE SERVER svr_legacy_btu
   FOREIGN DATA WRAPPER ogr_fdw
  OPTIONS (datasource 'C:\data\BTUData.mdb',format 'ODBC');

DROP SCHEMA IF EXISTS btu CASCADE;
CREATE SCHEMA btu;

IMPORT FOREIGN SCHEMA public 
FROM server svr_legacy_btu INTO btu OPTIONS(launder_table_names 'false') ;

Since I set it to false, the casing and weird junk in my table names are preserved.

Similarly if I do this:


DROP SCHEMA IF EXISTS btu CASCADE;
CREATE SCHEMA btu;

IMPORT FOREIGN SCHEMA public 
FROM server svr_legacy_btu 
INTO btu OPTIONS(launder_table_names 'false', launder_column_names 'false') ;

Both table and column name casing and weird characters are preserved. Now all I have left is to tackle the EXCEPT / LIMIT TO and do something with the remote schema and I think my patch (when I submit it - after @pramsey has committed my 9.5rc1 fix patch) will be feature complete :).

P.S. I also noticed for some reason I have to use a schema name that exists in my db for the remote schema, which seems pretty lame. Not sure this is a bug or just a limitation of existing IMPORT..functionality.

I was mistaken about schema - I was trying to use all for remote schema, and that's a reserved word. So user-error on that one.

Contributor

robe2 commented Dec 27, 2015

Okay I decided to go with two separate laundering options and default both to true if not specified.

So I have:

launder_table_names
launder_column_names

So to use for example I have a statements like this:

CREATE SERVER svr_legacy_btu
   FOREIGN DATA WRAPPER ogr_fdw
  OPTIONS (datasource 'C:\data\BTUData.mdb',format 'ODBC');

DROP SCHEMA IF EXISTS btu CASCADE;
CREATE SCHEMA btu;

IMPORT FOREIGN SCHEMA public 
FROM server svr_legacy_btu INTO btu OPTIONS(launder_table_names 'false') ;

Since I set it to false, the casing and weird junk in my table names are preserved.

Similarly if I do this:


DROP SCHEMA IF EXISTS btu CASCADE;
CREATE SCHEMA btu;

IMPORT FOREIGN SCHEMA public 
FROM server svr_legacy_btu 
INTO btu OPTIONS(launder_table_names 'false', launder_column_names 'false') ;

Both table and column name casing and weird characters are preserved. Now all I have left is to tackle the EXCEPT / LIMIT TO and do something with the remote schema and I think my patch (when I submit it - after @pramsey has committed my 9.5rc1 fix patch) will be feature complete :).

P.S. I also noticed for some reason I have to use a schema name that exists in my db for the remote schema, which seems pretty lame. Not sure this is a bug or just a limitation of existing IMPORT..functionality.

I was mistaken about schema - I was trying to use all for remote schema, and that's a reserved word. So user-error on that one.

@robe2

This comment has been minimized.

Show comment
Hide comment
@robe2

robe2 Dec 28, 2015

Contributor

For remote schema functionality, I decided to go with using the remote_schema as a prefix instead of as a real schema. Reason is things like WFS could have very nested names with periods and the concept then also becomes useful for things that don't have schemas.

For the ignore schema option, I decided to go with name ogr_all for remote schema name.

So for example:

CREATE SERVER fgdbtest
  FOREIGN DATA WRAPPER ogr_fdw
  OPTIONS (
    datasource 'c:/projects/sources/pgsql_ogr_fdw_robe/data/Querying.gdb',
    format 'OpenFileGDB' );

DROP SCHEMA IF EXISTS fgdbq CASCADE;
CREATE SCHEMA fgdbq;

-- will just import CitiesInfo  table (basically all tables starting with CitiesInf) --
IMPORT FOREIGN SCHEMA "CitiesInf"  
 FROM SERVER fgdbtest  INTO fgdbq;


-- will import all tables --
DROP SCHEMA IF EXISTS fgdbq CASCADE;
CREATE SCHEMA fgdbq;


IMPORT FOREIGN SCHEMA ogr_all  
    FROM SERVER fgdbtest  INTO fgdbq;

Tested it out and works nicely.

Contributor

robe2 commented Dec 28, 2015

For remote schema functionality, I decided to go with using the remote_schema as a prefix instead of as a real schema. Reason is things like WFS could have very nested names with periods and the concept then also becomes useful for things that don't have schemas.

For the ignore schema option, I decided to go with name ogr_all for remote schema name.

So for example:

CREATE SERVER fgdbtest
  FOREIGN DATA WRAPPER ogr_fdw
  OPTIONS (
    datasource 'c:/projects/sources/pgsql_ogr_fdw_robe/data/Querying.gdb',
    format 'OpenFileGDB' );

DROP SCHEMA IF EXISTS fgdbq CASCADE;
CREATE SCHEMA fgdbq;

-- will just import CitiesInfo  table (basically all tables starting with CitiesInf) --
IMPORT FOREIGN SCHEMA "CitiesInf"  
 FROM SERVER fgdbtest  INTO fgdbq;


-- will import all tables --
DROP SCHEMA IF EXISTS fgdbq CASCADE;
CREATE SCHEMA fgdbq;


IMPORT FOREIGN SCHEMA ogr_all  
    FROM SERVER fgdbtest  INTO fgdbq;

Tested it out and works nicely.

@robe2

This comment has been minimized.

Show comment
Hide comment
@robe2

robe2 Dec 29, 2015

Contributor

okay I screwed up again and put the wrong number on my pull request. Should have been linked to this ticket - #47

Contributor

robe2 commented Dec 29, 2015

okay I screwed up again and put the wrong number on my pull request. Should have been linked to this ticket - #47

@robe2

This comment has been minimized.

Show comment
Hide comment
@robe2

robe2 Dec 31, 2015

Contributor

Okay the EXCEPT / LIMIT TO I confirmed from Tom Lane is handled by the post processor, and it does filter out CREATE FOREIGN TABLE statements that are in violation of the LIMIT TO / EXCEPT.

So that explains when I set launder_table_names 'false' it worked as expected. The whole driver specific thing was more to do with the funkiness of names.

I think we should still direclty handle LIMIT/EXCEPT ourselves first (like postgres_fdw does) just because depending on datasource, getting the whole table definition might be expensive.

I changed the logic I had and the readme to reflect that the LIMIT TO / EXCEPT should NOT use the layer names (unless table name laundering is off), but should use post laundered names. That does mean users have to have an idea how the laundering works though.

Contributor

robe2 commented Dec 31, 2015

Okay the EXCEPT / LIMIT TO I confirmed from Tom Lane is handled by the post processor, and it does filter out CREATE FOREIGN TABLE statements that are in violation of the LIMIT TO / EXCEPT.

So that explains when I set launder_table_names 'false' it worked as expected. The whole driver specific thing was more to do with the funkiness of names.

I think we should still direclty handle LIMIT/EXCEPT ourselves first (like postgres_fdw does) just because depending on datasource, getting the whole table definition might be expensive.

I changed the logic I had and the readme to reflect that the LIMIT TO / EXCEPT should NOT use the layer names (unless table name laundering is off), but should use post laundered names. That does mean users have to have an idea how the laundering works though.

pramsey added a commit that referenced this issue Jan 5, 2016

Merge pull request #47 from robe2/master
#10 PostgreSQL 9.5 IMPORT FOREIGN SCHEMA support
@robe2

This comment has been minimized.

Show comment
Hide comment
@robe2

robe2 Jan 5, 2016

Contributor

Don't forget to close this one out.

Contributor

robe2 commented Jan 5, 2016

Don't forget to close this one out.

@pramsey pramsey closed this Jan 5, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment