New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DDL is incomplete #282

Closed
jyutzler opened this Issue Feb 7, 2017 · 15 comments

Comments

Projects
None yet
3 participants
@jyutzler
Contributor

jyutzler commented Feb 7, 2017

I found that GeoPackages in the wild were failing /opt/features/vector_features/data/feature_table_integer_primary_key because the primary key did not have a NOT NULL of 1 as per http://www.geopackage.org/spec/#example_feature_table_cols. (opengeospatial/ets-gpkg10#31) Upon further review it is because they are using our DDL from http://www.geopackage.org/spec/#_sample_feature_table_informative
The DDL needs to be updated to have a NOT NULL in it.

I suspect this problem exists in multiple tables.

@jyutzler

This comment has been minimized.

Contributor

jyutzler commented Feb 8, 2017

One way of addressing this problem is by creating repair scripts as described here. Basically you make a new table with the right schema, copy the data from the original table to the new table, drop the old table, and rename the new one.

kwrobot pushed a commit to aashish24/gdal-svn that referenced this issue Feb 8, 2017

rouault
GPKG: declare feature id column of features tables and tile pyramid u…
…ser data tables as NOT NULL (fixes #6807, relates to ​opengeospatial/geopackage#282)

git-svn-id: https://svn.osgeo.org/gdal/trunk/gdal@37323 f0d54148-0727-0410-94bb-9a71ac55c965

rouault added a commit to OSGeo/gdal that referenced this issue Feb 8, 2017

GPKG: declare feature id column of features tables and tile pyramid u…
…ser data tables as NOT NULL (fixes #6807, relates to ​opengeospatial/geopackage#282)

git-svn-id: https://svn.osgeo.org/gdal/trunk@37323 f0d54148-0727-0410-94bb-9a71ac55c965

rouault added a commit to OSGeo/gdal that referenced this issue Feb 8, 2017

GPKG: declare feature id column of features tables and tile pyramid u…
…ser data tables as NOT NULL (fixes #6807, relates to ​opengeospatial/geopackage#282)

git-svn-id: https://svn.osgeo.org/gdal/branches/2.1@37324 f0d54148-0727-0410-94bb-9a71ac55c965
@jratike80

This comment has been minimized.

jratike80 commented Feb 9, 2017

By reading http://www.sqlite.org/lang_createtable.html#primkeyconst in SQLite the column is automatically NOT NULL if it is primary key AND integer.

According to the SQL standard, PRIMARY KEY should always imply NOT NULL. Unfortunately, due to a bug in some early versions, this is not the case in SQLite. Unless the column is an INTEGER PRIMARY KEY or the table is a WITHOUT ROWID table or the column is declared NOT NULL, SQLite allows NULL values in a PRIMARY KEY column.

I wonder if this change is necessary of if it would be better to edit the test. Execute "PRAGMA table_info(table)", and if id column has type=INTEGER and pk=1 then it is OK.

@rouault

This comment has been minimized.

Contributor

rouault commented Feb 9, 2017

You're right @jratike80 . I read too fast/wrongly the SQLite documentation. What is surprising, and I'd say a sqlite bug, is that if you do "PRAGMA table_info(table_name)", then a column of type INTEGER PRIMARY KEY returns notnull = 0, whereas a column of type INTEGER PRIMARY KEY NOT NULL returns notnull = 1. In both cases, I'd expect notnull = 1 to be returned given SQLite doc.

@jyutzler

This comment has been minimized.

Contributor

jyutzler commented Feb 9, 2017

The test will definitely be weakened for GPKG 1.0/1.1. It isn't pragmatic to make the test fail on nearly every single GPKG in existence.

The SWG will discuss what to do about 1.2 on Monday.

@jratike80

This comment has been minimized.

jratike80 commented Feb 9, 2017

More important than to check if INTEGER PRIMARY KEY AUTOINCREMENT column is NOT NULL (according to documentation it is always NOT NULL), would be to check if column is really AUTOINCREMENT. If it is not the ids of deleted rows may be re-used and that can lead to troubles if tables have foreing key relations.

Unfortunately it does not seem to be straight forward to check if column is AUTOINCREMENT. The "SELECT COUNT(*) FROM sqlite_sequence WHERE name='yourtable';" test that is explained in http://stackoverflow.com/questions/20979239/how-to-tell-if-a-sqlite-column-is-autoincrement works but I confirmed with SQLite version 3.9.2 that the sequence really appears only after inserting data into the table. If table is empty the only place to find this information is propably in the "sql" field of sqlite_master table:

SELECT sql FROM sqlite_master WHERE type = 'table' AND name = 'example';

Empty tables which have never contained any data are marginal cases but I suppose that such can exist in GeoPackage.

@jyutzler

This comment has been minimized.

Contributor

jyutzler commented Feb 9, 2017

Yes, I think we have to be prepared for the possibility that someone has generated a GeoPackage template that will be empty until it is populated in the field.

@jyutzler

This comment has been minimized.

Contributor

jyutzler commented Feb 13, 2017

In the SWG today we determined that we will add a separate test to ensure that there are no NULLs in the data. We will also add a test to check to see if the IDs are in fact unique.

@jyutzler jyutzler added this to the 1.2-comment period milestone Feb 14, 2017

@jyutzler

This comment has been minimized.

Contributor

jyutzler commented Feb 15, 2017

@jyutzler

This comment has been minimized.

Contributor

jyutzler commented Feb 27, 2017

The PR #283 was approved by the SWG today. Summary:

  • the test is nerfed for GPKG 1.0/1.1
  • the DDL was updated for GPKG 1.2
  • there is an additional test to confirm that the IDs are unique
  • Appropriate information will be added to the release notes

@jyutzler jyutzler closed this Feb 27, 2017

jyutzler added a commit to jyutzler/geopackage that referenced this issue Mar 31, 2017

jyutzler added a commit to jyutzler/geopackage that referenced this issue Mar 31, 2017

@jyutzler jyutzler referenced this issue Mar 31, 2017

Merged

I282 #331

@jyutzler jyutzler reopened this Mar 31, 2017

@jyutzler

This comment has been minimized.

Contributor

jyutzler commented Mar 31, 2017

Reopening the issue because expecting the provider to add the NOT NULL bit to the DDL is just too annoying since it has no operational effect.

The proposed way ahead is to be a little smart about this possibility in the tests and only fail if there is something actually wrong with the primary key such as NULL or duplicate values existing. This would make the change non-substantive.

opengeospatial/ets-gpkg12@55c3af9 addresses the issue in the tests.

@rouault

This comment has been minimized.

Contributor

rouault commented Mar 31, 2017

Actually I'm thinking about an issue in using PRAGMA table_info(). The issue is with views. In that case PRAGMA table_info() desn't know how to identify the primary key. This is actually an issue @jratike80 raised on the GDAL mailing list. Contrary to Spatialite that has a system table to indicate which column is the primary key for views, GeoPackage has not this.

See

$ sqlite3
SQLite version 3.11.0 2016-02-15 17:29:24
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> create table my_table (ID INTEGER PRIMARY KEY AUTOINCREMENT);
sqlite> pragma table_info(my_table);
0|ID|INTEGER|0||1
sqlite> create view my_view AS SELECT ID FROM my_table;
sqlite> pragma table_info(my_view);
0|ID|INTEGER|0||0
@jyutzler

This comment has been minimized.

Contributor

jyutzler commented Mar 31, 2017

Of course system tables can get out of sync with the rest of the database so I'm not sure that is a solution.
If you have a proposal for an abstract test that will work with views I will implement it in the ETS.

@rouault

This comment has been minimized.

Contributor

rouault commented Mar 31, 2017

If you have a proposal for an abstract test that will work with views I will implement it in the ETS

Unfortunately what I found in my researchs show that it is pretty hard, and you need to analyze at hand the DDL of the CREATE VIEW. In the general case, a view has no primary key if there are joins, etc...

@jratike80

This comment has been minimized.

jratike80 commented Apr 1, 2017

Views and primary keys are discussed also in #207

There is a concrete and common use case: FID of the table from where the geometry is selected into the view should be found in order to utilize the spatial index (R-Tree) of the main table.
Addidional remark: Spatial index of the main table must not always be utilized:
CREATE VIEW "min_x_view" AS SELECT ST_MinX(geom).
However, it could be utilised even in some other cases than pure SELECT GEOMETRY, for example with some imaginary simplification extension: SELECT ST_Simplify(geom,10) could well utilize the R-Tree of the main table.

Should we create a new issue for dealing with spatial views and spatial indexes?

@jyutzler

This comment has been minimized.

Contributor

jyutzler commented Apr 3, 2017

@jratike80 Yes, I'd like that to be a separate issue.

Re-closing this issue. Note that I have updated the change notes to demote this to an administrative change now that we have resolved it in such a way that it has no operational effect. jyutzler/ogc16-126@8ac9ab1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment