Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix GPKG and Spatialite UNIQUE and NOT NULL constraints detection #36802

Merged
merged 6 commits into from
May 30, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions python/core/auto_generated/qgssqliteutils.sip.in
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,8 @@ Returns a string list of SQLite (and spatialite) system tables

.. versionadded:: 3.8
%End


};

/************************************************************************
Expand Down
31 changes: 30 additions & 1 deletion src/core/providers/ogr/qgsogrprovider.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,6 @@ email : sherman at mrcc.com
#include "qgsprovidermetadata.h"
#include "qgsogrdbconnection.h"
#include "qgsgeopackageproviderconnection.h"

#include "qgis.h"


Expand All @@ -56,6 +55,11 @@ email : sherman at mrcc.com
#include <ogr_srs_api.h>
#include <cpl_string.h>

// Temporary solution until GDAL Unique suppport is available
#include "qgssqliteutils.h"
#include <sqlite3.h>
// end temporary

#include <limits>
#include <memory>

Expand Down Expand Up @@ -1099,6 +1103,24 @@ void QgsOgrProvider::loadFields()
mFirstFieldIsFid = !fidColumn.isEmpty() &&
fdef.GetFieldIndex( fidColumn ) < 0;

// This is a temporary solution until GDAL Unique support is available
QSet<QString> uniqueFieldNames;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you considering to have a GDAL >= 3.2 path that uses the GDAL API (in this PR or a follow-up once OSGeo/gdal#2622 is merged) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rouault yes, that was the idea.
But I'm now thinking to remove the OGR part from this PR leaving spatialite (because it fixes another bug about NOT NULL) and wait for GDAL API to be ready, there is failing test about number of connections, https://travis-ci.org/github/qgis/QGIS/builds/692197527#L2175
I don't get exactly why it happens, because the solution in the patch does actually open an additional connection but it should be closed immediately thanks to the sqlite3_database_unique_ptr.

Anyway I've already spent faaaaar too much time on this issue and I don't think I can spend more.

I also tried a different implementation where I pass the sqlite3 handle to the uniqueValues method (and make it static) but it seems there is no way to retrieve it from the OGR layer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes the failure is mysterious... And there is indeed no way to get the sqlite3 handle from a GDALDataset.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rouault I've checked and the sqlite3_database_unique_ptr deleter actually called sqlite3_close_v2, how do you feel about changing the test expected value?

I can add a note that it is a temporary solution until we use the GDAL implementation.

I'd really like to avoid putting yet another sql parser inside the OGR provider.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do you feel about changing the test expected value?

I can add a note that it is a temporary solution until we use the GDAL implementation.

yeah, we can do that. Odd though



if ( mGDALDriverName == QLatin1String( "GPKG" ) )
{
sqlite3_database_unique_ptr dsPtr;
if ( dsPtr.open_v2( mFilePath, SQLITE_OPEN_READONLY, nullptr ) == SQLITE_OK )
{
QString errMsg;
uniqueFieldNames = QgsSqliteUtils::uniqueFields( dsPtr.get(), mOgrLayer->name(), errMsg );
if ( ! errMsg.isEmpty() )
{
QgsMessageLog::logMessage( tr( "GPKG error searching for unique constraints on fields for table %1" ).arg( QString( mOgrLayer->name() ) ), tr( "OGR" ) );
}
}
}

int createdFields = 0;
if ( mFirstFieldIsFid )
{
Expand Down Expand Up @@ -1233,6 +1255,13 @@ void QgsOgrProvider::loadFields()
newField.setConstraints( constraints );
}

if ( uniqueFieldNames.contains( OGR_Fld_GetNameRef( fldDef ) ) )
{
QgsFieldConstraints constraints = newField.constraints();
constraints.setConstraint( QgsFieldConstraints::ConstraintUnique, QgsFieldConstraints::ConstraintOriginProvider );
newField.setConstraints( constraints );
}

// check if field has default value
QString defaultValue = textEncoding()->toUnicode( OGR_Fld_GetDefault( fldDef ) );
if ( !defaultValue.isEmpty() && !OGR_Fld_IsDefaultDriverSpecific( fldDef ) )
Expand Down
87 changes: 87 additions & 0 deletions src/core/qgssqliteutils.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,13 @@
#include <sqlite3.h>
#include <cstdarg>
#include <QVariant>
#include <QSet>

// Temporary solution until GDAL Unique suppport is available
#include <regex>
#include <sstream>
#include <algorithm>
// end temporary

void QgsSqlite3Closer::operator()( sqlite3 *database )
{
Expand Down Expand Up @@ -114,6 +121,86 @@ int sqlite3_database_unique_ptr::exec( const QString &sql, QString &errorMessage
return ret;
}

QSet<QString> QgsSqliteUtils::uniqueFields( sqlite3 *connection, const QString &tableName, QString &errorMessage )
{
QSet<QString> uniqueFieldsResults;
char *zErrMsg = 0;
std::vector<std::string> rows;
QString sql = sqlite3_mprintf( "select sql from sqlite_master where type='table' and name=%q", quotedIdentifier( tableName ).toStdString().c_str() );
auto cb = [ ](
void *data /* Data provided in the 4th argument of sqlite3_exec() */,
int /* The number of columns in row */,
char **argv /* An array of strings representing fields in the row */,
char ** /* An array of strings representing column names */ ) -> int
{
static_cast<std::vector<std::string>*>( data )->push_back( argv[0] );
return 0;
};

int rc = sqlite3_exec( connection, sql.toUtf8(), cb, ( void * )&rows, &zErrMsg );
if ( rc != SQLITE_OK )
{
errorMessage = zErrMsg;
sqlite3_free( zErrMsg );
return uniqueFieldsResults;
}

// Match identifiers with " or ` or no delimiter (and no spaces).
std::smatch uniqueFieldMatch;
static const std::regex sFieldIdentifierRe { R"raw(\s*(["`]([^"`]+)["`])|(([^\s]+)\s).*)raw" };
for ( auto tableDefinition : rows )
{
tableDefinition = tableDefinition.substr( tableDefinition.find( '(' ), tableDefinition.rfind( ')' ) );
std::stringstream tableDefinitionStream { tableDefinition };
while ( tableDefinitionStream.good() )
{
std::string fieldStr;
std::getline( tableDefinitionStream, fieldStr, ',' );
std::string upperCaseFieldStr { fieldStr };
std::transform( upperCaseFieldStr.begin(), upperCaseFieldStr.end(), upperCaseFieldStr.begin(), ::toupper );
if ( upperCaseFieldStr.find( "UNIQUE" ) != std::string::npos )
{
if ( std::regex_search( fieldStr, uniqueFieldMatch, sFieldIdentifierRe ) )
{
const std::string quoted { uniqueFieldMatch.str( 2 ) };
uniqueFieldsResults.insert( QString::fromStdString( quoted.length() ? quoted : uniqueFieldMatch.str( 4 ) ) );
}
}
}
}
rows.clear();

// Search indexes:
sql = sqlite3_mprintf( "SELECT sql FROM sqlite_master WHERE type='index' AND"
" tbl_name='%q' AND sql LIKE 'CREATE UNIQUE INDEX%%'" );
rc = sqlite3_exec( connection, sql.toUtf8(), cb, ( void * )&rows, &zErrMsg );
if ( rc != SQLITE_OK )
{
errorMessage = zErrMsg;
sqlite3_free( zErrMsg );
return uniqueFieldsResults;
}

if ( rows.size() > 0 )
{
static const std::regex sFieldIndexIdentifierRe { R"raw(\(\s*[`"]?([^",`\)]+)["`]?\s*\))raw" };
for ( auto indexDefinition : rows )
{
std::string upperCaseIndexDefinition { indexDefinition };
std::transform( upperCaseIndexDefinition.begin(), upperCaseIndexDefinition.end(), upperCaseIndexDefinition.begin(), ::toupper );
if ( upperCaseIndexDefinition.find( "UNIQUE" ) != std::string::npos )
{
indexDefinition = indexDefinition.substr( indexDefinition.find( '(' ), indexDefinition.rfind( ')' ) );
if ( std::regex_search( indexDefinition, uniqueFieldMatch, sFieldIndexIdentifierRe ) )
{
uniqueFieldsResults.insert( QString::fromStdString( uniqueFieldMatch.str( 1 ) ) );
}
}
}
}
return uniqueFieldsResults;
}

QString QgsSqliteUtils::quotedString( const QString &value )
{
if ( value.isNull() )
Expand Down
13 changes: 13 additions & 0 deletions src/core/qgssqliteutils.h
Original file line number Diff line number Diff line change
Expand Up @@ -153,6 +153,7 @@ class CORE_EXPORT sqlite3_database_unique_ptr : public std::unique_ptr< sqlite3,
* \since QGIS 3.6
*/
int exec( const QString &sql, QString &errorMessage SIP_OUT ) const;

};

/**
Expand Down Expand Up @@ -200,6 +201,18 @@ class CORE_EXPORT QgsSqliteUtils
* \since QGIS 3.8
*/
static QStringList systemTables();

/**
* Returns a list of field names for \a connection and \a tableName having a UNIQUE constraint,
* fields that are part of a UNIQUE constraint that spans over multiple fields
* are not returned.
* \note the implementation is the same of GDAL but the test coverage is much
* better in GDAL.
* \since QGIS 3.14
* \note not available in Python bindings
*/
static QSet<QString> uniqueFields( sqlite3 *connection, const QString &tableName, QString &errorMessage ) SIP_SKIP;

};

#endif // QGSSQLITEUTILS_H
34 changes: 29 additions & 5 deletions src/providers/spatialite/qgsspatialiteprovider.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -893,7 +893,7 @@ void QgsSpatiaLiteProvider::fetchConstraints()
char **results = nullptr;
char *errMsg = nullptr;

// this is not particularly robust but unfortunately sqlite offers no way to check directly
// this is not robust but unfortunately sqlite offers no way to check directly
// for the presence of constraints on a field (only indexes, but not all constraints are indexes)
QString sql = QStringLiteral( "SELECT sql FROM sqlite_master WHERE type='table' AND name=%1" ).arg( QgsSqliteUtils::quotedIdentifier( mTableName ) );
int columns = 0;
Expand All @@ -910,23 +910,47 @@ void QgsSpatiaLiteProvider::fetchConstraints()
;
else
{
// Use the same logic implemented in GDAL for GPKG
QSet<QString> uniqueFieldNames;
{
QString errMsg;
uniqueFieldNames = QgsSqliteUtils::uniqueFields( mSqliteHandle, mTableName, errMsg );
if ( ! errMsg.isEmpty() )
{
QgsMessageLog::logMessage( tr( "Error searching for unique constraints on fields for table %1" ).arg( mTableName ), tr( "spatialite" ) );
}
}

QString sqlDef = QString::fromUtf8( results[ 1 ] );
// extract definition
QRegularExpression re( QStringLiteral( "\\((.*)\\)" ) );
QRegularExpression re( QStringLiteral( R"raw(\((.*)\))raw" ) );
QRegularExpressionMatch match = re.match( sqlDef );
if ( match.hasMatch() )
{
const QString matched = match.captured( 1 );
for ( auto &field : matched.split( ',' ) )
{
field = field.trimmed();
QString fieldName = field.left( field.indexOf( ' ' ) );
QString definition = field.mid( field.indexOf( ' ' ) + 1 );
QString fieldName;
QString definition;
const QChar delimiter { field.at( 0 ) };
if ( delimiter == '"' || delimiter == '`' )
{
const int start { field.indexOf( delimiter ) + 1};
const int end { field.indexOf( delimiter, start ) };
fieldName = field.mid( start, end - start );
definition = field.mid( end + 1 );
}
else
{
fieldName = field.left( field.indexOf( ' ' ) );
definition = field.mid( field.indexOf( ' ' ) + 1 );
}
int fieldIdx = mAttributeFields.lookupField( fieldName );
if ( fieldIdx >= 0 )
{
QgsFieldConstraints constraints = mAttributeFields.at( fieldIdx ).constraints();
if ( definition.contains( QLatin1String( "unique" ), Qt::CaseInsensitive ) || definition.contains( QLatin1String( "primary key" ), Qt::CaseInsensitive ) )
if ( uniqueFieldNames.contains( fieldName ) || definition.contains( QLatin1String( "primary key" ), Qt::CaseInsensitive ) )
constraints.setConstraint( QgsFieldConstraints::ConstraintUnique, QgsFieldConstraints::ConstraintOriginProvider );
if ( definition.contains( QLatin1String( "not null" ), Qt::CaseInsensitive ) || definition.contains( QLatin1String( "primary key" ), Qt::CaseInsensitive ) )
constraints.setConstraint( QgsFieldConstraints::ConstraintNotNull, QgsFieldConstraints::ConstraintOriginProvider );
Expand Down
Loading