Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Warning about invalid entry names #469

Open
wants to merge 1 commit into from

4 participants

@olivierdalang

Hi !

When using QgsProject.instance().writeEntry(scope, key, value), it's really easy to make the whole project file unreadable.

This happens because scope and keys are used as XML elements names in the project file.

In case a plugin developer uses user input as the key, he may not notice that he's endangering the user's files.

The commit sanitize scope and keys to make sure the project's XML remains valid whatever scope/key is used. The regexp are done using this : http://www.w3.org/TR/REC-xml/#NT-NameStartChar

http://hub.qgis.org/issues/7371 can be closed upon merge...

Thanks !

Olivier

@mhugent mhugent was assigned
@olivierdalang

Will this be merged for 2.0 ? (I'm currently working on a plugin that could rely on this)

@mhugent
Owner

Denis commited something similar in fe7f26f
That lets Qt remove the invalid characters from the xml names. Is this sufficient for your usage?

@olivierdalang

Hi ! Sorry for the delay answering to this.

I just tested fe7f26f and it does indeed solve the problem of having an invalid XML file.

But still some strange behaviour can occur with the writeEntry function :

QgsProject.instance().writeEntry('testingWriteEntry','abc','abc')  #works
QgsProject.instance().writeEntry('testingWriteEntry','123','123') #not written to file at all (since an XML tag can't start with a number)
QgsProject.instance().writeEntry('testingWriteEntry','def','def') #works too
QgsProject.instance().writeEntry('testingWriteEntry','def','def again') #works (overwrites def)
QgsProject.instance().writeEntry('testingWriteEntry','d e f','d e f') #bugs (append another def node, since spaces are dropped from xml tags)

This can be a problem, since a plugin developer may ignore that the entry keys are directly used in the XML file or ignore what characters are allowed as XML tags.

Also, in master, the entries are always stored in memory, but may not be written to file (or with an altered tag). So when testing, it can seem to work, but when saving and reloading the project, some entries can be lost or altered.
Those problems are hard to spot and hard to debug.

A scenario would be a developer using user input as a writeEntry key (for instance if one wants to store several user-named settings) : he may not think of sanitizing the input, and get those annoying bugs.

I don't know if this should be considered a bug or not. Maybe a notice in the documentation would be enough ? Or a message in the console when an illegal name is used ?

But if it's a bug, then my pull request still makes sense.

Since the sanitation is made in the makeKeyTokens method (which is used at read and write), the alteration of the name won't be as problematic as in master (where the tags name are altered at writing to file only).

Thinking of it, in my pull request, the problem is only partially solved : if a character is invalid, it replaces it with an underscore, which doesn't fix similar keys with invalid characters being different.

So this solves

QgsProject.instance().writeEntry('testingWriteEntry','123','123') #stored as _123

but not

QgsProject.instance().writeEntry('testingWriteEntry','a b c','abc') #stored as a_b_c
QgsProject.instance().writeEntry('testingWriteEntry','a<b_c','abc') #also stored as a_b_c (overwrite)

(so for that latter case, it may be a good idea to modify the pull request to "encode" illegal characters, rather than just replace them with an underscore)

What do you think ?

@olivierdalang

So what do you think ? is it worth working on that (resolving that latter case) ?
If so, does someone know what string encoding method could fit this need ?

@mhugent
Owner

Sorry for the long delay on my side. Maybe it is better to simply reject writing keys with invalid xml names (and say so in the API docs) rather than automatically fix it. Automatically fixing might surprise the user and lead to bugs which are hard to track down.

@m-kuhn
Collaborator

I support @mhugent. I cannot see a good reason why arbitrary keys are allowed, but potential problems with colliding keys. Rejecting (with an error message) is more stable and sounds absolutely valid.

src/core/qgsproject.cpp
@@ -64,7 +64,28 @@ QStringList makeKeyTokens_( QString const &scope, QString const &key )
// be sure to include the canonical root node
keyTokens.push_front( "properties" );
- return keyTokens;
+ //sanitize keys since an unvalid xml name will make the project file unreadable !
+ QStringList sanitizedKeyTokens = QStringList();
+ for (int i = 0; i < keyTokens.size(); ++i){
+ QString sanitizedKey = keyTokens.at(i);
+
+ //replace invalid chars in XML document ( http://www.w3.org/TR/REC-xml/#NT-NameChar )
+ //note : it seems this should also keep \x10000-\xEFFFF but then a lot of unwanted chars remain...
+ QString nameCharRegexp = QString( "[^:A-Z_a-z\\xC0-\\xD6\\xD8-\\xF6\\xF8-\\x2FF\\x370-\\x37D\\x37F-\\x1FFF\\x200C-\\x200D\\x2070-\\x218F\\x2C00-\\x2FEF\\x3001-\\xD7FF\\xF900-\\xFDCF\\xFDF0-\\xFFFD\\-\\.0-9\\xB7\\x0300-\\x036F\\x203F-\\x2040]" );
+ QString nameStartCharRegexp = QString( "^[^:A-Z_a-z\\xC0-\\xD6\\xD8-\\xF6\\xF8-\\x2FF\\x370-\\x37D\\x37F-\\x1FFF\\x200C-\\x200D\\x2070-\\x218F\\x2C00-\\x2FEF\\x3001-\\xD7FF\\xF900-\\xFDCF\\xFDF0-\\xFFFD]" );
+
+ if( sanitizedKey.contains( QRegExp(nameCharRegexp) ) || sanitizedKey.contains( QRegExp(nameStartCharRegexp) ) ){
+
+ //TODO :
+ // >>> DISPLAY SOME KIND OF MESSAGE <<<

Okay for the rejection + error message...

I made the changes and the pull request is almost ready excepted that I did not manage to display a message here...

I tried with QgsDebugMsg(), but the console didn't print anything...

Can someone help ?

@m-kuhn Collaborator
m-kuhn added a note

Strange that it does not work for you. QgsDebugMsg() reliably sends all the output to stderr according to my experience.

However, I assume this message will be interesting also for people developing plugins?
You could use the QgsMessageLog which will also be available in release versions, so plugin devs using release versions also get the messages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@olivierdalang

Ok it seems to work with QgsMessageLog !
So now, when an invalid key is used, it logs a message and outputs a debug message as well.

I also updated the docstrings to briefly explain the key's constraints (as well as removed some obsolete notes).

I'd say the pull request is ready to merge, but please review it with some attention since I'm still not very comfortable with C++.

@3nids
Collaborator

@olivierdalang can you maybe rebase this, sorry for not reviewing this?

@olivierdalang olivierdalang check validity of keys used by writeEntry and readEntry
and print a message in the console in case of invalid key

update of docstrings (obsolete note removed, note added)
313fe6a
@olivierdalang

@3nids ok, done.

Now, a message is printed to the console in case the keys are not valid, but the behaviour is not changed : the key is still set and dropped on file save. So it's more a warning than a sanitation...

I also removed the note The key string <em>must</em> include '/'s. E.g., "/foo" not "foo". since it seem to be obsolete.

Again: I'd say the pull request is ready to merge, but please review it with some attention since I'm still not very comfortable with C++.

@olivierdalang olivierdalang changed the title from Sanitize entry names to avoid make project unreadable to Warning about invalid entry names
@3nids
Collaborator

@m-kuhn or @mhugent can one of you two review this?

@m-kuhn
Collaborator

Hi Olivier,

would it be ok for you to write some small tests? I think it will be very straightforward for this (some keys that are rejected, some that are not). The regex looks quite cryptic to just review like that :-)

I also wonder if you think that this should be merged for 2.4 still? I am a bit afraid that there is a possibility to render working plugins unusable and the remaining time for bugfixing gets shorter. So maybe this should rather be treated as a feature for 2.6. What's your opinion, @olivierdalang @mhugent ?

@olivierdalang

Hi !

I'll try to write some tests in the following days (never did yet, but it's indeed probably a good opportunity to learn).

About the merge, it shouldn't make plugins unusable : the latest version of the PR only prints a message to the console, it doesn't reject the key (since they work, they are just not saved to file).

So, for the merge, it's up to you, but I think at least we should merge the update of the docstring to explain that the key must be valid xml tags, and not just any string.

The regexp is very cryptic (hehe, it's a regexp), but it simply searches for characters that are not in the allowed range defined on http://www.w3.org/TR/REC-xml/#NT-NameChar and http://www.w3.org/TR/REC-xml/#NT-NameStartChar (the latter being only for the first character).

@m-kuhn
Collaborator

@olivierdalang
This is our oldest pull request, would be nice to merge it :)

Do you think you can handle writing a test for it?

@olivierdalang

@m-kuhn
Hehe it seems I should have had some time to work on those tests since the 4th of june...

If you want to merge soon, maybe the best would be to find someone to write these tests (I guess it's easy for someone who knows how it works), if not, I still have making those tests on my long term todo list (I should be able to do this before 2015).

@m-kuhn
Collaborator

Yes, I think it should be trivial. Some known good and known bad entries can be written and the result of makeKeyTokens be checked.
Looking forward to this :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Jun 1, 2014
  1. @olivierdalang

    check validity of keys used by writeEntry and readEntry

    olivierdalang authored
    and print a message in the console in case of invalid key
    
    update of docstrings (obsolete note removed, note added)
This page is out of date. Refresh to see the latest.
View
4 python/core/qgsproject.sip
@@ -175,7 +175,7 @@ class QgsProject : QObject
keys would be the familiar QSettings-like '/' delimited entries, implying
a hierarchy of keys and corresponding values
- @note The key string <em>must</em> include '/'s. E.g., "/foo" not "foo".
+ @note The key string must be valid xml tag names in order to be saved to the file.
*/
//@{
//! @note available in python bindings as writeEntryBool
@@ -192,8 +192,6 @@ class QgsProject : QObject
keys would be the familiar QSettings-like '/' delimited entries,
implying a hierarchy of keys and corresponding values
-
- @note The key string <em>must</em> include '/'s. E.g., "/foo" not "foo".
*/
//@{
QStringList readListEntry( const QString & scope, const QString & key, QStringList def = QStringList(), bool *ok = 0 ) const;
View
19 src/core/qgsproject.cpp
@@ -27,6 +27,7 @@
#include "qgslayertreeregistrybridge.h"
#include "qgslogger.h"
#include "qgsmaplayerregistry.h"
+#include "qgsmessagelog.h"
#include "qgspluginlayer.h"
#include "qgspluginlayerregistry.h"
#include "qgsprojectfiletransform.h"
@@ -67,6 +68,24 @@ QStringList makeKeyTokens_( QString const &scope, QString const &key )
// be sure to include the canonical root node
keyTokens.push_front( "properties" );
+ //check validy of keys since an unvalid xml name will will be dropped upon saving the xml file. If not valid, we print a message to the console.
+ for (int i = 0; i < keyTokens.size(); ++i){
+ QString keyToken = keyTokens.at(i);
+
+ //invalid chars in XML are found at http://www.w3.org/TR/REC-xml/#NT-NameChar
+ //note : it seems \x10000-\xEFFFF is valid, but it when added to the regexp, a lot of unwanted characters remain
+ QString nameCharRegexp = QString( "[^:A-Z_a-z\\xC0-\\xD6\\xD8-\\xF6\\xF8-\\x2FF\\x370-\\x37D\\x37F-\\x1FFF\\x200C-\\x200D\\x2070-\\x218F\\x2C00-\\x2FEF\\x3001-\\xD7FF\\xF900-\\xFDCF\\xFDF0-\\xFFFD\\-\\.0-9\\xB7\\x0300-\\x036F\\x203F-\\x2040]" );
+ QString nameStartCharRegexp = QString( "^[^:A-Z_a-z\\xC0-\\xD6\\xD8-\\xF6\\xF8-\\x2FF\\x370-\\x37D\\x37F-\\x1FFF\\x200C-\\x200D\\x2070-\\x218F\\x2C00-\\x2FEF\\x3001-\\xD7FF\\xF900-\\xFDCF\\xFDF0-\\xFFFD]" );
+
+ if( keyToken.contains( QRegExp(nameCharRegexp) ) || keyToken.contains( QRegExp(nameStartCharRegexp) ) ){
+
+ QString errorString = QObject::tr("Entry token invalid : '%1'. The token will not be saved to file.").arg(keyToken);
+ QgsMessageLog::logMessage( errorString, QString::null, QgsMessageLog::CRITICAL );
+
+ }
+
+ }
+
return keyTokens;
} // makeKeyTokens_
View
4 src/core/qgsproject.h
@@ -220,7 +220,7 @@ class CORE_EXPORT QgsProject : public QObject
keys would be the familiar QSettings-like '/' delimited entries, implying
a hierarchy of keys and corresponding values
- @note The key string <em>must</em> include '/'s. E.g., "/foo" not "foo".
+ @note The key string must be valid xml tag names in order to be saved to the file.
*/
//@{
//! @note available in python bindings as writeEntryBool
@@ -237,8 +237,6 @@ class CORE_EXPORT QgsProject : public QObject
keys would be the familiar QSettings-like '/' delimited entries,
implying a hierarchy of keys and corresponding values
-
- @note The key string <em>must</em> include '/'s. E.g., "/foo" not "foo".
*/
//@{
QStringList readListEntry( const QString & scope, const QString & key, QStringList def = QStringList(), bool *ok = 0 ) const;
Something went wrong with that request. Please try again.