Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workaround for PostgreSQL inconsistent bytea encoding requirements. #594

Closed
wants to merge 1 commit into from

Conversation

demiankatz
Copy link
Member

It appears that bytea fields are handled differently in PostgreSQL 8 vs. 9. The result is that running VuFind on PostgreSQL 9 throws a fatal error when a user performs a search containing a backslash. This PR works around this problem by base64-encoding the serialized search object when PostgreSQL is being used to avoid any dangerous characters and flatten out differences between versions.

This is obviously a hack -- see #599 for a better solution.

@jochen-lienhard
Copy link
Contributor

Still the same problem:

Here more error message, I had forgotten in the last mail.

I think the problem starts with postgres version 9.3

Previous exceptions:
PDOException

SQLSTATE[22P02]: Invalid text representation: 7 ERROR: invalid input syntax for type bytea

#0 /opt/rdsui/master/vendor/zendframework/zendframework/library/Zend/Db/Adapter/Driver/Pdo/Statement.php(239): PDOStatement->execute()
#1 /opt/rdsui/master/vendor/zendframework/zendframework/library/Zend/Db/TableGateway/AbstractTableGateway.php(307): Zend\Db\Adapter\Driver\Pdo\Statement->execute()
#2 /opt/rdsui/master/vendor/zendframework/zendframework/library/Zend/Db/TableGateway/AbstractTableGateway.php(263): Zend\Db\TableGateway\AbstractTableGateway->executeInsert(Object(Zend\Db\Sql\Insert))
#3 /opt/rdsui/master/module/VuFind/src/VuFind/Db/Table/Search.php(192): Zend\Db\TableGateway\AbstractTableGateway->insert(Array)
#4 /opt/rdsui/master/module/VuFind/src/VuFind/Controller/AbstractSearch.php(398): VuFind\Db\Table\Search->saveSearch(Object(VuFind\Search\Results\PluginManager), Object(VuFind\Search\Solr\Results), 'pfrhmne1a1fvm0q...', Object(Zend\Db\ResultSet\ResultSet))
#5 /opt/rdsui/master/module/VuFind/src/VuFind/Controller/AbstractSearch.php(305): VuFind\Controller\AbstractSearch->saveSearchToHistory(Object(VuFind\Search\Solr\Results))
#6 /opt/rdsui/master/module/VuFind/src/VuFind/Controller/SearchController.php(510): VuFind\Controller\AbstractSearch->resultsAction()
#7 /opt/rdsui/master/vendor/zendframework/zendframework/library/Zend/Mvc/Controller/AbstractActionController.php(82): VuFind\Controller\SearchController->resultsAction()
#8 [internal function]: Zend\Mvc\Controller\AbstractActionController->onDispatch(Object(Zend\Mvc\MvcEvent))
#9 /opt/rdsui/master/vendor/zendframework/zendframework/library/Zend/EventManager/EventManager.php(444): call_user_func(Array, Object(Zend\Mvc\MvcEvent))
#10 /opt/rdsui/master/vendor/zendframework/zendframework/library/Zend/EventManager/EventManager.php(205): Zend\EventManager\EventManager->triggerListeners('dispatch', Object(Zend\Mvc\MvcEvent), Object(Closure))
#11 /opt/rdsui/master/vendor/zendframework/zendframework/library/Zend/Mvc/Controller/AbstractController.php(118): Zend\EventManager\EventManager->trigger('dispatch', Object(Zend\Mvc\MvcEvent), Object(Closure))
#12 /opt/rdsui/master/vendor/zendframework/zendframework/library/Zend/Mvc/DispatchListener.php(93): Zend\Mvc\Controller\AbstractController->dispatch(Object(Zend\Http\PhpEnvironment\Request), Object(Zend\Http\PhpEnvironment\Response))
#13 [internal function]: Zend\Mvc\DispatchListener->onDispatch(Object(Zend\Mvc\MvcEvent))
#14 /opt/rdsui/master/vendor/zendframework/zendframework/library/Zend/EventManager/EventManager.php(444): call_user_func(Array, Object(Zend\Mvc\MvcEvent))
#15 /opt/rdsui/master/vendor/zendframework/zendframework/library/Zend/EventManager/EventManager.php(205): Zend\EventManager\EventManager->triggerListeners('dispatch', Object(Zend\Mvc\MvcEvent), Object(Closure))
#16 /opt/rdsui/master/vendor/zendframework/zendframework/library/Zend/Mvc/Application.php(314): Zend\EventManager\EventManager->trigger('dispatch', Object(Zend\Mvc\MvcEvent), Object(Closure))
#17 /opt/rdsui/master/public/index.php(81): Zend\Mvc\Application->run()
#18 {main}

@demiankatz
Copy link
Member Author

That's strange. I'm running PostgreSQL 9.3 as well, and this branch seems to prevent that problem. Did you start with a clean database and checkout of this branch? Is it possible that old data in an existing database is contributing to the problem? If you perform a non-fatal search, can you confirm that the bytea field is being correctly populated with base64-encoded data (instead of plain PHP serialized text) when using this branch?

@jochen-lienhard
Copy link
Contributor

I will look for the problem in the next days.

@jochen-lienhard
Copy link
Contributor

It works. Sorry, it seems my master branch was not up to date.

Do you have an idea, what I must do to use this solution in release 2.5.x?

Why don't you remove the 'POSTGRES' check and use always the base64 encode.
Do you think the database will increase to much?

@demiankatz
Copy link
Member Author

Glad to hear that it's working for you. It should be possible to apply the diffs from this patch directly to release 2.5.x and have it work -- the workaround is fairly self-contained. However, it's probably a good idea to make sure we have a consensus on the approach before pushing it to production, simply because once these things are in the database, they'll have to be supported indefinitely (or converted by some automatic reformatting process).

There are a few reasons I included the POSTGRES check:

  • Because this is a workaround hack rather than something I consider a good design, I prefer to use it as little as possible and affect the smallest number of users. That way if something changes in the future, we have a smaller number of affected installations that might need to take some action to upgrade.
  • The base64 encoding/decoding introduces a performance hit and, as you say, makes the data take up more space in the database. It also reduces human readability in cases where someone is inspecting the database for troubleshooting. All around, it's not something we should do unless we really have to.

I'm still open to the possibility of a more targeted, more elegant solution to this -- perhaps a less intrusive escaping mechanism than base64. However, base64 is a good way to ensure that there aren't any edge cases we're failing to anticipate -- it reduces every possible scenario to safe characters.

I hope to discuss this in a little more detail on the next developers call... but if your users are being negatively impacted in the meantime, as I say, you should be able to just patch your database class manually. If we come up with a different solution as the result of later discussion, it should be possible to manually correct any saved entries beginning with B64__ (and I don't imagine saved searches are very common, so it may not matter much at all).

@jochen-lienhard
Copy link
Contributor

We are just in the test phase with vufind 2, so there is no problem for our users at the moment.

@berndoberknapp
Copy link

Writing bytea data requires escaping, but my understanding is that PDO should escape the data automatically when a prepared query is used (which is the case according to the PostgreSQL logs). I'm not familiar with the ZF Db layer but writing bytea data seems to be supported with ParameterContainer::TYPE_LOB. I couldn't find a reference to TYPE_LOB or 'lob' in the VuFind code, so maybe we have to tell the ZF that search_object is a LOB?

@demiankatz
Copy link
Member Author

I think you may be correct, but I'm not sure how we can access the ParameterContainer object from the scope of a TableGateway object in order to specify the appropriate parameter. The ZF2 documentation is weak on this point, and the code is a maze! Perhaps one option would be to add the EventFeature to the TableGateway and then hook the preInsert action in order to add appropriate parameters to the Insert object... but this seems overly complicated. It seems like there ought to be a way to simply set default parameters on a table-by-table basis, but I can't find anything that simple or straightforward. I'll look into this a little more later when I have more time.

@demiankatz
Copy link
Member Author

I'm not entirely happy with its complexity, but I believe I have found a true fix to the problem as opposed to just a workaround. See #599. I'm closing this PR, since I believe the new one works along more appropriate general lines. Thanks, @berndoberknapp, for helping to point me in the right direction!

@demiankatz demiankatz closed this Feb 19, 2016
@demiankatz demiankatz deleted the bytea-workaround branch May 4, 2016 19:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants