-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Provide Postgresql support for Piwik #500
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
See ticket #425. |
Thank you for the suggestion! Piwik officially supports MySQL only for now. |
Let’s go here : http://github.com/klando/pgpiwik/tree/master If you are interesting, then : git clone git://github.com/klando/pgpiwik.git |
too bad. I would have loved to see this happen. I would say that people that would use postgres would rival that of mysql. |
Do I interpret right that you expect to have more tickets opened to request postgres? :-) I usually don’t even bother to check a mysql-only project, let alone open a bug about it in their project, so your assumption about the size of the postgresql userbase may be extremely skewed. Apart from letting you know that many people use db other than mysql it is a bit more important to note that even php makes it possible to write db backend independent code, which could support nearly any database backends, including, but not limited to psql and sqlite. I do not believe it would visibly make development harder to use standard SQL and standard DB API, since at a first blick your code is pretty much mysql independent (only a few places may require tuning) and you seem to use backend independent code (at least you include it in the distribution). So it may only required a few changes here and there. Pity I do not like php at all, so I cannot offer to patch it up, and you’d rejct anyway from an outsider. ;-) So, please consider to use database backend independent code and API. It would be nice and The Right Thing®™. |
Re-closing. This is a resource issue; not a techical one. We need people to step up to: |
Replying to [vipsoft](comment:7): > Re-closing. This is a resource issue; not a techical one. |
A good ressource to read by wordpress team: http://codex.wordpress.org/Using_Alternative_Databases |
Well, since Piwik uses ZF for database handling, it wouldn't be that hard, "just" editing the SQL statements to be compatible with other DBMS. I see further problems in plugin development, there will be coder who make their plugins only MySQL compatible. |
Blockers:
|
I'm keeping MySQL on my VPS just because Piwik requires it. I decided to start support for PostgreSQL in Piwik, the plugin approach seems the right first try. When it stabilizes in my production VPS, I'll report here. |
I strongly ask for Postgres support as MySQL has various bugs and critical problems (like security, speed, scaling). Supporting any database is very easy. Everybody knows that plain SQL queries within sourcecode are a big risks, because some people know how to write good queries, others don't. Injection and speed issues are unavoidable. There must be something like http://search.cpan.org/~abraxxa/DBIx-Class-0.08195/lib/DBIx/Class.pm for PHP. Switching shouldn't take that much time, if you still use plain queries in sourcecode, and will greatly improve security and portability. PS: MySQL is already on it's way to become payware and I doubt that many users would pay for it just to use Piwik :-) |
I'm interested to use Piwik with Postgresql too ! Right now, the only reason why i'm still using Google Analytics is because Piwik don't support pgsql... Is there any chance to use piwik with alternatives databases soon ? |
Replying to grin:
+1 for Postgres support |
Another +1 for PostgreSQL support. Piwik is a really nice piece of software, but I definitely won't install another DBMS so I can use it. |
Another +1 for a non mysql version of Piwik here! PostgreSQL would be great! Also other great software like Drupal is supporting postgresql nowadays. |
like we said we are keen to do it if someone submits a patch. There is clearly a lot of interest out there but not yet someone who has the skills and time to invest in such a huge coding spree! |
Reopening, as it appears that the devs are now welcoming patches for this request. Also +1'ing, because I refuse to deal with MySQL and Oracle anymore. I could switch to Maria DB, but Postgres seems to have all the momentum. Would love this feature. Without it, I'm going to go looking for another solution. |
+1 here as well. I've looked at Piwik several times, and mysql has been the deal breaker each time. |
+1 here as well. I'm using Piwik for a volunteer organization, and we are sensitive to the privacy issues posed by for instance Google analytics. We're using Piwik with mysql locally right now, but all other resources are running against a postgres cluster, with all the bells and whistles, most importantly proper data backup and protection routines. It would be great to get rid of the lone mysql service running locally. |
+1. We have a small VPS that we use for hosting our site (Apache+Postgresql), and it's working quite nicely, but having to add MySQL just for Piwik uses up almost all memory, even if we configure MySQL to use as little as possible. |
+1 MySQL is a big joke. (It never makes me laugh though.) Unfortunately MariaDB has to have all the MySQL strangeness in order to work as a drop-in replacement. This, however, makes it just as irrelevant as an alternative. I want to get rid of that installation for the same reasons already mentioned by others:
|
I'm interested in taking on the work of adding support to PostgreSQL. The actual amount of SQL that needs to be changed doesn't seem too dramatic. I'm more concerned with the following:
Most importantly (for me), does anyone on the core team have the time to map out an approach for a database abstraction layer? Once that's done, I think we can handle the pgsql-specific changes. Also, any thoughts on the other concerns above? EDIT: Looks like there's some active discussion on this in the mailing list. I'll follow up there. |
Thanks for showing your interest in this topic! While we are not actively engaged in this work (yet) many team members are interested in running other DBs with Piwik. the issue is, that it's super complex.
We could map out an approach, but honestly, unless we do the full work, do you think anyone would do it? If anyone is really interested, yes we could work for a few days / weeks on the basics of DB abstraction layer. But converting the whole codebase to use such layer, is rather huge and complicated task. If you know any software engineer/developer with 3-4 weeks available and a big brain willing to explode, and a big interest in this work, please let us know here! |
(Apologies, I think I misspoke -- it looks like Piwik has a db abstraction layer, in the form of the Db class and the Zend_Db_Adapter. What's missing is a data-access layer, since all of the SQL code is intermingled with the business logic.) Matt, I can spare a developer for 3-4 weeks, but I doubt we'll have time to do much more than that right now. So I'm hesitant to dive in, given that we might realize after 2 or 3 weeks that we'd need to spend a lot more time on this. If that happened, I would need to find an alternative solution quickly. One creative solution is to do this incrementally, by updating the code file-by-file to use a new data access layer. This could only work if some of the code could be updated to use a new data access layer, but the rest of the code was left as-is. I'm not sure how exactly to do this, but I'm guessing the two approaches could co-exist, since most or all queries seem to go through the Db class. This would allow the core team to implement and test an approach that you're happy with, since you'd only have to implement a data-access tier for a few files. That would prove that it works, and your code would give us a template to follow. You wouldn't even need a PostgreSQL database to test those changes; just set up two different MySQL logins with access to the same schema. Once you're done with the proof of concept, we would come in and move the SQL into a new data-access tier for as much of the code as possible, following your template. Our initial focus would be all of the SQL that is specific to MySQL, since the app would be usable for both PostgreSQL and MySQL if we get through all of that code and leave the rest of the files as-is. But we'd try to get through all of the SQL while we had focus. I can offer at least 3 person-weeks of work, and hopefully a bit more, which might be enough to get through all of the code. Other developers could help with this as well, if any are available. And any remaining files could be handled by the community or by us as time permits. I'd be more comfortable diving in if we followed this approach, because we'd be more likely to have a production-quality solution within 3-4 weeks that we could deploy. The big downside is that the code might end up in a Frankenstein-like state for some period of time, with some code calling to a data-access tier, and other code with inline SQL. You might not consider that to be "production-quality". Having said that, a data-access tier might help structure the code more cleanly. If so, then the incremental approach would let you migrate the code towards that structure without having to take on all of the work within a single release. |
Ah, I see a few classes under the Piwik\DataAccess namespace. Is this the start of a full data-access tier? Or is this a solution for a much smaller problem? |
Please see https://github.com/sri-soham/piwik . This is an (almost) working version of Piwik with PostgreSQL suppport. |
Thanks, @low. I saw a link to that repo in another ticket, so our developers have been looking closely at it for the past few days. I myself looked too quickly; I saw a change to a file that commingled a MySQL and PostgreSQL error code (core/DataAccess/ArchiveTableCreator.php maybe?), so I quickly assumed that there was no db abstraction layer in this repo. I assumed incorrectly that @sri-soham was working initially on a proof of concept, without changing the code structure. One of my developers mentioned that this repo uses DAOs to encapsulate both data access and database abstraction, so I looked more closely. The DAO implementation looks pretty darn solid to me. The pgsql classes extending the MySQL classes to take advantage of common logic. It would be straightforward to add support for additional DBMSes, if that was desirable. Matt et al, have you been following this work? What do you think of this implementation? If the remaining issues are resolved, is it reasonable to assume that this code or something like it will make it into the upstream repo in the foreseeable future? |
Increasing priority to reflect the very high interest of the community in supporting other databases! Please keep commenting. I am a bit swamped at the moment, but I'll do my best to review the code and give a feedback here in the next week or so. Thanks for your patience and trying to make this happen! |
hey Jon! I have posted a message to Sridhar on the piwik-hackers mailing list: http://lists.piwik.org/pipermail/piwik-hackers/2014-February/001452.html I updated the description of this ticket with: see the Piwik fork with Postgresql support. Great contribution by Sridhar from the community @sridhar if you read this, maybe you could update the port to sync with master? I am looking at this diff in particular: sri-soham/piwik@piwik:master...master If you confirm this is the right place to look, I will discuss the fork with the Piwik team and will come back to you here about our feedback! |
If you don't do stupid stuff postgres scales really well and does not need necessarily need the live MySQL does. This does also not need to be repeated for every change. Only the ones related to queries are affected also git has you covered to do that reasonable. Supporting multiple DBs is possible and one of the main reasons database abstractions exist and I am going so far to say that matomo would be at an even better place if it would have gone with postgres instead of MySQL since the beginning. It is one of two products I am using which does not support postgres and MySQL is using quadruple the ram, has longer start up times and just does not work as well with containers. And change is always possible if someone wants to do it. Right now we are finding excuses to not do it rather than solving problems. |
I don't mean to mind you as I don't necessarily have a contention to what you said. I would just like to add that the up and coming generation opt for Postgres over mySQL and as I am 27 with a couple of years in the game I can say that it's more than just a focus on edge cases. Not considering the switch could also spell a shorter life span for matomo. |
Just found this out now, and really disappointed this still isn't a thing in 2020. |
@arjan-s The fork (https://github.com/sri-soham/piwik) was last updated 5 years ago, so I doubt it will work anymore. |
Thanks for the feedback @arjan-s the FAQ has been updated to clarify the situation: at: https://matomo.org/faq/how-to-install/faq_55/ (and removed mention of the fork) |
Postgres can be implemented if the project does not use manual sql query. example cycle orm |
Correct, but Matomo does build all sql queries manually and doesn't use a query builder. And switching Matomo to an ORM is around the same order of magnitude work than also supporting Postgresql. |
Also with query builders its not easily possible to tweak queries for performance, which is essential for us. |
+1 for postgres support |
I'm interested in implementing this as I have extensive experience with PostgreSQL and PostGIS. I'd just need to find or make time to do this. Also, I'll need to get familiar with the code base. If you have any questions about PostgreSQL if you're doing this please don't hesitate to get in touch. I'll let you know once I've read all the comments and code base in context and I can start so we don't double up efforts. |
This gives me a crazy idea: This would mean, instead of having to port applications from MySQL to PostgreSQL, we "just" write a middleware that translates MySQL queries to PostgreSQL queries and then forwards the translation to PostgreSQL and returns the result using the MySQL client/server protocol. This could even be optimised by caching the translations in memory, redis or even PostgreSQL itself. It is similar to the Linux Wine or Cedega ideas that translate system calls for example from DirectX to OpenGL and could be quite fast if implemented in a low-level language such as C++. I got the idea because I have used Sphinx for fulltext search engines and it uses a MySQL-like server. One can simply connect to it with a MySQL client application, even with MySQL libraries and even though it has a different SQL language than MySQL. It could be a PostgreSQL extension that runs on a different port that you connect to with the MySQL library. This would solve this problem for all applications that only support the MySQL protocol and could boost PostgreSQL up in usage. |
Maybe the code for these could be leveraged somehow: |
Easier might be to write classes in PHP that extend PDO and override methods and translate MySQL to PostgreSQL on that level and make it a PHP library that Matomo could leverage. The translations could be limited to the ones used by Matomo or another client application for that matter. |
Hi @geekdenz, As interesting as this idea sounds, I think implementing it would be an enormous efford. |
Hi @Findus23 Thanks for your reply. In my last comment, I meant to say that vendors such as Matomo could provide translations for just their query patterns. That would make the module quite simple actually. Wine is by no means perfect, but works for quite a few games and apps. It would be an incremental effort supporting some queries from the start but not all. Matomo could be the first vendor to work with the module. PostgreSQL is the most advanced Open Source DBMS with the best standard support of SQL. I am pretty sure it supports at least what MySQL supports and therefore believe it is a super set. That means anything in MySQL can be implemented in a PostgreSQL structure, but maybe not the other way around. |
I already ported applications from MySQL to PostgreSQL. According to my experience, a good part of simple queries are easy to migrate. But special performance critical queries might be completely rewritten, because optimizations for MySQL might have opposite effects with PostgreSQL. Furthermore, if multiple queries are required for one task, a rewrite of the application logic might be required. Don't get me wrong, I would really like to see PostgreSQL support in Matomo! |
Doesn't that mean a PDO extension allowing prepared statements to be mapped to translations would still work if an application such as Matomo implements their own mappings? It would be a super simple module that Matomo could use and someone could monitor the queries that go through with testing the functionality and then they could be written. |
imho the most correct way would be to move all db access to some kind of interface with defined methods and returnable structures and allow to implement other database support as plugins |
@lafriks we are using database adapters to connect to the database. Those could in theory simply be replaced with a postgre adapter. But that was actually never the problem. We are building a lot queries that are optimized for mysql, some even use features that won't be available in another database. All those queries are done all across over the code. The most common solution for something like this would for sure be using a database query builder. For simpler queries that wouldn't be a problem. But we have a lot very complex queries, where a query builder would not even work. So that would actually mean we would need to write each query for each database we would like to support, which would make everything a lot more complex. |
imho even complex queries should have a mapping from any RDBMS to any other. There are bound to be complex queries that will need to get translated and optimised but it should be possible according to relational algebra, which I believe can be proven mathematically. I had a look through some of the source code. See It says <?php
return self::get()->query($sql, $parameters); and earlier <?php
/** @var \Zend_Db_Adapter_Abstract $db */
$db = self::get(); which implies the Zend_Db_Adapter_Abstract class is used. My proposition is to create an intermediary class <?php
class MySQL_PG_Translator
{
// this map can be stored in a DB or memcached compatible middleware like redis
// if it gets big and would still have O(1) access speed as it would be a map op
private $queries = [
'SELECT * FROM `Table1` t1 JOIN `Table2` t2 ON (t1.id=t2.t1_id) WHERE prop1=?'
=>
'SELECT * FROM "Table1" AS t1 JOIN "Table2" AS t2 ON (t1.id=t2.t1_id) WHERE prop1=?',
// ...
];
public function translate($query)
{
return $this->queries[$query];
}
}
class MySQL_PG_Translator_Zend_Db_Adapter extends Zend_Db_Adapter_Abstract
{
// DI Translator
protected $translator;
// ...
public function query($query, $params = [])
{
$translatedQuery = $this->translator->translate($query);
// ... other maybe necessary stuff
return parent::query($translatedQuery, $params);
}
// other methods like fetchAll etc
}
// use in Db, something like
$db = @Adapter::factory($dbConfig['adapter'], $dbConfig);
} The amount of queries that need to be translated should be reasonably finite, which would make the MySQL_PG_Translator class fairly managable even with a lot of queries. The large amount of work would then be to find all the queries that are run and then rewrite them. One could put the adapter in place, just translate to itself and still use MySQL, run all the tests and log the queries to a file. Then they can be translated in isolation. Of course there could be generated queries where this does not quite work, because in theory there might be infinite possibilities for them. But they would probably have a pattern that could be matched with a regex. In the worst case a transformation engine could be used such as ANTLR: |
FYI: Not sure how important it is but Matomo has an infinite amount of SQL queries because of features like Segmentation and https://matomo.org/custom-report/ . They don't just change where conditions or anything but they might change queries quite a bit with various subqueries, different joins etc. Translating query by query wouldn't work there potentially. |
Yes, @tsteur I agree it has to be weighed how important this issue is and I agree with @mattab 's initial decision to not do this as it will be a big effort. We would have to do some serious work either way. Since there are an infinite amount of possible queries the only way along these lines would be to write an actual translator for SQL. Would it maybe be easier to reason with users that installing a free Middleware such as MySQL or MariaDB additional to PostgreSQL is a small price to pay for a great product? |
I guess in the end the effort you put into it, and the advantage you get with PostgreSQL is minimal and it might be a different story if you were to think about an alternative storage that scales a lot better say 10-100 times etc and brings more unique advantages/benefits |
Hence my proposal was to have interface with all needed storage functions/abstraction that could be implemented as plugin. This way it could be possible even to let's say combine db for users and config + some other, ex. let's say time series database for metric data |
+1 interested in mamoto with postgres |
I want to be able to use a postgres database (instead of mysql).
please see http://piwik.org/faq/how-to-install/#faq_55
Latest update: see the [Piwik fork with Postgresql support](https://github.com/sri-soham/piwik). Great contribution by Sridhar from the community
Keywords: wishlist
The text was updated successfully, but these errors were encountered: