New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Doctrine DBAL instead of Wikibase Database #29

Merged
merged 1 commit into from May 8, 2014

Conversation

4 participants
@JeroenDeDauw
Member

JeroenDeDauw commented May 2, 2014

This change also tackled some design issues. Primarily the
cohesion, feature envy and responsibility mixing in the
DataValueHandler code and the weirdness of the SchemaConfig class.

@JeroenDeDauw

This comment has been minimized.

Show comment
Hide comment
@JeroenDeDauw

JeroenDeDauw May 3, 2014

Member

@mwjames Please forget my suggestions to use Wikibase Database in SMW. I will likely kill most of Wikibase Database, as it was reinventing the wheel. (see wmde/WikibaseDatabase#45) And based on experience gained with this commit, I can say that DBAL is a good fit for SMW as well, much better so than the MW DB abstract layer.

Member

JeroenDeDauw commented May 3, 2014

@mwjames Please forget my suggestions to use Wikibase Database in SMW. I will likely kill most of Wikibase Database, as it was reinventing the wheel. (see wmde/WikibaseDatabase#45) And based on experience gained with this commit, I can say that DBAL is a good fit for SMW as well, much better so than the MW DB abstract layer.

@mwjames

This comment has been minimized.

Show comment
Hide comment
@mwjames

mwjames May 3, 2014

Everything that makes the test environment not depend on MediaWikiTestCase (especially for the DB stuff) together with a possibility to separate transaction handling from MW-core will be an improvement to the current state of affairs.

mwjames commented May 3, 2014

Everything that makes the test environment not depend on MediaWikiTestCase (especially for the DB stuff) together with a possibility to separate transaction handling from MW-core will be an improvement to the current state of affairs.

@JeroenDeDauw

This comment has been minimized.

Show comment
Hide comment
@JeroenDeDauw

JeroenDeDauw May 3, 2014

Member

Everything that makes the test environment not depend on MediaWikiTestCase ... an improvement to the current state of affairs.

Sure thing. And switching to DBAL can help with that. Do however note how for this particular component, there was no dependency on MW to begin with, so there are lots of different advantages as well.

Member

JeroenDeDauw commented May 3, 2014

Everything that makes the test environment not depend on MediaWikiTestCase ... an improvement to the current state of affairs.

Sure thing. And switching to DBAL can help with that. Do however note how for this particular component, there was no dependency on MW to begin with, so there are lots of different advantages as well.

This was referenced May 3, 2014

Use Doctrine DBAL instead of Wikibase Database
This change also tackled some design issues. Primarily the
cohesion, feature envy and responsibility mixing in the
DataValueHandler code and the weirdness of the SchemaConfig class.
@filbertkm

This comment has been minimized.

Show comment
Hide comment
@filbertkm

filbertkm May 6, 2014

Collaborator

I see the QueryInterface in WikibaseDatabase has a MediaWiki implementation that uses LazyDBConnectionProvider which uses the load balancer to get an appropriate connection.

how does this handle load balancing?

Collaborator

filbertkm commented May 6, 2014

I see the QueryInterface in WikibaseDatabase has a MediaWiki implementation that uses LazyDBConnectionProvider which uses the load balancer to get an appropriate connection.

how does this handle load balancing?

@JeroenDeDauw

This comment has been minimized.

Show comment
Hide comment
@JeroenDeDauw

JeroenDeDauw May 6, 2014

Member

how does this handle load balancing?

It does not at this point.

Since this storage tool is not bound to MediaWiki at all, I don't see why we would try to put the two together in production. It seems much easier and safer to use a dedicated database. (Much like we'd be doing if we used an alternate storage technology such as Lucene) And if we end up wanting to use more databases for performance reasons, we can add load balancing capabilities. To me it is not at all clear this will be the road we want to go down, since we have many other options available as well.

Member

JeroenDeDauw commented May 6, 2014

how does this handle load balancing?

It does not at this point.

Since this storage tool is not bound to MediaWiki at all, I don't see why we would try to put the two together in production. It seems much easier and safer to use a dedicated database. (Much like we'd be doing if we used an alternate storage technology such as Lucene) And if we end up wanting to use more databases for performance reasons, we can add load balancing capabilities. To me it is not at all clear this will be the road we want to go down, since we have many other options available as well.

private $config;
protected $schema;

This comment has been minimized.

@thiemowmde

thiemowmde May 6, 2014

Contributor
  • Protected next to private?
  • Same order as in constructor please.
@thiemowmde

thiemowmde May 6, 2014

Contributor
  • Protected next to private?
  • Same order as in constructor please.
* @param TableBuilder $tableBuilder
*/
public function __construct( StoreConfig $storeConfig, Schema $storeSchema, TableBuilder $tableBuilder ) {
public function __construct( StoreConfig $storeConfig, StoreSchema $storeSchema, AbstractSchemaManager $schemaManager ) {
$this->config = $storeConfig;

This comment has been minimized.

@thiemowmde

thiemowmde May 6, 2014

Contributor

Use the same name everywhere please. $this->storeConfig in this case. Yes, I'm aware it's not part of this change. ;)

@thiemowmde

thiemowmde May 6, 2014

Contributor

Use the same name everywhere please. $this->storeConfig in this case. Yes, I'm aware it's not part of this change. ;)

*/
public function __construct( $storeName, $tablePrefix, array $dataValueHandlers ) {
public function __construct( $storeName ) {
$this->name = $storeName;

This comment has been minimized.

@thiemowmde

thiemowmde May 6, 2014

Contributor

I know this is painfully nitpicking. But "name"? The name of the config? Use "storeName" everywhere.

@thiemowmde

thiemowmde May 6, 2014

Contributor

I know this is painfully nitpicking. But "name"? The name of the config? Use "storeName" everywhere.

* @param DataValueHandlers $dataValueHandlers
*/
public function __construct( $tablePrefix, DataValueHandlers $dataValueHandlers ) {
$this->tablePrefix = $tablePrefix;

This comment has been minimized.

@thiemowmde

thiemowmde May 6, 2014

Contributor

No type check? Not that I insist on one. ;-)

@thiemowmde

thiemowmde May 6, 2014

Contributor

No type check? Not that I insist on one. ;-)

$factory->newMySQLSchemaModifier()
new SQLStore(
new StoreSchema(
'qr_',

This comment has been minimized.

@thiemowmde

thiemowmde May 6, 2014

Contributor

The explanation for the qr_ got lost. It said "QueryR Replicator QueryEngine". To be honest, this doesn't help me much. ;-)

@thiemowmde

thiemowmde May 6, 2014

Contributor

The explanation for the qr_ got lost. It said "QueryR Replicator QueryEngine". To be honest, this doesn't help me much. ;-)

@thiemowmde

This comment has been minimized.

Show comment
Hide comment
@thiemowmde

thiemowmde May 6, 2014

Contributor

+1.5! ;-) I read all added lines (I really did) as well as the relevant parts of the removed lines. Aside from some marginal complaints about naming conventions (some a matter of taste or bikesheding, all of them can be refactored later very easily) I'm happy with this move towards DBAL.

Keep in mind this does reduce the complexity on our side but introduces an external dependency and therefor, because it's external, increases the overall complexity of our system. This is not necessary a bad thing. Just be honest and keep this in mind.

I would like to have a quick chat with @filbertkm tomorrow. I will very likely hit merge then.

Contributor

thiemowmde commented May 6, 2014

+1.5! ;-) I read all added lines (I really did) as well as the relevant parts of the removed lines. Aside from some marginal complaints about naming conventions (some a matter of taste or bikesheding, all of them can be refactored later very easily) I'm happy with this move towards DBAL.

Keep in mind this does reduce the complexity on our side but introduces an external dependency and therefor, because it's external, increases the overall complexity of our system. This is not necessary a bad thing. Just be honest and keep this in mind.

I would like to have a quick chat with @filbertkm tomorrow. I will very likely hit merge then.

*
* @return self
*/
public function selectFromWhere(array $columnNames, $tableName, array $criteria) {

This comment has been minimized.

@thiemowmde

thiemowmde May 6, 2014

Contributor

Just for the record, I'm not completely happy with the three public methods this class introduces. Personally I find them a bit unintuitive. It feels like they are a bit to "magic". Maybe it's just the method names. They could be a bit more expressive. $tableName should be the first param in my opinion.

It's perfectly fine as a bunch of helper methods inside WikibaseQueryEngine. They do the job and can easily be refactored later. No problem.

@thiemowmde

thiemowmde May 6, 2014

Contributor

Just for the record, I'm not completely happy with the three public methods this class introduces. Personally I find them a bit unintuitive. It feels like they are a bit to "magic". Maybe it's just the method names. They could be a bit more expressive. $tableName should be the first param in my opinion.

It's perfectly fine as a bunch of helper methods inside WikibaseQueryEngine. They do the job and can easily be refactored later. No problem.

@JeroenDeDauw

This comment has been minimized.

Show comment
Hide comment
@JeroenDeDauw

JeroenDeDauw May 6, 2014

Member

@thiemowmde So to clarify, you are fine with this (and the test follow up) being merged, and the issues being addresses in further follow ups? (This will allow me to address these, and other TODOs for QueryEngine much faster, using smaller and easier to review commits)

Member

JeroenDeDauw commented May 6, 2014

@thiemowmde So to clarify, you are fine with this (and the test follow up) being merged, and the issues being addresses in further follow ups? (This will allow me to address these, and other TODOs for QueryEngine much faster, using smaller and easier to review commits)

@filbertkm

This comment has been minimized.

Show comment
Hide comment
@filbertkm

filbertkm May 7, 2014

Collaborator

@JeroenDeDauw seems reasonable to have the query database separate from the main site database, but these types of setups still tend to be within the load balancer configuration and think it's still something we want to take advantage of. I assume the query database will be on more than one box and we need some way of getting the appropriate connection.

We might at least want a ConnectionFactory that can take a MediaWiki Database object (where we can get the LB connection info) and create a DBAL Connection.

generally DBAL is great (been using it for quite sometime for external projects, assumed you knew about it) and am quite happy with it. However, I am concerned we are not handling performance aspects adequately here.

Collaborator

filbertkm commented May 7, 2014

@JeroenDeDauw seems reasonable to have the query database separate from the main site database, but these types of setups still tend to be within the load balancer configuration and think it's still something we want to take advantage of. I assume the query database will be on more than one box and we need some way of getting the appropriate connection.

We might at least want a ConnectionFactory that can take a MediaWiki Database object (where we can get the LB connection info) and create a DBAL Connection.

generally DBAL is great (been using it for quite sometime for external projects, assumed you knew about it) and am quite happy with it. However, I am concerned we are not handling performance aspects adequately here.

@filbertkm

This comment has been minimized.

Show comment
Hide comment
@filbertkm

filbertkm May 7, 2014

Collaborator

where is OutOfBoundsException used? (same for all the use X below)

Collaborator

filbertkm commented on src/SQLStore/StoreSchema.php in eec80ac May 7, 2014

where is OutOfBoundsException used? (same for all the use X below)

This comment has been minimized.

Show comment
Hide comment
@JeroenDeDauw

JeroenDeDauw May 7, 2014

Member

I don't know, might be a leftover from the old code. We can easily optimize imports in a follow up commit. (Doing so now will just invite conflicts in existing follow ups for essentially no gain)

Member

JeroenDeDauw replied May 7, 2014

I don't know, might be a leftover from the old code. We can easily optimize imports in a follow up commit. (Doing so now will just invite conflicts in existing follow ups for essentially no gain)

@filbertkm

This comment has been minimized.

Show comment
Hide comment
@filbertkm

filbertkm May 7, 2014

Collaborator

why all this spacing and indenting stuff?

Collaborator

filbertkm commented on README.md in eec80ac May 7, 2014

why all this spacing and indenting stuff?

This comment has been minimized.

Show comment
Hide comment
@JeroenDeDauw

JeroenDeDauw May 7, 2014

Member

To indent the list :) You are to used to wikitext!

Member

JeroenDeDauw replied May 7, 2014

To indent the list :) You are to used to wikitext!

@filbertkm

This comment has been minimized.

Show comment
Hide comment
@filbertkm

filbertkm May 7, 2014

Collaborator

not part of your change, but is this used?

Collaborator

filbertkm commented on src/SQLStore/DVHandler/EntityIdHandler.php in eec80ac May 7, 2014

not part of your change, but is this used?

This comment has been minimized.

Show comment
Hide comment
@JeroenDeDauw

JeroenDeDauw May 7, 2014

Member

This commit removes the usage of this class here, so the reference can be deleted

Member

JeroenDeDauw replied May 7, 2014

This commit removes the usage of this class here, so the reference can be deleted

@JeroenDeDauw

This comment has been minimized.

Show comment
Hide comment
@JeroenDeDauw

JeroenDeDauw May 7, 2014

Member

I assume the query database will be on more than one box and we need some way of getting the appropriate connection.

This is not clear to me. If we really need to support that, than I prefer to have this tracked by a dedicated item. Can you please create one?

However, I am concerned we are not handling performance aspects adequately here.

"First make it work, then make it correct/clean, then make it fast". This commit falls into the second part. And it makes room for moving onto the third. Indeed, I have already started with taking a closer look at performance (see for instance #32), though this is quite hard to continue with while this remains sitting on a branch.

I fully agree we should look into the load balancing topic, and other performance questions, in more detail. The fastest way we can do this is by having this merged. Adding more requirements onto this already to big commit is going to slow this work, and everything that depends on it, down.

Member

JeroenDeDauw commented May 7, 2014

I assume the query database will be on more than one box and we need some way of getting the appropriate connection.

This is not clear to me. If we really need to support that, than I prefer to have this tracked by a dedicated item. Can you please create one?

However, I am concerned we are not handling performance aspects adequately here.

"First make it work, then make it correct/clean, then make it fast". This commit falls into the second part. And it makes room for moving onto the third. Indeed, I have already started with taking a closer look at performance (see for instance #32), though this is quite hard to continue with while this remains sitting on a branch.

I fully agree we should look into the load balancing topic, and other performance questions, in more detail. The fastest way we can do this is by having this merged. Adding more requirements onto this already to big commit is going to slow this work, and everything that depends on it, down.

@thiemowmde

This comment has been minimized.

Show comment
Hide comment
@thiemowmde

thiemowmde May 8, 2014

Contributor

So to clarify, you are fine with this (and the test follow up) being merged, and the issues being addresses in further follow ups?

@JeroenDeDauw Yes, absolutely. If you can convince @filbertkm we can merge both. All my comments are about private details that can easily be addressed in a follow up. Just don't forget it (e.g. the missing explanation for the qr_ abbreviation).

Contributor

thiemowmde commented May 8, 2014

So to clarify, you are fine with this (and the test follow up) being merged, and the issues being addresses in further follow ups?

@JeroenDeDauw Yes, absolutely. If you can convince @filbertkm we can merge both. All my comments are about private details that can easily be addressed in a follow up. Just don't forget it (e.g. the missing explanation for the qr_ abbreviation).

thiemowmde added a commit that referenced this pull request May 8, 2014

Merge pull request #29 from wmde/dbal
Use Doctrine DBAL instead of Wikibase Database

@thiemowmde thiemowmde merged commit 7f2ec8f into master May 8, 2014

2 checks passed

continuous-integration/travis-ci The Travis CI build passed
Details
default Scrutinizer: Errored — Travis: Passed
Details

@thiemowmde thiemowmde deleted the dbal branch May 8, 2014

@mwjames

This comment has been minimized.

Show comment
Hide comment
@mwjames

mwjames May 8, 2014

but these types of setups still tend to be within the load balancer configuration and think it's still something we want to take advantage of.

Ocramius has pointed towards https://github.com/doctrine/dbal/blob/master/lib/Doctrine/DBAL/Connections/MasterSlaveConnection.php

mwjames commented May 8, 2014

but these types of setups still tend to be within the load balancer configuration and think it's still something we want to take advantage of.

Ocramius has pointed towards https://github.com/doctrine/dbal/blob/master/lib/Doctrine/DBAL/Connections/MasterSlaveConnection.php

@JeroenDeDauw

This comment has been minimized.

Show comment
Hide comment
@JeroenDeDauw
Member

JeroenDeDauw commented May 8, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment