WIP - 3.0 - Connection prefix #4505

Closed
wants to merge 78 commits into
from

Projects

None yet
@HavokInspiration
Member

As mentionned in #3843, a fresh start to implement the connection prefix feature.
At this point, this commit puts the PR back to the state it was, but with an up to date Cake 3.0 version (and better managed branches on my side)

Refs #2666 and #4118

To do :

  • Update Schema system so it takes into account the connection prefix (if any)
  • Maybe update the \ORM\Table class (I have a written note about that on my desk, so I guess "past me" put the finger on something I may have forgotten)
  • Make Fixtures apply connection prefixes

I'll probably have some questions about testing with / without a prefix and making the fixtures apply the prefix. I know I got stuck on something regarding this but can't exactly remember what. I'll dive in this again and ask you as they come up.

@HavokInspiration HavokInspiration Begin implementing connection prefix feature to the ORM
The table name is wrapped in a TableNameExpression which will be fully resolved when the Query is converted to SQL.
13fb246
@markstory markstory added this to the 3.0.0 milestone Sep 6, 2014
@dereuromark dereuromark commented on an outdated diff Sep 6, 2014
src/Database/Connection.php
+ *
+ * @param string|array|ExpressionInterface $names The names of the tables
+ *
+ * @see \Cake\Database\Expression\TableNameExpression
+ * @return string|array|ExpressionInterface Full tables names
+ */
+ public function fullTableName($names) {
+ $prefix = "";
+
+ if (isset($this->_config["prefix"]) && $this->_config["prefix"] !== "") {
+ $prefix = $this->_config["prefix"];
+ }
+
+ if (is_string($names)) {
+ $names = new TableNameExpression($names, $prefix);
+ } else {
@dereuromark
dereuromark Sep 6, 2014 Member

using elseif you could save one level of indentation.

@dereuromark dereuromark commented on an outdated diff Sep 6, 2014
src/Database/Schema/Collection.php
@@ -108,6 +108,9 @@ public function describe($name, array $options = []) {
}
}
$config = $this->_connection->config();
+
+ $prefix = (isset($config["prefix"]) && $config["prefix"] !== "") ? $config["prefix"] : "";
@dereuromark
dereuromark Sep 6, 2014 Member

We usually prefer single quotes here for consistency. But just nitpick.

HavokInspiration added some commits Sep 6, 2014
@HavokInspiration HavokInspiration Implements connection prefix to the Schema system
Various methods of the schema system now calls a method that resolves the full table name before performing the query.
When an instance of \Schema\Table is created, the name is wrapped in a TableNameExpression. Whenever the full name of the Table instance is needed, the expression is converted.
570dd0a
@HavokInspiration HavokInspiration Merge branch '3.0' into 3.0-connection-prefix
Conflicts:
	src/Database/Schema/MysqlSchema.php
	src/Database/Schema/PostgresSchema.php
	src/Database/Schema/SqliteSchema.php
	src/Database/Schema/SqlserverSchema.php
345a8c1
@HavokInspiration
Member

I just pushed my first draft for the Schema system update which brings questions about unit testing with a prefix.

What would be the way to tests all of the framework with a prefix ?
My first solution was to put a prefix in the tests configuration and see how it goes.
Currently, this option would not go well as some tests use raw sql queries - so hard coded table name - to assert. The tests could, of course, be updated with the prefix concatenated in these raw sql queries.

But I'm not sure this would be the best solution. I'm afraid this would not be as "global" as needed.

Since I suck quite a lot at unit testing, I'm asking for help : what is the best way ?

@HavokInspiration
Member

Some tests are failling, I'll take a look at it.

Edit : It seems to come from the \Schema\Collection::describe() method.

@HavokInspiration HavokInspiration Fix CS
a451c44
@markstory
Member

You might want to look at the existing DatabaseTestsuite for an example of running all the tests with/withou identifier quoting. This might be a way to do complete integration tests with prefixes.

@ravage84 ravage84 and 4 others commented on an outdated diff Sep 9, 2014
src/Database/Connection.php
@@ -192,6 +193,44 @@ public function isConnected() {
}
/**
+ * Wrap the table name in a TableNameExpression with the current config prefix
+ *
+ * @param string|array|TableNameExpression $names The names of the tables
+ *
+ * @see \Cake\Database\Expression\TableNameExpression
@ravage84
ravage84 Sep 9, 2014 Member

Nitpicking:
Please move @see after @return and remove the white line above.

We haven't set any coding standard on this, but to me the following makes most sense:

  1. @param
  2. @return
  3. @throws
  4. Any other, like @see or @link
@thaJeztah
thaJeztah Sep 9, 2014

@ravage84 In case you're going to create coding standards on that; many times I see @link or @see being used as part of the description (e.g. This method does .... and is further handled by @see othermethod()). In this context, it could be more logical to put @see and @link before @param.

(Just suggesting, for you to consider)

@ravage84
ravage84 Sep 9, 2014 Member

Well, I wasn't going to except people see a use/need in it.
But we could define that any other annotations either need to be BEFORE or AFTER param, return and throws, so that these three are always together.

@thaJeztah
thaJeztah Sep 9, 2014

👍 fully agree; @param, @return and @throws together basically are the signature of a method, so putting them together is logical. Other annotations before or after; I'm on favor of before, for reasons mentioned, but maybe other people have compelling reasons not to do that.

@ADmad
ADmad Sep 9, 2014 Member

Existing CS rules already expect @param to be first tag, so extras would have to go after.

@thaJeztah
thaJeztah Sep 9, 2014

If that's the case, it's settled. I looked in the docs, but couldn't find it (Coding Standards), maybe that section needs updating.

@dereuromark
dereuromark Sep 9, 2014 Member

There is no section about the annotation order yet. But it would sure be nice to have it added to have clarification on that.

@ravage84
ravage84 Sep 9, 2014 Member

Yep, not everything that is covered through phpcs rules is also covered on the coding standards page in the docs.

One example docblock for a class, property an dmethod each should suffice, I guess.
Or one example where you see a class with one property and one method.

@thaJeztah
thaJeztah Sep 9, 2014

I've created a placeholder issue here #4559 so that this can be discussed in the right context if any discussion is still required

@HavokInspiration
HavokInspiration Sep 10, 2014 Member

Well, this line was inspiring :p
I'll update this as per the new guidelines

@ravage84 ravage84 commented on an outdated diff Sep 9, 2014
src/Database/IdentifierQuoter.php
@@ -157,6 +184,29 @@ protected function _quoteJoins($joins) {
}
/**
+ * Quotes the table name (either from a from or a join) taking into account
@ravage84
ravage84 Sep 9, 2014 Member

You could split that up into the long description, e.g:

/**
 * Quotes the table name
 *
 * Either from a from or a join, taking into account
 * if the $name parameter is a TableNameExpression or not.
@ravage84 ravage84 and 1 other commented on an outdated diff Sep 9, 2014
src/Database/QueryCompiler.php
@@ -196,7 +197,9 @@ protected function _buildFromPart($parts, $query, $generator) {
protected function _buildJoinPart($parts, $query, $generator) {
$joins = '';
foreach ($parts as $join) {
- if ($join['table'] instanceof ExpressionInterface) {
+ if ($join['table'] instanceof TableNameExpression) {
@ravage84
ravage84 Sep 9, 2014 Member

The following 5 lines are almost the same as the next change below.
May be extract that into a protected method?
Not sure if this makes sense and if it can be really reused, though...

@HavokInspiration
HavokInspiration Sep 10, 2014 Member

That seems feasable.

@ravage84 ravage84 commented on an outdated diff Sep 9, 2014
src/Database/Schema/BaseSchema.php
@@ -46,6 +48,51 @@ public function __construct(Driver $driver) {
}
/**
+ * Retrieves the prefix from the current connection $config
+ *
+ * @param array $config Configuration array for the current Connection
+ * @return string The prefix for the current connection
+ */
+ public function getConnectionPrefix($config) {
+ return (isset($config['prefix']) && is_string($config['prefix'])) ? $config['prefix'] : '';
@ravage84
ravage84 Sep 9, 2014 Member

I would prefer an if here.

HavokInspiration added some commits Sep 10, 2014
@HavokInspiration HavokInspiration Merge branch '3.0' into 3.0-connection-prefix ad844e7
@HavokInspiration HavokInspiration Optimization based on comments eadbc45
@HavokInspiration HavokInspiration Merge branch '3.0' into 3.0-connection-prefix
34d45cb
@HavokInspiration HavokInspiration Fix Schema error builds
5bf34e2
@HavokInspiration HavokInspiration Fix build for SqlserverSchema
2ad0daf
@HavokInspiration HavokInspiration Merge remote-tracking branch 'upstram/3.0' into 3.0-connection-prefix a6d73f4
@HavokInspiration HavokInspiration The Query::into() method should apply the connection prefix if needed
5f1d8ec
@HavokInspiration HavokInspiration The Query::update() method should apply the connecion prefix if needed
19fb2fc
HavokInspiration added some commits Oct 2, 2014
@HavokInspiration HavokInspiration Apply connection prefix in fixtures c739426
@HavokInspiration HavokInspiration Finish updating the Database system so it fully applies connection pr…
…efixes
6209554
@HavokInspiration HavokInspiration Merge branch '3.0' into 3.0-connection-prefix
Conflicts:
	tests/TestCase/Database/ConnectionTest.php
	tests/TestCase/Database/QueryTest.php
0d5d04c
@HavokInspiration HavokInspiration Update test cases for Mysql and SqlServer 0bae5b5
@HavokInspiration HavokInspiration Merge branch '3.0' into 3.0-connection-prefix
Conflicts:
	src/Database/QueryCompiler.php
	tests/TestCase/Database/QueryTest.php
dad32c3
@HavokInspiration
Member

New update.
I finally found some time to work on this again.
Right now, all the Database system should work with prefix.
All query clauses should now be compatible with connection prefixes if needed (ORDER BY, GROUP BY, etc). To make this happend, I had to upgrade the TableNameExpression so it takes into account an optionnal field name.
In some cases (such as manual joins conditions), I had to preg_replace string parts matching the table.field pattern while excluding aliases.
Fixtures now works with a connection prefix if one is given.

If you give it a try and add a prefix in the tests bootstrap, there will still be lots of failure as well as errors.
Next step is to work on the ORM.

Waiting for you feedback. This set of changes were made them accross multiple very short sessions of work so it probably needs to be harmonised or simply rewritten. And I'm really not sure about what I did in the tests...

HavokInspiration added some commits Oct 16, 2014
@HavokInspiration HavokInspiration Fix Sqlserver failing tests
e0413eb
@HavokInspiration HavokInspiration Fix CS
da2de26
@markstory markstory and 1 other commented on an outdated diff Oct 18, 2014
src/Database/Connection.php
+ *
+ * @param string $field The field
+ * @return TableNameExpression Full field names
+ *
+ * @see \Cake\Database\Expression\TableNameExpression
+ */
+ public function fullFieldName($field) {
+ $prefix = '';
+
+ if (isset($this->_config['prefix']) && $this->_config['prefix'] !== '') {
+ $prefix = $this->_config['prefix'];
+ }
+
+ if (is_string($field) && strpos($field, '.') !== false) {
+ list($tableName, $fieldName) = explode('.', $field);
+ $field = new TableNameExpression($tableName, $prefix, $fieldName);
@markstory
markstory Oct 18, 2014 Member

Would it make more sense for the arguments to be 'prefix, table, field'?

@HavokInspiration
HavokInspiration Oct 21, 2014 Member

I have no problem with it. It would more logical in a way, as they would be in the order they are displayed.
I first went with table as first argument for its importance (it is basically the cornerstone of all of this). I can make the change if it seems more logical to you.

@markstory markstory and 1 other commented on an outdated diff Oct 18, 2014
src/Database/IdentifierQuoter.php
@@ -157,6 +184,28 @@ protected function _quoteJoins($joins) {
}
/**
+ * Quotes the table name
+ *
+ * @param string|TableNameExpression|QueryExpression|\Cake\Database\Query $name Table name to quote
+ *
+ * @return string|TableNameExpression|QueryExpression|\Cake\Database\Query
+ */
+ protected function _quoteTableName($name) {
+ if ($name instanceof TableNameExpression) {
+ $tableName = $name->getName();
+ if (is_string($tableName)) {
@markstory
markstory Oct 18, 2014 Member

What else could it be? I guess it could also be a query for example.

@HavokInspiration
HavokInspiration Oct 21, 2014 Member

Actually, you are right. In early versions, the Expression wrapped more than strings. But as far as I recall, I changed that to simplify the purpose of the Expression : there was no point in wrapping a Query since there would be nothing to prefix. So I only limit the Expression wrapping to string. The test might be useless (I will check that).

@markstory markstory and 1 other commented on an outdated diff Oct 18, 2014
src/Database/Query.php
@@ -1239,6 +1286,10 @@ public function values($data) {
* @return $this
*/
public function update($table) {
+ if ($this->_connection instanceof \Cake\Database\Connection) {
+ $table = $this->_connection->fullTableName($table);
+ }
@markstory
markstory Oct 18, 2014 Member

Can't this be done when the query is converted into SQL?

@HavokInspiration
HavokInspiration Oct 21, 2014 Member

I don't understand. The Connection::fullTableName() method only wraps $table in a instance of TableNameExpression. It doesn't prefix anything yet (not until the query is converted to SQL anyway).
Or I am missing something and it's not about that at all

@markstory
markstory Oct 22, 2014 Member

I was thinking this could be done by the QueryCompiler, but that might be a pile of work as well.

@HavokInspiration
HavokInspiration Oct 22, 2014 Member

Interresting.
I guess, in a way, the prefixing could be done only in the QueryCompiler (well, at first glance, it should be possible). This way the prefixing would only be done if the query is compiled, which would optimize the process as if the query is never compiled, there would be less work done for nothing. It would also concentrate all the prefixing at one place, which would be easier to maintain.

I don't know if you meant this but I'm convinced 😄.

I will probably finish my update for the having clause and branch my current work to try this. I am probably missing things but I don't see why this could not be done.
This will require a lot of change though. The TableNameExpression might even be useless.

@markstory markstory and 1 other commented on an outdated diff Oct 18, 2014
src/Database/QueryCompiler.php
@@ -44,6 +45,14 @@ class QueryCompiler {
];
/**
+ * List of query clauses that need to have the connection prefix applied
+ * to their fields name
+ *
+ * @var array
+ */
+ protected $_fullFieldsParts = ['order', 'group'];
@markstory
markstory Oct 18, 2014 Member

What about having?

@HavokInspiration
HavokInspiration Oct 21, 2014 Member

Indeed, it is missing. I will add it.

@markstory markstory and 1 other commented on an outdated diff Oct 18, 2014
tests/TestCase/Database/Schema/SqlserverSchemaTest.php
+ $config = $connection->config();
+ $prefix = isset($config["prefix"]) && is_string($config["prefix"]) ? $config["prefix"] : "";
+
+ return $prefix;
+ }
+
+/**
+ * Will apply connection prefix to a raw SQL query.
+ * Prefixes are to be represented by the character ~
+ *
+ * @param string $query Query as a string that should be prefixed
+ * @return string The given query with the connection prefix, if any
+ */
+ public function applyConnectionPrefix($query) {
+ $query = str_replace('~', $this->prefix, $query);
+ return $query;
@markstory
markstory Oct 18, 2014 Member

These methods are in quite a few places, perhaps they should be in a trait?

HavokInspiration added some commits Oct 25, 2014
@HavokInspiration HavokInspiration Moves the table name expression wrapping to the Query compiler
This way, if a query is being generated but is never compiled, some overhead would have been avoided as no table name would have been wrapped in an Expression.
7ef11fa
@HavokInspiration HavokInspiration Merge branch '3.0' into 3.0-connection-prefix
10b42f7
@HavokInspiration
Member

Just pushed a commit where I moved the expression wrapping to the QueryCompiler. This way, if the Query is generated but never used, less work is done.

HavokInspiration added some commits Oct 28, 2014
@HavokInspiration HavokInspiration Group some functions needed in tests in a Trait and make the having c…
…lause in the QueryCompiler apply the full table name
79056ca
@HavokInspiration HavokInspiration Merge branch '3.0' into 3.0-connection-prefix
d80f20b
@HavokInspiration HavokInspiration commented on an outdated diff Oct 28, 2014
src/Database/QueryCompiler.php
+ }
+
+/**
+ * Helper function used to apply full field name (meaning applying the connection prefix if the field names in
+ * the order clause contains the table name)
+ * This functions is only applied on ORDER clause
+ *
+ * @param ExpressionInterface $parts Parts of the query clause
+ * @param \Cake\Database\Query $query The query that is being compiled
+ * @param \Cake\Database\ValueBinder $generator The placeholder and value binder object
+ * @return array|ExpressionInterface The variable $parts modified
+ */
+ protected function _orderFullFieldsName($parts, $query, $generator) {
+ if ($parts instanceof ExpressionInterface) {
+ $parts->iterateParts(function ($condition, &$key) use ($parts, $query, $generator) {
+ $key = $query->connection()->fullFieldName($key);
@HavokInspiration
HavokInspiration Oct 28, 2014 Member

This should only be done if $key is not an alias

@HavokInspiration HavokInspiration commented on an outdated diff Oct 28, 2014
src/Database/QueryCompiler.php
+ }
+
+/**
+ * Helper function used to apply full field name (meaning applying the connection prefix if the field names in
+ * the group clause contains the table name)
+ * This functions is only applied on GROUP clause
+ *
+ * @param array $parts Parts of the query clause
+ * @param \Cake\Database\Query $query The query that is being compiled
+ * @param \Cake\Database\ValueBinder $generator The placeholder and value binder object
+ * @return array|ExpressionInterface The variable $parts modified
+ */
+ protected function _groupFullFieldsName($parts, $query, $generator) {
+ if (!empty($parts)) {
+ foreach ($parts as $key => $part) {
+ $parts[$key] = $query->connection()->fullFieldName($part);
@HavokInspiration
HavokInspiration Oct 28, 2014 Member

Same thing as in _orderFullFieldsName

HavokInspiration added some commits Oct 30, 2014
@HavokInspiration HavokInspiration Prefix application in some clauses is only made if the field name doe…
…s not have an aliased table name.
57233ed
@HavokInspiration HavokInspiration Upgrade the regex with word boundaries and add tests to avoid regress…
…ion in case the regular expression have to be modified in the future
745f815
@HavokInspiration HavokInspiration Moves the full table name resolution of the join clause in the Query …
…Compiler. This way, when the prefix is applied, all alias are stored in Query::tablesAliases.
cf17763
@HavokInspiration HavokInspiration Merge remote-tracking branch 'upstram/3.0' into 3.0-connection-prefix 0944f16
@HavokInspiration HavokInspiration Remove a method that is not used anymore
db04dd5
@HavokInspiration HavokInspiration Fix CS
fc50e39
@HavokInspiration HavokInspiration The Schema::getFullTableName should apply the connection prefix only …
…if the table name does not start with the prefix

The prefix is also applied if the table name is the prefix (this should deal with the case where the table name is equal to the prefix)
efd8ecf
@HavokInspiration HavokInspiration The ModelTask::listAll() strips out the prefix in the table name if o…
…ne is configured

Since ModelTask::listAll() is responsible for giving the potential models or controllers, the prefix is unnecessary
22ffa3c
@HavokInspiration HavokInspiration Removes extra space d0e261a
@HavokInspiration HavokInspiration Merge remote-tracking branch 'upstram/3.0' into 3.0-connection-prefix
585cee0
@HavokInspiration HavokInspiration Rewrote the Schema::getFullTableName method
This was causing a lot of issues as some tests were not properly done
8f318cc
@HavokInspiration HavokInspiration Add instanceof check to fix SqlServer build 6bcd84f
@HavokInspiration HavokInspiration Merge remote-tracking branch 'upstram/3.0' into 3.0-connection-prefix
2bf0ac0
@HavokInspiration HavokInspiration The Table object created in the Collection::describe() method should …
…have the connection prefix so it correctly reflects the real table state
1e3d000
@HavokInspiration HavokInspiration The table names used in the ModelTask (specifically around associatio…
…ns) should be stripped out of the connection prefixes

This allows the associations columns name to follow the conventions even if a table name is prefixed
9e735c9
@HavokInspiration HavokInspiration The cache key for model metadata should be prefixed 91cdf85
@HavokInspiration
Member

Hi there.

I finally reached the point where all tests pass when you configure a prefix in the test configuration (locally, with Sqlite, waiting for Travis and AppVeyor for the others). Yay.

Now, I will try to optimize everything and see if there is no inconsistencies (it will be easier now that I have a full picture).
I'll also add some tests for Connection.
I saw that a BetweenExpression was created, I'll try to add cases to make sure it properly works with a connection prefix.

I started adding some notes, I'll check this out later.

@HavokInspiration HavokInspiration Merge remote-tracking branch 'upstram/3.0' into 3.0-connection-prefix
7ad554d
@HavokInspiration HavokInspiration commented on an outdated diff Nov 6, 2014
src/Shell/Task/ModelTask.php
@@ -670,7 +670,16 @@ public function listAll() {
}
$this->_modelNames = [];
- $this->_tables = $this->_getAllTables();
+ $tables = $this->_getAllTables();
+
+ $db = ConnectionManager::get($this->connection);
+ $prefix = $db->getPrefix();
@HavokInspiration
HavokInspiration Nov 6, 2014 Member

This is useless

@HavokInspiration HavokInspiration commented on an outdated diff Nov 6, 2014
src/Database/Query.php
+ return $this;
+ }
+
+/**
+ * Checks wheter $name is or contain (e.g. 'table.field') a aliased table name.
+ *
+ * @param string $name Table or field name
+ *
+ * @return bool
+ */
+ public function hasTableAlias($name) {
+ if (is_string($name) && strpos($name, '.') !== false) {
+ list($tableName, $fieldName) = explode('.', $name);
+ return isset($this->tablesAliases[$tableName]);
+ }
+
@HavokInspiration
HavokInspiration Nov 6, 2014 Member

This method is not completely doing what it says. It properly works only if $name is in the form of "table.field".

@HavokInspiration HavokInspiration commented on the diff Nov 6, 2014
src/Database/Schema/BaseSchema.php
+ }
+
+ return $prefix;
+ }
+
+/**
+ * Resolves the full table name for the table name $tableName
+ *
+ * @param string|TableNameExpression $tableName Table name
+ * @param array $config Configuration array for the current Connection
+ * @param bool $quoted Whether the table name should be quoted
+ * @return string The full table name
+ */
+ public function getFullTableName($tableName, $config, $quoted = true) {
+ $prefix = $this->getConnectionPrefix($config);
+
@HavokInspiration
HavokInspiration Nov 6, 2014 Member

See if it can't get an instance of Connection through _driver ? This also applies for other portions of code underneath this. This can not be done

@markstory
Member

Looking good @HavokInspiration depending on when the RC gets cut this may end up waiting until 3.1, but I don't suspect that there will be a long delay between 3.0.x and 3.1.x. I know both @lorenzo and I have a few ideas for 3.1 and after.

@HavokInspiration
Member

No problem on my side for 3.1, since it's not fully ready. I still have some tests to write and maybe modifications for the QueryCompiler (especially for the BetweenExpression).
I'll just have to monitor potential changes to the ORM until this can be merged.

I'm just wondering how is it possible to have it constantly tested. From experience, I can tell that editing something and having tests passing without a prefix, does not mean that it will pass with one. I don't really know to what extent it might be possible to add a configuration with a prefix to Travis though.

@josegonzalez
Member

@HavokInspiration adding another build target in travis is certainly possible.

HavokInspiration added some commits Nov 7, 2014
@HavokInspiration HavokInspiration Update ConnectionTests
- Update the applyFullTableName tests :
 . Changed a table name to a more conventional name
 . Used the $condition var
 . Added more complex cases
- Added tests for the Connection::isTableNamePrefixed() method
- Added tests for Connection::rawTableName() method
- Added tests for Connection::fullFieldName() method
8bf8dd2
@HavokInspiration HavokInspiration Fixes the Database\Query::hasTableAlias() method
The method only worked when a field name (e.g. table.field) was given. A string (meaning a single table name) without a dot would always return false, which was wrong.
Tests were also added.
30b1d5d
@HavokInspiration HavokInspiration Remove a useless variable d542b02
@HavokInspiration HavokInspiration Mixes the indexation of tables aliases in the Query with a call to Qu…
…ery::join()
ed6f834
@HavokInspiration HavokInspiration Use '=== false' instead of '!$var' 706601d
@HavokInspiration HavokInspiration Merge remote-tracking branch 'upstram/3.0' into 3.0-connection-prefix
149d722
@HavokInspiration
Member

@josegonzalez Ok, I'll look up in Travis' doc to see what I can do (and how I can do it)

I started working on the BetweenExpression, which made me realized I did not do anything for the where clause... So I'm working on that.

HavokInspiration added some commits Nov 14, 2014
@HavokInspiration HavokInspiration First draft for making the where clause of a Query apply the connecti…
…on prefix

The purpose of the TableNameExpression was made broader as it can now wrap expressions or larger strings than table name. This way, this can avoid problems with strings prefixed twice (or even three times in some rare edge cases).
Lots of test cases were added to avoid regressions
Refactoring will be needed when the functionnality goes stable.
Some tests in the ORM are currently failing, they will be addressed in a future commit
40ba88a
@HavokInspiration HavokInspiration This set of changes completely changes the way the prefix is applied …
…in complex clauses

Instead of saving the table aliases and apply prefix only if an alias is not found, table names are stored and prefix is applied only if it is found. This prevents associations aliased to be prefixed in some cases.
Tests were updated accordingly.
4cb5749
@HavokInspiration HavokInspiration Add the second parameters to Connection::fullFieldName() called in th…
…e processing of the select and group clauses

This allows the processed part to be considered as a full SQL snippet and not just only a table name
b7b95c6
@HavokInspiration HavokInspiration Merge remote-tracking branch 'upstram/3.0' into 3.0-connection-prefix d4f8872
@HavokInspiration
Member

I just uptaded the PR with some changes around the TableNameExpression. It has now a "larger" purpose as it can wrap more than table name so it can apply the table name on condition (or even field stored in Expression).
This should normally prevent table name to be double prefixed if a Query is reused.
I have a big work of refactoring at this point. That switch in the QueryCompiler is not really sexy. I'll probably create a new interface that allow Expressions to prefix their table names.

@jippi jippi commented on an outdated diff Nov 17, 2014
src/Database/Connection.php
+
+ return $prefix;
+ }
+
+/**
+ * Checks if $tableName is prefixed
+ * If no prefix is configured, the method will return false
+ *
+ * @param string $tableName Table name to check
+ *
+ * @return bool
+ */
+ public function isTableNamePrefixed($tableName) {
+ $prefix = $this->getPrefix();
+
+ if ($prefix !== '' && strpos($tableName, $prefix) === 0 && $tableName !== $prefix) {
@jippi
jippi Nov 17, 2014 Member

why not just return the value of that condition stack?

return ($prefix !== '' && strpos($tableName, $prefix) === 0 && $tableName !== $prefix)

@jippi jippi commented on an outdated diff Nov 17, 2014
src/Database/Connection.php
+ return false;
+ }
+
+/**
+ * Public method that will resolve the full table name of a table
+ *
+ * @param string|array|TableNameExpression $names The names of the tables
+ * @return array|TableNameExpression Full tables names
+ *
+ * @see \Cake\Database\Expression\TableNameExpression
+ */
+ public function fullTableName($names) {
+ $prefix = $this->getPrefix();
+
+ if (is_string($names) || $names instanceof TableNameExpression) {
+ $names = $this->_fullTableName($names);
@jippi
jippi Nov 17, 2014 Member

just return $names and drop the elseif

@jippi jippi commented on an outdated diff Nov 17, 2014
src/Database/Connection.php
+
+ return $field;
+ }
+
+/**
+ * Apply the full table name on conditions string
+ * Will replace string such as `articles.id` to `prefix_articles.id`
+ *
+ * @param string $condition Condition extracted from a QueryExpression
+ * @param string $exclude String to be excluded as a table name
+ * @return string
+ */
+ public function applyFullTableName($condition, $targets) {
+ $prefix = $this->getPrefix();
+
+ if ($prefix !== '') {
@jippi
jippi Nov 17, 2014 Member

un-nest this block :)

@jippi jippi commented on an outdated diff Nov 17, 2014
src/Database/Expression/TableNameExpression.php
+ public function getPrefix() {
+ return $this->_prefix;
+ }
+
+/**
+ * Constructor
+ *
+ * @param string $name Table name
+ * @param string $prefix Prefix to prepend
+ * @param bool $snippet Whether this expression represents a SQL snippet or just a table name (optionnaly with a prefix)
+ *
+ * @return void
+ */
+ public function __construct($name, $prefix, $snippet = false, $tablesNames = []) {
+ if ($snippet === false) {
+ if (strpos($name, '.') !== false) {
@jippi
jippi Nov 17, 2014 Member

un-nest this block

@jippi jippi commented on an outdated diff Nov 17, 2014
src/Database/Expression/TableNameExpression.php
+ $this->_tablesNames = $tablesNames;
+
+ if (!empty($field)) {
+ $this->_field = $field;
+ }
+ }
+
+/**
+ * Change the $_quoted property that to tell that the $_value property was quoted
+ *
+ * @param bool $quoted Boolean indicating whether the $_value property was quoted or not
+ *
+ * @return void
+ */
+ public function setQuoted($quoted = true) {
+ if ($quoted === true) {
@jippi
jippi Nov 17, 2014 Member

$this->_quoted = (bool)$quoted

@jippi jippi commented on an outdated diff Nov 17, 2014
src/Database/Query.php
@@ -1540,6 +1554,42 @@ public function valueBinder($binder = null) {
}
/**
+ * Extract table aliases from $tables and stores them in Query::_tableAliases
+ *
+ * @param string|array $tables Table name or array of table name
+ *
+ * @return $this
+ */
+ protected function _extractTableNames($tables) {
+ if (!empty($tables)) {
@jippi
jippi Nov 17, 2014 Member

if empty, return $this

move the indented block -1 in

@jippi
Member
jippi commented Nov 17, 2014

code-style wise there is plenty of overly nested code that can be unwrapped into a cleaner state..

logic wise I have no idea :p

@HavokInspiration
Member

@jippi Indeed 😕. I'll clean it up and take an extra look around. Thanks for pointing them out.

@HavokInspiration
Member

While working on this, I realized that when I use the connection config option "quoteIdentifiers", everythign broke down (which didn't surprised me). So I started digging to find a solution for this and came accross the IdentifierQuoter which I seriously read this time. Found out it was used in a queryTranslator which lead me to think that this entire PR might be wrong (not entirely, but for the most part).

Why not having a "TableNamePrefixer" which would traverse the entire Query object to prefix table names (possibly through the TableNameExpression) (in the same way the IdentifierQuoter does his job).
This way, this should focus most of the table name prefixing in one class, and the compiler would only have to stringify expressions and would almost be untouched from the current 3.0 branch state.

This raises one major problem : performance. This would mean that if a prefix is used AND identifiers are quoted, the Query object would be traversed twice before beggining to be compiled to a SQL string (as opposed to now : the prefix is applied during query compilation).

Note that this is written without much deep thoughts, so I don't even know if this is not something that would get complicated. I just wanted to have some feedback over the general idea before eventually start again.

@markstory
Member

This would mean that if a prefix is used AND identifiers are quoted, the Query object would be traversed twice before beggining to be compiled to a SQL string (as opposed to now : the prefix is applied during query compilation).

Both identifier quoting and table prefixing are workaround for non-ideal hosting environments/legacy database designs. If people are super concerned about performance they should be working towards solving one or both of these issues.

@lorenzo
Member
lorenzo commented Nov 19, 2014

I think we could greatly simplify all this code by not making automatic replacements in every situation, but only when the ORM is used and only for the FROM and JOIN clauses

@lorenzo
Member
lorenzo commented Nov 19, 2014

Also, given the complexity of the code so far, the expression translation process should be extracted to another class, in the same way the Identifier quoting process was implemented

@HavokInspiration
Member

Both identifier quoting and table prefixing are workaround for non-ideal hosting environments/legacy database designs. If people are super concerned about performance they should be working towards solving one or both of these issues.

That's the primary use I had for table name prefix when I had only one database on my very cheap hosting environement. But if you dig a bit, some people consider prefixing tables names a security measure against SQL injections, as it would be more difficult to target a table if you don't know its full name. But in that case, the problem lies elsewhere. It would just be a small additional countermeasure in case of a security breach.

Also, given the complexity of the code so far, the expression translation process should be extracted to another class, in the same way the Identifier quoting process was implemented

My thought exactly. This is what triggered the questionning.

I think we could greatly simplify all this code by not making automatic replacements in every situation, but only when the ORM is used and only for the FROM and JOIN clauses

Meaning that if I call the Query system without using the ORM as a starting point, no prefix ?
It would indeed simplify a lot of things.

@markstory
Member

Table prefixing as a security measure is a new reason to me. I'm not sure I believe that it is an effective countermeasure against injection though.

@josegonzalez
Member

Doesn't using the ORM protect against that?

@thaJeztah

..and in case it would't protect against it (which would be a serious issue of course), SHOW TABLES; would reveal the prefixes I guess.

@HavokInspiration
Member

Table prefixing as a security measure is a new reason to me. I'm not sure I believe that it is an effective countermeasure against injection though.

Agreed, I never bought that myself for the reason @thaJeztah stated. Was just repeating what I came across some time ago.
Anyway, I'll probably close this soon and will submit a new PR (third time is a charm, isn't it ?!) focusing only on the ORM with a cleaner way.
Hopefully this time it will be the good one.

@HavokInspiration
Member

The more I think about it, the more I feel that having connection prefix only apply to queries made from the ORM feels a little bit incomplete.

For instance, the Connection object can instanciate a Query for simple operations (through Connection::insert() or Connection::update()). However, the instance is from Database\Query and not from the ORM. Should that mean that the Query does not prefix table name ? Even though the Connection object is aware of the connection prefix ? That feels odd to me.

While I understand the wish (and the need) to not add complexity in the base code for features that are, as Mark said, "for non-ideal hosting environments/legacy database designs", it would seem like a half working feature to me.
Or maybe I just didn't understand Jose's comment.

I do admit that this needs some serious clean up though (I started working on TableNamePrefixer to transform the Query). Maybe after the clean up this will be more... "affordable" ?
But I understand this add quite a layer of complexity "just for that". And maybe most of the queries will be made from the ORM and the support needs to only be for it.

I don't really if I'm right or not to think that so... what do you think ?

I also thought of an alternate solution : make the connection prefix support available through a plugin.
Since a class name can be given for the connection provider and the database driver, this should be enough to load all the custom classes needed (including Schemas for table reflection). This way, the Database\Query object could be applying the connection prefix and this would feel less incomplete. And on top of that, this would not overload the core with more code.
This might need some more thinking though, I didn't even try this out.

HavokInspiration added some commits Dec 2, 2014
@HavokInspiration HavokInspiration Move the prefixing logic in a class (TableNamePrefixer) that acts mor…
…e or less like the IdentifierQuoter.

The Query object is traversed if a connection prefix has been found. Each clause that needs to be prefixed will be processed to have its table name prefixed.
3eb9b0d
@HavokInspiration HavokInspiration Merge branch '3.0' into 3.0-connection-prefix 810bac5
@HavokInspiration HavokInspiration Clean up 7731a42
@HavokInspiration HavokInspiration Merge branch '3.0' into 3.0-connection-prefix d0ba91e
@HavokInspiration HavokInspiration Fix CS 48e1885
@HavokInspiration HavokInspiration Merge remote-tracking branch 'upstream/2.0' into 3.0-connection-prefix 63d6234
@HavokInspiration HavokInspiration Use statement clean up 03a1d1b
@HavokInspiration HavokInspiration Refactor 96fa07e
@HavokInspiration HavokInspiration Merge remote-tracking branch 'upstream/3.0' into 3.0-connection-prefix 7958f67
@HavokInspiration HavokInspiration Refactor the TableNamePrefixer
Expressions are now traversed and prefixed in a cleaner way
b018e07
@HavokInspiration HavokInspiration Remove modifications that are no longer necessary 0ba94b0
@HavokInspiration HavokInspiration Refactor so that a Connection configured with the "quoteIdentifiers" …
…option set to true works all the way through with a prefix

The TableNameExpression was refactored a bit to sustain more parameters for the constructor without having tons of arguments.
e646837
@HavokInspiration HavokInspiration Merge remote-tracking branch 'upstream/3.0' into 3.0-connection-prefix 5f6ddf1
@HavokInspiration HavokInspiration Remove commented out code 28a493c
@HavokInspiration HavokInspiration Merge remote-tracking branch 'upstream/3.0' into 3.0-connection-prefix
Conflicts:
	src/Shell/Task/ModelTask.php
	tests/TestCase/Database/QueryTest.php
53a404e
@HavokInspiration HavokInspiration Update the call to the "field" property setter for objects implementi…
…ng the FieldInterface
bfb226c
@HavokInspiration HavokInspiration A ORM Table instance now returns its full table name (meaning includi…
…ng the connection prefix) if the table() method is called
3a578f3
@HavokInspiration HavokInspiration Delete a file that should not be there anymore 26e627e
@HavokInspiration HavokInspiration Merge remote-tracking branch 'upstream/3.0' into 3.0-connection-prefix
fa897a0
@HavokInspiration
Member

Hi there.
Just a little update.
I did not give up on the task. So far I refactored this a bit :

  • I added a TableNamePrefixer class that more or less acts as the IdentifierQuoter class. It traverses the Query object to prefix the various parts and Expressions. This led to far less modifications to the QueryCompiler (except for the "stringification" of Expressions).
  • I reworked the TableNameExpression a bit so it can have more parameters without having to pass a million arguments to the constructor.

I also started to make instances of ORM\Table to correctly output the table name when the table() method is called, but my current implementations is not working properly. I am currently working on this.

As mentionned in my last post, I am also currently trying to implement this as a plugin.
So far, it seems possible. Most of the work is to extend most of the Query system as there are some files that needs to be edited since they contain hard-coded dependencies to other class that needs modifications.

So far I don't really know what's best.
The plugin is nice in the way that it does not "pollute" the core of Cake with things that will be necessary in edge cases. However, it brings some contraints for the end user :

  • I had to extends TestFixture which means that end users will have to make their fixtures extends the TestFixture of the plugin if they are to use connection prefixes. This is needed because Fixtures schema is built with a Schema\Table instance.
  • I will have to extend ORM\Table to make the table() method output the correct table name when table() is called.
    Also, if a Query is to be built "the hard way", they would have to use the Query from the plugin.
    Not really big constraints, but they should be known.

That's where I am so far.

@lorenzo lorenzo modified the milestone: 3.0.0, 3.1.0 Jan 2, 2015
@HavokInspiration
Member

Hi there.
Sooooo. Where am I more than a month since my last post ? I surely did not give up on this (but I'll admit that this is bigger than I imagined 😛).

About the pluginization (is that a word ?) of the feature, I did give up.
While very tempting as it would have been simpler to manage, it turns out that there was too much things to rewrite and it would have been really messy to maintain.

Right now, I started almost from scratch but I did not go for a "node" based solution (meaning without a TableNameExpression as in this PR). I instead went with a regex based solution which is, in my opinion, a bit cleaner as most of the heavy lifting will be done only by two classes (instead of having to modify the QueryCompiler and everything with expression).

It's too early for me to submit a PR right now, I have some polishing to do as well as some tests to add and fix (I've just set up Travis to work on my fork and it tells me that Postgre does not really like me).
I will probably close this PR when the new one is ready.

Cheers.

@markstory
Member

Sounds good @HavokInspiration. Thanks for the update.

@josephzidell

@HavokInspiration While I don't have the time to work on this, can I suggest a different approach? Instead of defining a prefix on the connection, define it on the Table. Would that simplify things?

@HavokInspiration
Member

@josephzidell Thanks for your input. It also crossed my mind at some point.

However, the Connection object is the common object between everything that needs to apply a prefix to table names : Schema | Schema\Table object, ORM\Table object, the base Query system, etc.
Fixtures are also created using an instance of Connection (through a Schema\Table object to be precise) and they are populated with a Query instance generated from the Connection.
Which is why having the prefix in the Connection object is the perfect place to have it work with the entire framework, even though it needs more work to implement it.

@burzum
Member
burzum commented Feb 23, 2015

@HavokInspiration nice work so far! Please keep working on it! 👍

@HavokInspiration
Member

Yup, I'm still working on it.
Fixtures and adding permutations to the DatabaseSuite gave me a hard time though.
I'm adding some tests right now. And after that I'll have to do some polishing (I have a couple of things to make DRY).

@lorenzo
Member
lorenzo commented Feb 26, 2015

For people still wanting support for prefixes in their tables, there is a way to support it using events. This will probably work for all table operations. Add it to your bootstrap.php file:

EventManager::instance()->on('Model.initialize', function ($event) {
    $instance = $event->subject();
    $instance->table('prefix_' . $instance->table());
});
@AD7six
Member
AD7six commented Feb 26, 2015

If that suggestion works @lorenzo I am very much in favor of documenting and promoting it - it's much less code, and removes all potential complexity from cakephp's code.

@ionas
Contributor
ionas commented Mar 7, 2015

Does it work on a per plugin basis? Does it work with cake bake?

@lorenzo
Member
lorenzo commented Mar 7, 2015

@ionas works for plugin, but not bake

@ionas
Contributor
ionas commented Mar 8, 2015

@lorenzo any "holistic" approach that works on a per plugin basis (plugin's AppModel) and is respected by cake bake, and the whole ORM. Use case would be plugin models that have generic names like User, Product, Address, Article, Content, etc.

@HavokInspiration
Member

Closing this.
The code base has been out of sync for months, and it doesn't seem to fit what people expects from the feature.
If the decision to deal with prefix at the Connection level is taken, I'll submit a new one with another approach of the solution I've been working on.
Thanks for everyone's attention, I learnt a lot :)

@burzum
Member
burzum commented Jul 31, 2015

@lorenzo will this work also with fixtures? Or will you have to change that for tests to not use the prefix? Haven't tried and don't have the time to do so right now.

EventManager::instance()->on('Model.initialize', function ($event, $instance) {
    $instance->table('prefix_' . $instance->table());
});
@lorenzo
Member
lorenzo commented Jul 31, 2015

I would as long as you apply the prefix conditionally depending on the connection

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment