Skip to content
Mark Story edited this page Jun 17, 2013 · 26 revisions

Note This wiki page reflect my thoughts, and don't indicate changes that will actually happen in CakePHP.

The current model class/system has served CakePHP well for the last 6 years, however its showing its age, and some of the decisions made in the past have aged poorly. There have been advancements in PHP as well, which allow the creation of a better data access layer.

Problems

Existing model has a few problems.

  • Frankenstein - is it a record, or a table? Currently its both.
  • Inconsistent API - Model::read() for example.
  • No query object - Queries are always defined as arrays, this has some limitations and restrictions.
  • Returns arrays. This is a common complaint about CakePHP, and has probably reduced adoption at some levels.
  • No record object - This makes attaching formatting methods difficult/impossible.
  • Containable - Should be part of the ORM, not a crazy hacky behaviour.
  • Recursive - This should be better controlled as defining which associations are included, not a level of recursiveness.
  • DboSource - Its a beast, and Model relies on it more than datasource. That separation could be cleaner easier.
  • Validation - Should be separate, its a giant crazy function right now. Making it a reusable bit would make the framework more extensible.

Possible solution

Separate the row/record behaviour. App models would represent the record, and there would be an easy way to replace a table gateway for a specific model. This allows extension at the record and table gateway/entity manager level. Separating the row/table/query/driver layers allows a more flexible and robust API. Developers can choose the API that provides them what they need.

A multi-tiered API gives more extension points, increases the testability of the code, and gives developers tools to do the different kinds of jobs they need to do. Active record is great for handling single rows and simple associations. A more fully featured query api allows for reporting type features, and aggregate operations to be performed. Aggregate and reporting queries don't work well with an ActiveRecord approach, and having more complete query api will allow easy access to those kind of features.

Example Table/Record API

<?php
// create a new record
$post = Post::create();
  
// save that record and its associations
$post->save();

// delete that record, depending on cascades associations would be deleted.
$post->delete();
Post::table()->delete($post);

// Fetch a record with some conditions
// Since conditions are provided, the query is evaluated right away.
// Since the query type is 'first' a single Record will be returned instead
// of a RecordSet.
$post = Post::find('first', array(
      'conditions' => array('Post.id' => 1)
));

// Gets a query object
$query = Post::find('all')->where('Post.active', true);
$query->order('Post.id', 'ASC');

// Iterating a query executes & fetches the results.
// Queries always result in a recordset.  Using find('first') can
// get a single record instead.
foreach ($query as $post) { ... }

// Include associations.  By default no associations are included.
// Included associations will be joined if possible based on the association type.
// and join information.
$posts = Post::find('all', array('contain' => array('Comment', 'User');
$query = Post::find('all')->contain('Comment', 'User');

// update all the posts
Post::updateAll($conditions, $values);

Table objects

Table objects are a new concept to CakePHP. By default models would use a generic instance, which could be customized per model. Table objects provide a number of features:

  • They can load/persist/delete objects.
  • They provide access to table level operations like find(), delete(), update(), create().
  • Access schema and use a schema object to reflect on existing schema.
  • Build association chains for eager/lazy loading later.

Many of the static methods on the Record instances proxy to the table object, for each of use. Methods like

  • Post::find() => table->find()
  • Post::create() => table->create()
  • Post::updateAll() => table->update()
  • Post::deleteAll() => table->delete()
  • Post::findAllBySomeField() => table::__call()

All proxy to the table instance retrieved through Post::table().

Behaviors and callbacks

The existing CakePHP behaviors best fit onto Table objects because of the callback system. Table objects could handle callbacks like:

Query callbacks

  • beforeFind($model, $event, $query)
  • afterFind($model, $event, $recordSet)

Save/delete/validate callbacks

  • beforeDelete($model, $event)
  • afterDelete($model, $event)
  • beforeSave($model, $event)
  • afterSave($model, $event)
  • beforeValidate($model, $event)
  • afterValidate($model, $event)

All of these callbacks could be implemented as static methods on the Record class, instance methods on the table object, or in behaviors. Additionally arbitrary callbacks could be registered with the table object.

<?php
Post::table()->Events->on('afterSave', function ($model, $event) { ... }, $options);

$options would allow controlling priority, and any other settings required. The $event object could be used to cancel further callbacks using $event->stop().

Behaviors

Behaviors would be retro-fitted to provide additional callbacks to models in a horizontally reusable way. They could also be used to define custom finder methods that are applied to multiple models. When a behavior is attached to a model, the priority of its callbacks can be defined.

<?php
static $actsAs = array(
    'Tree' => array('priority' => 5)
);

Or individually on methods:

<?php
static $actsAs = array(
    'Tree' => array('priority' => array('beforeFind' => 5, 'afterFind' => 6)
);

Normal instance methods on a behavior could be accessed through the table as a 'mixin' or statically on the class as they are currently.

<?php
class TestBehavior extends \Cake\Model\Behavior {
    function doIt() {
    }
}

Post::doIt();
Post::table()->doIt();

Could both work.

Table methods

  • schema() Get schema for the table.
  • find() Find record(s)
  • delete()
  • update()
  • insert()
<?php
$options = array(
    'table' => 'posts',
    'connection' => 'default',
    'primaryKey' => 'id', // could be an array for composite keys.
);
$table = new \Cake\Model\Table($options);

// short form, fetches the object in the registry.
$table = Post::table();

// get the schema.
$schema = $table->schema();

// get a new record.
$post = $table->create();

// get a query object
$query = $table->find('all');
$query = $table->find();

// Delete a pile of records
$conditions = array('Post.id' => array(1,2,3));
$table->delete($conditions);

// Update a pile of records
$table->update($fields, $conditions);


// Magic methods.
$results = $table->findAllByTitle('great');
$results = $table->countByTitle('super');

Associations

Table classes also manage and maintain association data. Associations would still be declared on the model classes as static properties:

<?php
class Post extends \App\Model\AppModel {

  static $belongsTo = array(
    'User' => array('className' => 'User')
  );

  static $hasMany = array(
    'Comment' => array('className' => 'Comment')
  );
}

Associations would be available on record classes as either lazy-load properties or eagerly loaded result sets.

<?php
$post = Post::find('first', array(
  'conditions' => array('Post.title' => 'First post'),
);

// get the second comment
$post->comments[1];

Associations would be exposed on records as the lowercase pluralized form of the association name.

Manipulating associations

Adding associations on the fly would be done on the table objects:

<?php
Post::table()->addAssociation(new Cake\Model\Association\HasMany($options));
Post::table()->hasMany('Author', [...]);

Removing associations would also be done on the table level:

<?php
Post::table()->removeAssocation('Author');

There would be several built-in associations that map to the existing associations:

  • HasOne
  • BelongsTo
  • HasMany
  • HasAndBelongsToMany
  • HasManyThrough (replaces 'with')

Record classes

Record classes contain the logic related to a single row/instance. They can:

  • Access record data.
  • Provide row level formatting, like getName().
  • Access associated data and perform operations on instance data.
  • Validate themselves.
  • Save, delete, and update themselves.
  • Check for new-ness.
  • Use a serializer to convert to a format to('json')
  • Use 'traits' for shared behaviour, these could either be actual PHP traits, or use a compatibility class that emulates it for PHP5.3.

Methods:

  • save()
  • delete()
  • isNew()/exists()
  • dirty() // check if a attribute has changed.
  • validate()
  • to() // serialize to a format.
<?php
$post = Post::find('first')->where('Post.id', 1);
$post->title; // get the post's title.

// The text of the first comment attached to the post.  
// Since no 'contain' was used, comments are lazy loaded with an extra query.
$post->Comment[0]->text;

// If an association was defined as an eager load, it will be done as a join, and
// eagerly loaded.
$post->User->id;

// Updating a single field.
$post->saveField('name', 'Mark');

Getters and Setters

Having to write attribute accessors for every property is a pain. However there are many times you want to have methods handle the get/set of a field. By defining a get/set method you can override the default behavior:

<?php
class User extends \App\Model\AppModel {
    function setPassword($value) {
        return $this->_hash($value);
    }
}

$user = new User();
$user->password = 'something';
var_dump($user->password); // Hashed value.

By returning the updated value, you can mutate values that are going to be set to attribute.

Validation

Validation is currently contained in one giant function, that munges through various validation definitions, runs callbacks and creates error messages. Ideally Model validation would be a separate class that can be re-used in other places as well. Separating the responsibilities will also ease testing, and make validation easier to replace should someone want that.

Whenever a model is validated the current validation rules will be converted into a validation object and the record will be validated against it.

<?php
use app\Model\AppModel
class Post extends AppModel {
    static $validate = array(
        'title' => array('notEmpty'),
        'on' => 'create'
    );
}

$post->validate(); //return boolean
$post->errors; // array of invalid fields + messages.

The userlevel api would be consistent with previous versions, however there would also be a standalone validation api available.

<?php
$validator = new \Cake\Model\Validation();
$validator->add('title', array(
    'rule' => function ($value) { },
    'message' => 'Wrong!'
));
// Either a record or a hash of properties to validate.
$errors = $validator->validate(array());

Create/Update flags would be supported only when using a Record instance.

Query API

The Query represents an object that will turn into datasource query. For RDBMS this would generate SQL, for Mongo it could create an array of mongo query parameters. Since RDBMS and other datasources implement different apis, a basic shared api could contain:

  • where()
  • exclude() (where + not)
  • limit()
  • page()
  • order()

While a SQL api should support a more fully featured API including features CakePHP has never offered before:

  • where()
  • exclude() (where + not)
  • limit()
  • page()
  • order()
  • from()
  • fields()
  • join()
  • group()
  • having()
  • union()
<?php
$table = Post::table();
$query = $table->find(); // get a query object bound to the posts table.
$query->where(array('Post.title LIKE' => '%' . $name))
      ->limit(10)
      ->group('Post.author');

// Run the query. Iterating the query will also execute it.
$resultSet = $query->execute();

// queries can be converted to strings and executed on the database.
$db->execute((string) $query);


// Get things out of a query object.
// These methods would return results as an array.
$row = $query->one();
$rows = $query->all();

Driver API

Drivers represent connections to Datasources such as Relational databases or NoSQL solutions. They should offer methods for connection creation, preparing of statements, and escaping of field content, and CRUD.

Methods

  • connect()
  • connected()
  • disconnect()
  • enabled()
  • create($table, $attributes) - Returns an insert query builder.
  • update($table, $values, $conditions) - Returns an update query builder.
  • delete($table, $conditions) - Returns a delete query builder.
  • read($table, $conditions) - Returns a select query builder.
  • selectDatabase($db)
  • listCollections()
  • describe($table)
  • begin()
  • commit()
  • rollback()

Expression objects

Expression objects allow the insertion of arbitrary SQL / query logic, and can be used anywhere a literal value could be used.

<?php
$expr = $db->expression('some sql');

Open issues

  • Callbacks - some callbacks are table level (beforeFind, afterFind), the rest are record level.
  • Handling saving of associated objects.

Appendix

  • FluentPDO as an example of a desirable query API.