Path handling

dbu edited this page Jul 30, 2011 · 5 revisions

Overview

One of the big questions is how paths should be handled. Specifically if path's can be considered equivalent to ID's in the other ODM's.

See also http://groups.google.com/group/symfony-cmf-devs/browse_thread/thread/73c2e88037876362

Conclusion: The path is the document id for phpcr-odm.

IDs. vs paths

An ID is usually a semi random string (or integer) with no special meaning aside from uniquely identifying a Document. A path is on the other hand constructed from the parent path with a unique subpath.

ID's usually have the following characteristics:

  1. unique
  2. non NULL
  3. immutable (for the most part)

Examples: 42, 234ldsfdsf23432423

Paths are:

  1. unique
  2. non Null
  3. some what less immutable

Examples: /foo/bar (bar node on /foo parent), /test/bar/foo (foo node on /test/bar parent)

Problem definition

The key issue is that in some cases a user might want to simply reference a node and in others one might want to move the node instead. Especially the later case would mean the path is not immutable and as a result could lead to confusion.

Additional considerations

The goal of PHPCR ODM is to make it easier to share both knowledge and code between other ODM's. The resulting behavior of PHPCR ODM should however not violate best practices. Therefore whenever there is simply no way to find a compatible API that does not violate best practices for PHPCR, then its preferred to simply not provide anything. If necessary users will need to access the Jackalope API to get access to PHPCR specific functionality that isn't matched in the other ODM's similar to how for example one would fallback to the DBAL to manually start a transaction with the ORM.

Use cases

Creating a new Document instance and storing it

In this case it needs to be attached to an existing node (or implicitly to the root?) and a unique relative path needs to be defined.

$comment = new Foo\Article();
$dm->persist($comment);
$dm->flush();

Renaming a subpath of a node

The name of the subpath is changed without affecting the parent node.

$comment = new Foo\Article();
$path = explode('/', $comment->path);
$path[count($path)-1] = 'somenewname';
$dm->flush();

Associating a previously stored Document instance with another one

In this case its unclear if the intention is a reference, a move operation or multiple parent nodes.

$comment = $fooArticle->comments[0];
$barArticle->commments[] = $comment;
$dm->flush();

A variation of this is if the previously stored Document is unset on the previous parent node.

$comment = $fooArticle->comments[0];
unset($fooArticle->comment[0]);
$barArticle->commments[] = $comment;
$dm->flush();

Related concepts in the ODM

MongoDB has the concept of ID's on collections, which are simple incremented integer id's (is this true?). CouchDB has UUID's. Neither has a concept similar to paths in PHPCR/JCR, but PHPCR/JCR does have support for UUID's. However assigning a UUID is optional and generally discouraged.

Both MongoDB and CouchDB ODM support the concept of weak references and embedded documents. Embedded cannot be referenced else where (is this true?). PHPCR/JCR also has the concept of normal references which are similar to foreign keys, however their use is generally discouraged. The concept of embedded documents does not exist, however there is the concept of parent<->child Documents.

Another aspect is that by default both MongoDB and CouchDB provide auto generated ID's. Currently PHPCR ODM does not provide this. In theory the RepositoryPathGenerator could have a default implementation in the DocumentRepository that would simply combine the parent (or the root node if no parent is set) and a randomly generated subpath.

Real world example

A text block will in most cases only be used in one place, while an image inside an asset library will likely be used in multiple locations.

Maybe a good approach would be to configure the text node as an embedded document and the image as a weak reference. As a result when assigning the text to a new parent it would end up with a multi parent (do we want to encourage this? or rather throw an Exception?) or when un-assigning in the same transaction it would be a move. For something configured as a reference it would obviously use a weak reference. Maybe we should add support for optionally also using normal references, then again we could define that for these kinds of references the Jackalope API should be used.

Implementation

Some notes on how tree operations are done in PHPCR: Tree Operations