Skip to content
Odm for Ceph based on Doctrine mapper skeleton
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

Ceph ODM

Build Status codecov

This repository presents a Ceph odm based on Doctrine mapper skeleton.


The recommanded way to install this project is using composer:

$ composer require coffreo/ceph-odm

Basic usage

First, you need to instanciate an Amazon S3 client:

$s3Client = new \Aws\S3\S3Client([
    'region' => '',
    'version' => '2006-03-01',
    'endpoint' => 'http://my-ceph-server/',
    'use_path_style_endpoint' => true,
    'credentials' => ['key' => 'userAccessKey', 'secret' => 'userSecretKey']

use_path_style_endpoint is important, it allows to internally generate urls like http://my-ceph-server/mybucket instead of urls like

Once your client is instanciated, use it to create your ObjectManager:

$objectManager =  \Coffreo\CephOdm\Factory\ObjectManagerFactory::create($s3Client);

Note that you can pass an Doctrine\Common\EventManager as create second argument if you have to deal with Doctrine events.

Create a bucket

Before creating objects, you must create a bucket for storing them into:

$objectManager->persist(new \Coffreo\CephOdm\Entity\Bucket('my-bucket'));

Create a new object

$object = new \Coffreo\CephOdm\Entity\File();
$object->setBucket(new \Coffreo\CephOdm\Entity\Bucket('my-bucket'));
$object->setAllMetadata(['my-metadata1' => 'my-value1', 'my-metadata2' => 'my-value2']);

echo $object->getId(); // e223fc11-8046-4a84-98e2-0de912d071e9 for instance since object is stored

Be careful, only lowercase strings are accepted as metadata keys.

Update an object

$object->addMetadata('my-metadata2', 'my-new-metadata-value);

Remove an object


Duplicate an object

You can easyly clone an object by persisting it again. The only thing to keep in mind is to detach the entity:

$object = $fileRepository->find(/* ... */);

// You can update (or not) the object properties before saving it


The object will be saved with a new id. You can also save it to another bucket:

$object = $fileRepository->find(/* ... */);

$object->setBucket(new \Coffreo\CephOdm\Entity\Bucket('my-bucket-2));
// You can update (or not) the object properties before saving it


Find an object by its identifiers

Bucket and id are the primary identifiers of objects.

$fileRepository = $objectManager->getRepository(\Coffreo\CephOdm\Entity\File::class);
$object = $fileRepository->find([new \Coffreo\CephOdm\Entity\Bucket('my-bucket'), 'e223fc11-8046-4a84-98e2-0de912d071e9']);

echo $object->getFilename();    // test.txt

In repository find methods, you must use the bucket name or a bucket object in your criteria:

$object = $fileRepository->find([new \Coffreo\CephOdm\Entity\Bucket('my-bucket'), 'e223fc11-8046-4a84-98e2-0de912d071e9']);

Is the same thing as:

$object = $fileRepository->find(['my-bucket', 'e223fc11-8046-4a84-98e2-0de912d071e9']);

Other find methods

$objects = $fileRepository->findAll();  // All objects of all buckets

$objects = $fileRepository->findBy(['bucket' => 'my-bucket']);  // All objects of the bucket
$objects = $fileRepository->findBy(['id' => 'e223fc11-8046-4a84-98e2-0de912d071e9']); // All objects in any bucket of the given id

The previous statements only return objects that the logged user owns. For now, you can only perform a search on bucket and/or id.

Filter results by metadata

You can also use metadata as filter

$objects = $fileRepository->findBy(['bucket' => 'my-bucket', 'metadata' => ['mymetadata' => 'myvalue']]);

Be careful, it's only a filter. It's not native, all files are retrieved, filtering is done after. Furthermore the criteria metadata => [] won't return all files without metadata. It means no metadata filter, so all the files will be returned according by the possible other criteria.

Sort results

The results can be sorted but it's not a database sort. The sort is done programmatically so it's not optimized and it's applyed after the bucket limit. By default, the results are ordered by bucket name and id. For ordering a query by a filename metadata (desc) and by id (asc):

$objects = $fileRepository->findBy([], ['metadata' => ['filename' => -1], 'id' => 1]);

Truncated results

For the find methods which return many files (findBy and findAll), if there is too many results (more than the limit you specified or 1000 by default), the names of the buckets where all the files couldn't be returned are returned by getBucketsTruncated:

// Let's set the limit to 10
$objects = $fileRepository->findBy(['bucket' => 'mybucket'], [], 10);
foreach ($objects->getBucketsTruncated() as $bucketName) {
    // some files of the bucket $bucketName ('mybucket' in our case) was not returned

Resume truncated queries

You can use the continue parameter to resume a previously truncated query. For instance for retrieving the files of mybucket that was not retrieved by the query above:

// It may be necessary to do this call many times. Do this call in a loop until $objects->getBucketsTruncated() returns an empty array.
$objects = $fileRepository->findBy(['bucket' => 'mybucket'], [], null, 1);

For making this possible, the repository keeps a pointer on the last file returned by bucket. Note that this pointer is modified when another query is done on the bucket; the calls bellow update the pointer for bucket mybucket:

  • findBy(['bucket' => 'mybucket'])
  • findOneBy(['id' => 'myid'])
  • findBy([])
  • findAll
    Only find never modify the internal pointer.

This is another example for retrieving all files of the connected user:

$truncated = []
do {
   $objects = $fileRepository->findBy([], [], null, $truncated ? 1 : 0);
   // Do something with objects
   $truncated = $objects->getBucketsTruncated();
} while ($truncated);

Note that you can use findAll on the first call too.

Finally, the findByFrom method returns files starting after the given identifier:

$objects = $fileRepository->findByFrom(['bucket' => 'mybucket'], ['mybucket' => 'myid3']);
// Returns files myid4, myid5, myid6... but not myid3
// Since the criteria specifies the bucket, you can even simplify by: findByFrom(['bucket' => 'mybucket'], 'myid3')

Lazy load

When queries that return multiple results are used (i.e. queries which don't specify bucket and id), bin and metadata are not loaded directly since getting them requires to perform another specific server call per result. This library uses in these cases a lazy load strategy and retrieves bin and metadata only when getBin, getAllMetadata, getMetadata or setMetadata is called. You won't normally have to worry about it but it could be useful to be aware about it.

You can’t perform that action at this time.