Contains facilities to group data together
PHP
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
src/CrazyCodr/Data/Grouper
tests/CrazyCodr/Data/Grouper
.gitattributes
.gitignore
.travis.yml Remove docs and setup for travis ci Sep 13, 2013
LICENSE
README.md
composer.json
phpunit.php
phpunit.xml

README.md

Latest Stable Version Total Downloads Build Status

CrazyCodr/Data/Grouper

This package contains facilities to easily regroup data from any enumerable source.

This class features a grouping iterator accompagnied by different groupers and aggregators that you can use to regroup and aggregate grouped data. The GroupingIterator is not an IteratorIterator per se because it has to precompile every result instead of just iterating the inner iterator.

Table of contents

  1. Installation
  2. Creating a basic grouping iterator
  3. Supporting many groupers at once
  4. Adding aggregators to the lot
  5. Creating your own testable classes

Installation

To install it, just include this requirement into your composer.json

{
    "require": {
        "crazycodr/data-grouper": "2.*"
    }
}

And then run composer install/update as necessary.

Creating a basic grouping iterator

Creating a grouping iterator requires at least three items:

  1. A GroupResult used to contain the results of your different groups
  2. A GroupingIterator used to iterate your data and provide an iterator on the resulting data
  3. A Grouper used to regroup your data and create more "Results"

(Note: This code assumes that you have an array based datasource with columns: Name, Type, Sex and Age)

$mygrouper = new GroupingIterator(new GroupResult(), $data);
$mygrouper->addGrouper(new ClosureGrouper(function($a){ return $a['type']; }), 'type');

//Prepare the data by calling "rewind"
//Note that on each "rewind", first operation of the foreach, all data is pre/re-compiled.
//Because this operation repeats the grouping of data, large or slow data-sources may end up slowing your
//iterator quite fast. Avoid repetitive calls to foreach or use a caching iterator between your data 
//and your grouping iterator
$mygrouper->rewind();

//We don't want to iterate the master iterator because it will simply return all the values as we know them
//Instead, we'll iterate the result of getGroups which will return the first grouping level "Type" in this case
foreach($mygrouper->getGroups() as $group)
{
    echo '<h1>Group '.$group->getGroupValue();
    foreach($group as $data)
    {
        echo 'Employee: '.$data['name'].'<br>';
    }
    echo '<hr>';
}

Supporting many groupers at once

To support many groupers, just add another grouper to the GroupingIterator and loop the groups as usual but loop the groups of the first group to get the subgroups of that new grouper.

//Setup a grouping iterator that will group employees by type and count them
$mygrouper = new GroupingIterator(new GroupResult(), $data);
$mygrouper->addGrouper(new ClosureGrouper(function($a){ return $a['sex']; }), 'sex');
$mygrouper->addGrouper(new ClosureGrouper(function($a){ return $a['type']; }), 'type');

//Remember to rewind to prepare the datasource
$mygrouper->rewind();

//We want to iterate the master iterator's groups and then in each group iterate again the sub groups.
foreach($mygrouper->getGroups() as $group)
{
    echo '<h1>Sex group '.$group->getGroupValue();
    foreach($group->getGroups() as $subgroup)
    {
        echo '<h1>Type group '.$group->getGroupValue();
        foreach($group as $data)
        {
            echo 'Employee: '.$data['name'].'<br>';
        }
        echo '<hr>';
    }
}

Adding aggregators to the lot

One really convenient use for the grouper is to calculate aggregated values on the fly for each group. Creating an aggregator is as simple as we have done so far, create a ClosureAggregator and add it.

//Setup a grouping iterator that will group employees by type and count them
$mygrouper = new GroupingIterator(new GroupResult(), $data);
$mygrouper->addGrouper(new ClosureGrouper(function($a){ return $a['sex']; }), 'sex');
$mygrouper->addGrouper(new ClosureGrouper(function($a){ return $a['type']; }), 'type');
$mygrouper->addAggregator(new CountAggregator('employeeCount'));
$mygrouper->addAggregator(new AverageClosureAggregator(function($a){ return $a['age']; }), 'averageAge');

//Remember to rewind to prepare the datasource
$mygrouper->rewind();

//We want to iterate the master iterator's groups and then in each group iterate again the sub groups.
foreach($mygrouper->getGroups() as $group)
{
    echo '<h1>Sex group '.$group->getGroupValue().' (Average age of the '.$this->getAggregationValue('employeeCount').' employees is '.$this->getAggregationValue('averageAge').' years old)';
    foreach($group->getGroups() as $subgroup)
    {
        echo '<h1>Type group '.$group->getGroupValue().' (Average age of the '.$this->getAggregationValue('employeeCount').' employees is '.$this->getAggregationValue('averageAge').' years old)';
        foreach($group as $data)
        {
            echo 'Employee: '.$data['name'].'<br>';
        }
        echo '<hr>';
    }
}

The different aggregators that exist are the following:

  1. MinClosureAggregator
  2. MaxClosureAggregator
  3. SumClosureAggregator
  4. AgerageClosureAggregator
  5. CountAggregator

All aggregators except the CountAggregator are closure based meaning you must give them a closure to retrieve the value to work with. The CountAggregator only increments an internal variable as it gets data, doesn't need any special value to work with.

Creating your own testable classes

The point of this library is not to have to create the iterators and they sub-components each time and be able to test the lot easily. To this end, simply create concrete extensions of your iterators and sub-components and then test them.

class SexGrouper extends ClosureGrouper
{
    public function __construct()
    {
        parent::__construct(new ClosureGrouper(function($a){ return $a['sex']; }));
    }
}
class TypeGrouper extends ClosureGrouper
{
    public function __construct()
    {
        parent::__construct(new ClosureGrouper(function($a){ return $a['type']; }));
    }
}
class GroupByTypeAndSexIterator extends GroupingIterator
{
    public function __construct($data)
    {
        parent::__construct(new GroupResult(), $data);
        $this->addGrouper(new SexGrouper(), 'sex');
        $this->addGrouper(new TypeGrouper(), 'type');
        $this->addAggregator(new CountAggregator('employeeCount'));
        $this->addAggregator(new AverageClosureAggregator(function($a){ return $a['age']; }), 'averageAge');
    }
}

It might look extreme but this way you are creating a concrete functional grouper that can be reused and tested. Note that DataProviders are a great way to test Groupers and Aggregators but it will look strange to use a DataProvider when testing GroupingIterators.

class SexGrouperTest extends PHPUnit_Framework_TestCase
{

    /**
    * @dataProvider sexGrouperDataProvider
    */
    public function testGroupedOnValue($data)
    {
        $group = new SexGrouper();
        $this->assertEquals($data['expected'], $group->getGroupedOnValue($data['testdata']));
    }

    public function sexGrouperDataProvider()
    {
        return array(
            array(
                'expected' => 'male',
                'testdata' => array('name' => 'John doe', 'age' => 35, 'sex' => 'male'),
            ),
            array(
                'expected' => 'female',
                'testdata' => array('name' => 'Jone doe', 'age' => 30, 'sex' => 'female'),
            ),
        );
    }

}