Skip to content

Commit

Permalink
DRAFT some sort of summary builder thing
Browse files Browse the repository at this point in the history
  • Loading branch information
addshore committed Aug 17, 2015
1 parent 69af4eb commit 7eaaf4e
Show file tree
Hide file tree
Showing 2 changed files with 160 additions and 0 deletions.
91 changes: 91 additions & 0 deletions src/Summary/SummaryBuilder.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
<?php

namespace Wikibase\DataModel\Services\Summary;

use Comparable;
use Diff\DiffOp\AtomicDiffOp;
use Diff\DiffOp\Diff\Diff;
use Wikibase\DataModel\Entity\EntityDocument;
use Wikibase\DataModel\Services\Diff\EntityDiffer;
use Wikibase\DataModel\Services\Diff\ItemDiff;

/**
* This is a stupid rough draft.....@deprecated
* RFC......
*
* TODO: i18n, plurals, linking stuff?, priortise certain changes, limit?
*/
class SummaryBuilder {

/**
* @param EntityDocument $orig
* @param EntityDocument $changed
*
* @return string
*/
public function get( EntityDocument $orig, EntityDocument $changed ) {

This comment has been minimized.

Copy link
@JeroenDeDauw

JeroenDeDauw Aug 19, 2015

Contributor

Sounds like a generic getter like this. How about something like getSummaryFor ?


//TODO could just check the diff is null?
if ( $orig instanceof Comparable && $changed instanceof Comparable ) {

This comment has been minimized.

Copy link
@JeroenDeDauw

JeroenDeDauw Aug 17, 2015

Contributor

This should still work correct even if you drop && $changed instanceof Comparable

if ( $orig->equals( $changed ) ) {
return 'Null change';
}
}

//TODO isEmpty should be in an interface?

This comment has been minimized.

Copy link
@JeroenDeDauw

JeroenDeDauw Aug 19, 2015

Contributor

Seen this issue before. Created a ticket for it now at https://phabricator.wikimedia.org/T109581

Do note that the code in this class has related OCP violations, though indirectly: it assumes all entities have labels, descriptions, aliases and statements, and won't hold into account additional types of content from new entity types. For this class at least, it makes sense to solve those two issues at the same time since they have the same root cause.

How about having a summary formatter per entity type?

if ( method_exists( $changed, 'isEmpty' ) ) {
if ( $changed->isEmpty( $changed ) ) {
return 'Entity Emptied';
}
}

//TODO inject entityDiffer
$entityDiffer = new EntityDiffer();
$entityDiff = $entityDiffer->diffEntities( $orig, $changed );

/** @var Diff[] $diffs */
$diffs = array();
$diffs['labels'] = $entityDiff->getLabelsDiff();
$diffs['descriptions'] = $entityDiff->getDescriptionsDiff();
$diffs['aliases'] = $entityDiff->getAliasesDiff();
$diffs['claims'] = $entityDiff->getClaimsDiff();
if( $entityDiff instanceof ItemDiff ) {
$diffs['sitelinks'] = $entityDiff->getSiteLinkDiff();
}

foreach ( $diffs as $type => $diff ) {
if( $diff->isEmpty() ) {
unset( $diffs[$type] );
}
}

$summaryParts = array();
foreach( $diffs as $type => $diff ) {
/** @var AtomicDiffOp[] $diffs */
$typeDiffs = array();
$typeDiffs['added'] = $diff->getAdditions();
$typeDiffs['removed'] = $diff->getRemovals();
$typeDiffs['changed'] = $diff->getChanges();

$typeSummaryParts = array();
foreach ( $typeDiffs as $typeType => $typeDiff ) {
if( !empty( $typeDiff ) ) {
$typeSummaryParts[] = $typeType . ' ' . count( $typeDiff ) . ' ' . $type;
}
}

if( !empty( $typeSummaryParts ) ) {
$summaryParts[] = implode( ', ', $typeSummaryParts );
}

}

if ( !empty( $summaryParts ) ) {
return implode( ': ', $summaryParts );
}

return 'Could not create a good summary.....';

}

This comment has been minimized.

Copy link
@JeroenDeDauw

JeroenDeDauw Aug 19, 2015

Contributor

Split split split split


}
69 changes: 69 additions & 0 deletions tests/unit/Summary/SummaryBuilderTest.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
<?php

namespace Wikibase\DataModel\Services\Tests;

use Wikibase\DataModel\Entity\EntityDocument;
use Wikibase\DataModel\Entity\Item;
use Wikibase\DataModel\Services\Summary\SummaryBuilder;

class SummaryBuilderTest extends \PHPUnit_Framework_TestCase {

public function provideStuff() {
$emptyItem = new Item();

$itemWithOneLabelEn = new Item();
$itemWithOneLabelEn->setLabel( 'en', 'Foo' );

$itemWithOneDescriptionEnFoo = new Item();
$itemWithOneDescriptionEnFoo->setDescription( 'en', 'Foo' );

$itemWithOneDescriptionEnBar = new Item();
$itemWithOneDescriptionEnBar->setDescription( 'en', 'Bar' );

$itemWithOneDescriptionDe = new Item();
$itemWithOneDescriptionDe->setDescription( 'de', 'Foo' );

return array(
array(
$emptyItem,
$emptyItem,
'Null change'
),
array(
$itemWithOneDescriptionEnFoo,
$itemWithOneDescriptionEnFoo,
'Null change'
),
array(
$itemWithOneDescriptionEnFoo,
$emptyItem,
'Entity Emptied'
),
array(
$emptyItem,
$itemWithOneDescriptionEnFoo,
'added 1 descriptions'
),
array(
$itemWithOneDescriptionEnFoo,
$itemWithOneDescriptionEnBar,
'changed 1 descriptions'
),
array(
$itemWithOneDescriptionEnFoo,
$itemWithOneDescriptionDe,
'added 1 descriptions, removed 1 descriptions'
),
);
}

/**
* @dataProvider provideStuff
*/
public function testGetPropertyIds( EntityDocument $from, EntityDocument $to, $expectedSummary ) {
$builder = new SummaryBuilder();
$summary = $builder->get( $from, $to );
$this->assertEquals( $expectedSummary, $summary );
}

}

5 comments on commit 7eaaf4e

@addshore
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other notes / thoughts:

  • If a summary is approaching / over a specifed limit then fallback to saying "Entity changed" or something like that
  • Instead of a string this could simply be some sort of code (that could be used in summary formatted stuff) which means it might not go in this repo...... but then it coul just return something like l+1:d1-1+1|c+(P123|Foo)-(P99|Q12)|s+1 ...... maybe that looks too ugly...

@thiemowmde
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a public method to call this with a Diff object. We do have code that works on a Diff object and not on two Entity objects.

@addshore
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two entity objects are used to determine if the entity has been emptied and could probably be used to detect a new entity creation.
I don't think either can be done with a Diff object alone

@thiemowmde
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is a precalculated EntityDiff object, why not reuse it? All you need to do to make this possible is to split this code into two public functions.

@JeroenDeDauw
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. If there are cases where we only have an EntityDiff and not the entities from which the diff was created, and we want to make a summary based on the diff, then that could also be done via this code. And as long as it is structured sanely, we can trivially expose that functionality by making a method public.

Do we have such places though? I understand that there are ones that already have an EntityDiff object, though as addshore pointed out, you can make better summaries if you have the entities themselves. (Or at least one of them and the diff.)

Please sign in to comment.