Skip to content

Enhancement of ElasticSearch support, SearchManager, and Serialization #77

Merged
merged 14 commits into from Jun 19, 2013

3 participants

@MrHash
MrHash commented Apr 28, 2013

Various updates to ElasticSearch integration. Breaking changes to Solr/Lucene clients but possibly not difficult to implement updated interface requirements.

In general the fundamental difficulty is in the serialization of the mappings and the entities. So far the annotation updates gives a richer support for the mapping property generation, and a serializer option has been added to provide an appropriate means of serializing entities into the required format, which would otherwise be difficult with mappings alone.

MrHash added some commits Apr 24, 2013
@MrHash MrHash Support Elastica 0.20.x client
Breaks BC because of namespace change
f41f359
@MrHash MrHash Added field mappings into class metadata 8cb5b60
@MrHash MrHash Enrichment of SearchManager, ElasticSearch client, and serializer sup…
…port

Mostly updates to ElasticSearch integration with breaking changes for
other clients. Also added support for more complex annotations, Unit of
Work style persistence, and support for different entity serialization
methods for persistence. Updated for Elastica 0.20.x support.
6d1195f
@MrHash MrHash Skip object serialization during removal cf59371
@MrHash MrHash Refactoring client interface clean up concrete classes cf274a9
@Baachi Baachi commented on an outdated diff Apr 29, 2013
lib/Doctrine/Search/ElasticSearch/Client.php
use Doctrine\Search\SearchClientInterface;
+use Doctrine\Search\Mapping\ClassMetadata;
+use Elastica\Client as Elastica_Client;
@Baachi
Baachi added a note Apr 29, 2013

Please rename it to ElasticaClient.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@Baachi Baachi commented on an outdated diff Apr 29, 2013
lib/Doctrine/Search/SearchManager.php
/**
* Constructor
*
* @param Configuration $config
* @param SearchClientInterface $sc
*/
- public function __construct(Configuration $config, SearchClientInterface $sc)
+ public function __construct(Configuration $config, SearchClientInterface $sc, SerializerInterface $se = null)
@Baachi
Baachi added a note Apr 29, 2013

Can you move the SerializerInterface to the Configuration class? I think its a better place for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@Baachi Baachi commented on the diff Apr 29, 2013
lib/Doctrine/Search/SearchManager.php
@@ -89,6 +105,22 @@ public function getClassMetadata($className)
{
return $this->metadataFactory->getMetadataFor($className);
}
+
+ /**
+ * @return SearchClientInterface
+ */
+ public function getClient()
+ {
+ return $this->searchClient;
@Baachi
Baachi added a note Apr 29, 2013

Why you need this?

@MrHash
MrHash added a note Apr 29, 2013

We need this for creating indexes and some other tasks at the moment. If all the underlying client methods can be abstracted we can eventually do away with it but for now it is useful.

@Baachi
Baachi added a note Apr 29, 2013

Okay make sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@Baachi Baachi commented on an outdated diff Apr 29, 2013
lib/Doctrine/Search/SearchManager.php
@@ -121,8 +153,8 @@ public function persist($object)
if (!is_object($object)) {
throw new UnexpectedTypeException($object, 'object');
}
-
- //$this->searchClient->createIndex($index, $type, $query);
+
+ $this->persisted[] = $object;
@Baachi
Baachi added a note Apr 29, 2013

scheduledForPersist?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@Baachi Baachi commented on an outdated diff Apr 29, 2013
lib/Doctrine/Search/SearchManager.php
- }
-
- /**
- * Bulk action
- *
- * @param object $object
- *
- * @throws UnexpectedTypeException
- */
- public function bulk($object)
- {
- if (!is_object($object)) {
- throw new UnexpectedTypeException($object, 'object');
- }
+
+ $this->removed[] = $object;
@Baachi
Baachi added a note Apr 29, 2013

scheduledForDelete?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@Baachi Baachi commented on an outdated diff Apr 29, 2013
lib/Doctrine/Search/SearchManager.php
}
+
+ protected function commitPersisted()
+ {
+ $documents = $this->sortObjects($this->persisted);
@Baachi
Baachi added a note Apr 29, 2013

Please respect the CS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@Baachi
Baachi commented Apr 29, 2013

I'm not really familar with elasticsearch but i know that the entities must be serialized to json. I don't know, its the best way to go, because it make to use solr more difficult.

@MrHash
MrHash commented Apr 29, 2013

My understanding is that Solr also requires entities to be serialized or is this incorrect? In any case the serialization can be made configurable.

@Baachi
Baachi commented Apr 29, 2013

I think serialize is the wrong word. The library should extract the data from the entity/document and pass this to the search backend. This library should only deal with plain php arrays, the serialization process should be done in solarium or elastica.

@MrHash MrHash Support for Entity hydration, Pagination
Search manager now supports querying through a magic wrapper class.
Hydration is provided by default when an entity manager is provided
through configuration or injection, or directly through a custom
Doctrine Query if provided. Hydration can be bypassed to return search
results directly. Currently only functional for ElasticSearch.
aadbf8c
@MrHash
MrHash commented Apr 29, 2013

Search request example with pagination:

$sm = $context->getDatabaseConnection('elasticsearch');
$em = $context->getDatabaseConnection('doctrine');
$sm->setEntityManager($em);

$q = $sm->createQuery()
    ->from('Entities\Feature')
    ->searchWith(new ElasticaQuery())
    ->hydrateWith($em->createQuery('SELECT e FROM Entities\Feature e WHERE e.id IN (:ids)')
    ->addSort('_score') //Client specific methods are passed through to the searchWith query object
    ->addSort(['start' => ['order' => 'desc']])
    ->useResultCache(true);  //Hydration via Doctrine uses result cache

$adapter = new Doctrine\Search\ElasticSearch\PaginationAdapter($q);
$pager = new Zend\Paginator\Paginator($adapter);
$pager->setCurrentPageNumber($page);  //Paging is actually handled at the search engine level
$pager->setItemCountPerPage($limit);
$results = $pager->getCurrentItems();
@MrHash
MrHash commented Apr 29, 2013

Regarding serialization, yes i think you are right. In fact the JMS serialization adapter returns an array, but the library will natively support array serialization soon. This may have seemed misleading, as was the __toString default. So these minor changes can be made easily.

@Baachi
Baachi commented Apr 29, 2013

Wow nice 👍
One problem, the library should also work with the MongoDB ODM.

@MrHash
MrHash commented Apr 29, 2013

Well Doctrine2 already supports MongoDB no? I don't use it so i'm not familiar with the API.

@Baachi
Baachi commented Apr 29, 2013

Quick introduction:

  • doctrine2 is the ORM which deals with PostgreSQL, MySQL and SQLite.
  • MongoDB ODM deals only with the mongodb driver

There are 2 different libraries but have nearly the same api.

@MrHash
MrHash commented Apr 29, 2013

Well I'm afraid i may not be the best person to handle such an integration. Perhaps you could tell me why these modifications are not suitable for the ODM? Surely it's just a case of a different hydration abstraction or is the search completely different also?

@MrHash
MrHash commented Apr 30, 2013

I don' really understand what the problem is that you are concerned about. There is a Query::hydrateWith method which can accept any type of hydration query. So all that would be required is a little abstraction to provide a different default hydration query or hydration call. It is possible to determine the type and method of hydration by checking the type of EntityManager or hydration Query that has been provided to the SearchManager.

If you are concerned about implementing a full-text search service on the MongoDB layer then that would require the implementation of a MongoDB search client which conforms to the SearchClientInterface specification. Therefore I do not see any problem with these developments. The fact that Mongo uses map/reduce and RDBMS is SQL based is irrelevant because the Doctrine libraries abstract away that difference.

@beberlei beberlei and 1 other commented on an outdated diff Apr 30, 2013
lib/Doctrine/Search/Configuration.php
@@ -23,6 +23,9 @@
use Doctrine\Common\Cache\ArrayCache;
use Doctrine\Search\Mapping\ClassMetadataFactory;
use Doctrine\Common\Persistence\Mapping\Driver\MappingDriver;
+use Doctrine\Search\SerializerInterface;
+use Doctrine\Search\Serializer\CallbackSerializer;
+use Doctrine\ORM\EntityManager;
@beberlei
Doctrine member
beberlei added a note Apr 30, 2013

hard dependency on the EntityManager, we should work with ObjetManager interface here

@MrHash
MrHash added a note Apr 30, 2013

Agreed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@beberlei beberlei and 1 other commented on an outdated diff Apr 30, 2013
lib/Doctrine/Search/Query.php
+ {
+ $this->doctrineQuery = $doctrineQuery;
+ if($parameter) $this->hydrationParameter = $parameter;
+ return $this;
+ }
+
+ /**
+ * Return a provided Doctrine Query or a default.
+ *
+ * @return DoctrineQuery
+ */
+ protected function getHydrationQuery()
+ {
+ if($this->doctrineQuery) return $this->doctrineQuery;
+
+ $em = $this->getSearchManager()->getEntityManager();
@beberlei
Doctrine member
beberlei added a note Apr 30, 2013

This needs to be EntityManager independent.

@MrHash
MrHash added a note Apr 30, 2013

The EntityManager dependency here is only in order to build a default hydration query. This dependency can be removed completely by requiring the user to provide a hydration query.

@beberlei
Doctrine member
beberlei added a note Apr 30, 2013

This is not how dependencies work, This class has a dependency on the ORM, if its used or not in some cases. The Doctrine\Search namespace however should be persistence agnostic.

@MrHash
MrHash added a note Apr 30, 2013

So you are saying the property names should be changed? Or should i remove the default query completely and leave an exception saying that a hydration query is required. This removes pretty much any dependency on the ORM anyway since the provided query will have the required ObjectManager reference in it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@beberlei beberlei commented on an outdated diff Apr 30, 2013
lib/Doctrine/Search/Mapping/Driver/AnnotationDriver.php
@@ -163,10 +181,7 @@ private function addValuesToMetadata(array $reflectedClassProperties, ClassMetad
if (false === property_exists($metadata, $propertyName)) {
throw new DriverException\PropertyDoesNotExistsInMetadataException($reflectedProperty->getName());
} else {
- $metadata->$propertyName = $class->$propertyName;
- /*I am not sure if that is needed
- * $metadata->addField($reflectedProperty);
- $metadata->addFieldMapping($reflectedProperty);*/
+ if(!is_null($class->$propertyName)) $metadata->$propertyName = $class->$propertyName;
@beberlei
Doctrine member
beberlei added a note Apr 30, 2013

please wrap the line in { } and put it on an extra line

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@beberlei beberlei commented on the diff Apr 30, 2013
lib/Doctrine/Search/Mapping/Driver/AnnotationDriver.php
- $metadata = $this->addValuesToMetadata($reflFieldAnnotations->getProperties(),
- $metadata,
- $documentsFieldAnnotation);
-
+ return $metadata;
+ }
+
+ /**
+ * Extract the methods annotations.
+ *
+ * @param \ReflectionMethod[] $reflMethods
+ * @param ClassMetadata $metadata
+ *
+ * @return ClassMetadata
+ */
+ private function extractMethodsAnnotations(array $reflMethods, ClassMetadata $metadata)
@beberlei
Doctrine member
beberlei added a note Apr 30, 2013

this method is very complex and nested very deep. Is this necessary?

@MrHash
MrHash added a note Apr 30, 2013

This was a quick duplication of the method used to extract property annotations. This allows us to support virtual properties since the annotations allows a name to override the actual property/method name in the mapping properties generation from the entity. Ultimately It could be refactored along with the property annotation extraction method and perhaps other parts of this class as the requirements become clearer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
MrHash added some commits Apr 30, 2013
@MrHash MrHash Change depency to ObjectManager cef7b81
@MrHash MrHash CS fix 7b588c4
@MrHash MrHash Remove ORM dependency
Hydration query is now required to be specified if hydration mode is
not set to bypass.
ae1e70b
@Baachi Baachi commented on an outdated diff Jun 18, 2013
lib/Doctrine/Search/SearchManager.php
}
+
+ protected function commitPersisted()
@Baachi
Baachi added a note Jun 18, 2013

Should be private at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@Baachi Baachi commented on an outdated diff Jun 18, 2013
lib/Doctrine/Search/SearchManager.php
}
+
+ protected function commitPersisted()
+ {
+ $documents = $this->sortObjects($this->scheduledForPersist);
+
+ foreach($documents as $index => $documentTypes)
+ {
+ foreach($documentTypes as $type => $documents)
+ {
+ $this->searchClient->addDocuments($index, $type, $documents);
+ }
+ }
+ }
+
+ protected function commitRemoved()
@Baachi
Baachi added a note Jun 18, 2013

Should be private at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@Baachi Baachi commented on an outdated diff Jun 18, 2013
lib/Doctrine/Search/SearchManager.php
+ }
+
+ protected function commitRemoved()
+ {
+ $documents = $this->sortObjects($this->scheduledForDelete, false);
+
+ foreach($documents as $index => $documentTypes)
+ {
+ foreach($documentTypes as $type => $documents)
+ {
+ $this->searchClient->removeDocuments($index, $type, $documents);
+ }
+ }
+ }
+
+ protected function sortObjects(array $objects, $serialize = true)
@Baachi
Baachi added a note Jun 18, 2013

Should be private at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@Baachi
Baachi commented Jun 18, 2013

@MrHash Sorry for the very long delay. I would like to merge this PR, can you fix the last comments?
Thank you for your work!

@MrHash
MrHash commented Jun 18, 2013

@Baachi Updated as requested. I understand there is an outstanding issue with compatibility with the Doctrine ODM query api but i think that should be fairly trivial to implement.

@Baachi Baachi merged commit e17d196 into doctrine:master Jun 19, 2013
@Baachi
Baachi commented Jun 19, 2013

PR is merged, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.