Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

provide an alternative query implementation using ES/Solr #127

Closed
lsmith77 opened this Issue · 12 comments

5 participants

@lsmith77
Owner

any transport layer should be easily configured to use this implementation for full text search as a replacement for whatever it natively provides. this way users will get easy access to a high performant full text search solution with additional capabilities like facetting that are not covered by PHPCR.

@dbu
Owner

well this will make the php-only implementation again use a java product. but i guess still can make sense, devs are more familiar with solr, as are hosters.

@lsmith77
Owner

exactly ..

@lsmith77
Owner

and i guess someone could then also easily integrate Zend_Search_Lucene

@ruflin

@lsmith77 I will take a closer at the implementation to better unterstand how this could work together with https://github.com/ruflin/Elastica

@lsmith77
Owner

essentially PHPCR provides a query syntax called SQL2 (this is essentially the string serialization of the QOM APi which defines queries as an object graph). you can find more details about it here:
http://www.h2database.com/jcr/grammar.html

Note I dont think we necessarily need to support joins initially.

@dbu
Owner
dbu commented

i think we should close this and keep the task at jackalope/jackalope-doctrine-dbal#14 - or close the other one, if we want to provide this on the transport agnostic jackalope level.

@lsmith77
Owner

i think it makes more sense to do this transport agnostic as I see no benefit or simplification by doing it transport specific.

@lsmith77 lsmith77 referenced this issue in jackalope/jackalope-doctrine-dbal
Closed

add optional support for using ElasticSearch/Solr for full text search #14

@awdng

i came here from the other ticket, basically what i am looking for is using elasticsearch with jackalope-doctrine-dbal. As i have zero experience with elasticsearch i was hoping to find something implemented for the cmf, but i guess it's still open ?
Would creating a phpcr provider for https://github.com/FriendsOfSymfony/FOSElasticaBundle/tree/master/Doctrine be the right place to start ?

@dbu
Owner

hi @awdng

what we meant in here is using elasticsearch to provide the phpcr built-in query language. we currently use sqlite/mysql/postgres full text search capabilites for this, which is ok for small amounts of data but does not scale. is this what you are looking for? if you can run java on your server, the other option that already exists would be using jackalope-jackrabbit. that one uses the java jackrabbit server which has built-in search with an integrated lucene.

or do you want a search where you connect directly to elasticsearch? what exactly for? if you use phpcr-odm as well, your best bet would indeed be to add phpcr-odm support in FOSElasticSearch bundle. somebody started this but did not wrap it up (yet?) : FriendsOfSymfony/FOSElasticaBundle#739

cheers,david

@awdng

hi @dbu
well yeah its kind of a weird problem because the client did not want to use jackrabbit for the main backend, so we used jackalope with doctrine dbal to build a custom CMS. Now we needed to provide a solid search function so i thought elasticsearch would be a viable option, even though jackrabbit is basically already based on lucene in itself, but we can't make changes to the data layer at this point.
Running an elastichsearch server is not a problem though.

I spent a couple of hours yesterday and got basic phpcr doctrine dbal support working in FOSElasticaBundle which, now as i see it, is probably very similar to that of the ticket you linked too. My version in still rough but works, updating the index on object updates does not work yet but i am positive to fix that.

But now that you mention full text search of mysql, that sounds rather simple but our CMF data is rather complex by now. How would full text search work exactly ?

@dbu
Owner

the doc is not extensive. we have this: http://phpcr.readthedocs.org/en/latest/book/query.html - to go further, either look at the query object classes and methods (best using autocomplete of your IDE) or the SQL2 documentation of jcr (the java version of the content repository that is the original we ported to php). the phpcr-odm also provides a query builder that is on the object level rather than the phpcr node level: http://doctrine-phpcr-odm.readthedocs.org/en/latest/reference/query-builder.html - you can also query documents with a phpcr query, just as you can make the orm load entities with sql queries.

is what you built done with raw phpcr nodes, or the doctrine phpcr-odm? if its phpcr-odm, if you could provide input on the pull request, that would be appreciated of course.

btw, switching to jackalope-jackrabbit should not involve any changes in your code. both adhere to the same specification. there is an xml export/import functionality that works with any phpcr (and jcr!) repository if you need to migrate data. this could be an option if the search features match your needs, but search performance becomes an issue. jackrabbit however provides no direct access to its lucene index, only through SQL2 or the query builder.

in the end, particularly for mixed setups, i think elasticsearch support in phcpr-odm still is a valid scenario.

if you want ES support as this ticket was intended, it would still go through SQL2/query builder, and would be hooked into https://github.com/jackalope/jackalope-doctrine-dbal/blob/master/src/Jackalope/Transport/DoctrineDBAL/Client.php#L2159 (and all methods that update data) - it would be interchangeable for users of jackaope-doctrine-dbal which search system they use.

@dbu
Owner
dbu commented

we are now planning a large refactoring of jackalope and this should become much easier then. added it to the list of possible search adapters: https://github.com/Jackalope2/jackalope/wiki/Query-Adapters

@dbu dbu closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.