New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for full text search ie. CONTAINS() #50

Closed
lsmith77 opened this Issue Jul 21, 2012 · 9 comments

Comments

Projects
None yet
3 participants
@lsmith77
Member

lsmith77 commented Jul 21, 2012

conceptually this should be similar to http://docs.doctrine-project.org/projects/doctrine1/en/latest/en/manual/searching.html
i.e. using an inverted index (http://en.wikipedia.org/wiki/Inverted_index) where we would split all text by word, drop that into a table and use that as a pre filter to reduce the number of rows that are considered for the XPaths.

for specs of CONTAINS() see http://www.day.com/specs/jcr/1.0/8.5.4.5_CONTAINS.html
its ok if initially we do not support the full "lucene" style query syntax

see also http://www.h2database.com/jcr/grammar.html for the grammar of SQL2

also not sure .. but maybe this could help to get started http://code.google.com/p/inverted-index/

@ghost ghost assigned cryptocompress Jul 21, 2012

@lsmith77

This comment has been minimized.

Show comment
Hide comment
@lsmith77

lsmith77 Jul 23, 2012

Member

a quick implementation using equality matching rather than full text searching with normal xpath just to show where the code will need to hook into:

diff --git a/src/Jackalope/Transport/DoctrineDBAL/Query/QOMWalker.php b/src/Jackalope/Transport/DoctrineDBAL/Query/QOMWalker.php
index 7f9ba8c..e34017a 100644
--- a/src/Jackalope/Transport/DoctrineDBAL/Query/QOMWalker.php
+++ b/src/Jackalope/Transport/DoctrineDBAL/Query/QOMWalker.php
@@ -173,6 +173,9 @@ class QOMWalker
         if ($constraint instanceof QOM\SameNodeInterface) {
             return $this->walkSameNodeConstraint($constraint);
         }
+        if ($constraint instanceof QOM\FullTextSearchInterface) {
+            return $this->walkFullTextSearchConstraint($constraint);
+        }

         throw new InvalidQueryException("Constraint " . get_class($constraint) . " not yet supported.");
     }
@@ -187,6 +190,15 @@ class QOMWalker
     }

     /**
+     * @param \PHPCR\Query\QOM\FullTextSearchConstraint $constraint
+     * @return string
+     */
+    public function walkFullTextSearchConstraint(QOM\FullTextSearchInterface $constraint)
+    {
+        return $this->sqlXpathExtractValue($this->getTableAlias($constraint->getSelectorName()), $this->getTableAlias($constraint->getPropertyName())).' = '. $this->conn->quote($constraint->getFullTextSearchExpression());
+    }
+
+    /**
      * @param QOM\PropertyExistenceInterface $constraint
      */
     public function walkPropertyExistanceConstraint(QOM\PropertyExistenceInterface $constraint)
Member

lsmith77 commented Jul 23, 2012

a quick implementation using equality matching rather than full text searching with normal xpath just to show where the code will need to hook into:

diff --git a/src/Jackalope/Transport/DoctrineDBAL/Query/QOMWalker.php b/src/Jackalope/Transport/DoctrineDBAL/Query/QOMWalker.php
index 7f9ba8c..e34017a 100644
--- a/src/Jackalope/Transport/DoctrineDBAL/Query/QOMWalker.php
+++ b/src/Jackalope/Transport/DoctrineDBAL/Query/QOMWalker.php
@@ -173,6 +173,9 @@ class QOMWalker
         if ($constraint instanceof QOM\SameNodeInterface) {
             return $this->walkSameNodeConstraint($constraint);
         }
+        if ($constraint instanceof QOM\FullTextSearchInterface) {
+            return $this->walkFullTextSearchConstraint($constraint);
+        }

         throw new InvalidQueryException("Constraint " . get_class($constraint) . " not yet supported.");
     }
@@ -187,6 +190,15 @@ class QOMWalker
     }

     /**
+     * @param \PHPCR\Query\QOM\FullTextSearchConstraint $constraint
+     * @return string
+     */
+    public function walkFullTextSearchConstraint(QOM\FullTextSearchInterface $constraint)
+    {
+        return $this->sqlXpathExtractValue($this->getTableAlias($constraint->getSelectorName()), $this->getTableAlias($constraint->getPropertyName())).' = '. $this->conn->quote($constraint->getFullTextSearchExpression());
+    }
+
+    /**
      * @param QOM\PropertyExistenceInterface $constraint
      */
     public function walkPropertyExistanceConstraint(QOM\PropertyExistenceInterface $constraint)
@lsmith77

This comment has been minimized.

Show comment
Hide comment
@lsmith77

lsmith77 Jul 23, 2012

Member

just some notes

[16:42] syncNode($uuid, $path, $parent, $type, $isNewNode, $props = array(), $propsData = array())
[16:42] copyNode($srcAbsPath, $dstAbsPath, $srcWorkspace = null)
[16:43] deleteNode($path)
[16:43] deleteProperty($path)
[16:43] in this methods i have to update index... am i missing something?
[16:43] yeah

Member

lsmith77 commented Jul 23, 2012

just some notes

[16:42] syncNode($uuid, $path, $parent, $type, $isNewNode, $props = array(), $propsData = array())
[16:42] copyNode($srcAbsPath, $dstAbsPath, $srcWorkspace = null)
[16:43] deleteNode($path)
[16:43] deleteProperty($path)
[16:43] in this methods i have to update index... am i missing something?
[16:43] yeah

@cordoval

This comment has been minimized.

Show comment
Hide comment
@cordoval

cordoval Jul 24, 2012

what are those passages references? just curious

what are those passages references? just curious

@cryptocompress

This comment has been minimized.

Show comment
Hide comment
@cryptocompress

cryptocompress Jul 24, 2012

Contributor

yes, in QOMWalker the index has to be looked up and in this Client methods the index should be changed

Contributor

cryptocompress commented Jul 24, 2012

yes, in QOMWalker the index has to be looked up and in this Client methods the index should be changed

@lsmith77

This comment has been minimized.

Show comment
Hide comment
@lsmith77

lsmith77 Jul 24, 2012

Member

@cordoval these are methods in the transport client that get called by jackalope core to write data to the given transports backend. as such these are the places were any additional index needs to be updated.

Member

lsmith77 commented Jul 24, 2012

@cordoval these are methods in the transport client that get called by jackalope core to write data to the given transports backend. as such these are the places were any additional index needs to be updated.

@lsmith77

This comment has been minimized.

Show comment
Hide comment
@lsmith77

lsmith77 Jul 26, 2012

Member

@cryptocompress did you try if it works using the XPath contains() function?

Member

lsmith77 commented Jul 26, 2012

@cryptocompress did you try if it works using the XPath contains() function?

@cryptocompress

This comment has been minimized.

Show comment
Hide comment
@cryptocompress

cryptocompress Jul 26, 2012

Contributor

sorry, had no time. will do it now.
as described here: http://www.day.com/specs/jcr/1.0/8.5.4.5_CONTAINS.html

Contributor

cryptocompress commented Jul 26, 2012

sorry, had no time. will do it now.
as described here: http://www.day.com/specs/jcr/1.0/8.5.4.5_CONTAINS.html

@cryptocompress

This comment has been minimized.

Show comment
Hide comment
@cryptocompress

cryptocompress Aug 5, 2012

Contributor

very early but functional xpath-full-text-search can be found here:
cryptocompress@78af6a4

feedback desired :)

Contributor

cryptocompress commented Aug 5, 2012

very early but functional xpath-full-text-search can be found here:
cryptocompress@78af6a4

feedback desired :)

@lsmith77

This comment has been minimized.

Show comment
Hide comment
@lsmith77

lsmith77 Oct 22, 2012

Member

ok implemented something ultra simple with LIKE '%foo%' here eaac413

Member

lsmith77 commented Oct 22, 2012

ok implemented something ultra simple with LIKE '%foo%' here eaac413

@lsmith77 lsmith77 closed this Oct 22, 2012

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment