Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use node references tables in join query to better performance #312

Open
hason opened this issue Jan 14, 2016 · 3 comments
Open

Use node references tables in join query to better performance #312

hason opened this issue Jan 14, 2016 · 3 comments

Comments

@hason
Copy link

hason commented Jan 14, 2016

The QueryBuilder

$qb
    ->fromDocument(Article::class, 'a')
    ->addJoinInner()
        ->right()->document(ArticleCategory::class, 'c')->end()
        ->condition()->equi('a.category', 'c.uuid')->end()
    ->end()
    ->where()
        ->eq()->field('c.uuid')->literal($contentDocument->getUuid())->end()
    ->end()
    ->andWhere()
        ->eq()->field('a.publishable')->literal(true)->end()
    ->end()
    ->orderBy()->desc()->field('a.publishStartDate')->end()
;

is actual transformed to very slow sql (sqlite 400 ms)

SELECT 
  n0.path AS n0_path, 
  n0.identifier AS n0_identifier, 
  n0.props AS n0_props, 
  n1.path AS n1_path, 
  n1.identifier AS n1_identifier, 
  n1.props AS n1_props 
FROM 
  phpcr_nodes n0 
  INNER JOIN phpcr_nodes n1 ON (
    n0.workspace_name = n1.workspace_name 
    AND n1.type IN ('nt:unstructured', 'rep:root') 
    AND EXTRACTVALUE(
      n0.props, '//sv:property[@sv:name="category"]/sv:value[1]'
    ) = n1.identifier
  ) 
WHERE 
  n0.workspace_name = ? 
  AND n0.type IN ('nt:unstructured', 'rep:root') 
  AND (
    (
      (
        n1.identifier = 'c39b134f-81dd-4ef8-937d-d0fd975c349b' 
        AND EXTRACTVALUE(
          n0.props, '//sv:property[@sv:name="publishable"]/sv:value[1]'
        ) = '1'
      ) 
      AND (
        EXTRACTVALUE(
          n0.props, 'count(//sv:property[@sv:name="phpcr:class"]/sv:value[text()="AppBundle\Document\Article"]) > 0'
        ) 
        OR EXTRACTVALUE(
          n0.props, 'count(//sv:property[@sv:name="phpcr:classparents"]/sv:value[text()="AppBundle\Document\Article"]) > 0'
        )
      )
    ) 
    AND (
      EXTRACTVALUE(
        n1.props, 'count(//sv:property[@sv:name="phpcr:class"]/sv:value[text()="AppBundle\Document\ArticleCategory"]) > 0'
      ) 
      OR EXTRACTVALUE(
        n1.props, 'count(//sv:property[@sv:name="phpcr:classparents"]/sv:value[text()="AppBundle\Document\ArticleCategory"]) > 0'
      )
    )
  ) 
ORDER BY 
  CAST(
    EXTRACTVALUE(
      n0.numerical_props, '//sv:property[@sv:name="publishStartDate"]/sv:value[1]'
    ) AS DECIMAL
  ) DESC, 
  EXTRACTVALUE(
    n0.props, '//sv:property[@sv:name="publishStartDate"]/sv:value[1]'
  ) DESC

The sql is much more faster with references tables (4 ms):

SELECT 
  n0.path AS n0_path, 
  n0.identifier AS n0_identifier, 
  n0.props AS n0_props, 
  n1.path AS n1_path, 
  n1.identifier AS n1_identifier, 
  n1.props AS n1_props 
FROM 
  phpcr_nodes n0 
  INNER JOIN phpcr_nodes n1 ON (
    n0.workspace_name = n1.workspace_name 
    AND n1.type IN ('nt:unstructured', 'rep:root')
  ) 
  INNER JOIN phpcr_nodes_weakreferences n2 ON (
    n2.target_id = n1.id 
    AND n2.source_id = n0.id 
    AND n2.source_property_name = "category"
  ) 
WHERE 
  n0.workspace_name = ? 
  AND n0.type IN ('nt:unstructured', 'rep:root') 
  AND (
    (
      (
        n1.identifier = 'c39b134f-81dd-4ef8-937d-d0fd975c349b' 
        AND EXTRACTVALUE(
          n0.props, '//sv:property[@sv:name="publishable"]/sv:value[1]'
        ) = '1'
      ) 
      AND (
        EXTRACTVALUE(
          n0.props, 'count(//sv:property[@sv:name="phpcr:class"]/sv:value[text()="AppBundle\Document\Article"]) > 0'
        ) 
        OR EXTRACTVALUE(
          n0.props, 'count(//sv:property[@sv:name="phpcr:classparents"]/sv:value[text()="AppBundle\Document\Article"]) > 0'
        )
      )
    ) 
    AND (
      EXTRACTVALUE(
        n1.props, 'count(//sv:property[@sv:name="phpcr:class"]/sv:value[text()="AppBundle\Document\ArticleCategory"]) > 0'
      ) 
      OR EXTRACTVALUE(
        n1.props, 'count(//sv:property[@sv:name="phpcr:classparents"]/sv:value[text()="AppBundle\Document\ArticleCategory"]) > 0'
      )
    )
  ) 
ORDER BY 
  CAST(
    EXTRACTVALUE(
      n0.numerical_props, '//sv:property[@sv:name="publishStartDate"]/sv:value[1]'
    ) AS DECIMAL
  ) DESC, 
  EXTRACTVALUE(
    n0.props, '//sv:property[@sv:name="publishStartDate"]/sv:value[1]'
  ) DESC
@dbu
Copy link
Member

dbu commented Jan 15, 2016

i kind of wonder if it even makes sense that the condition reads ->condition()->equi('a.category', 'c.uuid') instead of ->condition()->equi('a.category', 'c'). the odm query builder should not use the phpcr level things...

on the phpcr level, i don't know if we can detect this situation - and we probably don't know if its a weak or a hard reference, and those are in different tables. but maybe we can, or can use both tables or something. or find a way to pass hints from phcpr-odm to the phcpr query builder - the performance difference is quite massive.

if you have time to improve some of this, it would be great.

@hason
Copy link
Author

hason commented Jan 18, 2016

i kind of wonder if it even makes sense that the condition reads ->condition()->equi('a.category', 'c.uuid') instead of ->condition()->equi('a.category', 'c'). the odm query builder should not use the phpcr level things...

->condition()->equi('a.category', 'c') throws Invalid field specification, expected string like "[alias].[field_name]", got "c"

@dbu
Copy link
Member

dbu commented Jan 21, 2016

yes, its how it currently works, but it feels like a violation of abstractions :-(

@dbu dbu added the performance label Apr 7, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants