Task Query with Authorization and Identity link are slow #2217

ThorbenLindhauer · 2017-02-06T14:43:33Z

This issue was imported from JIRA:

Field	Value
JIRA Link	CAM-7441
Reporter	J3VD3gn What is this name? This pseudonym name was generated based on the user name in JIRA to protect the personal data of our JIRA users. You can use this identifier to search for issues by the same reporter.
Has restricted visibility comments	true

The task query in combination with authorization and identity link are slow on MySQL.

To reproduce start 5000 process instances of the invoice process.

EXPLAIN SELECT Count( DISTINCT res.id_ )
FROM ACT*RU*TASK res
inner join ACT*RU*IDENTITYLINK i
ON i.task*id_ = res.id*
left join ACT*RE*PROCDEF procdef
ON res.proc*def_id_ = procdef.id*
left join ( SELECT a.*
FROM ACT*RU*AUTHORIZATION a
WHERE a.type_ < 2 AND
( a.user*id* IN ( 'demo', '*' ) OR
a.group*id* IN ( 'camunda-admin', 'management', 'accounting', 'sales' ) ) AND
(( a.resource*type* = 7 AND
a.perms_ & 2 = 2 OR
a.resource*type* = 6 AND
a.perms_ & 64 = 64 )) ) auth
ON ( auth.resource*id_ IN ( res.id_, procdef.key*, '*' ) ) #<-- this is the problematic part 
WHERE res.assignee_ IS NULL AND
res.assignee_ IS NULL AND
 i.type_ = 'candidate' AND
( i.group*id* IN ( 'accounting', 'camunda-admin', 'management', 'sales' ) ) AND
 res.suspension*state* = 1 AND
( ( res.case*execution_id* IS NOT NULL ) OR
( auth.resource*id* IS NOT NULL ) );

With authorization the query returns a result after 30s, without authorization after 1 s.
On 7.4 with the old authorization query it runs as well much faster.

Problem is that on this combination the existing indices are not used by MySQL.

Hint:
Since we test in the performance test suite the task query with authorization and 50k task.
I assume that the combination with the identity link table is the problem.

Links:

is related to https://jira.camunda.com/browse/SUPPORT-6219
is related to https://jira.camunda.com/browse/CAM-11277

ThorbenLindhauer · 2022-02-11T07:58:17Z

This comment was imported from JIRA and written by user @tmetzke

Copied from CAM-14092:

Hello Tobias,

Thank you for reply.

We are using MySQL 5.7.35.

The total rows for tables are:

SELECT COUNT(1) FROM ACTREPROCDEF; --> 3;
SELECT COUNT(1) FROM ACTRUAUTHORIZATION; --> 11868;
SELECT COUNT(1) FROM ACTRUTASK; --> 11746;

ps: As you can see tables are small in row numbers.

I'm talking about the query for a request like this:

GET /task/count?taskDefinitionKey=atender*reclamacao&unassigned=true&firstResult=0&maxResults=25&sortBy=dueDate&sortOrder=desc&taskVariables=grupo_eq_GEROP<ins>-</ins>COPER&processDefinitionKey=tratativa*backoffice

OR

GET /task/count?assigned=true&firstResult=0&maxResults=10&sortBy=dueDate&sortOrder=desc&processDefinitionKey=tratativa_backoffice

I attached the queries and explains, you can see below the query time for each strategy.

<^{explain-in-strategy.html>}

120,097 sec / Lost connection...

<^{explain-or-strategy.html>}

^{29,106 sec / 0,0000069 sec}

^<explain-multiples-joins-strategy.html>
^

0,779 sec / 0,0000050 sec

Please let me know if you need something more.

Regards,

Denis.

mojtaba-khallash · 2023-01-14T08:43:35Z

Hi @ThorbenLindhauer

I have the same problem in task table with 14k records. The duration of calculating the count/list request is about 1 minute and 30 seconds.

SELECT		COUNT(DISTINCT RES.ID_ )
FROM		ACT_RU_TASK RES
LEFT JOIN	ACT_RU_IDENTITYLINK I ON I.TASK_ID_ = RES.ID_
LEFT JOIN	ACT_RE_PROCDEF D ON RES.PROC_DEF_ID_ = D.ID_
INNER JOIN (
    SELECT	A.*
    FROM	ACT_RU_AUTHORIZATION A
    WHERE	A.TYPE_ < 2     
    AND		( 
		A.USER_ID_ IN ( 'user1', '*')
		OR A.GROUP_ID_ IN  (  'G1', 'G2', 'G3', 'G4', 'G5', 'G6' ) 
    )
    AND (A.RESOURCE_TYPE_ = 7 AND A.PERMS_ & 2 = 2 OR A.RESOURCE_TYPE_ = 6 AND A.PERMS_ &64 = 64)
) AUTH ON AUTH.RESOURCE_ID_ IN ( RES.ID_ , D.KEY_ , '*' )
WHERE	RES.SUSPENSION_STATE_ = 1
AND	AUTH.RESOURCE_ID_ IS NOT NULL;

In general, when combining tables, if possible, it is better to move Inner Joins to the beginning of the query and Full Joins to the end of the query. I changed task.xml by putting the I_JOIN after AuthorizationCheck join, the time was reduced to about 30 seconds for both count/list request.

SELECT		COUNT(DISTINCT RES.ID_ )
FROM		ACT_RU_TASK RES
LEFT JOIN	ACT_RE_PROCDEF D ON RES.PROC_DEF_ID_ = D.ID_
INNER JOIN (
    SELECT	A.*
    FROM	ACT_RU_AUTHORIZATION A
    WHERE	A.TYPE_ < 2     
    AND		( 
		A.USER_ID_ IN ( 'user1', '*')
		OR A.GROUP_ID_ IN  (  'G1', 'G2', 'G3', 'G4', 'G5', 'G6' ) 
    )
    AND (A.RESOURCE_TYPE_ = 7 AND A.PERMS_ & 2 = 2 OR A.RESOURCE_TYPE_ = 6 AND A.PERMS_ &64 = 64)
) AUTH ON AUTH.RESOURCE_ID_ IN ( RES.ID_ , D.KEY_ , '*' )
LEFT JOIN	ACT_RU_IDENTITYLINK I ON I.TASK_ID_ = RES.ID_
WHERE	RES.SUSPENSION_STATE_ = 1
AND	AUTH.RESOURCE_ID_ IS NOT NULL;

If you agree with this idea, I will prepare PR.

ThorbenLindhauer · 2023-01-24T10:02:48Z

Hi @mojtaba-khallash,

Thanks for sharing this idea. I haven't heard of it yet, but it can well be that this improves performance. Is there any reference that contributes your statement that the join order matters besides the observed execution time? If you have, it would also be nice to see query plans for the original and the refactored query.

Cheers,
Thorben

mojtaba-khallash · 2023-01-25T09:56:58Z

Hi @ThorbenLindhauer

The main idea is based on the number of final results per join. The lowest number of results is selected in Inner Join and the largest in Full Join.

Original query

Refactored query

ThorbenLindhauer · 2023-01-26T15:22:49Z

Thanks for sharing that. I'm a little surprised that the query planner isn't able to rewrite the order of joins to the most optimal one, but I guess it just cannot tell that it's irrelevant if we get less output rows after all the joins are applied.

Feel free to create a pull request for that change. We should create a separate ticket for this optimiziation, too. The description here is about indexes on MySQL not being applied and it's unlikely the join order changes this.

Btw did you modify the query from your previous comment, such as removing predicates? I'm asking because the join with ACT_RU_IDENTITYLINK_ is not needed at all.

mojtaba-khallash · 2023-01-26T16:10:51Z

Sorry, I made the change in the source code and tested it on MS SQL Server, but apparently two predicates were not sent when writing the comment!

Original query

SELECT		COUNT(DISTINCT RES.ID_ )
FROM		ACT_RU_TASK RES
LEFT JOIN	ACT_RU_IDENTITYLINK I ON I.TASK_ID_ = RES.ID_
LEFT JOIN	ACT_RE_PROCDEF D ON RES.PROC_DEF_ID_ = D.ID_
INNER JOIN (
    SELECT	A.*
    FROM	ACT_RU_AUTHORIZATION A
    WHERE	A.TYPE_ < 2     
    AND		( 
		A.USER_ID_ IN ( 'user1', '*')
		OR A.GROUP_ID_ IN  (  'G1', 'G2', 'G3', 'G4', 'G5', 'G6' ) 
    )
    AND (A.RESOURCE_TYPE_ = 7 AND A.PERMS_ & 2 = 2 OR A.RESOURCE_TYPE_ = 6 AND A.PERMS_ &64 = 64)
) AUTH ON AUTH.RESOURCE_ID_ IN ( RES.ID_ , D.KEY_ , '*' )
WHERE	I.TYPE_ = 'candidate' 
AND	I.GROUP_ID_ IN (N'G1') 
AND	RES.SUSPENSION_STATE_ = 1
AND	AUTH.RESOURCE_ID_ IS NOT NULL;

Refactored query

SELECT		COUNT(DISTINCT RES.ID_ )
FROM		ACT_RU_TASK RES
LEFT JOIN	ACT_RE_PROCDEF D ON RES.PROC_DEF_ID_ = D.ID_
INNER JOIN (
    SELECT	A.*
    FROM	ACT_RU_AUTHORIZATION A
    WHERE	A.TYPE_ < 2     
    AND		( 
		A.USER_ID_ IN ( 'user1', '*')
		OR A.GROUP_ID_ IN  (  'G1', 'G2', 'G3', 'G4', 'G5', 'G6' ) 
    )
    AND (A.RESOURCE_TYPE_ = 7 AND A.PERMS_ & 2 = 2 OR A.RESOURCE_TYPE_ = 6 AND A.PERMS_ &64 = 64)
) AUTH ON AUTH.RESOURCE_ID_ IN ( RES.ID_ , D.KEY_ , '*' )
LEFT JOIN	ACT_RU_IDENTITYLINK I ON I.TASK_ID_ = RES.ID_
WHERE	I.TYPE_ = 'candidate' 
AND	I.GROUP_ID_ IN (N'G1') 
AND	RES.SUSPENSION_STATE_ = 1
AND	AUTH.RESOURCE_ID_ IS NOT NULL;

ThorbenLindhauer · 2023-02-02T13:16:07Z

Not sure I understand the query plans. Both plans begin (at the right hand side) with an inner join (second join to the right) between ACT_RU_IDENTITYLINK and ACT_RU_TASK. That's not what the SQL specifies, or am I missing something here?

If you could export the plans as sqlplan files (XML), that would be fantastic, then it's easier for me to answer such questions myself.

mojtaba-khallash · 2023-02-02T15:26:56Z

Files are attached. I don't think I left anything else :)

sqlplan.zip

Do you get another sqlplan by executing these two queries in the database? Let me know if I'm wrong somewhere.

ThorbenLindhauer · 2023-02-17T10:18:18Z

Hey @mojtaba-khallash,

Sorry for the delay here, I was pretty busy the last two weeks.

Thanks for sharing the plans, I had a first look at them. Before I move ahead: Did you take generate these plans against a database that has a production-like amount of data?

The reason I am asking is that the stats seem off. For example, the clustered index scan on ACT_RU_TASK in the original query plan (i.e. the part that implements SELECT ID_, PROC_DEF_ID_ FROM ACT_RU_TASK WHERE SUSPENSION_STATE_ = 1; node id 7 in the plan) has "Estimated Number of Rows" and "Estimated Number of Rows to be read" with a value of 1.

mojtaba-khallash · 2023-02-24T15:48:07Z

Hi @ThorbenLindhauer

No I did not have access to the original data and reported on the test database. The following files are executed on the original data. Of course, the data has changed since the first comment was posted, but the improvement is still visible.

actual-execution-plan.zip

ThorbenLindhauer · 2023-03-07T08:21:38Z

Thanks for sharing the actual plans, I had a closer look now.

I can see that neither of the plans use an index to access the authorization table for this query and in both queries the authorization check is the expensive operation (nested loop join (node id 6) in the original query plan; nested loops join (node id 7) and key lookup (node id 28) in the refactored query plan). I believe that two factors contribute to the costs: They use table scans on ACT_RU_AUTHORIZATION (or key lookups for a very large number of rows) and they use nested loops joins (instead of e.g. more efficient merge joins).

The improvement you report is okay, but not really an order of magnitude better. 90 secs vs 30 secs is not a drastic improvement, of course it's noticable in real life, but it doesn't seem like solving the root problem that we see here. In that sense, I'm rather reluctant to rewrite the query like proposed and would rather look in other directions that may improve the query more.

Picking up on the analysis from the first paragraph, there are two ways:

Ensure that a more efficient join is used. I don't know how to tackle this one, since I don't understand under which circumstances SQL server chooses which join type.
Avoid full table scans or key lookups with a lot of inputs for ACT_RU_AUTHORIZATION. Here you could see if you can find an index that captures the following predicate better:

WHERE A.TYPE_ < 2
  AND (
    A.USER_ID_ IN (
      N'testUser'
      ,'*'
      )
    OR A.GROUP_ID_ IN (
      N'TenantAdmin'
      ,N'SaleExpert'
      ,N'SupportManager'
      ,N'SaleManager'
      ,N'SupportExpert'
      ,N'ContentExpert'
      )
    )
  AND (
    A.RESOURCE_TYPE_ = 7
    AND A.PERMS_ & 2 = 2
    OR A.RESOURCE_TYPE_ = 6
    AND A.PERMS_ & 64 = 64
    )

In the engine, we already have two indexes:

create unique index ACT_UNIQ_AUTH_USER on ACT_RU_AUTHORIZATION (TYPE_,USER_ID_,RESOURCE_TYPE_,RESOURCE_ID_) where USER_ID_ is not null;
create unique index ACT_UNIQ_AUTH_GROUP on ACT_RU_AUTHORIZATION (TYPE_,GROUP_ID_,RESOURCE_TYPE_,RESOURCE_ID_) where GROUP_ID_ is not null;

Neither of them is used. I would guess because the predicate is on both USER_ID_ and GROUP_ID_ and is an OR condition. Maybe one index that uses both fields would work better. I recommend to look into this with a DBA, since this depends on more factors than what we can see in the query plan (e.g. table cardinalities and data distribution) and I cannot claim I am a SQL server expert.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Task Query with Authorization and Identity link are slow #2217

Task Query with Authorization and Identity link are slow #2217

ThorbenLindhauer commented Feb 6, 2017 •

edited

ThorbenLindhauer commented Feb 11, 2022

mojtaba-khallash commented Jan 14, 2023

ThorbenLindhauer commented Jan 24, 2023

mojtaba-khallash commented Jan 25, 2023

ThorbenLindhauer commented Jan 26, 2023

mojtaba-khallash commented Jan 26, 2023 •

edited

ThorbenLindhauer commented Feb 2, 2023

mojtaba-khallash commented Feb 2, 2023

ThorbenLindhauer commented Feb 17, 2023

mojtaba-khallash commented Feb 24, 2023

ThorbenLindhauer commented Mar 7, 2023

Task Query with Authorization and Identity link are slow #2217

Task Query with Authorization and Identity link are slow #2217

Comments

ThorbenLindhauer commented Feb 6, 2017 • edited

ThorbenLindhauer commented Feb 11, 2022

mojtaba-khallash commented Jan 14, 2023

ThorbenLindhauer commented Jan 24, 2023

mojtaba-khallash commented Jan 25, 2023

ThorbenLindhauer commented Jan 26, 2023

mojtaba-khallash commented Jan 26, 2023 • edited

ThorbenLindhauer commented Feb 2, 2023

mojtaba-khallash commented Feb 2, 2023

ThorbenLindhauer commented Feb 17, 2023

mojtaba-khallash commented Feb 24, 2023

ThorbenLindhauer commented Mar 7, 2023

ThorbenLindhauer commented Feb 6, 2017 •

edited

mojtaba-khallash commented Jan 26, 2023 •

edited