Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Task Query with Authorization and Identity link are slow #2217

Open
ThorbenLindhauer opened this issue Feb 6, 2017 · 11 comments
Open

Task Query with Authorization and Identity link are slow #2217

ThorbenLindhauer opened this issue Feb 6, 2017 · 11 comments
Labels
scope:core-api Changes to the core API: engine, dmn-engine, feel-engine, REST API, OpenAPI type:bug Issues that describe a user-facing bug in the project.

Comments

@ThorbenLindhauer
Copy link
Member

ThorbenLindhauer commented Feb 6, 2017

This issue was imported from JIRA:

Field Value
JIRA Link CAM-7441
Reporter J3VD3gn
What is this name? This pseudonym name was generated based on the user name in JIRA to protect the personal data of our JIRA users. You can use this identifier to search for issues by the same reporter.
Has restricted visibility comments true

The task query in combination with authorization and identity link are slow on MySQL.

To reproduce start 5000 process instances of the invoice process.

EXPLAIN SELECT Count( DISTINCT res.id_ )
FROM ACT*RU*TASK res
inner join ACT*RU*IDENTITYLINK i
ON i.task*id_ = res.id*
left join ACT*RE*PROCDEF procdef
ON res.proc*def_id_ = procdef.id*
left join ( SELECT a.*
FROM ACT*RU*AUTHORIZATION a
WHERE a.type_ < 2 AND
( a.user*id* IN ( 'demo', '*' ) OR
a.group*id* IN ( 'camunda-admin', 'management', 'accounting', 'sales' ) ) AND
(( a.resource*type* = 7 AND
a.perms_ & 2 = 2 OR
a.resource*type* = 6 AND
a.perms_ & 64 = 64 )) ) auth
ON ( auth.resource*id_ IN ( res.id_, procdef.key*, '*' ) ) #<-- this is the problematic part 
WHERE res.assignee_ IS NULL AND
res.assignee_ IS NULL AND
 i.type_ = 'candidate' AND
( i.group*id* IN ( 'accounting', 'camunda-admin', 'management', 'sales' ) ) AND
 res.suspension*state* = 1 AND
( ( res.case*execution_id* IS NOT NULL ) OR
( auth.resource*id* IS NOT NULL ) );

With authorization the query returns a result after 30s, without authorization after 1 s.
On 7.4 with the old authorization query it runs as well much faster.

Problem is that on this combination the existing indices are not used by MySQL.

Hint:
Since we test in the performance test suite the task query with authorization and 50k task.
I assume that the combination with the identity link table is the problem.

Links:

@ThorbenLindhauer
Copy link
Member Author

This comment was imported from JIRA and written by user @tmetzke


Copied from CAM-14092:

Hello Tobias, 

Thank you for reply.

  1. We are using MySQL 5.7.35.

 

  1. The total rows for tables are:

SELECT COUNT(1) FROM ACTREPROCDEF; --> 3;
SELECT COUNT(1) FROM ACTRUAUTHORIZATION; --> 11868;
SELECT COUNT(1) FROM ACTRUTASK; --> 11746;

ps: As you can see tables are small in row numbers.

 

  1. I'm talking about the query for a request like this:

GET /task/count?taskDefinitionKey=atender*reclamacao&unassigned=true&firstResult=0&maxResults=25&sortBy=dueDate&sortOrder=desc&taskVariables=grupo_eq_GEROP<ins>-</ins>COPER&processDefinitionKey=tratativa*backoffice

OR

GET /task/count?assigned=true&firstResult=0&maxResults=10&sortBy=dueDate&sortOrder=desc&processDefinitionKey=tratativa_backoffice

 

  1. I attached the queries and explains, you can see below the query time for each strategy.

<explain-in-strategy.html>

120,097 sec / Lost connection...

<explain-or-strategy.html>

29,106 sec / 0,0000069 sec

<explain-multiples-joins-strategy.html>
^

0,779 sec / 0,0000050 sec

Please let me know if you need something more.

 

Regards,

Denis.

@mojtaba-khallash
Copy link

Hi @ThorbenLindhauer

I have the same problem in task table with 14k records. The duration of calculating the count/list request is about 1 minute and 30 seconds.

SELECT		COUNT(DISTINCT RES.ID_ )
FROM		ACT_RU_TASK RES
LEFT JOIN	ACT_RU_IDENTITYLINK I ON I.TASK_ID_ = RES.ID_
LEFT JOIN	ACT_RE_PROCDEF D ON RES.PROC_DEF_ID_ = D.ID_
INNER JOIN (
    SELECT	A.*
    FROM	ACT_RU_AUTHORIZATION A
    WHERE	A.TYPE_ < 2     
    AND		( 
		A.USER_ID_ IN ( 'user1', '*')
		OR A.GROUP_ID_ IN  (  'G1', 'G2', 'G3', 'G4', 'G5', 'G6' ) 
    )
    AND (A.RESOURCE_TYPE_ = 7 AND A.PERMS_ & 2 = 2 OR A.RESOURCE_TYPE_ = 6 AND A.PERMS_ &64 = 64)
) AUTH ON AUTH.RESOURCE_ID_ IN ( RES.ID_ , D.KEY_ , '*' )
WHERE	RES.SUSPENSION_STATE_ = 1
AND	AUTH.RESOURCE_ID_ IS NOT NULL;

In general, when combining tables, if possible, it is better to move Inner Joins to the beginning of the query and Full Joins to the end of the query. I changed task.xml by putting the I_JOIN after AuthorizationCheck join, the time was reduced to about 30 seconds for both count/list request.

SELECT		COUNT(DISTINCT RES.ID_ )
FROM		ACT_RU_TASK RES
LEFT JOIN	ACT_RE_PROCDEF D ON RES.PROC_DEF_ID_ = D.ID_
INNER JOIN (
    SELECT	A.*
    FROM	ACT_RU_AUTHORIZATION A
    WHERE	A.TYPE_ < 2     
    AND		( 
		A.USER_ID_ IN ( 'user1', '*')
		OR A.GROUP_ID_ IN  (  'G1', 'G2', 'G3', 'G4', 'G5', 'G6' ) 
    )
    AND (A.RESOURCE_TYPE_ = 7 AND A.PERMS_ & 2 = 2 OR A.RESOURCE_TYPE_ = 6 AND A.PERMS_ &64 = 64)
) AUTH ON AUTH.RESOURCE_ID_ IN ( RES.ID_ , D.KEY_ , '*' )
LEFT JOIN	ACT_RU_IDENTITYLINK I ON I.TASK_ID_ = RES.ID_
WHERE	RES.SUSPENSION_STATE_ = 1
AND	AUTH.RESOURCE_ID_ IS NOT NULL;

If you agree with this idea, I will prepare PR.

@ThorbenLindhauer
Copy link
Member Author

Hi @mojtaba-khallash,

Thanks for sharing this idea. I haven't heard of it yet, but it can well be that this improves performance. Is there any reference that contributes your statement that the join order matters besides the observed execution time? If you have, it would also be nice to see query plans for the original and the refactored query.

Cheers,
Thorben

@mojtaba-khallash
Copy link

Hi @ThorbenLindhauer

The main idea is based on the number of final results per join. The lowest number of results is selected in Inner Join and the largest in Full Join.

image


  • Original query

image

  • Refactored query

image

@ThorbenLindhauer
Copy link
Member Author

Thanks for sharing that. I'm a little surprised that the query planner isn't able to rewrite the order of joins to the most optimal one, but I guess it just cannot tell that it's irrelevant if we get less output rows after all the joins are applied.

Feel free to create a pull request for that change. We should create a separate ticket for this optimiziation, too. The description here is about indexes on MySQL not being applied and it's unlikely the join order changes this.

Btw did you modify the query from your previous comment, such as removing predicates? I'm asking because the join with ACT_RU_IDENTITYLINK_ is not needed at all.

@mojtaba-khallash
Copy link

mojtaba-khallash commented Jan 26, 2023

Sorry, I made the change in the source code and tested it on MS SQL Server, but apparently two predicates were not sent when writing the comment!

  • Original query
SELECT		COUNT(DISTINCT RES.ID_ )
FROM		ACT_RU_TASK RES
LEFT JOIN	ACT_RU_IDENTITYLINK I ON I.TASK_ID_ = RES.ID_
LEFT JOIN	ACT_RE_PROCDEF D ON RES.PROC_DEF_ID_ = D.ID_
INNER JOIN (
    SELECT	A.*
    FROM	ACT_RU_AUTHORIZATION A
    WHERE	A.TYPE_ < 2     
    AND		( 
		A.USER_ID_ IN ( 'user1', '*')
		OR A.GROUP_ID_ IN  (  'G1', 'G2', 'G3', 'G4', 'G5', 'G6' ) 
    )
    AND (A.RESOURCE_TYPE_ = 7 AND A.PERMS_ & 2 = 2 OR A.RESOURCE_TYPE_ = 6 AND A.PERMS_ &64 = 64)
) AUTH ON AUTH.RESOURCE_ID_ IN ( RES.ID_ , D.KEY_ , '*' )
WHERE	I.TYPE_ = 'candidate' 
AND	I.GROUP_ID_ IN (N'G1') 
AND	RES.SUSPENSION_STATE_ = 1
AND	AUTH.RESOURCE_ID_ IS NOT NULL;

image


  • Refactored query
SELECT		COUNT(DISTINCT RES.ID_ )
FROM		ACT_RU_TASK RES
LEFT JOIN	ACT_RE_PROCDEF D ON RES.PROC_DEF_ID_ = D.ID_
INNER JOIN (
    SELECT	A.*
    FROM	ACT_RU_AUTHORIZATION A
    WHERE	A.TYPE_ < 2     
    AND		( 
		A.USER_ID_ IN ( 'user1', '*')
		OR A.GROUP_ID_ IN  (  'G1', 'G2', 'G3', 'G4', 'G5', 'G6' ) 
    )
    AND (A.RESOURCE_TYPE_ = 7 AND A.PERMS_ & 2 = 2 OR A.RESOURCE_TYPE_ = 6 AND A.PERMS_ &64 = 64)
) AUTH ON AUTH.RESOURCE_ID_ IN ( RES.ID_ , D.KEY_ , '*' )
LEFT JOIN	ACT_RU_IDENTITYLINK I ON I.TASK_ID_ = RES.ID_
WHERE	I.TYPE_ = 'candidate' 
AND	I.GROUP_ID_ IN (N'G1') 
AND	RES.SUSPENSION_STATE_ = 1
AND	AUTH.RESOURCE_ID_ IS NOT NULL;

image

@ThorbenLindhauer
Copy link
Member Author

Not sure I understand the query plans. Both plans begin (at the right hand side) with an inner join (second join to the right) between ACT_RU_IDENTITYLINK and ACT_RU_TASK. That's not what the SQL specifies, or am I missing something here?

If you could export the plans as sqlplan files (XML), that would be fantastic, then it's easier for me to answer such questions myself.

@mojtaba-khallash
Copy link

Files are attached. I don't think I left anything else :)

sqlplan.zip

Do you get another sqlplan by executing these two queries in the database? Let me know if I'm wrong somewhere.

@ThorbenLindhauer
Copy link
Member Author

Hey @mojtaba-khallash,

Sorry for the delay here, I was pretty busy the last two weeks.

Thanks for sharing the plans, I had a first look at them. Before I move ahead: Did you take generate these plans against a database that has a production-like amount of data?

The reason I am asking is that the stats seem off. For example, the clustered index scan on ACT_RU_TASK in the original query plan (i.e. the part that implements SELECT ID_, PROC_DEF_ID_ FROM ACT_RU_TASK WHERE SUSPENSION_STATE_ = 1; node id 7 in the plan) has "Estimated Number of Rows" and "Estimated Number of Rows to be read" with a value of 1.

@mojtaba-khallash
Copy link

Hi @ThorbenLindhauer

No I did not have access to the original data and reported on the test database. The following files are executed on the original data. Of course, the data has changed since the first comment was posted, but the improvement is still visible.

actual-execution-plan.zip

@ThorbenLindhauer
Copy link
Member Author

Thanks for sharing the actual plans, I had a closer look now.

I can see that neither of the plans use an index to access the authorization table for this query and in both queries the authorization check is the expensive operation (nested loop join (node id 6) in the original query plan; nested loops join (node id 7) and key lookup (node id 28) in the refactored query plan). I believe that two factors contribute to the costs: They use table scans on ACT_RU_AUTHORIZATION (or key lookups for a very large number of rows) and they use nested loops joins (instead of e.g. more efficient merge joins).

The improvement you report is okay, but not really an order of magnitude better. 90 secs vs 30 secs is not a drastic improvement, of course it's noticable in real life, but it doesn't seem like solving the root problem that we see here. In that sense, I'm rather reluctant to rewrite the query like proposed and would rather look in other directions that may improve the query more.

Picking up on the analysis from the first paragraph, there are two ways:

  1. Ensure that a more efficient join is used. I don't know how to tackle this one, since I don't understand under which circumstances SQL server chooses which join type.
  2. Avoid full table scans or key lookups with a lot of inputs for ACT_RU_AUTHORIZATION. Here you could see if you can find an index that captures the following predicate better:
WHERE A.TYPE_ < 2
  AND (
    A.USER_ID_ IN (
      N'testUser'
      ,'*'
      )
    OR A.GROUP_ID_ IN (
      N'TenantAdmin'
      ,N'SaleExpert'
      ,N'SupportManager'
      ,N'SaleManager'
      ,N'SupportExpert'
      ,N'ContentExpert'
      )
    )
  AND (
    A.RESOURCE_TYPE_ = 7
    AND A.PERMS_ & 2 = 2
    OR A.RESOURCE_TYPE_ = 6
    AND A.PERMS_ & 64 = 64
    )

In the engine, we already have two indexes:

create unique index ACT_UNIQ_AUTH_USER on ACT_RU_AUTHORIZATION (TYPE_,USER_ID_,RESOURCE_TYPE_,RESOURCE_ID_) where USER_ID_ is not null;
create unique index ACT_UNIQ_AUTH_GROUP on ACT_RU_AUTHORIZATION (TYPE_,GROUP_ID_,RESOURCE_TYPE_,RESOURCE_ID_) where GROUP_ID_ is not null;

Neither of them is used. I would guess because the predicate is on both USER_ID_ and GROUP_ID_ and is an OR condition. Maybe one index that uses both fields would work better. I recommend to look into this with a DBA, since this depends on more factors than what we can see in the query plan (e.g. table cardinalities and data distribution) and I cannot claim I am a SQL server expert.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
scope:core-api Changes to the core API: engine, dmn-engine, feel-engine, REST API, OpenAPI type:bug Issues that describe a user-facing bug in the project.
Projects
None yet
Development

No branches or pull requests

2 participants