-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Expected behaviour
Calls to /transfers/status should not fail, and synchronous tasks to getUnitStatuses should not fail even in an instance with millions of Jobs
Current behaviour
When loading /transfers/status the call fails after around 30 seconds. We did some debugging and found that things time out when running the call to getUnitStatuses.
Looking into https://github.com/artefactual/archivematica/blob/42b7dfa46026d78dd16e41504c6aa5a41cb4dd33/src/archivematica/MCPServer/server/rpc_server.py#L336-L415 we see that there is a query:
SELECT SIPUUID,
MAX(UNIX_TIMESTAMP(createdTime) + createdTimeDec) AS timestamp
FROM Jobs
WHERE unitType=%s AND NOT SIPUUID LIKE '%%None%%'
GROUP BY SIPUUID;
When we run this query manually we see that it takes over a minute, which suggests it is likely at least one culprit of the timeout.
Steps to reproduce
Since this is related to a database state I imagine this is hard to explicitly reproduce. We have around 3 million Jobs in our system.
Your environment (version of Archivematica, operating system, other relevant details)
We're using Archivematica 1.17
Here's the output of EXPLAIN
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Group aggregate: max((unix_timestamp(Jobs.createdTime) + Jobs.createdTimeDec)) (cost=392147 rows=43406) (actual time=3.96..65252 rows=20258 loops=1)
-> Filter: ((Jobs.unitType = 'unitTransfer') and (not((Jobs.SIPUUID like '%%None%%')))) (cost=368046 rows=241017) (actual time=2.16..64626 rows=971805 loops=1)
-> Index scan on Jobs using Jobs_SIPUUID_246989a2 (cost=368046 rows=2.71e+6) (actual time=2.15..63944 rows=3.02e+6 loops=1)
|
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (1 min 5.27 sec)
For Artefactual use:
Before you close this issue, you must check off the following:
- All pull requests related to this issue are properly linked
- All pull requests related to this issue have been merged
- A testing plan for this issue has been implemented and passed (testing plan information should be included in the issue body or comments)
- Documentation regarding this issue has been written and merged (if applicable)
- Details about this issue have been added to the release notes (if applicable)