-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Expected behaviour
The MCPClient worker queue handles errors raised by the Transcribe job.
Current behaviour
There are a few key aspects to this problem:
- Task batching. The MCPServer creates transcription task batches by iterating through:
- Files listed in the database, then
- Files found in the filesystem under the
objectsdirectory.
- Transcription output. The
Transcribejob writestesseractoutput to theobjects/metadata/OCRfilesdirectory. - Argument parsing. The
Transcribejob expects two arguments: a task ID and a file UUID. Both are validated asUUIDsduring argument parsing.
When processing a transfer with a large number of transcribable files, the MCPServer may submit transcription tasks to gearman before it finishes iterating through all files. This was discovered in a client transfer with 1.7k files using the default 128 batch size in the MCPServer settings.
As a result the initial set of tasks begins creating output in objects/metadata/OCRfiles. These newly created files are picked up during the filesystem iteration, even though they didn’t exist at the start of the batching process. Since these files don't have a corresponding UUID (as they are output files, not input transfer files), the MCPServer assigns None as the UUID.
When the transcription task is later run on these new files, the argument parser raises a SystemExit error due to invalid arguments. This error is not handled by the MCPClient worker queue, causing it to enter a broken state and spawn zombie processes. Over time, this leads to increased memory usage and stalls the job.
This is what the MCPClient logs show when the problem occurs:
Jun 02 18:00:47 server python[3280065]: usage: archivematicaClient.py [-h] task_uuid file_uuid
Jun 02 18:00:47 server python[3280065]: archivematicaClient.py: error: argument file_uuid: invalid UUID value: 'None'Your environment (version of Archivematica, operating system, other relevant details)
Archivematica 1.17.0.
In Archivematica 1.16.0, the Transcribe job does not validate the file UUID parameter. As a result, it later fails when attempting to look up the file in the database, producing the following error:
['“None” is not a valid UUID.']Traceback (most recent call last):
File "/pyenv/data/versions/3.9.22/lib/python3.9/site-packages/django/db/models/fields/__init__.py", line 2688, in to_python
return uuid.UUID(**{input_form: value})
File "/pyenv/data/versions/3.9.22/lib/python3.9/uuid.py", line 177, in __init__
raise ValueError('badly formed hexadecimal UUID string')
ValueError: badly formed hexadecimal UUID string
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/src/src/MCPClient/lib/client/job.py", line 142, in JobContext
yield
File "/src/src/MCPClient/lib/clientScripts/transcribe_file.py", line 182, in call
job.set_status(main(job, task_uuid, file_uuid))
File "/src/src/MCPClient/lib/clientScripts/transcribe_file.py", line 108, in main
file_ = File.objects.get(uuid=file_uuid)
File "/pyenv/data/versions/3.9.22/lib/python3.9/site-packages/django/db/models/manager.py", line 87, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File "/pyenv/data/versions/3.9.22/lib/python3.9/site-packages/django/db/models/query.py", line 623, in get
clone = self._chain() if self.query.combinator else self.filter(*args, **kwargs)
File "/pyenv/data/versions/3.9.22/lib/python3.9/site-packages/django/db/models/query.py", line 1436, in filter
return self._filter_or_exclude(False, args, kwargs)
File "/pyenv/data/versions/3.9.22/lib/python3.9/site-packages/django/db/models/query.py", line 1454, in _filter_or_exclude
clone._filter_or_exclude_inplace(negate, args, kwargs)
File "/pyenv/data/versions/3.9.22/lib/python3.9/site-packages/django/db/models/query.py", line 1461, in _filter_or_exclude_inplace
self._query.add_q(Q(*args, **kwargs))
File "/pyenv/data/versions/3.9.22/lib/python3.9/site-packages/django/db/models/sql/query.py", line 1546, in add_q
clause, _ = self._add_q(q_object, self.used_aliases)
File "/pyenv/data/versions/3.9.22/lib/python3.9/site-packages/django/db/models/sql/query.py", line 1577, in _add_q
child_clause, needed_inner = self.build_filter(
File "/pyenv/data/versions/3.9.22/lib/python3.9/site-packages/django/db/models/sql/query.py", line 1492, in build_filter
condition = self.build_lookup(lookups, col, value)
File "/pyenv/data/versions/3.9.22/lib/python3.9/site-packages/django/db/models/sql/query.py", line 1319, in build_lookup
lookup = lookup_class(lhs, rhs)
File "/pyenv/data/versions/3.9.22/lib/python3.9/site-packages/django/db/models/lookups.py", line 27, in __init__
self.rhs = self.get_prep_lookup()
File "/pyenv/data/versions/3.9.22/lib/python3.9/site-packages/django/db/models/lookups.py", line 341, in get_prep_lookup
return super().get_prep_lookup()
File "/pyenv/data/versions/3.9.22/lib/python3.9/site-packages/django/db/models/lookups.py", line 85, in get_prep_lookup
return self.lhs.output_field.get_prep_value(self.rhs)
File "/pyenv/data/versions/3.9.22/lib/python3.9/site-packages/django/db/models/fields/__init__.py", line 2672, in get_prep_value
return self.to_python(value)
File "/pyenv/data/versions/3.9.22/lib/python3.9/site-packages/django/db/models/fields/__init__.py", line 2690, in to_python
raise exceptions.ValidationError(
django.core.exceptions.ValidationError: ['“None” is not a valid UUID.']This ValidationError is handled correctly by the MCPClient worker queue.
For Artefactual use:
Before you close this issue, you must check off the following:
- All pull requests related to this issue are properly linked
- All pull requests related to this issue have been merged
- A testing plan for this issue has been implemented and passed (testing plan information should be included in the issue body or comments)
- Documentation regarding this issue has been written and merged (if applicable)
- Details about this issue have been added to the release notes (if applicable)