-
Couldn't load subscription status.
- Fork 1
Description
Expected behaviour
MCPClient resuource usage follows a predictable pattern, allowing the number of clients to be scaled depending on the available system resources.
Current behaviour
MCPClient resource usage can be quite variable. A typical script (say assign_file_uuids) is one thread runnning in one process. Many scripts fork a single additional process via Popen (e.g. examine_contents runs bulk_extractor), which isn't really a problem (depending on the behaviour of what's forked; some processes are much more demanding of CPU than others, for example ffmpeg or compression). However, scripts that define concurrent_instances (e.g. archivematica_clamscan) will fork a number of child processes (number of system cpus * 2, as they fork twice). Some scripts (e.g. characterize_file) define concurrent_instances and fork via Popen, resulting in 1 + (number of cpus * 3) executing processes. The problem compounds with high core systems and multiple MCPClients.
In a worst case situation, if I run 8 MCPClients on a 16 core system (which seems like it would be leaving a reasonable amount of room for MySQL, Dashboard, MCPServer, Storage Service and other various services to run), and I run a single transfer with over 1000 files (so that one batch of 125 tasks will be passed to each of the 8 clients), when it hits characterize_file MCPClient will suddenly jump from 8 to 264 (8 clients * (1 master process + (16 cores * 2 forks )) running Python processes, and 128 command processes (FITS via nailgun in this case). In the particular case of characterize_file, this behaviour is problematic because it's not clear that nailgun can acutally process multiple requests at a time, and so most of these processes are probably just waiting for the single running nailgun server.
Steps to reproduce
- Configure multiple MCPClients on a multi cores system.
- Run a large (1000+ files) transfer.
- Observe the number of processes.
Your environment (version of Archivematica, operating system, other relevant details)
docker-compose / qa/1.x
For Artefactual use:
Before you close this issue, you must check off the following:
- All pull requests related to this issue are properly linked
- All pull requests related to this issue have been merged
- A testing plan for this issue has been implemented and passed (testing plan information should be included in the issue body or comments)
- Documentation regarding this issue has been written and merged
- Details about this issue have been added to the release notes (if applicable)