Description
The instrumentation tracks the processing spans that it creates for received SQS messages in a more or less global dictionary on the Boto3SQSInstrumentor.
As dictionary key the receipt handle of the received SQS messages is used.
Spans/entries are removed from this dictionary by the instrumentation only in one of the following cases:
- the SQS message is deleted with boto3's sqs.delete_message API
- the SQS message is deleted with boto3's sqs.delete_message_batch
- The same SQS message is received again with boto3's sqs.receive_message
According to the AWS docs the receipt handles of received messages change with every call to sqs.receive_message
.
This means that the instrumentation's attempt to remove the span/entry from the dictionary in the instrumenting sqs.receive_message
function is not working as intended and will leak memory if a message isn't properly deleted, e.g. if an exception happens before the sqs.delete_message
API can be called.
Another case where memory might get leaked is when multiprocessing
is used. E.g. SQS messages are received in a parent process and processing + deletion of messages is done in separate child processes.
The instrumentation would in the parent process then add the processing spans to the dictionary in the sqs.receive_message
but when the processing spans are deleted and removed in the child process the dictionary in the parent process is unaffected.
There might be other cases but essentially if for some reason a SQS message isn't deleted in the same context as the sqs.receive_message
operation, the instrumentation will leave orphaned entries its 'global' dictionary and in turn leak memory.