Skip to content

RuntimeError: dictionary changed size during iteration #653

@ghost

Description

Description
We experienced in incident the other day where our servers were overload (100% cpu usage) and the newrelic agent was crashing and making things much worse. A dictionary is being modified while its being iterated over - looking at the code there does not seem to be any protection from this happening

Expected Behavior
The agent does not crash and properly handler concurrency

Troubleshooting

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/uvicorn/protocols/http/h11_impl.py", line 404, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "/usr/local/lib/python3.9/site-packages/uvicorn/middleware/proxy_headers.py", line 78, in __call__
    return await self.app(scope, receive, send)
  File "/usr/local/lib/python3.9/site-packages/newrelic/api/asgi_application.py", line 359, in nr_async_asgi
    return await coro
  File "/usr/local/lib/python3.9/site-packages/newrelic/common/async_proxy.py", line 148, in __next__
    return self.send(None)
  File "/usr/local/lib/python3.9/site-packages/newrelic/common/async_proxy.py", line 120, in send
    return self.__wrapped__.send(value)
  File "/usr/local/lib/python3.9/site-packages/newrelic/common/async_proxy.py", line 110, in __exit__
    trace_cache().record_event_loop_wait(self.enter_time, time.time())
  File "/usr/local/lib/python3.9/site-packages/newrelic/core/trace_cache.py", line 362, in record_event_loop_wait
    for trace in self._cache.values():
  File "/usr/local/lib/python3.9/weakref.py", line 248, in values
    for wr in self.data.values():
RuntimeError: dictionary changed size during iteration

Steps to Reproduce
This is not very easy to reproduce, but if needed I can try to come up with an application. Reading the code I did not really see how this scenario was protected against and the fix could be as simple as iterating over a copy of the dict.

Your Environment

fastapi==0.75.0
uvicorn==0.18.3
gunicorn==20.1.0
newrelic==8.1.0.180

Python FastAPI application running in a docker container in kubernetes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugIncorrect or flawed agent behavior.needs-triageRequires initial review by maintainers.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions