-
Notifications
You must be signed in to change notification settings - Fork 133
Closed
Labels
bugIncorrect or flawed agent behavior.Incorrect or flawed agent behavior.needs-triageRequires initial review by maintainers.Requires initial review by maintainers.
Description
Description
We experienced in incident the other day where our servers were overload (100% cpu usage) and the newrelic agent was crashing and making things much worse. A dictionary is being modified while its being iterated over - looking at the code there does not seem to be any protection from this happening
Expected Behavior
The agent does not crash and properly handler concurrency
Troubleshooting
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/uvicorn/protocols/http/h11_impl.py", line 404, in run_asgi
result = await app( # type: ignore[func-returns-value]
File "/usr/local/lib/python3.9/site-packages/uvicorn/middleware/proxy_headers.py", line 78, in __call__
return await self.app(scope, receive, send)
File "/usr/local/lib/python3.9/site-packages/newrelic/api/asgi_application.py", line 359, in nr_async_asgi
return await coro
File "/usr/local/lib/python3.9/site-packages/newrelic/common/async_proxy.py", line 148, in __next__
return self.send(None)
File "/usr/local/lib/python3.9/site-packages/newrelic/common/async_proxy.py", line 120, in send
return self.__wrapped__.send(value)
File "/usr/local/lib/python3.9/site-packages/newrelic/common/async_proxy.py", line 110, in __exit__
trace_cache().record_event_loop_wait(self.enter_time, time.time())
File "/usr/local/lib/python3.9/site-packages/newrelic/core/trace_cache.py", line 362, in record_event_loop_wait
for trace in self._cache.values():
File "/usr/local/lib/python3.9/weakref.py", line 248, in values
for wr in self.data.values():
RuntimeError: dictionary changed size during iteration
Steps to Reproduce
This is not very easy to reproduce, but if needed I can try to come up with an application. Reading the code I did not really see how this scenario was protected against and the fix could be as simple as iterating over a copy of the dict.
Your Environment
fastapi==0.75.0
uvicorn==0.18.3
gunicorn==20.1.0
newrelic==8.1.0.180
Python FastAPI application running in a docker container in kubernetes.
Metadata
Metadata
Assignees
Labels
bugIncorrect or flawed agent behavior.Incorrect or flawed agent behavior.needs-triageRequires initial review by maintainers.Requires initial review by maintainers.