Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OSError: inotify instance limit reached #24

Open
ericmjl opened this issue Apr 30, 2020 · 29 comments
Open

OSError: inotify instance limit reached #24

ericmjl opened this issue Apr 30, 2020 · 29 comments
Labels

Comments

@ericmjl
Copy link

ericmjl commented Apr 30, 2020

When using cachier with some long-running functions, I get the following error:

OSError: inotify instance limit reached

A fuller stack trace is in the details below. I have tried my best to obscure out proprietary information, so please forgive me if it kind of messes up the syntax.

What would be the best next step here to debug? (No pressure on your side, @shaypal5, I have no problems debugging on my own, just wanted to know your thoughts.)

---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
 in 
----> 1 s = my_long_func("arg")

~/path/to/my/env/lib/python3.7/site-packages/cachier/core.py in func_wrapper(*args, **kwds)
169 return _calc_entry(core, key, func, args, kwds)
170 _print('No entry found. No current calc. Calling like a boss.')
--> 171 return _calc_entry(core, key, func, args, kwds)
172
173 def clear_cache():

~/path/to/my/env/lib/python3.7/site-packages/cachier/core.py in _calc_entry(core, key, func, args, kwds)
65 core.mark_entry_being_calculated(key)
66 # _get_executor().submit(core.mark_entry_being_calculated, key)
---> 67 func_res = func(*args, **kwds)
68 core.set_entry(key, func_res)
69 # _get_executor().submit(core.set_entry, key, func_res)

~/path/to/my/src.py in my_long_func.py(kwarg, another_arg)
47 @cachier(stale_after=timedelta(weeks=1))
48 def my_long_func.py(kwarg: str, another_arg: bool = False):
---> 49 res_df = wrapped_func(kwarg=kwarg, another_arg="string")
50 if another_arg:
51 return res_df.query("valid_data == True")

~/path/to/my/env/lib/python3.7/site-packages/cachier/core.py in func_wrapper(*args, **kwds)
165 _print('No value but being calculated. Waiting.')
166 try:
--> 167 return core.wait_on_entry_calc(key)
168 except RecalculationNeeded:
169 return _calc_entry(core, key, func, args, kwds)

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
191 if observer.isAlive():
192 # print('Timedout waiting. Starting again...')
--> 193 return self.wait_on_entry_calc(key)
194 # print("Returned value: {}".format(event_handler.value))
195 return event_handler.value

~/path/to/my/env/lib/python3.7/site-packages/cachier/pickle_core.py in wait_on_entry_calc(self, key)
187 recursive=True
188 )
--> 189 observer.start()
190 observer.join(timeout=1.0)
191 if observer.isAlive():

~/path/to/my/env/lib/python3.7/site-packages/watchdog/observers/api.py in start(self)
253 def start(self):
254 for emitter in self._emitters:
--> 255 emitter.start()
256 super(BaseObserver, self).start()
257

~/path/to/my/env/lib/python3.7/site-packages/watchdog/utils/init.py in start(self)
108
109 def start(self):
--> 110 self.on_thread_start()
111 threading.Thread.start(self)
112

~/path/to/my/env/lib/python3.7/site-packages/watchdog/observers/inotify.py in on_thread_start(self)
119 def on_thread_start(self):
120 path = unicode_paths.encode(self.watch.path)
--> 121 self._inotify = InotifyBuffer(path, self.watch.is_recursive)
122
123 def on_thread_stop(self):

~/path/to/my/env/lib/python3.7/site-packages/watchdog/observers/inotify_buffer.py in init(self, path, recursive)
33 BaseThread.init(self)
34 self._queue = DelayedQueue(self.delay)
---> 35 self._inotify = Inotify(path, recursive)
36 self.start()
37

~/path/to/my/env/lib/python3.7/site-packages/watchdog/observers/inotify_c.py in init(self, path, recursive, event_mask)
186 inotify_fd = inotify_init()
187 if inotify_fd == -1:
--> 188 Inotify._raise_error()
189 self._inotify_fd = inotify_fd
190 self._lock = threading.Lock()

~/path/to/my/env/lib/python3.7/site-packages/watchdog/observers/inotify_c.py in _raise_error()
413 raise OSError("inotify watch limit reached")
414 elif err == errno.EMFILE:
--> 415 raise OSError("inotify instance limit reached")
416 else:
417 raise OSError(os.strerror(err))

</pre>
</details>
@ericmjl
Copy link
Author

ericmjl commented Apr 30, 2020

I went digging into the codebase of cachier, and saw the following that might be responsible for the recursive inotify-ing:

        observer.join(timeout=1.0)
        if observer.is_alive():
            # print('Timedout waiting. Starting again...')
            return self.wait_on_entry_calc(key)

What if it were instead changed to the following?

        while observer.is_alive():
            observer.join(timeout=1.0)

Would this bork how Watchdog is supposed to work? Would this help with the recursive nesting calls of self. wait_on_entry_calc?

@tsp-kucbd
Copy link

@ericmjl Did you find the solution to this? We encounter the same inotify problem when implementing cachier in our prediction server.

@ericmjl
Copy link
Author

ericmjl commented Jun 28, 2020

@tsp-kucbd I tried using a while loop, as you can see in the PR #25 referenced above. Please try it out and see if it works for you -- if so, we should ping back to @shaypal5 to see how we can fix the macOS errors that are causing the PR to fail. (I have to admit, I'm not very well-versed with the codebase, and it took me a long time to narrow down the error.)

@shaypal5
Copy link
Collaborator

Oh, I forgot about your PR to fix this, @ericmjl . Please ping me again freely to make me take a look, but I suggest you two try to take a jab at this.

@ericmjl
Copy link
Author

ericmjl commented Jun 30, 2020

No worries, @shaypal5 😄 I assumed you were busy and needed to finish up work!

@tsp-kucbd
Copy link

@tsp-kucbd I tried using a while loop, as you can see in the PR #25 referenced above. Please try it out and see if it works for you -- if so, we should ping back to @shaypal5 to see how we can fix the macOS errors that are causing the PR to fail. (I have to admit, I'm not very well-versed with the codebase, and it took me a long time to narrow down the error.)

I tried, but unfortunately it did not work. Also, this is not under MacOS but on a Linux server ...

@ericmjl
Copy link
Author

ericmjl commented Jul 1, 2020

I tried, but unfortunately it did not work. Also, this is not under MacOS but on a Linux server ...

Without a stack trace, it's going to be tough to see what's happening. I am not sure whether you made the change in the source code, and whether the changes were reflected in your environment, or whether something else was happening.

@shaypal5
Copy link
Collaborator

shaypal5 commented Sep 7, 2020

See my comment on the related PR, which fails an important test.
This is now in the hands of you guys, or some other interested party.

@shaypal5 shaypal5 added the bug label Sep 7, 2020
@paul-matthews
Copy link

paul-matthews commented Feb 19, 2021

I'm also getting this error when calling with anonymous func as worker function.

Environment:
Raspberry Pi Zero
Linux 4.19.97+ #1294 Thu Jan 30 13:10:54 GMT 2020 armv6l GNU/Linux
Raspbian GNU/Linux 10 (buster)
Python 3.7.3
cachier 1.5.0

Expected:
Cachier would call worker function

Actual:
Cachier never calls function and after a significant delay triggers below error

[Errno 24] inotify instance limit reached
Traceback (most recent call last):
  File "/path/to/source/run.py", line 274, in <module>
    main(stored_data, args.area_code, args.area_population, args.skip_cache)
  File "/path/to/source/run.py", line 73, in main
    data = get_covid_data(area_code, dates_to_overwrite)
  File "/path/to/source/run.py", line 182, in get_covid_data
    resp = util.exponential_backoff(lambda: get_location_data_for_date(area_code, d, overwrite_cache=overwrite, verbose_cache=True))
  File "/path/to/source/util.py", line 62, in exponential_backoff
    raise last_exception
  File "/path/to/source/util.py", line 55, in exponential_backoff
    return fn()
  File "/path/to/source/run.py", line 182, in <lambda>
    resp = util.exponential_backoff(lambda: get_location_data_for_date(area_code, d, overwrite_cache=overwrite, verbose_cache=True))
  File "/home/someuser/.local/lib/python3.7/site-packages/cachier/core.py", line 228, in func_wrapper
    return core.wait_on_entry_calc(key)
  File "/home/someuser/.local/lib/python3.7/site-packages/cachier/pickle_core.py", line 204, in wait_on_entry_calc
    observer.start()
  File "/home/someuser/.local/lib/python3.7/site-packages/watchdog/observers/api.py", line 256, in start
    emitter.start()
  File "/home/someuser/.local/lib/python3.7/site-packages/watchdog/utils/__init__.py", line 93, in start
    self.on_thread_start()
  File "/home/someuser/.local/lib/python3.7/site-packages/watchdog/observers/inotify.py", line 118, in on_thread_start
    self._inotify = InotifyBuffer(path, self.watch.is_recursive)
  File "/home/someuser/.local/lib/python3.7/site-packages/watchdog/observers/inotify_buffer.py", line 35, in __init__
    self._inotify = Inotify(path, recursive)
  File "/home/someuser/.local/lib/python3.7/site-packages/watchdog/observers/inotify_c.py", line 187, in __init__
    Inotify._raise_error()
  File "/home/someuser/.local/lib/python3.7/site-packages/watchdog/observers/inotify_c.py", line 431, in _raise_error
    raise OSError(errno.EMFILE, "inotify instance limit reached")
OSError: [Errno 24] inotify instance limit reached

@wjaskowski
Copy link

I am also experiencing this error.

@shaypal5
Copy link
Collaborator

Please see my comments on PR #25 if you want to help get this fixed.

@endlisnis
Copy link

I also am experiencing this error.

@Wendroff
Copy link

I'm also getting this error when calling in a sub-processing. I used the forkserver mode on a linux server. The traceback is shown below.

Traceback (most recent call last):
File "/home/wangzhefu/qlib_ext/qlib_ext/src/self_workflow/post_process.py", line 33, in cal_index_score
index_con_df = get_index_components_weight(index_wind_code=index_windcode, date=date)
File "/home/wangzhefu/.conda/envs/qlib_only/lib/python3.8/site-packages/cachier/core.py", line 228, in func_wrapper
return core.wait_on_entry_calc(key)
File "/home/wangzhefu/.conda/envs/qlib_only/lib/python3.8/site-packages/cachier/pickle_core.py", line 208, in wait_on_entry_calc
return self.wait_on_entry_calc(key)
File "/home/wangzhefu/.conda/envs/qlib_only/lib/python3.8/site-packages/cachier/pickle_core.py", line 208, in wait_on_entry_calc
return self.wait_on_entry_calc(key)
File "/home/wangzhefu/.conda/envs/qlib_only/lib/python3.8/site-packages/cachier/pickle_core.py", line 208, in wait_on_entry_calc
return self.wait_on_entry_calc(key)
[Previous line repeated 15 more times]
File "/home/wangzhefu/.conda/envs/qlib_only/lib/python3.8/site-packages/cachier/pickle_core.py", line 204, in wait_on_entry_calc
observer.start()
File "/home/wangzhefu/.conda/envs/qlib_only/lib/python3.8/site-packages/watchdog/observers/api.py", line 256, in start
emitter.start()
File "/home/wangzhefu/.conda/envs/qlib_only/lib/python3.8/site-packages/watchdog/utils/init.py", line 93, in start
self.on_thread_start()
File "/home/wangzhefu/.conda/envs/qlib_only/lib/python3.8/site-packages/watchdog/observers/inotify.py", line 118, in on_thread_start
self._inotify = InotifyBuffer(path, self.watch.is_recursive)
File "/home/wangzhefu/.conda/envs/qlib_only/lib/python3.8/site-packages/watchdog/observers/inotify_buffer.py", line 35, in init
self._inotify = Inotify(path, recursive)
File "/home/wangzhefu/.conda/envs/qlib_only/lib/python3.8/site-packages/watchdog/observers/inotify_c.py", line 187, in init
Inotify._raise_error()
File "/home/wangzhefu/.conda/envs/qlib_only/lib/python3.8/site-packages/watchdog/observers/inotify_c.py", line 431, in _raise_error
raise OSError(errno.EMFILE, "inotify instance limit reached")
OSError: [Errno 24] inotify instance limit reached

@endlisnis
Copy link

I also experience this problem. I would have imagined this is the exact scenario that cachier is designed for (long run-to-complete tasks). I'm amazed this has stayed open for more than a year. Is this project active?

@shaypal5
Copy link
Collaborator

shaypal5 commented Oct 6, 2021

@endlisnis The project is active, yes, but I myself (the author of the package) is not actively using it anymore, so I rarely sit down anymore to work on new feature, or debug scenarios I have no ideas how to reproduce.

I put in enough time to answer issues, walk people through making PRs, etc. But yeah, this is an extremely small open source project that is in the stage where it relies on the community for improvements and bug fixes. ¯_(ツ)_/¯

Again, I'd love to help anyone who wants to get into researching this and making a PR, and I would advise anyone attempting this to start off where PR #25 has left off.

@shaypal5
Copy link
Collaborator

shaypal5 commented Oct 6, 2021

The fix on PR #25 was released in v1.5.2.
I'm closing. Please re-open if this is not amended.

@shaypal5 shaypal5 closed this as completed Oct 6, 2021
@nhairs-lumin
Copy link

nhairs-lumin commented Jan 16, 2024

@shaypal5 - I suspect that the underlying issue has not been fixed as I've just hit this issue.

Edit: part of my hypothesis was wrong (is not to do with number of files in cache_dir)

Details

Situation

Long running consumer completing same task (downloading image from URL), run across about 6000 tasks.

Using Cachier with Pickle core.

Versions

Python 3.11.7
Linux 6.2.0-39-generic #40~22.04.1-Ubuntu

cachier==2.2.2
watchdog==3.0.0

Traceback

 Traceback (most recent call last):
 File "/home/devuser/.local/lib/python3.11/site-packages/dramatiq/worker.py", line 485, in process_message
 res = actor(*message.args, **message.kwargs)
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/devuser/.local/lib/python3.11/site-packages/dramatiq/actor.py", line 177, in __call__
 return self.fn(*args, **kwargs)
 ^^^^^^^^^^^^^^^^^^^^^^^^
 File "/code/src/some_service/consumers/bars.py", line 18, in import_foo_bar
 imported_bar = importer.import_bar(foo_id, allow_update)
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/code/src/some_service/bars/foo_importer.py", line 145, in import_bar
 self.import_image(
 File "/code/src/some_service/bars/foo_importer.py", line 213, in import_image
 data = self.foo.get_image_data(foo_data_id, foo_size, image_extention)
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/devuser/.local/lib/python3.11/site-packages/cachier/core.py", line 308, in func_wrapper
 return core.wait_on_entry_calc(key)
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/devuser/.local/lib/python3.11/site-packages/cachier/pickle_core.py", line 267, in wait_on_entry_calc
 observer.start()
 File "/home/devuser/.local/lib/python3.11/site-packages/watchdog/observers/api.py", line 261, in start
 emitter.start()
 File "/home/devuser/.local/lib/python3.11/site-packages/watchdog/utils/__init__.py", line 92, in start
 self.on_thread_start()
 File "/home/devuser/.local/lib/python3.11/site-packages/watchdog/observers/inotify.py", line 119, in on_thread_start
 self._inotify = InotifyBuffer(path, self.watch.is_recursive)
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/devuser/.local/lib/python3.11/site-packages/watchdog/observers/inotify_buffer.py", line 37, in __init__
 self._inotify = Inotify(path, recursive)
 ^^^^^^^^^^^^^^^^^^^^^^^^
 File "/home/devuser/.local/lib/python3.11/site-packages/watchdog/observers/inotify_c.py", line 167, in __init__
 Inotify._raise_error()
 File "/home/devuser/.local/lib/python3.11/site-packages/watchdog/observers/inotify_c.py", line 430, in _raise_error
 raise OSError(errno.EMFILE, "inotify instance limit reached")
 OSError: [Errno 24] inotify instance limit reached

Misc

Creating cached functions using the following:

self.get_image_data = cachier.cachier(
  cache_dir=".tmp/cachier",
  seperate_files=True,
  pickle_reload=False,
)(parent.get_image_data)

Cache directory

.tmp/cachier % ls -a | wc -l
8309

.tmp/cachier % ls -a | grep get_image_data | wc -l
6099

OLD INCORRECT Analysis

Here's what I think is happening and how it might be resolved.

First - there is a limit to how many files can be watched using inotofy (which is what watchdog is using with the Linux optimal Observer). According to this page the default max_user_watches limit on most Linux systems is 8192 - this is per user so includes anything else running.

A workaround would be to increase this limit on the system being used, but this would just "postpone" the error.

Second - I believe that there are 2 scenarios that will trigger this limit being hit in general use.

a) cachier is being used across a large number of different functions
b) a function is using seperate_files=True

Third - the error is only thrown if a call to a cached function results in a 'being_calculated' result - waiting for a result is the only time the observer is created . This observer watches the entire cache directory - so it doesn't really matter which scenario fills the cache dir as long as the number of files is larger than the limit (in fact it might be possible that even filling the cache dir with non-cachier related files will trigger the error).

For the fix I don't think anything like "adding subdirectories per function" will help because of case b) above. It's unclear if watchdog Observers are designed to watch a single file (the current implementation is to watch the entire cache_dir and filter on the expected file) - but this might be a solution. Some other work arounds would be to change what library we use for watching, or create a directory per key (kinda gross tbh), or simply create our own polling watcher (with a small sleep interval to avoid smashing the CPU - e.g. time.sleep(0.05) # 50ms).

I'll see if I can find a way to reproduce the problem.

@nhairs-lumin
Copy link

nhairs-lumin commented Jan 16, 2024

Update:

I cannot reproduce the problem with the below scripts.

Why? It turns out I had a lot of hanging consumer threads that was probably taking me over the max_user_instances (cat /proc/sys/fs/inotify/max_user_instances) which is probably being reported here (instead of my incorrect assumption above of max_user_watches).

(I identified this hanging threads usings the bash find snippet in https://unix.stackexchange.com/a/15549).

Once I shutdown those consumer threads I could run the below tests with 70000 junk files which is well over my system's max_user_watches of 65536)

Files

main.py

### IMPORTS
### ====================================
import os
import sys
import secrets
import shutil
import time

from cachier import cachier

### CONSTANTS
### ====================================
CACHE_DIR = f"cachier-test-{secrets.token_hex(8)}"
TIME = 20

### FUNCTIONS
### ====================================
def fill_directory(count: int = 9000) -> None:
    """Fill the cache dir with `count` junk files"""
    for i in range(count):
        filename = f"{CACHE_DIR}/{i}"
        with open(filename, "wb") as junk_file:
            junk_file.write(secrets.token_bytes(8))
    return


@cachier(cache_dir=CACHE_DIR)
def cached_func(key) -> str:
    time.sleep(TIME)
    return secrets.token_hex(8)

def run_test(junk_count: int) -> None:
    print("[ SETUP ]")
    print(f"Using cache_dir: {CACHE_DIR}")
    print(f"Creating {junk_count} junk files...")
    fill_directory(junk_count)

    print(f"[ GET READY ]")
    print(
        "The test is about to start, in another window you will need to run the "
        f"following command. You will have {TIME} seconds to run the command before "
        "the test ends."
    )
    print(f"python3 test.py {CACHE_DIR}")

    print("[ TEST RUNNING ]")
    cached_func("asdf")

    print("[ TEST ENDING ]")
    return

### MAIN
### ====================================
if __name__ == "__main__":
    if len(sys.argv) != 2:
        print("Usage: python3 main.py <junk_count>")
        sys.exit(1)
    try:
        os.mkdir(CACHE_DIR)
        run_test(int(sys.argv[1]))
    finally:
        if os.path.isdir(CACHE_DIR):
            shutil.rmtree(CACHE_DIR)

test.py

### IMPORTS
### ====================================
import sys
from cachier import cachier

### MAIN
### ====================================
if __name__ == "__main__":
    if len(sys.argv) != 2:
        print("Usage: python3 test.py <cache_dir>")
        sys.exit(1)

    @cachier(cache_dir=sys.argv[1])
    def cached_func(key) -> str:
        return "test"

    cached_func("asdf")

@shaypal5
Copy link
Collaborator

Wow, amazing work tracking this down, explaining it thoroughly and reproducing it!

All three mitigations, however, seem non-trivial to me (some technically, some in a bet on change in user experience, like the directory-per-key suggestion, as I suspect some people rely on the current single directory behavior.

Still, I'd appreciate you working on a suggestion to the most sensible solution, and hopefully we'll get other key contributes to chime in on this, especially @lordjabez .

@shaypal5 shaypal5 reopened this Jan 16, 2024
@shaypal5
Copy link
Collaborator

shaypal5 commented Jan 16, 2024

It will also be great to hear from the iNotify bugfixers, @ericmjl and @ofirnk, and from @erap129 who wrote the seperate_files feature.

@nhairs-lumin
Copy link

Ahh @shaypal5 - looks like you got here just as I realised I could not reproduce the problem (see edited comments 😅).

That said I think I am closer - let me spend some more time reproducing.

@shaypal5
Copy link
Collaborator

That's ok. Take your time. :)
Let us know when you have an update for us.

@nhairs-lumin
Copy link

nhairs-lumin commented Jan 16, 2024

Okay this time I can actually reproduce it.

Uses same main.py as above.

test2.py

### IMPORTS
### ====================================
import sys
import threading

from cachier import cachier

### GLOBALS
### ====================================
COUNT_LOCK = threading.Lock()
COUNT_SUCCESS = 0
COUNT_ERROR = 0

### MAIN
### ====================================
if __name__ == "__main__":
    if len(sys.argv) != 3:
        print("Usage: python3 test2.py <cache_dir> <threads>")
        sys.exit(1)

    @cachier(cache_dir=sys.argv[1])
    def cached_func(key) -> str:
        return "test"

    def thread_func(thread_num: int):
        global COUNT_LOCK
        global COUNT_SUCCESS
        global COUNT_ERROR
        print(f"T{thread_num}: Starting ...")
        try:
            cached_func("asdf")
            with COUNT_LOCK:
                COUNT_SUCCESS += 1
        except Exception as e:
            print(f"T{thread_num}: Error - {e!r}")
            with COUNT_LOCK:
                COUNT_ERROR += 1
        return

    print("[STARTING THREADS]")
    threads = []
    for i in range(int(sys.argv[2])):
        thread = threading.Thread(target=thread_func, args=(i,), name=f"T{i}")
        threads.append(thread)
        thread.start()

    print("[JOINING THREADS]")
    for thread in threads:
        try:
            thread.join()
        except Exception as e:
            print(f"Error joining thread {thread.name} - {e!r}")

    print("[STATS]")
    print(f"success = {COUNT_SUCCESS}")
    print(f"error = {COUNT_ERROR}")
    print(f"sanity: counts = {COUNT_SUCCESS + COUNT_ERROR}, expected {sys.argv[2]}")

Running

Ran tests using python3 main.py 1

Low count

% python3 test2.py cachier-test-5d8afbc0cbac4a32 20
[STARTING THREADS]
T0: Starting ...
T1: Starting ...
T2: Starting ...
T3: Starting ...
T4: Starting ...
T5: Starting ...
T6: Starting ...
T7: Starting ...
T8: Starting ...
T9: Starting ...
T10: Starting ...
T11: Starting ...
T12: Starting ...
T13: Starting ...
T14: Starting ...
T15: Starting ...
T16: Starting ...
T17: Starting ...
T18: Starting ...
T19: Starting ...
[JOINING THREADS]
[STATS]
success = 20
error = 0
sanity: counts = 20, expected 20

High count

% python3 test2.py cachier-test-c4366eabb7e4974a 100 
[STARTING THREADS]
T0: Starting ...
T1: Starting ...
T2: Starting ...
T3: Starting ...
T4: Starting ...
T5: Starting ...
T6: Starting ...
T7: Starting ...
T8: Starting ...
T9: Starting ...
T10: Starting ...
T11: Starting ...
T12: Starting ...
T13: Starting ...
T14: Starting ...
T15: Starting ...
T16: Starting ...
T17: Starting ...
T18: Starting ...
T19: Starting ...
T20: Starting ...
T21: Starting ...
T22: Starting ...
T23: Starting ...
T24: Starting ...
T25: Starting ...
T26: Starting ...
T27: Starting ...
T28: Starting ...
T29: Starting ...
T30: Starting ...
T31: Starting ...
T32: Starting ...
T33: Starting ...
T34: Starting ...
T35: Starting ...
T36: Starting ...
T37: Starting ...
T38: Starting ...
T39: Starting ...
T40: Starting ...
T41: Starting ...
T43: Starting ...
T44: Starting ...
T46: Starting ...
T45: Starting ...
T48: Starting ...
T47: Starting ...
T42: Starting ...
T49: Starting ...
T50: Starting ...
T51: Starting ...
T52: Starting ...
T53: Starting ...
T54: Starting ...
T55: Starting ...
T56: Starting ...
T57: Starting ...
T58: Starting ...
T59: Starting ...
T61: Starting ...
T60: Starting ...
T63: Starting ...
T64: Starting ...
T66: Starting ...
T62: Starting ...
T65: Starting ...
T67: Starting ...
T71: Starting ...
T68: Starting ...
T73: Starting ...
T70: Starting ...
T72: Starting ...
T76: Starting ...
T74: Starting ...
T75: Starting ...
T80: Starting ...
T69: Starting ...
T79: Starting ...
T81: Starting ...
T77: Starting ...
T78: Starting ...
T82: Starting ...
T83: Starting ...
T84: Starting ...
T85: Starting ...
T86: Starting ...
T87: Starting ...
T88: Starting ...
T89: Starting ...
T90: Starting ...
T91: Starting ...
T92: Starting ...
T93: Starting ...
T94: Starting ...
T95: Starting ...
T97: Starting ...
T96: Starting ...
T98: Starting ...
T99: Starting ...
[JOINING THREADS]
T69: Error - OSError(24, 'inotify instance limit reached')
T81: Error - OSError(24, 'inotify instance limit reached')
T78: Error - OSError(24, 'inotify instance limit reached')
T84: Error - OSError(24, 'inotify instance limit reached')
T37: Error - OSError(24, 'inotify instance limit reached')
T92: Error - OSError(24, 'inotify instance limit reached')
T97: Error - OSError(24, 'inotify instance limit reached')
T24: Error - OSError(24, 'inotify instance limit reached')
T56: Error - OSError(24, 'inotify instance limit reached')
T76: Error - OSError(24, 'inotify instance limit reached')
T98: Error - OSError(24, 'inotify instance limit reached')
T59: Error - OSError(24, 'inotify instance limit reached')
[STATS]
success = 88
error = 12
sanity: counts = 100, expected 100

While running this test I checked how many instances were running (see above comment for SO link to where this snippet is from).

% find /proc/*/fd/* -type l -lname 'anon_inode:inotify' -exec sh -c 'cat $(dirname {})/../cmdline; echo ""' \; 2>/dev/null | wc -l
128

Running it again after it was done:

% find /proc/*/fd/* -type l -lname 'anon_inode:inotify' -exec sh -c 'cat $(dirname {})/../cmdline; echo ""' \; 2>/dev/null | wc -l
40

This matches what we'd expect of only being able to handle 88 more observers (88 + 40 == 128 (aka the limit)).

Running it again with exactly 88 observers: (note 88 is for my system while it is running while I write this, with my max_user_instances)

% python3 test2.py cachier-test-54654296ffdc26c0 88
[STARTING THREADS]
T0: Starting ...
T1: Starting ...
T2: Starting ...
T3: Starting ...
T4: Starting ...
T5: Starting ...
T6: Starting ...
T7: Starting ...
T8: Starting ...
T9: Starting ...
T10: Starting ...
T11: Starting ...
T12: Starting ...
T13: Starting ...
T14: Starting ...
T15: Starting ...
T16: Starting ...
T17: Starting ...
T18: Starting ...
T19: Starting ...
T20: Starting ...
T21: Starting ...
T22: Starting ...
T23: Starting ...
T24: Starting ...
T25: Starting ...
T26: Starting ...
T27: Starting ...
T28: Starting ...
T29: Starting ...
T30: Starting ...
T31: Starting ...
T32: Starting ...
T33: Starting ...
T34: Starting ...
T35: Starting ...
T37: Starting ...
T38: Starting ...
T36: Starting ...
T40: Starting ...
T42: Starting ...
T43: Starting ...
T44: Starting ...
T46: Starting ...
T41: Starting ...
T48: Starting ...
T45: Starting ...
T50: Starting ...
T39: Starting ...
T47: Starting ...
T51: Starting ...
T49: Starting ...
T55: Starting ...
T56: Starting ...
T54: Starting ...
T58: Starting ...
T52: Starting ...
T61: Starting ...
T62: Starting ...
T63: Starting ...
T64: Starting ...
T66: Starting ...
T67: Starting ...
T68: Starting ...
T59: Starting ...
T70: Starting ...
T60: Starting ...
T72: Starting ...
T73: Starting ...
T57: Starting ...
T69: Starting ...
T77: Starting ...
T65: Starting ...
T71: Starting ...
T53: Starting ...
T81: Starting ...
T83: Starting ...
T84: Starting ...
T76: Starting ...
T78: Starting ...
T87: Starting ...
[JOINING THREADS]
T82: Starting ...
T85: Starting ...
T79: Starting ...
T74: Starting ...
T86: Starting ...
T75: Starting ...
T80: Starting ...
[STATS]
success = 88
error = 0
sanity: counts = 88, expected 88

Changing limits

Lets try changing the limits:

sudo sysctl fs.inotify.max_user_instances=200

If we now run the test again we should expect number of errors equal to number of already consumed instances (currently 40)

% python3 test2.py cachier-test-75a71579e729aa52 200

[STARTING THREADS]
T0: Starting ...
# ...
T178: Starting ...
[JOINING THREADS]
T168: Error - OSError(24, 'inotify instance limit reached')
# ...
T160: Error - OSError(24, 'inotify instance limit reached')
[STATS]
success = 160
error = 40
sanity: counts = 200, expected 200

Analysis

Assuming this has correctly identified the issue:

This error only affects Linux using inotify (this will be majority of Linux users as inotify requires Linux 2.6 which was first released in 2003).

The error is triggered when a number of Cachier Pickle cores are waiting on a "being_calculated" value (which each create a new watchdog.Observer) which exhuasts the allowed inotify instances available to a user (cat /proc/sys/fs/inotify/max_user_instances). The exact number of waiting instances depends on the max_user_instances and the number of inotify instances being consumed by other processes.

@nhairs-lumin
Copy link

nhairs-lumin commented Jan 16, 2024

@shaypal5 - I'm going to leave my analysis here for the moment (see test2.py comment with edit). It would be good if we could get some others to test this script check that it is reproducable on Linux systems (and does not trigger on non-linux systems).

Edit: I'm pretty confident in my analysis now - see "Changing limits" in the previous comment.

@shaypal5
Copy link
Collaborator

Great work! Let's see if we can get people to chime in on this.

@nhairs-lumin
Copy link

So had I checked the relevent code from watchdog it should have been obvious that the instance limit (rather than watch limit) was the culprit 🤦 - but good that we now have a couple of test cases that we can use to test for a few error cases.

It also looks like inotify observers can be used on a single path (opposed to a directory), this wouldn't deal with our current error of "inotify instance limit reached" but would help if users were hitting "inotify watch limit reached" - it would probably also help with the efficiency of the program - especially in environments when there are many files changing (e.g. using seperate_files=True) as this would generate many events (most of which are ignored).

Given max_user_instances is (generally) much lower than max_user_watches, it might be better to find a way to re-use a single observer per cache_dir (this would still reduce the number of watches currently consumed, but would not be as efficient as an observer per single file).


That said, it appears that the default max_user_instances=128 is very low for modern systems (which generally have 4GB+ of RAM), it might also be worth adding documentation encouraging users to increase this limit on their system.

(Even if we do this I still think it would be a good idea to implement the more efficient use of observers per the above).


Various references:

@shaypal5
Copy link
Collaborator

Ok. Great analysis. Thank you yet again!

Is this something you would consider trying to solve with a suggested PR yourself?

@nhairs-lumin
Copy link

No problem :)

Not at the moment - I'm pretty busy contributing to some other open-source projects 🥲. If that changes I'll let you know (probably via a PR).

@shaypal5
Copy link
Collaborator

Ok. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants