-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When model cache process is interrupted by app restart, model cache becomes unusable #30360
Comments
I think something is missing in the description. What happens with cached Models when the instance is killed? If the user tries to access them, they get an error? |
Thanks @luizarakaki just updated the description and finished the sentence. |
At first read, this is working correctly. The fact that the cache is in a "pending" state means it will not be substituted in. The original design of this is that interruptions like this just bust the cache until the next refresh happens. If the cache stays in the "pending" state and never gets updated that is a bug. If it gets fixed on the next scheduled refresh then I consider it working well. Do you know which is the case here? |
Based on (def ^:private refreshable-states
"States of `persisted_info` records which can be refreshed."
#{"creating" "persisted" "error"}) I guess that the cache being in a "pending" state means it is forever disabled. We'll have to fix this. The good thing is that we disallow concurrent execution (jobs/defjob ^{org.quartz.DisallowConcurrentExecution true
:doc "Refresh persisted tables job"}
PersistenceRefresh [job-context]
(refresh-job-fn! job-context)) That means that the refresh job runs serially so "pending" isn't a concern. The only issue is making sure that the cache isn't currently undergoing a manual refresh. We should be able to use the |
It stays pending until next refresh, but if scheduled 24 hours, and uncached queries are very expensive, performance can take a significant hit or even become unresponsive until cache successfully run. |
@likeshumidity I think that is acceptable. The original design of this was always "cached when possible". from bruno in the notion doc:
Improving this is always good. But I'm not sure I'd call this a bug (unless the "pending" state prevents the cache from being repopulated on the next scheduled refresh). |
Right... Thanks for the fix there. Yeah, this was by design. So I agree with the "new feature" label there. |
Hi (original reporter here)! Thanks @likeshumidity for the report! Thanks for looking into it! Some additionnal info:
Actually it never refreshes until we disable all the caches, and re-enable them. We had one cache stuck for more than 10 days (although we have a 1h refresh) (took us a long time to figure out what was wrong with our Metabase instance). So this sentence is correct:
We would be fine with the cache being unavailable between 2 runs when this problem occurs, but as of today, we have to manually restart (all of) them (and manually monitor them). |
@dpsutton re: last comment, I was wrong about the problem correcting itself on the next model cache run. It does require manual intervention to address by disabling and re-enabling cache. |
FWIW and for others people, I plugged Metabase DB as a datasource of Metabase, and create an Alert in Metabase with this query, to at least be notified if something is off:
|
Is your feature request related to a problem? Please describe.
Describe the solution you'd like
Describe alternatives you've considered
How important is this feature to you?
The text was updated successfully, but these errors were encountered: