Fix for race condition loading management.cattle.io.cluster schema #5319
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #5313
Fixes #4967
This PR fixes the issue reported in both the above issues - there is a rare race condition that leads to the fail-whale page being displayed:
Tracing through the code, the problem is that
middleware/authenticated.js
on line 265, we dispatch calls to load the management and cluster stores in parallel. The management store will load all schemas (seestore/index.js
line 538). The cluster store assumes that the schema for the management cluster is avaialable (seestore.js
line 677).There is a race condition where if the schemas are not loaded, the call to load the cluster will fail and you'll get bumped to the fail-whale page.
This PR fixes this by adding a wait into the
loadCluster
code - we don't want to load the schemas in multiple places - we assume that the schemas are in the process of being loaded and wait up to 15 seconds for them - after 10 seconds we log a message to the browser console and after 15 if we still don't have the schemas, we request them synchronously (this should not be needed, but is a fail-safe).This is hard to test, but from the description above you should be able to understand how this can happen.
Developers can test by:
Commenting out line 673 in
store/index.js
- this essentially removes the fix in this PR. The line to comment out is:Now, change line 538 in
store/index.js
from:to:
This will add a 5 second delay to the loading of schemas.
In the UI, refresh to the page for the local cluster, i.e. https://127.0.0.1:8005/c/local/explorer
You should see the fail-whale page.
Now uncomment line 673 in
store/index.js
to put back in the fix from this PR and try again and everything should load correctly.