-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Catalog corruption issue when running CLI in parallel #472
Fix Catalog corruption issue when running CLI in parallel #472
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This look very nice to me.
There are, I think, some important comments to address, but this is a very clean solution.
Thanks @anujc25 !
6f2c2e7
to
cbfc837
Compare
Do we have any e2e or integ tests (or doing multiple concurrent catalog reads/writes) that can help uncover the issue we are trying to fix here? Having them earlier would have help catch the issue, having them would have helped giving confidence that the issue is sufficiently addressed. Let's make sure we at least file a TODO to provide them. |
I am planning to add e2e tests for this. I wanted to get high-level feedback on the approach before I deep dive into writing tests for this PR and hence a draft PR. If the approach looks good, I will add some e2e tests for this change. |
sgtm. The not-so-great part is an expectation that any valid unlock function returned has to be called by the caller, but the approach is a reasonable compromise. |
e7dd402
to
aa9360d
Compare
8292dce
to
6df47bb
Compare
6df47bb
to
a99b6ca
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice effort. This is complicated stuff.
I haven't looked at the E2E tests yet because more concurrency questions came to my mind.
I'm wondering, in saveCatalogCache()
do we need to use the lockedFile
file descriptor to write to the file? I think doing that would make unit tests proper fail if Unlock()
is called too early (which we had before but you corrected in your latest changes), because a call to Upsert()
after the Unlock()
would not be able to write to the closed file.
If the cc.unlock is `nil` consider it as catalog has been unlocked already and throw an meaningful error when running Upsert/Delete calls on the unlocked catalog
abfd39e
to
75c2b96
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks for this complicated and important change!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates and improvements after the reviews.
There is a nit a typo in the comments, but changes lgtm.
75c2b96
to
118ba57
Compare
What this PR does / why we need it
NewContextCatalog
(for the reading) andNewContextCatalogUpdater
(for reading/writing) as separate APIs.lockedfile
API to lock the file. Inspired from: https://go.googlesource.com/proposal/+/master/design/33974-add-public-lockedfile-pkg.mdWhich issue(s) this PR fixes
Fixes #471
Describe testing done for PR
Also added E2E tests to run tanzu commands in parallel when 2 telemetry plugins are present
Also added E2E tests to install multiple Plugins in parallel
User can verify the E2E test by running tests with latest
tanzu
binary with the fix and the old `tanzu binary without the fix as below:Release note
Additional information
Special notes for your reviewer