Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix invalidate obj on close #54

Merged
merged 3 commits into from
May 22, 2019

Conversation

rlizzo
Copy link
Member

@rlizzo rlizzo commented May 17, 2019

Motivation and Context

Why is this change required? What problem does it solve?:

A full overview of the problem is provided in: #53 (comment). However, in brief:

  • the changes introduced in Invalidate existing dataset or metadata accessor objects after a checkout is closed #41 resulted in a few situations where when a write-enabled checkout (and only write-enabled) would inadvertently invalidate the old weakproxy references which had been assigned to it, before the checkout.close() method was formally called.
  • This was observed to occur when checkout.commit() and checkout.reset_staging_area() were called.
  • The reason was because these two functions made calls into checkout.__setup(), replaced the only strong references to previously "proxied" dataset or metadata objects with new versions.

If it fixes an open issue, please link to the issue here:

Description

Describe your changes in detail:

  • In order to stick with the idea that "__setup" should probably only be called once upon actual class "setup"..., the problem calls were just wholesale removed from commit() and reset_stating_area().
  • For commit(), we actually do need to perform some setup work for the backend file-stores (mainly to open and close file handles so symlinks can be properly identified). This work was pushed into the DatasetDataWriter class which actually performed that work natively, and this is arguably cleaner than the previous implementation. No changes were needed for metadata if we don't try to rebuild the entire object, it just works :)
  • For reset_staging_area(), I thought the best choice would be to just have that operation close the checkout in full just before it returns. It's probably best for a user to fully reset their environment after wiping out the staging area in full, and that is the most effective way to ensure no leftover state is hanging around after this is called.

Types of changes

What types of changes does your code introduce? Put an x in all the boxes that apply:

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Is this PR ready for review, or a work in progress?

  • Ready for review
  • Work in progress

How Has This Been Tested?

Put an x in the boxes that apply:

  • Current tests cover modifications made
  • New tests have been added to the test suite
  • Modifications were made to existing tests to support these changes
  • Tests may be needed, but they are not included when the PR was proposed
  • I don't know. Help!

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have signed (or will sign when prompted) the tensorwork CLA.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@rlizzo rlizzo added the Bug: Priority 2 No risk of data/record corruption or loss; ANY user facing impacts label May 17, 2019
@rlizzo rlizzo added this to the v0.1.0 Release milestone May 17, 2019
@rlizzo rlizzo requested a review from hhsecond May 17, 2019 19:52
Copy link
Member

@hhsecond hhsecond left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -431,7 +443,7 @@ def reset_staging_area(self):
commit_hash=head_commit)

logger.info(f'Hard reset completed, staging area head commit: {head_commit}')
self.__setup()
self.close()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rlizzo Personally, I don't have any sentiments towards this but I have a feeling that closing the checkout implicitly would be confusing for users. Just wanted you to have a second thought about it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I'm going to mark it as a TODO for now. There's a bit more logistics that go into handling a reset since we need to invalidate some objects and potentially not others. For now I'm ok with it though.

@rlizzo rlizzo merged commit dded091 into tensorwerk:master May 22, 2019
@rlizzo rlizzo deleted the fix-invalidate-obj-on-close branch May 22, 2019 20:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug: Priority 2 No risk of data/record corruption or loss; ANY user facing impacts
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants