Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow single process access to multiple repos. #20

Merged
merged 2 commits into from
May 1, 2019

Conversation

rlizzo
Copy link
Member

@rlizzo rlizzo commented Apr 30, 2019

Fix behavior which prevented a single python process from working on multiple repositories.

Description

These changes modify the behavior of the Environment and hdf5 FileHandler singleton classes to instantiate singleton instances based on the path of the checked out repository. This seems to fix all outstanding bugs related to these odd behaviors, but is untested as of yet.

The TransactionRegister singleton did not need to be updated because it the reference counting scheme is isolated for each lmdb environment which is passed to it. As different repository instances in the same process will now have unique lmdb environment handles, it does not require any modifications to it's current behavior.

Motivation and Context

Addresses:

How Has This Been Tested?

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

ping @hhsecond for sanity check.

These changes modify the behavior of the Environment and hdf5
FileHandler singleton classes to instantiate singleton instances based
on the path of the checked out repository. This seems to fix all
outstanding bugs related to these odd behaviors, but is untested as of
yet.

The TransactionRegister singleton did not need to be updated because it
the reference counting scheme is isolated for each lmdb environment
which is passed to it. As different repository instances in the same
process will now have unique lmdb environment handles, it does not
require any modifications to it's current behavior.
Copy link
Member

@lantiga lantiga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works for me. Just make sure we return an error in case we don't pass the repo_path in.
The only gotcha to consider is that the singleton(s) will now live indefinitely. There's no way to get rid of them - which is what a singleton does anyway, but now we have more then one, so we should have a way to get rid of some of them if they're not used. We can't rely on GC, maybe we can do it on close.

src/hangar/context.py Show resolved Hide resolved
@rlizzo
Copy link
Member Author

rlizzo commented May 1, 2019

So this is an interesting point. Ive been purposefully avoiding putting a close method on the repo because i think it would be annoying for users to have to remember. Only checkouts need to be closed, which should not close the entire repo context when they are called.

Not sure there's a solution if im stuck on not putting in an explicit repo.close() method?

If this is not a pathlike object, a directory on disk, or a directory
which the user agent does not have write permissions for then an
exception will be thrown and the no repository object will be
instantiated.
@rlizzo rlizzo mentioned this pull request May 1, 2019
@rlizzo rlizzo merged commit d1d0e53 into tensorwerk:master May 1, 2019
@rlizzo
Copy link
Member Author

rlizzo commented May 1, 2019

Closes #5 and #4

@rlizzo rlizzo deleted the hotfix-multi-repo-singleton branch May 1, 2019 15:39
@lantiga
Copy link
Member

lantiga commented May 1, 2019

I think we should at least give the option of cleaning up the singletons (either a specific one or all of them). In general I'm not 100% sure why we need a singleton (or something like it) design-wise, can you explain in a one-liner?

This was referenced May 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants