New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BugFix: Removing All Samples in Arrayset Should Not Remove Aset Schema #159
Conversation
Codecov Report
@@ Coverage Diff @@
## master #159 +/- ##
==========================================
+ Coverage 95.22% 95.29% +0.07%
==========================================
Files 63 64 +1
Lines 11393 11548 +155
Branches 974 977 +3
==========================================
+ Hits 10848 11004 +156
Misses 362 362
+ Partials 183 182 -1
|
7aed94d
to
3224144
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -1093,7 +1084,8 @@ def items(self) -> Iterable[Tuple[str, Union[ArraysetDataReader, ArraysetDataWri | |||
Iterable[Tuple[str, Union[:class:`.ArraysetDataReader`, :class:`.ArraysetDataWriter`]]] | |||
returns two tuple of all all arrayset names/object pairs in the checkout. | |||
""" | |||
for asetN, asetObj in self._arraysets.items(): | |||
for asetN in list(self._arraysets.keys()): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just curious: why an explicit list()
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah.. I hate it too... It's because of the following situation
>>> for name, aset in co.arraysets.items():
>>> del co.arraysets[name]
traceback RuntimeError
------------------------
RuntimeError: dictionary changed size during iteration
While I normally subscribe to the belief that "if you mutate a data structure while iterating over it, you are living in a state of sin, and deserve whatever happens to you", it's hangars responsibility to manager usage of this thing which kindof behaves like a dict
and class simultaneously. Seemed unfair to put the implications of dealing with an implementation detail on the user when it can be fixed so trivially...
src/hangar/arrayset.py
Outdated
@@ -1278,6 +1282,8 @@ def init_arrayset(self, | |||
|
|||
Raises | |||
------ | |||
PermissionError | |||
If any enclosed arrayset is opned in a connection manager. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
opened
…need to increase tests.
…y when all samples are removed
3224144
to
03c8afe
Compare
Motivation and Context
Why is this change required? What problem does it solve?:
In previous versions of hangar, when the last sample in an arrayset was deleted, the arrayset schema would implicitly also be deleted (with no user notice).
In addition to being a detriment to UX, the level at which this operation occurred (
ArraysetDataWriter
) meant that the change was not being propagated up to theArraysets
class which holds the only strong reference keeping the object alive. Since the strong reference was never deleted, the user could technically still hold a live weakref proxy to this object which should have been finalized.In the worst case scenario (if a context manager for any
Arrayset
class was open at the time the last sample was removed), backend file handles may not be closed and invalidated properly (which would force an exception on anyset
orget
operation). Should this happen, it was possible that aset
to theArraysetDataWriter
would actually succeed - saving data to disk and writing a (valid) record reference to the staging db. Though the sample was recorded, the record of the schema spec would have been removed from the staging area / commit refs as soon as theArrayset
implicitly deleted itself. Upon a later checkout of this (or a child) commit/branch, the schema spec corresponding to the added sample refs would not exist, and as such noArrayset
would be generated for the sample (even though a valid reference was present, and a valid schema spec may exist in the commit's ancestory)If it fixes an open issue, please link to the issue here:
Description
Describe your changes in detail:
Arrayset
can not be performed when any arrayset is open in a context manager.Screenshots (if appropriate):
Types of changes
What types of changes does your code introduce? Put an
x
in all the boxes that apply:Is this PR ready for review, or a work in progress?
How Has This Been Tested?
Put an
x
in the boxes that apply:Checklist: