Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attribute copy #768

Merged
merged 15 commits into from
Dec 9, 2021
Merged

Attribute copy #768

merged 15 commits into from
Dec 9, 2021

Conversation

pkasprzyk
Copy link
Contributor

No description provided.

neptune/new/types/value_copy.py Outdated Show resolved Hide resolved
exp["copied/path"] = exp["some/path/num"]
exp.wait()
self.assertEqual(42, exp["copied/path"].fetch())

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see such test and more meaningful error than what I got now.

        exp["some/path/num"] = 42
        exp["copied/path"] = 11
        exp["copied/path"] = exp["some/path/num"]
self = <[AttributeError("'Integer' object has no attribute 'value'") raised in repr()] Integer object at 0x7fac10319490>
value = <Integer field at "some/path/num">

    def __init__(self, value):
>       self.value = int(value)
E       TypeError: int() argument must be a string, a bytes-like object or a number, not 'Handler'

../../../neptune/new/types/atoms/integer.py:33: TypeError

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, that was really broken... fixed now, I hope -- thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have I broken this because String does not support copying yet?

        exp["some/path/num"] = "fo9o"
        exp["copied/path"] = 42
        exp["copied/path"] = exp["some/path/num"]
self = <neptune.new.internal.backends.neptune_backend_mock.NeptuneBackendMock.NewValueOpVisitor object at 0x7f0ef5fe2670>
op = CopyAttribute(path=['copied', 'path'], container_id='bdcc8dc7-a2b7-42a9-ba25-209a43888ed7', source_path=['some', 'path', 'num'], source_attr_cls=<class 'neptune.new.attributes.atoms.string.String'>)

    def visit_copy_attribute(self, op: CopyAttribute) -> Optional[Value]:
>       getter = op.source_attr_cls.getter
E       AttributeError: type object 'String' has no attribute 'getter'

Additionally it'd be good to have additional case with copying on existing value (the case recently broken).

Copy link
Contributor Author

@pkasprzyk pkasprzyk Dec 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have I broken this because String does not support copying yet?

yep

Additionally it'd be good to have additional case with copying on existing value

will do

neptune/new/attributes/atoms/integer.py Outdated Show resolved Hide resolved
neptune/new/internal/backends/neptune_backend_mock.py Outdated Show resolved Hide resolved
@pkasprzyk pkasprzyk marked this pull request as ready for review December 7, 2021 14:20
@pkasprzyk pkasprzyk changed the title Attribute copy POC Attribute copy Dec 7, 2021
@pkasprzyk pkasprzyk marked this pull request as draft December 7, 2021 15:35
@pkasprzyk pkasprzyk marked this pull request as ready for review December 8, 2021 15:27
@pkasprzyk pkasprzyk changed the base branch from master to dev/project-metadata December 8, 2021 15:33
neptune/new/attributes/atoms/copiable_atom.py Show resolved Hide resolved
neptune/new/attributes/atoms/float.py Show resolved Hide resolved
@@ -65,7 +66,7 @@ def process(self, operations: List[Operation]):

def get_operations(self) -> List[Operation]:
result = []
for _, acc in sorted(self._accumulators.items()):
for _, acc in self._accumulators.items():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We dropped sort, because we have to make sure that every batch is limited by last copy operation, right?

Why did we sorted in the first place?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. yes, exactly
  2. that's a great question. @aniezurawski maybe you know?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not remember. It is sorted by path so it should not metter. Maybe it was easier to test it this way?

Anyway, are you guys sure it is good idea to change anything in operation_preprocessor? Shouldn't be part-batches preprocessed independently?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could just split a batch by copy operations and reuse existing code for every part-batch. Much simpler and elegant.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moreover we probably should ack after every part-batch, so maybe single call to execute_operations should just return number operations actually performed and do a single part-batch at the time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would change the idea that in processing a queue batch we do upload (and artifact) ops first -- we would do upload "subbatch", then other "subbatch", and upload again (there's a special comment saying "Upload operations should be done first since they are idempotent").

I'm not if we actually need that -- other than that, I'm perfectly OK with what you suggest.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would change the idea that in processing a queue batch we do upload (and artifact) ops first

No, it would not. It would still apply but to potentially smaller batches. Just split first, and do upload-related logic later for every part-batch. ;)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moreover we probably should ack after every part-batch, so maybe single call to execute_operations should just return number operations actually performed and do a single part-batch at the time?

Right, that sounds good (although the "much simpler" may be lost on me ;))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@pkasprzyk pkasprzyk requested a review from shnela December 9, 2021 10:55
@pkasprzyk pkasprzyk merged commit 4622791 into dev/project-metadata Dec 9, 2021
@pkasprzyk pkasprzyk deleted the pk/copy-attribute branch December 9, 2021 12:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants