-
Notifications
You must be signed in to change notification settings - Fork 24
Description
Describe the bug
When using the hook for adding an arbitrary activity annotation, a succesive export of the graph will crash.
To Reproduce
Steps to reproduce the behavior:
- create a pluggy implementator of
activity_annotations, following the documentation - execute a
renkurun. - export the resulting graph containing the extra annotation.
Expected behavior
Crash (a KeyError exception) should be handled gracefully, and execution halted. This kind of bug will need a hard-revert of the git repo, since any further attempts to export the data will choke on the past annotation.
Run environment (please complete the following information):
in this particular run, renku==1.11.2, calamus==0.4.2, but after looking at the codebase I suspect the bug persists in current development head for both projects.
Additional context
I am not too familiar with the code, but I'm inclined to believe this is due to an underlying bug in calamus. More precisely, the schema lookup that happens in fields._serialize_single_obj fails. The original backtrace is:
File "/opt/conda/lib/python3.9/site-packages/renku/command/graph.py", line 79, in export_graph
graph = get_graph_for_all_objects()
File "/opt/conda/lib/python3.9/site-packages/inject/__init__.py", line 342, in injection_wrapper
return sync_func(*args, **kwargs)
File "/opt/conda/lib/python3.9/site-packages/renku/command/graph.py", line 195, in get_graph_for_all_objects
return _convert_entities_to_graph(objects, project)
File "/opt/conda/lib/python3.9/site-packages/renku/command/graph.py", line 239, in _convert_entities_to_graph
graph.extend(schema(flattened=True).dump(entity))
File "/opt/conda/lib/python3.9/site-packages/marshmallow/schema.py", line 557, in dump
result = self._serialize(processed_obj, many=many)
File "/opt/conda/lib/python3.9/site-packages/calamus/schema.py", line 187, in _serialize
value = field_obj.serialize(attr_name, obj, accessor=self.get_attribute)
File "/opt/conda/lib/python3.9/site-packages/marshmallow/fields.py", line 344, in serialize
return self._serialize(value, attr, obj, **kwargs)
File "/opt/conda/lib/python3.9/site-packages/calamus/fields.py", line 550, in _serialize
result.append(self._serialize_single_obj(obj, **kwargs))
File "/opt/conda/lib/python3.9/site-packages/calamus/fields.py", line 522, in _serialize_single_obj
schema = self.schema["to"][type(obj)]
KeyError: <class 'renku.domain_model.provenance.annotation.Annotation'>In that function, there's a safeguard, but there's a typo that makes the execution continue (note the missing raise in line 508):
if type(obj) not in self.schema["to"]:
ValueError("Type {} not found in field {}.{}".format(type(obj), type(self.parent), self.name))
Fixing that is useful to get a clearer error, but will not fix the underlying problem.
Diving into why the class is not present there turns to be puzzling: the class is indeed registered, but the lookup in the dictionary doesn't retrieve the object (equality comparison fail for both class objects):
ipdb> type(obj)
<class 'renku.domain_model.provenance.annotation.Annotation'>
ipdb> tt = type(obj)
ipdb> ts = tuple(self.schema["to"].keys())[0]
ipdb> ts
<class 'renku.domain_model.provenance.annotation.Annotation'>
ipdb> tt
<class 'renku.domain_model.provenance.annotation.Annotation'>
ipdb> ts == tt
False
ipdb> ts.__dict__
mappingproxy({'__module__': 'renku.domain_model.provenance.annotation', '__doc__': 'Represents a custom annotation for a research object.', '__init__': <function Annotation.__init__ at 0x7fc791986550>, 'copy': <function Annotation.copy at 0x7fc7919865e0>, 'generate_id': <staticmethod object at 0x7fc7919b49a0>, '__dict__': <attribute '__dict__' of 'Annotation' objects>, '__weakref__': <attribute '__weakref__' of 'Annotation' objects>})
ipdb> tt.__dict__
mappingproxy({'__module__': 'renku.domain_model.provenance.annotation', '__doc__': 'Represents a custom annotation for a research object.', '__init__': <function Annotation.__init__ at 0x7fc78ca8e280>, 'copy': <function Annotation.copy at 0x7fc78ca8e310>, 'generate_id': <staticmethod object at 0x7fc78ca81640>, '__dict__': <attribute '__dict__' of 'Annotation' objects>, '__weakref__': <attribute '__weakref__' of 'Annotation' objects>})I'm lost about why the mappingproxy returns the same class with functions at different addresses in memory, but I suspect the internals of pluggy might be interfering with the schema lookup at calamus that relies on equality of class objects.
A quick and dirty (although unelegant) workaround is to convert the class-based index to a string comparison, which seems to capture the same semantics:
schema = None
for klass, schema in self.schema["to"].items():
if str(klass) == str(type(obj)):
breaka slightly better way could perhaps be to index by string with {__module__}.{__name__}.
I can try to work on a better fix, how do you suggest this should be handled?
Metadata
Metadata
Assignees
Labels
Type
Projects
Status