Skip to content

Conversation

apdavison
Copy link
Member

I failed to notice during the review of #606 and #472 that the implementation of Event.labels, Epoch.labels and Epoch.durations had been changed to store these attributes as array annotations.

I think this change is a mistake, because (i) semantically labels and durations are not annotations, they are fundamental data attributes for these objects, so it is confusing to handle them as annotations; (ii) annotations are treated differently to data attributes during copying and merging, and this can lead to problems.

I encountered this specifically while trying to fix problems with the Container.merge() method; I suspect it is also related to NeuralEnsemble/elephant#203

@mdenker
Copy link
Member

mdenker commented Apr 9, 2019

First of all, yes, I can confirm that this had also been an issue on NeuralEnsemble/elephant#203. Also, semantically, I agree that labels and durations should not be considered as array annotations. Therefore, for now, I would support the suggested revert of durations and labels on the epoch and event objects.

I understand the rationale behind making durations and labels is to use the same mechanism for ensuring consistent lengths of the arrays. In the reverted version, the array lengths of durations and labels are not checked. On the one hand, this is good, because the check had prevented re-assigning the values of an Epoch object before (i.e., typing something like e.times=[1,2,3]*pq.ms didn't work if len(e) was not 3). On the other hand, it's also not very optimal that one can have inconsistent Neo objects.

I think a question for the future is whether to be lazy and Pythonic in allowing Neo objects even when they are "intrinsically inconsistent", or whether to enforce consistent Neo objects. An assertion function could be provided that a user can call to validate a given Neo object. In such a function, maybe one could still find a way to have labels, durations and array annotations resort to a similar mechanism (as currently implemented for array annotations) internally to check/validate their lengths. This way, labels and durations would not appear as array annotations, but the code overhead would still be minimized.

@JuliaSprenger
Copy link
Member

Sorry for the long delay. Indeed the idea behind using the array annotation mechanism also for attributes like labels was to have less duplicated code in Neo as these attributes are expected to be consistent in length with the data.

I agree with @apdavison in (i) that having them listed in as in the array_annotation dictionary as well as object attribute might be confusing, but should not distract the user too much, since these are additional array_annotations he can just ignore in the simplest case. In the future this duplication could be avoided by either introducing some mandatory keys in the array annotations, which are then displayed differently to the user, so he's aware of labels and durations being different than custom annotations.
Regarding (ii), I don't see why attributes should be handled differently than array_annotations. Can you give an example where this is necessary?

@mdenker: I find the example of directly reassigning the times of an event artificial. When would you reassign the times instead of generating a new object by slicing/addition?

@apdavison
Copy link
Member Author

In the reverted version, the array lengths of durations and labels are not checked.

I hadn't realized this. I will update the PR to check them.

On the one hand, this is good

I can't see anything good about that! We definitely want to enforce consistent Neo objects, otherwise we might as well just use plain NumPy arrays.

Regarding (ii), I don't see why attributes should be handled differently than array_annotations. Can you give an example where this is necessary?

I'm not saying they should be handled differently, but in the current master they are handled differently. I think the particular case I had trouble with was in the duplicate_with_new_array() method, where there is a comment "Array annotations are not copied here, because it is not ensured that the same number of signals is used and they would possibly make no sense when combined with another signal"

@JuliaSprenger
Copy link
Member

So the problem with the duplicate_with_new_data() method is that on the one side it is just not copying the array annotations because they might have an incompatible shape. On the other side this means that also attributes like labels and durations which are stored as array annotations are not available for the new object, basically creating an incomplete object.
One way out of this would be to (re)introduce the option to provide labels and durations as keyword arguments for the duplicate_with_new_data() method.
In addition array annotations should be kept if they are consistent with the new data provided.
@apdavison: Would this solve your issue?

@mdenker
Copy link
Member

mdenker commented Apr 10, 2019

One option to guarantee in this scenario that duplicate_with_new_data() behaves somewhat consistent would be that by default, the function does not copy array annoatations, durations and labels. Only when a special flag transfer_array_annotations=True is given to duplicate_with_new_data(), these properties are copied given that the new times are of same length.

This would lead to code such as:

a = Epoch([1,2,3]*pq.ms, durations=[5,6,6]*pq.ms)
b = a.duplicate_with_new_data([4,5,6]*pq.ms, durations=[1,2,2]*pq.ms)
c = a.duplicate_with_new_data([4,5,6]*pq.ms, transfer_array_annotations=True)
d = a.duplicate_with_new_data([4,5]*pq.ms,  durations=a.durations[1:3])

@pep8speaks
Copy link

Hello @apdavison! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 431:37: E126 continuation line over-indented for hanging indent

@apdavison
Copy link
Member Author

please could someone merge this?

@apdavison apdavison added this to the 0.8.0 milestone Jun 6, 2019
@samuelgarcia
Copy link
Contributor

Oups. Sorry.

@samuelgarcia samuelgarcia merged commit 239e708 into NeuralEnsemble:master Jun 6, 2019
@JuliaSprenger
Copy link
Member

@apdavison: So what do you think of @mdenker and my alternative suggestions above for future implementations?

@apdavison
Copy link
Member Author

@JuliaSprenger because the thread got a bit complex, I'm not sure exactly what you're proposing. Epoch.duplicate_with_new_data() does now take durations and labels as arguments (but not kw arguments).
Improving duplicate_with_new_data() to copy array annotations when appropriate or when requested seems like a good idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants