-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Application definitions #63
Conversation
self._group = group | ||
|
||
def child_strategy(self, group: H5Group): | ||
if (definition_class := group.attrs.get(self.class_attribute)) is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This requires child classes to define class_attribute
and strategies
. @nvaytet and I talked about this recently and concluded that we should avoid this kind of requirement.
Can we avoid it here? That is, can this be implemented by defining those fields in ApplicationDefinitionStrategy
? Or can we use a decorator instead of inheritance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, this file is just a quick draft, to see if the strategy mechanism can be used for this. There are many things to improve here.
|
||
def child_strategy(self, group: H5Group): | ||
if (definition_class := group.attrs.get(self.class_attribute)) is not None: | ||
return self.strategies.get(definition_class) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function returns None
when
- there is no class attribute in the file
- there is no child strategy for the given attribute
Shouldn't the 2nd case be a hard failure, i.e. something is not supported by scippnexus?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that is for the application definition to decide. Without a custom strategy, child strategies are always set to None
, meaning the default strategy will be used by the child. Conceptually, this means the parent "does not care about or control" the child.
pass | ||
|
||
|
||
NXcanSAS.classes = [SASroot, SASentry, SASdata, SAStransmission_spectrum] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By inheriting from NXcanSAS
in all classes, this application definition supports arbitrary nesting. E.g. a root inside data. Does this make sense? Should we limit it more? For instance using something like the nexus loader prototype in ess?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had considered this. My conclusion was that in many cases structure may not be something people want to enforce (it will prevent reading partially broken/non-standard files, for example). At the same time, you can write your strategy in a way that it enforces structure.
@@ -2,7 +2,7 @@ | |||
# Copyright (c) 2022 Scipp contributors (https://github.com/scipp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach seems quite invasive in that it requires the NX*
classes to know about and handle strategies. Is this necessary or could it be implemented on a higher level like the loader prototype in ess?
So what kind of modification to the behaviour of, e.g. NXdata
does the application definition need to make which cannot be done as post processing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach seems quite invasive in that it requires the NX* classes to know about and handle strategies.
Well, this is the strategy pattern. It does not have to know about the details of the strategy.
Is this necessary or could it be implemented on a higher level like the loader prototype in ess?
Not really, you would need to fall back to h5py
, since scippnexus
will generally not be able to load groups that follow an application definition, e.g., because they define axes (dims) in a custom manner.
So what kind of modification to the behaviour of, e.g. NXdata does the application definition need to make which cannot be done as post processing?
Probably some could be made. But the post processing would need to reopen the file and read custom attributes, etc. So basically you could just go back and implemented your loader with plain h5py instead of scippnexus, as there might be little benefit.
Overall comment to the approach in #63 (comment) |
@jl-wynen I refactored to use decorators (I will move this to a separate file once we are happy with it). One thing that bothers me is the creation of instances of the strategies for each group. I have considering changing the |
I made this change, since it will make support for writing more convenient. Not perfectly happy with the result, but not worse that before either. |
@jl-wynen I managed to cleanup the mechanism significantly. Can you have a look? |
def __init__(self, data): | ||
self.data = data | ||
|
||
def __write_to_nexus_group__(self, group: NXobject): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there is a need for a dunder method since we are not going to support writing wildly different objects from different libraries. A protocol like this should be sufficient:
class NXWritable(Protocol):
def write_to_nexus_group(self, group:NXobject):
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for not making this a normal method was that users may want to make their existing classes "writable", without implementing a new class, and without changing the publicly visible interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dunder method is public. In my mind, a dunder method relates to a general convention across libraries / codebases. In this case here, we basically have duck typing, we just require a certain method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes public, but it does not interfere with tab completion, etc.
# Note that there is also an "uncertainties" attribute. It is not clear | ||
# to me what the difference is. | ||
coord.attrs['resolutions'] = 'Qdev' | ||
group.create_field('Qdev', sc.stddevs(da.coords['Q'])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'resolution' and 'uncertainty' seem to be different concepts. I would interpret it as resolution referring to systematic effects (e.g. binning in Q, potentially implicitly induced by the detector resolution) and uncertainty to statistical errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, if someone actually uses both they will need to provide this in a different manner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But shouldn't we write uncertainties instead of resolutions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually people talk about "Q-resolution", and I do not think it is statistical. I don't know if anyone has decided how it will be stored, and for now just assumed they might use Scipp's variances.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, if you think this is safe. I would be worried that people treat a systematic uncertainty as though it was statistical and get bad fits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will definitely need to check with the scientists, for now a lot of this is incomplete anyway.
if isinstance(axes, str): | ||
axes = axes.split(" ") | ||
if axes is None: | ||
axes = group.attrs.get('I_axes') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be a string with multiple 'entries'? I.e. does this if
need to be swapped with the previous one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation does not say. Will probably need to fix this in the future if it causes problems.
axes = axes.split(" ") | ||
if axes is None: | ||
axes = group.attrs.get('I_axes') | ||
if not isinstance(axes, list): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can axes
be a numpy array?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Checking for int
now instead.
class SASdataStrategy: | ||
|
||
@staticmethod | ||
def dims(group: NXobject) -> Tuple[str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def dims(group: NXobject) -> Tuple[str]: | |
def dims(group: NXobject) -> Tuple[str, ...]: |
src/scippnexus/nxobject.py
Outdated
self.child_params = {} | ||
self._definition = definition | ||
self._strategy = self._default_strategy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self._strategy = self._default_strategy() |
self.data = data | ||
assert Q_variances in (None, 'uncertainties', 'resolutions') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I learned a surprising thing recently, assertions are removed when making optimised builds (e.g. in a wheel). So this should be a proper if-raise
construct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't know that either!
Experimental (and incomplete) support for reading NXcanSAS.
Also fixes #65.