-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Skip dtype_out_time options that cause errors? #202
Comments
Where is this error occurring? I'm sure we can catch it and skip rather than let it crash. I agree, there's no reason why we should let this crash the whole CalcSuite whereas other errors get caught and those Calcs skipped.
I agree with this but haven't given it much thought beyond that. If you have some ideas, a new Issue about this would be great! |
The error occurs here: https://github.com/spencerahill/aospy/blob/develop/aospy/calc.py#L483-L497 The issue is that variables that are not time-defined do not have a https://github.com/spencerahill/aospy/blob/develop/aospy/calc.py#L531-L550 when we create the
Sounds good; when I get a chance I'll try and play around and see how easy/hard this might be to do already. |
Cool, that all sounds good to me. |
Labeling this "low-hanging fruit" with respect to the short-term fix. The bigger picture concerns are obviously more involved. |
In talking to @micahkim23 offline about his initial stab at this in #242, it occurred to us that it makes more sense to catch this earlier on, namely before the Consider the case where a Instead, what I'm proposing is for us to catch this within _time_defined_reducs = ['av', 'std', 'reg.av', 'reg.std']
def create_calcs(self):
"""Generate a Calc object for each requested parameter combination."""
specs_in = self._combine_core_aux_specs()
specs_out = []
calcs = []
for spec in specs_in:
reducs_in = spec['dtype_out_time']
reducs_out = []
if not specs['variable'].def_time:
for reduc in reducs_in:
if reduc not in self._time_defined_reducs:
reducs_out += reduc
else:
logging.info('skipping this because blah blah')
if reducs_out:
spec['dtype_out_time'] = reducs_out
specs_out.append(spec)
else:
logging.info('no valid reductions blah blah')
return [Calc(CalcInterface(**sp)) for sp in specs_out] So all of the There are a couple downsides, however:
Does anybody (esp. @spencerkclark) have any thoughts? We need to make a decision here one way or the other before @micahkim23 can proceed with #242. I would be in favor of going this route, although I'm deep enough into it that I may be missing something important. |
@spencerahill @micahkim23 many thanks for bringing this up; it was something that had not occurred to me initially. I agree that we should do everything we can to prevent loading data when we don't need to (since that can be fairly time consuming). I'll need to think over this a little more, but I just have one quick question for now.
I could be missing something, but is there some way this could be handled directly in the |
@spencerkclark yes we could definitely do that; meant to mention that. Doing it within We could do it in both. The logic in Extending that line of thought: the more I'm thinking about it, the more I like the idea that we would have built-in guards like this, ultimately for all of our classes. If a user inadvertently attempts to generate an object or submit a calculation that is inherently invalid, we should step in. #84 and #136 are in the same vein. |
Forgot to mention, we should make a decision on this soon if possible, as @micahkim23 is needing to wrap up his work for the quarter basically by the end of next week(end). I am in favor of implementing the checks in both places, What immediately comes to mind:
@spencerkclark let me know your take. |
Could we get away with not modifying _TIME_DEFINED_REDUCTIONS = ['av', 'std', 'reg.av', 'reg.std']
def _prune_invalid_time_reductions(self):
valid_reductions = []
if not self.var.def_time:
for reduction in self.dtype_out_time:
if reduction not in _TIME_DEFINED_REDUCTIONS:
valid_reductions.append(reduction)
else:
logging.info('Skipping this time reduction, because ... ')
else:
valid_reductions = self.dtype_out_time
self.dtype_out_time = valid_reductions
Having a check in @spencerahill what do you think? I think really this is just a discussion about aesthetics at this point :) |
It just occurred to me that an argument could be made for the following instead:
I think one could argue that @spencerahill what do you think about that argument? |
@spencerkclark, glad as usual to have gotten your feedback, as I think your last suggestion (drop invalid reductions in
I agree, too automagical. @micahkim23, please proceed along these lines. Ping us when you're ready for a review. Thanks both! |
Short term issue
@chuaxr has a use case for computed variables that do not have a time dimension (i.e. their
def_time
attribute isFalse
). She also has a use case for variables that do. It would be nice if she could submit calculations in the main script for both of these variables at the same time, with specifications to compute no time reduction (e.g.dtype_out_time=None
) and specifications to compute a time reduction (e.g.dtype_out_time='av'
), without having things halted by an error. More explicitly say one has two variables namedwithout_time_dim
andwith_time_dim
, and wants to compute them for a singleRun
and specifiesdtype_out_time=['av', None]
; the main script would create two Calc objects:The first would seek to compute:
without_time_dim
with a time averagewithout_time_dim
with no time reductionThe second would seek to compute:
with_time_dim
with a time averagewith_time_dim
with no time reductionThe one in bold currently raises an error, and crashes the set of computations. We ignore errors on Calc objects themselves, so the calculations for the
with_time_dim
variable would be fine, but sincewithout_time_dim
with a time average would crash, aospy would not get to computingwithout_time_dim
with no time reduction. For consistency and convenience, it seems like it could be a good idea report, yet ignore the exception caused by attempting to computewithout_time_dim
with a time average (rather than error) in a similar manner to what is done for Calc objects:aospy/aospy/automate.py
Lines 244 to 256 in ce7c784
Longer term thoughts
This is a very specific issue caused by the current main-script / Calc pipeline approach, which might have a relatively simple fix, but it could also serve as motivation to explore ways to use aospy outside of this strict paradigm (e.g. it does not always make sense to compute the Cartesian product of all variables and options; right now I address this by creating separate main scripts, but I feel like there could be a better way to do this).
The text was updated successfully, but these errors were encountered: