New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added option to use forcing start/stop dates and misc improvements to… #93
Conversation
… time checking / loading of data
metsim/datetime.py
Outdated
else: | ||
raise NotImplementedError() | ||
raise NotImplementedError('freq %s not supported at this time' % freq) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer we try to use .format(...)
as much as possible. It looks like there's a couple uses of the %
formatting above as well that I didn't catch before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, though the logging module doesn't require the .format
method to be called. This works:
logger.info('just a {} example', 'little')
xref: https://docs.python.org/3/howto/logging-cookbook.html#using-custom-message-objects
ds.rename(var_dict, inplace=True) | ||
|
||
if start is not None and stop is not None: | ||
if start is not None or stop is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this have the chance to break? For example, if start=None
and stop!=None
we will end up with (in the line below) ds = ds.sel(time=slice(None, stop))
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ds = ds.sel(time=slice(None, stop))
will work. Just like this works:
In [1]: x = list(range(10))
In [2]: x
Out[2]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
In [3]: x[slice(None, 4)]
Out[3]: [0, 1, 2, 3]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, I had no idea that you could do that.
self._aggregate_state() | ||
logger.info("load_inputs") | ||
self.load_inputs() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wary of having self.load_inputs()
after self._aggregate_state()
. The whole reason we were doing the load
and close
in the IO routines was to avoid the situation where the state and input forcings are looking at the same file and undefined behavior occurred. It might be possible to reason through and prove that this situation won't occur with this setup, but I think it's easier to side with caution on this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm skeptical that this is actually a problem. This is all happening before we startup MetSim's multiprocessing which is where I'd image any odd behavior came from. The reason I like this method is that we fully define exactly what data we need in the previous lines before we ever load any data. I think this is preferable to blinding loading data we don't know if we need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point, and I think it's reasonable to say that no issues will occur after looking at it a bit more. I'm still a bit sensitive to that bug because it took so long to track down.
metsim/metsim.py
Outdated
else: | ||
self.params[p] = pd.to_datetime(self.params[p]) | ||
|
||
logger.info('start %s' % self.params['start']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
String formatting again. (And below)
metsim/metsim.py
Outdated
# Don't include complex types | ||
if isinstance(v, dict): | ||
elif isinstance(v, dict): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I like this being an if
more, since it's talking about a separate variable than the first if
statement. This allows for us to catch more than one situation, and is just a more cautious approach in general.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I misread this logic. Good catch.
end_record = self.params['start'] - pd.Timedelta("1 days") | ||
record_dates = date_range(begin_record, end_record, | ||
|
||
assert self.state.dims['time'] == 90 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really don't like this check, because it seems to limit some cases that are pretty useful in my mind. For example, if you have a long timeseries broken up into 1 year chunks you could previously just use the previous year's data as your state file. Now it looks like you'd have to preprocess all of those files and create an additional state file for each years input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really, our read_state
function handles this correctly now with improved slice selection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm open to other ideas for how to pin down the behavior in this method but I was finding it useful to explicitly check the state shape before going into the computations in this method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I didn't catch that's what you were doing with the state_start
and all that. In that case, I like this.
… time checking / loading of data
git diff upstream/master | flake8 --diff