Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added option to use forcing start/stop dates and misc improvements to… #93

Merged
merged 2 commits into from Aug 17, 2017
Merged

added option to use forcing start/stop dates and misc improvements to… #93

merged 2 commits into from Aug 17, 2017

Conversation

jhamman
Copy link
Member

@jhamman jhamman commented Aug 11, 2017

… time checking / loading of data

@jhamman jhamman requested a review from arbennett August 11, 2017 19:13
else:
raise NotImplementedError()
raise NotImplementedError('freq %s not supported at this time' % freq)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer we try to use .format(...) as much as possible. It looks like there's a couple uses of the % formatting above as well that I didn't catch before.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, though the logging module doesn't require the .format method to be called. This works:

logger.info('just a {} example', 'little')

xref: https://docs.python.org/3/howto/logging-cookbook.html#using-custom-message-objects

ds.rename(var_dict, inplace=True)

if start is not None and stop is not None:
if start is not None or stop is not None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this have the chance to break? For example, if start=None and stop!=None we will end up with (in the line below) ds = ds.sel(time=slice(None, stop)).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ds = ds.sel(time=slice(None, stop)) will work. Just like this works:

In [1]: x = list(range(10))

In [2]: x
Out[2]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [3]: x[slice(None, 4)]
Out[3]: [0, 1, 2, 3]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I had no idea that you could do that.

self._aggregate_state()
logger.info("load_inputs")
self.load_inputs()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wary of having self.load_inputs() after self._aggregate_state(). The whole reason we were doing the load and close in the IO routines was to avoid the situation where the state and input forcings are looking at the same file and undefined behavior occurred. It might be possible to reason through and prove that this situation won't occur with this setup, but I think it's easier to side with caution on this one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm skeptical that this is actually a problem. This is all happening before we startup MetSim's multiprocessing which is where I'd image any odd behavior came from. The reason I like this method is that we fully define exactly what data we need in the previous lines before we ever load any data. I think this is preferable to blinding loading data we don't know if we need.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point, and I think it's reasonable to say that no issues will occur after looking at it a bit more. I'm still a bit sensitive to that bug because it took so long to track down.

metsim/metsim.py Outdated
else:
self.params[p] = pd.to_datetime(self.params[p])

logger.info('start %s' % self.params['start'])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

String formatting again. (And below)

metsim/metsim.py Outdated
# Don't include complex types
if isinstance(v, dict):
elif isinstance(v, dict):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I like this being an if more, since it's talking about a separate variable than the first if statement. This allows for us to catch more than one situation, and is just a more cautious approach in general.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I misread this logic. Good catch.

end_record = self.params['start'] - pd.Timedelta("1 days")
record_dates = date_range(begin_record, end_record,

assert self.state.dims['time'] == 90
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't like this check, because it seems to limit some cases that are pretty useful in my mind. For example, if you have a long timeseries broken up into 1 year chunks you could previously just use the previous year's data as your state file. Now it looks like you'd have to preprocess all of those files and create an additional state file for each years input.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really, our read_state function handles this correctly now with improved slice selection.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open to other ideas for how to pin down the behavior in this method but I was finding it useful to explicitly check the state shape before going into the computations in this method.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I didn't catch that's what you were doing with the state_start and all that. In that case, I like this.

@arbennett arbennett merged commit c117570 into UW-Hydro:develop Aug 17, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants