Conversation
ffe6dd8 to
018276a
Compare
…et to work with three data members.
…orbed WithObs variant.
…ind model training added to examples README.
…`` class to enable passing ``xr.Dataset`` and like objects directly to ``BatchHandlers`` without needing to invoke ``.data``
…ueueing, since batches can be sampled directly if there are none in the queue).
…d dequeueing since this would cast to tensors.
…h ``max_workers > 1``.
… just sampling is, except for macos, but training is not.
… files and replacement.
… files and replacement.
… files and replacement.
…classes fix: update tensor initialization to use tf.cast for consistency refactor: simplify Loader initialization in ForwardPassStrategy refactor: improve docstring handling in Loader class test: convert observation mask to numpy array in test_fixed_wind_obs
| logger.info(msg) | ||
| return total_grad, loss_details | ||
|
|
||
| def _get_parallel_grad( |
There was a problem hiding this comment.
Does any of this need to be decorated with @tf.function? Am i crazy or did you remove this? If so, why?
There was a problem hiding this comment.
This is actually a new helper function that was in run_gradient_descent before. Couldn't have a tf.function decorator since it was calling np.array_split
| class Sup3rGanWithObs(Sup3rGan): | ||
| """Sup3r GAN model which includes mid network observation fusion. This | ||
| model is useful for when production runs will be over a domain for which | ||
| observation data is available.""" |
There was a problem hiding this comment.
Somewhere, probably in this header, can you describe the nuances of training with (fake)observations vs. inference with real observations? I think that functions described with "obs for current batch" are for fake observations from training data but it's not really clear.
I also want to know how someone would pass in real observations at inference time. You don't subclass the generate() function here so i don't see anywhere that would describe how to provide real observations in a fwp.
There was a problem hiding this comment.
Yeah thanks for this feedback. Hard to get this perspective when you're "in it" for a while. Obs are used in fwp the same way as other exogenous data sources, but the ObsRasterizer should be specified (this allows NaNs to make it through the rasterization) or obs feature names should include the suffix "_obs" to automatically use the ObsRasterizer.
| step and will be converted to a list | ||
| """ # noqa : D301 | ||
| if isinstance(steps, dict): | ||
| for feat, entry in steps.items(): |
There was a problem hiding this comment.
This whole init statement is really convoluted and confusing and i'm pretty sure the entire purpose of this logic is to accomodate inputs that are not correctly formatted. Do we really need to be this flexible? I think decisions like this make things way too complex. My suggestion would be to just have a ton of assert statements in this init to make sure things are formatted correctly and then have static methods that could massage things into the correct format.
Also, nitpick, no vertical spacing for this entire block is insane. No way this is one logical group of statements.
There was a problem hiding this comment.
Yeah this is definitely fair. I got tired of not being able to pass exo data to generate just with a dict {'topography': nd.array(...)} - instead needing to do this nested dict described in the current doc. The nested dict was motivated by the need to work with multi step models which require a lot more complexity, but we've moved away from multi-step models a bit in favor of "multi-step" pipelines (e.g. spatial config, temporal config). Maybe this could allow us to simplify?
There was a problem hiding this comment.
Is this a classic example of the challenge of having a single object do all the things? Trying to do: exo data for multiple features, multiple model steps, and multiple types of concat? Would having more atomic objects help with this? Like having a exo class for each of these options, or having a class template for a single exo layer that can be stacked by another class.
This might also be a good example of where flat is better than nested. A flat list of exo objects would be really easy to parse through and could be handled the same for a single GAN and multi-step GAN.
What if the ExoDataSingle class was just a data class with ExoDataSingle(data, fname, step_ind, ctype, model_ind=0) and then ExoData was just a list of these and they would be super easy to parse through for single in ExoData if single.attrs == desired_attrs then use this object; raise if desired_attrs not found
Sounds like maybe a good opportunity for refactor, but probably in a separate PR after we do a release with all of this PR's work. Right now just glad you simplified the init logic, thanks!
bnb32
left a comment
There was a problem hiding this comment.
Would you prefer more doc info elsewhere or do you think these updates are sufficient? Also, lmk what you think on the ExoData response.
| logger.info(msg) | ||
| return total_grad, loss_details | ||
|
|
||
| def _get_parallel_grad( |
There was a problem hiding this comment.
This is actually a new helper function that was in run_gradient_descent before. Couldn't have a tf.function decorator since it was calling np.array_split
| class Sup3rGanWithObs(Sup3rGan): | ||
| """Sup3r GAN model which includes mid network observation fusion. This | ||
| model is useful for when production runs will be over a domain for which | ||
| observation data is available.""" |
There was a problem hiding this comment.
Yeah thanks for this feedback. Hard to get this perspective when you're "in it" for a while. Obs are used in fwp the same way as other exogenous data sources, but the ObsRasterizer should be specified (this allows NaNs to make it through the rasterization) or obs feature names should include the suffix "_obs" to automatically use the ObsRasterizer.
| step and will be converted to a list | ||
| """ # noqa : D301 | ||
| if isinstance(steps, dict): | ||
| for feat, entry in steps.items(): |
There was a problem hiding this comment.
Yeah this is definitely fair. I got tired of not being able to pass exo data to generate just with a dict {'topography': nd.array(...)} - instead needing to do this nested dict described in the current doc. The nested dict was motivated by the need to work with multi step models which require a lot more complexity, but we've moved away from multi-step models a bit in favor of "multi-step" pipelines (e.g. spatial config, temporal config). Maybe this could allow us to simplify?
There was a problem hiding this comment.
@grantbuster Would you prefer more doc info elsewhere or do you think these updates are sufficient? Also, lmk what you think on the ExoData question.
…added tf.function back to _get_parallel_grad.
…e / training only designation
The commit history is a mess but the cumulative effect is almost entirely in sup3r/preprocessing/rasterizers/exo.py and sup3r/models/with_obs.py. Lots of tiny edits re: refactoring, adding doc strings, etc.