-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
first attempt at adding basic bias correction code and fixing tests #2
first attempt at adding basic bias correction code and fixing tests #2
Conversation
…o feature/fix_failing_tests
Interesting, you're getting this error because of https://github.com/dgergel/xsd/blob/2f76a3a02a5cccf962d3c801f1603c545ced967b/skdownscale/pointwise_models/core.py#L46 where it's trying to get the Stacktrace just before this error looks like:
|
Looks like input to the It also looks like dimensions are transposed in the output... So if we hackishly change def _datafactory(x, start_time="1950-01-01"):
"""Populate xr.Dataset with synthetic data for testing"""
start_time = str(start_time)
if x.ndim != 1:
raise ValueError("'x' needs dim of one")
out = xr.Dataset(
{"fakevariable": (["time", "lon", "lat"], x[:, np.newaxis, np.newaxis])},
coords={
"time": xr.cftime_range(
start=start_time, freq="D", periods=len(x), calendar="noleap"
),
"lon": (["lon"], [1.0]),
"lat": (["lat"], [1.0]),
},
)
return out["fakevariable"] The test completes but output values differ from the goal:
What say you, @dgergel ? You're more familiar with the method, does this Edit: If this is correct, then we need to change input to DataArrays and change docstrs to reflect this. |
@brews inputs to the *.fit() and *.predict() methods for the PointWiseDownscaler do need be DataArrays unless we set it up a bit more cleverly. I should have paid closer attention to that in the test code when I was looking at it earlier. As for what's happening with the outputs, I'm going to do a bit more exploring/testing myself to figure that out - I don't immediately know what's going on. But I don't think those outputs are right. More soon. |
@brews I've been digging into this and I think something is going wrong with how the bias correction code is ingesting our test data, which seems to be due to a mismatch between what it expects and what we're giving it. For example, in the outputs, given that our training and testing data is 0 - 1825 +/- 10, our bias corrected values should not be negative (or equal to -725). I tried a couple things, for example changing the time index in case the time grouper was struggling with a cftime index, e.g.
But that didn't do it. I tested the branch of scikit-downscale that we're using with the global bias correction workflow and looked at a timestep of the output before we tried it here, so I'm thinking there is a problem bw what PointwiseDownscaler is expecting and the test data we're giving it as I mentioned earlier. I think removing PointwiseDownscaler and just testing on a gridcell with a test DataFrame and the BCSDTemperature model is probably the best next step, unless you have other ideas? |
@dgergel So if I follow you right, you're saying this is an upstream package problem with In that case, I think that trying to get this to work without Given that the our fake data for this test case is only a single grid point anyways... Can we make the test work with a Regardless, I say go ahead and drop (And stop me anytime if you want to walk through anything or you have any questions) |
I tried something different for this failing test. I fed it
|
From meeting today sounds like this is actually working as expected and unexpected numbers stem from quantile mapping in addition to a bias correction. Potential solutions are:
Choose whichever is less work at this point. In addition we need to ensure input data to fit the model is passed in as a DataArray, not a Dataset as was the original setup. Brainstorming for later work: At some point in the future we're going to need to allow users to choose what variable name to extract from output input data. The alternative is to allow only a single variable in each input dataset, but this feels like it might be clunky and too implicit. Do we allow users to have a different variable name for input files or does the variable name need to be the same for all files. Just things to chew on for later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work @dgergel ! Thanks again for putting this together.
The automated tests passed and that's your goal!
I have a few suggestions but nothing essential to fix before this merges. I don't think any of this is going to immediately break anything -- and we're still prototyping everything -- so it's okay if we get some of this in a later PR. Your call.
Go ahead and merge when ready!
Edit: Also let me know if anything is unclear or if you want to discuss things further.
dodola/core.py
Outdated
observation data for building quantile map | ||
gcm_predict_ds : Dataset | ||
future model data to be bias corrected | ||
predicted : Dataset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this "predicted" arg was renamed or removed. If so I'd remove the two docstr lines for predicted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch - it should be ds_predicted
instead of predicted
, thanks!
dodola/core.py
Outdated
# import numpy as np | ||
# import xarray as xr | ||
# from skdownscale.pointwise_models import PointWiseDownscaler, BcsdTemperature | ||
import numpy as np |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think can remove numpy
and xarray
imports - they're not used.
dodola/core.py
Outdated
|
||
|
||
def morenerdymathstuff(*args): | ||
raise NotImplementedError | ||
# TO-DO: implement additional bias correction functionality | ||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're set to just remove morenerdymathstuff()
alltogether. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed - done 👍
@brews all of your comments should be addressed, so this is ready to merge (I'd do it myself but looks like I don't have merge privileges - can you update that btw?) |
This PR adds in some super basic bias correction code as well as a dependency on a forked version of scikit-downscale with an interim fix for a current bug in PointwiseDownscaler.