Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data has wrong time dimension when using delta biasCorrection #31

Closed
matteodefelice opened this issue Sep 14, 2016 · 3 comments
Closed
Assignees

Comments

@matteodefelice
Copy link
Contributor

matteodefelice commented Sep 14, 2016

Here, I have an observational dataset (obs) and a forecast (fcst).

> str(obs)
List of 4
 $ Variable:List of 2
  ..$ varName: chr "var167"
  ..$ level  : NULL
  ..- attr(*, "is_standard")= logi FALSE
  ..- attr(*, "units")= chr "undefined"
  ..- attr(*, "longname")= chr "undefined"
  ..- attr(*, "daily_agg_cellfun")= chr "none"
  ..- attr(*, "monthly_agg_cellfun")= chr "none"
  ..- attr(*, "verification_time")= chr "none"
 $ Data    : num [1:72, 1:35, 1:50] 292 294 295 294 295 ...
  ..- attr(*, "dimensions")= chr [1:3] "time" "lat" "lon"
 $ xyCoords:List of 2
  ..$ x: num [1:50] -12 -11.25 -10.5 -9.75 -9 ...
  ..$ y: num [1:35] 32.2 33 33.7 34.5 35.2 ...
  ..- attr(*, "projection")= chr "LatLonProjection"
  ..- attr(*, "resX")= num 0.75
  ..- attr(*, "resY")= num 0.75
  ..- attr(*, "interpolation")= chr "nearest"
 $ Dates   :List of 2
  ..$ start: chr [1:72] "1984-06-30 18:00:00 GMT" "1984-07-31 18:00:00 GMT" "1984-08-31 18:00:00 GMT" "1985-06-30 18:00:00 GMT" ...
  ..$ end  : chr [1:72] "1984-06-30 18:00:00 GMT" "1984-07-31 18:00:00 GMT" "1984-08-31 18:00:00 GMT" "1985-06-30 18:00:00 GMT" ...
 - attr(*, "dataset")= chr "/opt/data/ERAIN-T2M/ERAIN-t2m-1983-2012.mon.EUROPE.nc"
> str(fcst)
List of 6
 $ Variable           :List of 2
  ..$ varName: chr "tas"
  ..$ level  : NULL
  ..- attr(*, "use_dictionary")= logi TRUE
  ..- attr(*, "description")= chr "2 metre temperature @ Ground or water surface"
  ..- attr(*, "units")= chr "degrees Celsius"
  ..- attr(*, "longname")= chr "2-meter air temperature"
  ..- attr(*, "daily_agg_cellfun")= chr "none"
  ..- attr(*, "monthly_agg_cellfun")=function (x, ...)  
  ..- attr(*, "verification_time")= chr "none"
 $ Data               : num [1:72, 1:35, 1:50] 21.8 23.3 23.2 19.1 21.1 ...
  ..- attr(*, "dimensions")= chr [1:3] "time" "lat" "lon"
 $ xyCoords           :List of 2
  ..$ x: num [1:50] -12 -11.25 -10.5 -9.75 -9 ...
  ..$ y: num [1:35] 32.2 33 33.7 34.5 35.2 ...
  ..- attr(*, "projection")= chr "+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs +towgs84=0,0,0"
 $ Dates              :List of 2
  ..$ start: chr [1:72(1d)] "1984-06-01 00:00:00 GMT" "1984-07-01 00:00:00 GMT" "1984-08-01 00:00:00 GMT" "1985-06-01 00:00:00 GMT" ...
  ..$ end  : chr [1:72(1d)] "1984-06-30 18:00:00 GMT" "1984-07-31 18:00:00 GMT" "1984-08-31 18:00:00 GMT" "1985-06-30 18:00:00 GMT" ...
 $ InitializationDates: chr [1:26] "1984-05-01 00:00:00 GMT" "1985-05-01 00:00:00 GMT" "1986-05-01 00:00:00 GMT" "1987-05-01 00:00:00 GMT" ...
 $ Members            : chr "Member_1"
 - attr(*, "dataset")= chr "System4_seasonal_15"
 - attr(*, "source")= chr "ECOMS User Data Gateway"
 - attr(*, "URL")= chr "<http://meteo.unican.es/trac/wiki/udg/ecoms>"

When I apply a delta bias correction with the command:

cal <- biasCorrection(y = obs,
                      x = fcst,
                      newdata = fcst,
                      cross.val = 'loocv',
                      method = "delta")

I got a grid with a wrong time dimension:

> str(cal)
List of 6
 $ Variable           :List of 2
  ..$ varName: chr "var167"
  ..$ level  : NULL
  ..- attr(*, "is_standard")= logi FALSE
  ..- attr(*, "units")= chr "undefined"
  ..- attr(*, "longname")= chr "undefined"
  ..- attr(*, "daily_agg_cellfun")= chr "none"
  ..- attr(*, "monthly_agg_cellfun")= chr "none"
  ..- attr(*, "verification_time")= chr "none"
 $ Data               : num [1:1656, 1:35, 1:50] 295 296 297 294 296 ...
 $ xyCoords           :List of 2
  ..$ x: num [1:50] -12 -11.25 -10.5 -9.75 -9 ...
  ..$ y: num [1:35] 32.2 33 33.7 34.5 35.2 ...
  ..- attr(*, "projection")= chr "LatLonProjection"
  ..- attr(*, "resX")= num 0.75
  ..- attr(*, "resY")= num 0.75
  ..- attr(*, "interpolation")= chr "nearest"
 $ Dates              :List of 2
  ..$ start: chr [1:72(1d)] "1984-06-01 00:00:00 GMT" "1984-07-01 00:00:00 GMT" "1984-08-01 00:00:00 GMT" "1985-06-01 00:00:00 GMT" ...
  ..$ end  : chr [1:72(1d)] "1984-06-30 18:00:00 GMT" "1984-07-31 18:00:00 GMT" "1984-08-31 18:00:00 GMT" "1985-06-30 18:00:00 GMT" ...
 $ InitializationDates: chr [1:26] "1984-05-01 00:00:00 GMT" "1985-05-01 00:00:00 GMT" "1986-05-01 00:00:00 GMT" "1987-05-01 00:00:00 GMT" ...
 $ Members            : chr "Member_1"
 - attr(*, "dataset")= chr "/opt/data/ERAIN-T2M/ERAIN-t2m-1983-2012.mon.EUROPE.nc"

This is not happening with other bias correction methods! It seems a bug due to CV, in fact without it, it works fine.

@miturbide
Copy link
Member

Thanks for reporting!
As you said, there was a bug when applying the "delta" method and CV.
The error was due to the particularity of the "delta" correction, that is applied in the following general manner (when cross.val = "none"):
y + (mean(newdata) - mean(x))

thus the resulting object has the same time dimension as the observation data (y).

When cross.val = "loocv"/"kfold", this formula is applied as many times as number of data partitions are defined for test (number of years/folds):
y[train years/folds] + (mean(x[test year/fold]) - mean(x[train years/folds])),

thus, the time dimension = numberOf_years/folds * daysInEach_train_year/fold).

We have made some changes at this respect. Now, when applying the "delta" method in CV, a subset of "y" is done to get the same time series of the test data (daily series of 1 year for "loocv" or daily series of 1 fold for "kfold"). The "delta" method is now applied in this particular manner when cross.val != "none":

y[test year/fold] + (mean(x[test year/fold]) - mean(x[train years/folds])),

Thus, the binding of the outputs corresponding to all years/folds gives the correct time dimension.

@matteodefelice
Copy link
Contributor Author

When are you going to release a new master version? Or can I switch to the devel branch?

@jbedia
Copy link
Member

jbedia commented Oct 5, 2016

Hi Matteo, we are about to move to downscaleR 2.0-0. The data transformation/manipulation tools have been moved to transformeR, which becomes now a dependency for downscaleR. We have still to update all the documentation, but this is ready for testing, so you can switch to the devel version now. Your feedback will be welcome!.

devtools::install_github(c("SantanderMetGroup/transformeR",
                           "SantanderMetGroup/downscaleR@devel"))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants