Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
2015-06-30 Remove wind trend from NLDAS #70
What do the data look like?
The full timeseries of NLDAS wind data aggregated across a bunch (100) of our lake sites.
But we can see it when we fit the data to a weibull distribution.
> fitdist(subset(sp, year>2001)$WindSpeed, 'weibull') Fitting of the distribution ' weibull ' by maximum likelihood Parameters: estimate Std. Error shape 3.005518 0.03401985 scale 5.830320 0.03101341 > fitdist(subset(sp, year<2001)$WindSpeed, 'weibull') Fitting of the distribution ' weibull ' by maximum likelihood Parameters: estimate Std. Error shape 3.012133 0.0247867 scale 5.372733 0.0210710
You can see the difference in the scale factor.
How do we handle this
I propose we scale the wind data after 2001 using a multiplication factor. (Proposed value, 0.921, which is the quotient of the before and after 2001 scale factors).
I'm going to see how RMSE changes if we do this. Trip report later.
As point of interest, here is the weibull paramters for scaled wind speed from NTL-LTER airport met station.
library(fitdistrplus) library(LakeMetabolizer) > fitdist(head(wind.scale.base(wnd$wnd, 3), 30000), 'weibull') Fitting of the distribution ' weibull ' by maximum likelihood Parameters: estimate Std. Error shape 1.636043 0.007673722 scale 3.690860 0.013661206
Lower than we get with NLDAS
Ok, the increase in bias suggests that perhaps the winds after 2001 are actually more characteristic than those before 2001 (though decrease RMSE suggests otherwise??).
Let's see what happens if we flip this correction and adjust the winds before 2001.
bfr_2001 = nldas$time < as.POSIXct('2001-12-31') nldas$WindSpeed[bfr_2001] = nldas$WindSpeed[bfr_2001] * 1.0857
Not really. For that last one, if you drop the outer 10%, RMSE goes to
resids = na.omit(all_cal$Observed_wTemp - all_cal$Modeled_wTemp) med = median(resids) high = med + 2*(quantile(resids, 19/20, na.rm=TRUE) - med) low = med + 2* (quantile(resids, 1/20, na.rm=TRUE) - med) inner_resids = resids[resids< high & resids > low] sqrt(mean(inner_resids^2, na.rm=TRUE))
I guess I was thinking that similar to how we dealt with the modeling errors before, we should toss out the really bad fits, so they don't steer us off. But, it doesn't look like there is enough of a change to make this something that should be applied.
That said, I assume the distribution fitting is pretty robust, but does it result in similar mean(wind^3) for the season? Since the power is the cube, those tails are really important, which is probably what you were seeing in the plot above with mean(wind) for the season. Do you feel confident that the shift is appropriate for the data?