-
-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RMSE_test calculation does not sample points along groundtruth grid edges properly #152
Milestone
Comments
weiji14
added
bug 🪲
Something isn't working
data 🗃️
Pull requests that update input datasets
labels
Jun 15, 2019
weiji14
added a commit
that referenced
this issue
Jun 15, 2019
Patch 054e295 so that the RMSE_test calculated in deepbedmap.ipynb matches that of srgan_train.get_deepbedmap_test_result. Basically run pygmt.grdtrack on an xarray.DataArray grid only, rather than on an xr.DataArray grid in srgan_train.ipynb and a NetCDF file grid in deepbedmap.ipynb that produces slightly different results! Main issue with this is that the grdtrack algorithm samples less points than before, from 38112 down to 37829 now. This is because of how the edges of the grid are not properly sampled. Issue is documented in #152.
weiji14
added a commit
that referenced
this issue
Jun 17, 2019
Update 2D, 3D and histogram plots in deepbedmap.ipynb for the 2007tx.nc test area using our newly trained Enhanced Super-Resolution Generative Adversarial Network (ESRGAN) model from https://www.comet.ml/weiji14/deepbedmap/0b9b232394da42e394998b112f628696. Had to change our residual_scaling hyperparameter default setting from 0.2 to 0.15 in a few places following the last commit in 83e956d. Like really, we need to find a way to set the correct residual_scaling and num_residual_blocks settings when loading from a trained .npz model file. Showcasing the best RMSE_test result of 43.57 achieved in the 2nd hyperparameter tuning frenzy in 83e956d. Note that the result is actually about 45.59 if we account for the borders properly (see issue #152), making the result not too different from the 45.35 reported in e27ac4a. However, the peak of the elevation error histogram is actually closer to that of the groundtruth with a mean of -25.37 instead of -94.75 (i.e. nearer to 0)! There's some checkerboard artifacts sure, and the errors at the 4 corners are off the chart for some reason, but I think we're definitely getting somewhere!!
weiji14
added a commit
that referenced
this issue
Jun 21, 2019
Creating new groundtruth NetCDF grids using GMT surface, replacing the ones last created in b90bd74 in #112. Besides having updated to the GMT 6.0.0rc1 tagged release, the main change here is with using nicely rounded bounds (to 250 units in EPSG:3031) instead of arbitrary decimal points. This will really help resolve some of the problems with points not being included in our RMSE_test calculations near the grid's edges (see #152), and integer coordinates are just nicer to debug won't you say? Specifically, the data_prep.get_region was refactored to use `gmt info -I xxx.csv` instead of pure pandas, returning an `xmin/xmax/ymin/ymax` string that has an extended region optimized for `gmt surface`. There is a "surface [WARNING]: Your grid dimensions are mutually prime. Convergence is very unlikely" which I'm just gonna ignore for now. Note that data_prep.ascii_to_xyz was one-line patched to drop NaNs as there were some points in the WISE_ISODYN_RadarByFlight.XYZ file with missing elevation (z) values (since #112...) that was messing up gmt.info in the refactored data_prep.get_region. Unit tests have been modified accordingly, and the grids in the integration tests are now downloaded/created in folder /tmp to avoid messing with the actual files in highres. Matplotlib plots of the grids in data_prep.ipynb have been updated, and the new grids will be released in v0.9.2.
3 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Commit 054e295 in #151 highlighted a bug introduced in #149. Basically,
pygmt.grdtrack
differs in sampling points along the edges depending on whether we use axarray.DataArray
orNetCDF file
grid input. See image below showing points sampling the 2007tx.nc grid, specifically the 2007t1.txt area.Yes we do crop the 2007tx.nc grid by one pixel on the left, bottom, right and top (to make the image shape divisible by 4), but there's still some serious discrepancies.
Number of points:
pygmt.grdtrack
sample on NetCDF file = 38112 pointspygmt.grdtrack
sample on xr.DataArray = 37829 pointsStrangely enough, running it on an
xr.DataArray
captures more points on the top and bottom (y-direction) whereas running it on aNetCDF file
captures more points on the left and right (x-direction).How to fix
Adjust data_prep.xyz_to_grid to not use tight bounds from data_prep.get_region. Maybe buffer the input bounds by 250m * 3 pixels (the mask we set when running
pygmt.surface
) before runningblockmedian
andsurface
. This should mean we get closer to the actual total of 42995 points regardless of whether we run it on anxarray.DataArray
orNetCDF file
If not, then it might be a good idea to report this upstream to whoever wrote that wrapper for
pygmt.grdtrack
😅The text was updated successfully, but these errors were encountered: