Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

convert to RNetCDF and bring in crs handling fixes #87 #88

Closed
wants to merge 8 commits into from
Closed

convert to RNetCDF and bring in crs handling fixes #87 #88

wants to merge 8 commits into from

Conversation

dblodgett-usgs
Copy link
Contributor

Switching over to RNetCDF and adding in some goodies from ncmeta. I also converted the NetCDF testing over to test that and built out some basic expectations, hopefully that's ok.

I added a little block that attempts to get a CRS from the NetCDF object. In the case that there isn't one, it defaults to WGS84 lat/lon. This could be in error if a non-standard method of specifying a projection is used and the data set really does use projected coordinate variables. Would need some more test data to make sure we don't screw this up too bad. Maybe there's a good set of test data used in the GDAL driver we could tap into?

@mdsumner
Copy link
Member

mdsumner commented Jan 8, 2019

See my rasterwise repo for a decent testing set

@dblodgett-usgs
Copy link
Contributor Author

Woah -- there're some gnarly examples in rasterwise! Will have a dig through soon.

Looks like tests fail due to ncmeta dependencies... Not sure how you guys want to handle that? I've got a little addition related to #89 to throw in here too, can pile on this PR or hold off for a bit. Which would you rather?

@mdsumner
Copy link
Member

mdsumner commented Jan 8, 2019

Happy for you to pile on, I consider this my responsibility to get fixed - and I may be delayed - appreciate your PR this will be a good improvement

@dblodgett-usgs
Copy link
Contributor Author

Alright. Have a look at what I've got in here now, just checking for non-XYZT order and raising a warning right now. I think we could use something like this to remediate non-XYZT variables, but it's tricky between the request indices and how to reshape the data you get back... I haven't touched that. In my mind this should fix #89 as well as #87 for now. Would be worth some more testing to make sure it's solid. Will try and get to that later on.

@mdsumner
Copy link
Member

mdsumner commented Jan 9, 2019

We need to start requiring versions of development packages, so now I have merged your ncmeta PRs and bumped that to 0.0.3.9001 - so in stars' DESCRIPTION we need Imports:

ncmeta (>= 0.0.3.9001)

and

Remotes: hypertidy/ncmeta

to make it all work when installing from github.

Probably also should do this in branches, and I would tend to use a non-master branch in my fork for PRs (but I'm not strict about this and open to discussion).

I'm currently exploring the changes in ncmeta

@dblodgett-usgs
Copy link
Contributor Author

Doing some more testing based on my latest PR to ncmeta. Probably best to leave this PR open or merge this into a new branch until things have settled down.

Played around with the ncsub input a bit just now and am realizing some of what I need over in #86 can be accomplished like:

h <- "https://cida.usgs.gov/thredds/dodsC/new_gmo"
h_nc <- RNetCDF::open.nc(h)

ncmeta::nc_vars(h_nc)

# # A tibble: 12 x 5
# id name             type      ndims natts
# <dbl> <chr>            <chr>     <dbl> <dbl>
# 1     0 longitude        NC_FLOAT      1     5
# 2     1 longitude_bnds   NC_FLOAT      2     0
# 3     2 latitude         NC_FLOAT      1     5
# 4     3 latitude_bnds    NC_FLOAT      2     0
# 5     4 time             NC_DOUBLE     1     2
# 6     5 bounds_latitude  NC_DOUBLE     2     0
# 7     6 bounds_longitude NC_DOUBLE     2     0
# 8     7 pr               NC_FLOAT      3     3
# 9     8 tas              NC_FLOAT      3     3
# 10    9 tasmax           NC_FLOAT      3     3
# 11    10 tasmin          NC_FLOAT      3     3
# 12    11 wind            NC_FLOAT      3     5

var <- "pr"

dplyr::filter(ncmeta::nc_axes(h_nc), variable == var)$dimension
# [1] 2 1 4

ncmeta::nc_dims(h_nc)
# # A tibble: 5 x 4
# id name      length unlim
# <dbl> <chr>      <dbl> <lgl>
# 1     0 bound          2 FALSE
# 2     1 latitude     222 FALSE
# 3     2 longitude    462 FALSE
# 4     3 nb2            2 FALSE
# 5     4 time       22645 FALSE

test <- stars::read_ncdf(h, var = var, 
                         ncsub = cbind(start = c(1, 1, 1), 
                                       count = c(NA, NA, 1)))

plot(test)

screen shot 2019-01-10 at 10 27 45 am

@dblodgett-usgs
Copy link
Contributor Author

Working through files from rasterwise first bit of debugging just checked in. I added a test file from EUROCORDEX cut down to just one variable that uses both 2d and 1d coordinate variables (see screenshot below). Also found an edge case in the CRS conversions over in ncmeta.

screen shot 2019-01-11 at 2 47 41 pm

@mdsumner
Copy link
Member

@edzer could you please merge this into master? I'm not comfortable with doing it in my fork - I definitely get confused at this level with git.

@edzer
Copy link
Member

edzer commented Jan 23, 2019

The travis check fails, merging this into master moves that problem there, I'm afraid. Will merge if that is fixed. (appveyor is supposed to fail because it wants a windows binary for starsdata, which is not there)

@mdsumner
Copy link
Member

It's only the use of the gh version of ncmeta, but ok.

@mdsumner
Copy link
Member

@dblodgett-usgs please don't push any more to this branch - I'm not good at these remote merges, I'll try to find out how to better manage this kind of update :)

This was referenced Jan 23, 2019
@mdsumner
Copy link
Member

mdsumner commented Jan 24, 2019

Quick summary:

  • RNetCDF change submitted in R netcdf-1 #93
  • changes here have implications regarding import dependencies, which need more attention (ncmeta on gh, dplyr not previous imported ...)
  • there are some inconsistencies in the CRS interpretation a la problem in grid mapping prj hypertidy/ncmeta#24 which I don't understand completely yet
  • some of the CRS-checking could be better done by stars itself, via GDAL (I'll investigate)
  • equivalent tests that involved extra nc files can be handled in ncmeta, I think

Again, apologies for thinking this was more straightforward, it's very helpful but I went about things the wrong way. More soon!

@edzer
Copy link
Member

edzer commented Jan 31, 2019

Do I understand correctly that this hangs on a modification of ncmeta which is not yet on CRAN?

I'll have to release stars soon (it has Errors on CRAN results now - fedora/GDAL issues), and am not sure whether to wait for the reorganization done here.

I'm also trying to push coverage to 100%, it would be great if one of you could help doing this for ncdf.R.

@mdsumner
Copy link
Member

Yes I'd say this is stalled - I have discovered too much that's required. (Overall in stars I think we should just lean on GDAL which already sorts this all out, but again that takes more stuff).

I'll look at tests!

@mdsumner
Copy link
Member

Actually, what is the point of 100% test coverage? It will be a substantial bit of pain to craft suitable files, I don't really want to do that right now.

@edzer
Copy link
Member

edzer commented Jan 31, 2019

I get your point, and I use #nocov, or #nocov start / #nocov end blocks to indicate where I don't want to go through this effort. I've found out too often however that code that I thought must work didn't actually do what I thought it should until I forced to test...

It's the difference between "might work" and "has actually worked at least once". Trust me, after doing this you'll sleep better at night.

@dblodgett-usgs
Copy link
Contributor Author

If you guys want to open up some issues and assign them to me to help get this across the finish line, please do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants