-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for reading station data timeseries (NetCDF example) #30
Comments
You probably know this, but just fyi you can see what GDAL thinks of it with For gdalinfo you also need |
@mdsumner thanks, yes unfortunately as far as I can tell none of the software I use, gdal and panoply included, natively recognizes the created file as a station timeseries file. |
Ok great, thanks. I think we'll always be able to find examples that GDAL can't handle generally, the worst part is that it has to unroll 3D+ arrays as bands (fields, columns, layers in other words) and anything using it must re-infer that structure. I'll be pushing for more general NetCDF support, and tidync is pretty well suited for this, it's on my radar to spend some time on tidync/stars as a possible way to use NetCDF more directly. The tidync parts used to get stuff into stars will be relatively simple to port out. The question for stars will be how far to push GDAL as a convenient way to shortcut NetCDF support, versus when we bang into inherent limitations. I think GDAL is not a good choice, we made this mistake in a commercial context several years ago - in hindsight it would have been better to use generic NetCDF support, because ultimately the geographic assumptions about the first two dimensions bite, and having to infer dimensionality from spread-out band metadata is kind of flaky. Interestingly direct use of ncdf4 is what raster did - and that allows a lot more general use than GDAL does across many sources - but raster is lot more limiting than stars aspires to. |
From our discussions at EGU I gathered that @edzer is fully aware of GDAL's limitations in this sense and is thinking on how to push this matter forward (upstream?). As far as NetCDF goes, NetCDF's power and weakness is that its possibilities are almost infinite, I agree there's no way to cover all use-cases. However the CF-Conventions are very complete and useful. In my ideal world I would like to cover all the use-cases listed in the Conventions with |
I had an idle hope that OGR would be able to deal with such data. I am totally with @mdsumner that it is better to read NetCDF directly rather than through GDAL if we want to represent data cubes generally. I took the GDAL route because it seemed easiest, so far, but it was a lot of work because one needs to get pretty much everything from the GDAL metadata tags. Nasty and messy. @mdsumner maybe time for a call to discuss aligning stars and tidync? |
Yes, I think they are already pretty complementary - I found it smoother than expected to align and a stars-like approach to metadata is definitely a gap for tidync. Here's some examples |
A minor addition, my attempts at convenient metadata extraction from NetCDF is ncmeta (on CRAN). It has obvious functions |
Cool! units is one, coordinate reference systems / datums another, and 360-day years a third! |
Just starting to wrap my head around |
@dblodgett-usgs That looks very interesting, thanks for the package! |
Just a point of reference for the discussion here. Recent contributions I've made to |
@dblodgett-usgs is ncdfgeom something you'd like to bring to CRAN, or would you alternatively like to integrate it in some package already on CRAN? |
I would like to, but time constraints are heavy. Let me ask around. We usually have to write a peer reviewed paper to release a package but since the package implements the CF conventions -- which was heavily peer reviewed, maybe that will fly. That said, if there were a package I could contribute to and someone else wanted to be the longer-term owner of it... I would not hesitate to go that way! |
I've initiated the review process to get the package approved for CRAN. I'll need to convert it to use RNetCDF and ncmeta, but that won't be hard. Will keep this issue updated on the progress. |
FYI, I've got https://usgs-r.github.io/ncdfgeom/ reviewed and did some refactoring to switch over to |
@mdsumner -- I think I can get |
I don't think I can do that until stars itself is released - there's a breaking change in read_ncdf and CRAN stars won't pass check. Is it ok to break like that? |
Oh, I see. What's the time-line for the dual release? |
Ah, I didn't know. Planning stars submission for the end of this week! |
That's cool, been meaning to ask - that's perfect |
FYI - |
That's good to know. I shall be looking forward to test it.
Regards,
Mohsin Raza
…On Fri, Apr 19, 2019 at 5:26 PM Edzer Pebesma ***@***.***> wrote:
FYI - stars submitted to CRAN.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#30 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ACLLHGLMQA53XHSTXCVJY5LPRHQD7ANCNFSM4E2UIPEA>
.
|
@dblodgett-usgs I've stumbled across the need for |
I'd love to get to it but a few other projects are blocking me from getting to this. Maybe in August? |
It's OK, I just wanted an approx idea on that. No pressure. Thanks a lot! |
Starting to play around with this. The object returned by
There are probably other options? What would be preferred? I dropped some prototype code below creating stars objects two ways. Is one better or more sustainable than the other? Am I handling geometry correctly? I kind of expected some kind of spatial plotting in the plot method -- am I missing something or is that just what there is so far? The file is linked to this issue so this code should be runnable as long as you download.file("https://github.com/r-spatial/stars/files/3507773/timeseries.nc.zip", destfile = "timeseries.nc.zip")
unzip("timeseries.nc.zip", files = "timeseries.nc")
ts <- ncdfgeom::read_timeseries_dsg("timeseries.nc")
crs <- st_crs(4326)
ts_points <- tibble::tibble(X = ts$lons, Y = ts$lats, Z = ts$alts)
ts_points <- sf::st_as_sf(ts_points, coords = c("X", "Y", "Z"), crs = crs)
data <- ts$data_frames[[1]])
data[["T"]] <- ts$time
dim <- stars:::create_dimensions(c(geometry = nrow(ts_points),
time = nrow(data)), raster = NULL)
gdim <- stars:::create_dimension(from = 1, to = length(ts$lats),
refsys = crs$proj4string, point = TRUE,
values = ts_points$geometry)
tdim <- stars:::create_dimension(from = 1, to = length(ts$time),
refsys = "POSIXct", point = FALSE,
values = as.POSIXct(ts$time))
dim_2 <- stars:::create_dimensions(list(time = tdim, geometry = gdim))
dim$geometry$from <- 1
dim$geometry$to <- length(ts$lats)
dim$geometry$refsys <- crs
dim$geometry$point <- TRUE
dim$geometry$values <- ts_points$geometry
dim$time$from <- 1
dim$time$to <- length(ts$time)
dim$time$refsys <- "POSIXct"
dim$geometry$point <- FALSE
dim$time$values <- as.POSIXct(ts$time)
stars_data <- stars:::st_stars(x = setNames(list(as.matrix(ts$data_frames[[1]])),
ts$varmeta[[1]]$name),
dimensions = dim)
stars_data_2 <- stars:::st_stars(x = setNames(list(as.matrix(ts$data_frames[[1]])),
ts$varmeta[[1]]$name),
dimensions = dim_2)
plot(stars_data$pr)
plot(stars_data$pr$`1`)
plot(st_dimensions(stars_data)$geometry$values)
plot(stars_data_2$pr)
plot(stars_data_2$pr$`1`)
plot(st_dimensions(stars_data_2)$geometry$values) |
As of your question, I'd suggest both 1 and 3. 1 will allow others to do similar things on potentially more simple data structures; with 1, 3 becomes easy. The I did some heavy editing in your script, which now runs kind of:
Plots are still work to do; there's quite some stuff here but that was all |
Thanks for the guidance on how to get this working. I just opened a PR that seems to be the core of it. The issue I'm going to have is that The next step here would be to have |
Looks good! I guess that the |
OK. No, I don't think those should be exported unless you want an ecosystem of packages that can build So if I added the |
Yes, but we keep it in |
I quickly tested this and it works for me, although I only tried with a couple of files, both created by me with correct CF attributes. I guess we'll also need a method for Also I'm sorry I've been quite absent recently - i'm afraid this is not going to change soon. I'll start with a new job shortly and it is likely won't use |
Followup of our discussions at EGU:
stars
aims at reading station data interpreting the X-Y coordinates correctly assf
point features. I did a bit of digging on the topic, although I can only speak for NetCDF data since that 's what I know best.The CF conventions (ch. 9) define a set of supported features: point, timeSeries, trajectory, profile, timeSeriesProfile, trajectoryProfile. The convention goes into the details on how these should be defined, see here for timeSeries for example. There are several recommendations and a few mandatory attributes.
Here is how to create an example timeseries file for 10 stations over 20 timesteps using R:
The created file has metadata (
ncdump -h timeseries.nc
):This AFAICT should be 100% CF-compliant.
stars
currently does not understand what to do with this dataset:Obviously we would want to have single
sf
point dimension (lat, lon, elevation), plus a time dimension.Whether this needs to be implemented by GDAL or by
stars
directly I do not know.The text was updated successfully, but these errors were encountered: