Where is the grd format described? #1

hsonne · 2020-07-22T13:03:18Z

Where is the format of the binary file described? Can you provide a link?
We need to know the number of grid cells, is this information (maybe along with other metadata) not somehow contained in the file?
If not, it is important to pass the "correct" values for nx and ny to the main function, those that "match" the actual input file. It should be checked somehow if the file and the values for nx and ny fit together.
I did a test with one of your test files:

# Read all raw bytes from the file
con <- file("SEVP_CLI_CHN_MERGE_CMP_PRE_HOUR_GRID_0.10-2008043000.grd", "rb")
x <- readBin(con, what = "raw", n = 1e7)
close(con)

# Number of bytes in the file
length(x) # 2464000

# Expected number of bytes (number of records + 1 first record skipped)
nx = 700
ny = 440
(nx * ny + 1) * 4 # 1232004

There are 2464000 bytes in the file but we are only reading 1232004. Are nx and ny correct for the given file? If yes, what is contained in the remaining bytes of the file? The actual number of bytes is almost exactly twice the number of bytes that we actually read. Are the values maybe coded in 8 bytes instead of 4 bytes?

The text was updated successfully, but these errors were encountered:

robetatis · 2020-07-22T16:24:49Z

Hi @hsonne

Have a look at this site for a general description of GrADS formatting possibilities, and here for the GrADS software itself. Mind, however, that this is just a general description and we have no information about the specific formatting used by our data provider. The solution for this specific data set was trial-and-error, comparing our resulting rasters with the graphical output of the GrADS software. This way it was possible to make sure the spatial distribution and value range of the resulting precipitation fields were correct (excluding small shifts of a couple of pixels). In the next development step, I will try to export a few rasters (tif or ascii) from GrADS, read them into R and overlay them with our output to do a final check.
nx and ny are described in a configuration file and the documentation made available by the data provider, so we are quite sure of their values.
If our data provider used the default GrADS configuration (direct access binary file, i.e., no FORTRAN record headers), each record in the byte stream returned byreadBin(con, what = "raw", n = 1e7) corresponds to a pixel with an actual rainfall depth. However, if they used headers, then maybe these are the missing bytes, which would also explain why my current version with readBin(con=grdconn, what="numeric", n=grdsize, size=4) produces a correct precipitation field (as I said, excluding a small shift due to the fact that I skipped the first and last records)
I also think if we are going for a more general version, we should ask the user for the input parameters for readBin.

So, let's keep at it =)

Thanks for your input!!

hsonne · 2020-07-22T20:21:16Z

Thanks, @robetatis.

I just googled and found this article about a package that seems to do what you are doing (reading GrADS files). Have you already read this?
https://www.r-bloggers.com/readgrads-%E2%80%93-an-r-package-to-read-and-manipulate-grads-data/

The package can be installed from a bitbucket repository as follows:

# install.packages("remotes")
remotes::install_bitbucket("paulhiemstra/readgrads@default")

The package seems to consider the metadata file that comes along with the binary data file. I propose that you try to read your data (you need the corresponding metadata files *.ctl, do you have them?) with the functions from that package and compare these functions with your function. We could then contact the author and try to contribute to his package instead of making a new package. What do you think?

robetatis · 2020-07-23T16:06:17Z

Hi @hsonne

Yep. I know the package. Contributing to it is surely an option. But I guess we will (if we move forward with this) specialize in GrADS data sets for urban hydrology, so we'll have to discuss the best strategy.
Yes, I have the *.ctl file. Got it directly from the data provider. That's the configuration file I mentioned in my last comment.
For this first step, building our own small function for figuring out the formatting by trial-and-error seemed more practical than using the existing package's functions, given the lack of information.

robetatis · 2020-07-23T17:09:10Z

Hi @hsonne

There might be an error in the way you construct the matrix in valuesToMatrix. I plotted the resulting raster and the data is upside down:

Also, the precipitation field seems quite different from the original data:

The output from my original (though potentially slower) version resembles much more the original data (I have yet to do an exact overlay to compute differences):

robetatis · 2020-07-24T13:50:46Z

Hi @hsonne

I managed to read the raw binary data into GrADS and export it as a GeoTiff with the following commands (GrADS command line):

open ...\pFromGrADS\test\conf.ctl (full path to configuration file)
set gxout geotiff. Set geotiff as output option
set geotiff ...\pFromGrADS\test\out. Set fine name for geotiff output
d crain. 'Draw' data into geotiff file

Content of conf.ctl (configuration file):
DSET ...\pFromGrADS\test\gridfile.grd. Full path and name (with extension) to grid file
UNDEF -999.0. NA value in raw data
OPTIONS little_endian sequential. Endianness and type of file access....so, in the end these were sequential access files all along (the provided did use FORTRAN headers before each record)
TITLE China Hourly Merged Precipitation Analysis. Just a title for the data
XDEF 700 linear 70.05 0.10. 700 columns starting at x-coordinate 70.05 East, x-resolution = 0.10
YDEF 440 linear 15.05 0.10. 440 rows starting (bottom) at y-coordinate 15.05 North, y-resolution = 0.10
ZDEF 1 LEVELS 1 . Data is 2-dimensional, i.e., only 1 vertical level
TDEF 1 LINEAR 20Z21Nov2016 1hr. Date and time of data in the grid file. Format: HZddmmmyyyy. Hourly data
VARS 2. Start of the variable section. The raw data contains two variables
crain 1 00 CH01 combined analysis (mm/Hour). crain is the rainfall depth in mm/hour
gsamp 1 00 CH02 gauge numbers
ENDVARS. End of variable section

Here's a comparison of both rasters after loading them into R:

I'm still struggling with the alignment of the rasters, so we can do an exact overlay, but visual comparison seem OK.

Best!
Roberto

hsonne · 2020-07-24T15:58:04Z

Hi @robetatis,

I used the following code to read the example files that you provided to me. I extracted the control information from your comment above and provided it in a variable control_text. The code uses the functions from the readGrads package by Paul Hiemstra to

read the control information with readGrads::parseCTLfile(),
read the the binary data with readGrads::readGradsFile(). The function uses the control information to (hopefully) correctly interpret the binary data.

Does this reproduce what the software delivers?
Were you able to install the readGrads package? I needed to "downgrade" some packages but then succeeded.

Regards,
Hauke

# Content of conf.ctl (configuration file)
control_text <- c(
  "DSET just-a-dummy", # The filename is not required, I assume
  "UNDEF -999.0",
  "OPTIONS little_endian sequential", 
  "TITLE China Hourly Merged Precipitation Analysis",
  "XDEF 700 linear 70.05 0.10", 
  "YDEF 440 linear 15.05 0.10",
  "ZDEF 1 LEVELS 1",
  "TDEF 1 LINEAR 20Z21Nov2016 1hr",
  "VARS 2",
  "crain 1 00 CH01", # combined analysis (mm/Hour)
  "gsamp 1 00 CH02", # gauge numbers
  "ENDVARS"
)

# Read control file content from text
con <- textConnection(control_text)
control <- readGrads::parseCTLfile(con)
close(con)

# Get elements out of the control structure
nx <- control$xdef$noLevels # 700
ny <- control$ydef$noLevels # 440

# Provide paths to grads files
grads_files <- dir("~/../Downloads/test_pFromGrADS", full.names = TRUE)

# Read the files into matrices and convert them to raster objects
raster_objects <- lapply(grads_files, function(grads_file) {
  
  # Read the binary file into a list of data frames
  cat("Reading", basename(grads_file), "\n")  
  x <- readGrads::readGradsFile(grads_file, control)

  # Get the values of the first variable
  values <- x[[1]]$value

  # Replace undefined values with NA
  values[values == control$undef] <- NA

  # Create a raster object using the bounding box information given in the control object  
  raster::raster(
    matrix(values, byrow = TRUE, nrow = ny), 
    xmn = control$xdef$levelValues[1L], 
    xmx = control$xdef$levelValues[nx], 
    ymn = control$ydef$levelValues[1L], 
    ymx = control$ydef$levelValues[ny],
    crs = "+proj=longlat +datum=WGS84"
  )
})

# Plot the raster objects
colfunc <- colorRampPalette(c(NA, "blue"))
lapply(raster_objects, raster::plot, col = colfunc(80L))

robetatis · 2020-07-30T19:10:49Z

Hello @hsonne
Thanks for all your valuable input!
As you suggested, I have created a branch to incorporate your changes ('dev'). If you agree, please address the issue in valuesToMatrix and send the pull request.
Best
Roberto

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Where is the grd format described? #1

Where is the grd format described? #1

hsonne commented Jul 22, 2020

robetatis commented Jul 22, 2020

hsonne commented Jul 22, 2020

robetatis commented Jul 23, 2020

robetatis commented Jul 23, 2020

robetatis commented Jul 24, 2020 •

edited

Loading

hsonne commented Jul 24, 2020

robetatis commented Jul 30, 2020

Where is the grd format described? #1

Where is the grd format described? #1

Comments

hsonne commented Jul 22, 2020

robetatis commented Jul 22, 2020

hsonne commented Jul 22, 2020

robetatis commented Jul 23, 2020

robetatis commented Jul 23, 2020

robetatis commented Jul 24, 2020 • edited Loading

hsonne commented Jul 24, 2020

robetatis commented Jul 30, 2020

robetatis commented Jul 24, 2020 •

edited

Loading