Replies: 14 comments 7 replies
-
Look at the IOOS Glider DAAC:
https://gliders.ioos.us/erddap/tabledap/index.html?page=1&itemsPerPage=1000
You want a table not a grid. try EDDTableFromNcFiles or the like. I can send you one sample we have for a glider dataset. Also there is a convention now for glider data in netcdf files, if your file(s) don't follow that convention I would strongly suggest thinking about redoing them.
If you have more questions the Glider DAAC folks are the ones to ask, they have a lot of experience both with glider data and ERDDAP.
HTH,
-Roy
On Jul 28, 2024, at 8:14 AM, Jody Klymak ***@***.***> wrote:
Hi all - I've been banging my head against a wall ingesting data that I thought would be TrajectoryProfile data, and thought I'd turn to the folks here for help/examples...
The data is in what I thought would be a pretty standard configuration for a CTD cruise, or in this case a series of glider profiles - it is a 2-D netcdf with a column per profile, and gridded vertically in depth. I usually use time as the column dimension, but could easily switch to profile. A representative file ncdump looks like (with some of the metadata stripped):
netcdf dfo-rosie713-20230810_grid_delayed {
dimensions:
depth = 1100 ;
time = 1691 ;
variables:
double depth(depth) ;
depth:_FillValue = NaN ;
depth:units = "m" ;
depth:long_name = "Depth" ;
depth:standard_name = "depth" ;
depth:positive = "down" ;
depth:coverage_content_type = "coordinate" ;
depth:comment = "center of depth bins" ;
double profile(time) ;
profile:_FillValue = NaN ;
profile:cf_role = "profile_id" ;
double time(time) ;
time:_FillValue = NaN ;
time:source = "sci_m_present_time" ;
time:long_name = "Time" ;
time:standard_name = "time" ;
time:axis = "T" ;
time:observation_type = "measured" ;
time:units = "seconds since 1970-01-01T00:00:00+00:00" ;
time:calendar = "gregorian" ;
double longitude(time) ;
longitude:_FillValue = NaN ;
longitude:source = "m_lon" ;
longitude:long_name = "longitude" ;
longitude:standard_name = "longitude" ;
longitude:units = "degrees_east" ;
longitude:axis = "X" ;
longitude:comment = "Estimated between surface fixes" ;
longitude:observation_type = "measured" ;
longitude:reference = "WGS84" ;
longitude:valid_max = "180.0" ;
longitude:valid_min = "-180.0" ;
longitude:coordinate_reference_frame = "urn:ogc:crs:EPSG::4326" ;
double latitude(time) ;
latitude:_FillValue = NaN ;
latitude:source = "m_lat" ;
latitude:long_name = "latitude" ;
latitude:standard_name = "latitude" ;
latitude:units = "degrees_north" ;
latitude:axis = "Y" ;
latitude:observation_type = "measured" ;
latitude:reference = "WGS84" ;
latitude:valid_max = "90.0" ;
latitude:valid_min = "-90.0" ;
latitude:coordinate_reference_frame = "urn:ogc:crs:EPSG::4326" ;
double temperature(depth, time) ;
temperature:_FillValue = NaN ;
temperature:source = "sci_water_temp" ;
temperature:long_name = "water temperature" ;
temperature:standard_name = "sea_water_temperature" ;
temperature:units = "Celsius" ;
temperature:instrument = "instrument_ctd" ;
temperature:valid_min = "-5" ;
temperature:valid_max = "50" ;
temperature:observation_type = "measured" ;
temperature:coverage_content_type = "physicalMeasurement" ;
double salinity(depth, time) ;
...
// global attributes:
:Conventions = "CF-1.8" ;
:Metadata_Conventions = "CF-1.8, Unidata Dataset Discovery v1.0" ;
...
Note that latitude, longitude and profile are 1-D variables, with one value per profile/column.
Happy to expand on my many failures to round trip this, but it seems that a) EDDGridFromNcFiles doesn't recognize the latitude and longitude as 1-d arrays, and tries to broadcast them to (depth, time) arrays, and then accessing either latitude or longitude (eg when making a graph) gets the dataset marked as invalid. If I do EDDTableFromNCCFFiles it seems OK, but everything gets flattened.
I don't particularly need to amalgamate multiple files together, though that might be nice.
I assume this style of netcdf is possible in ERDDAP. If anyone has an example netcdf structure and datasets.xml snippet that I could adapt, I would be appreciative.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.
**********************
"The contents of this message do not reflect any position of the U.S. Government or NOAA."
**********************
Roy Mendelssohn
Supervisory Operations Research Analyst
NOAA/NMFS
Environmental Research Division
Southwest Fisheries Science Center
***Note new street address***
110 McAllister Way
Santa Cruz, CA 95060
Phone: (831)-420-3666
Fax: (831) 420-3980
e-mail: ***@***.*** www: https://www.pfeg.noaa.gov/
"Old age and treachery will overcome youth and skill."
"From those who have been given much, much will be expected"
"the arc of the moral universe is long, but it bends toward justice" -MLK Jr.
|
Beta Was this translation helpful? Give feedback.
-
@rmendels Thanks for your comment. We already upload data to the IOOS GDAC (here). However, as far as I can tell, it flattens all the profiles into a long timeseries with Thats great for "raw" data, however, most science analysis is done on grids, so I'd also like my users (including myself) to have access to the gridded data sets we already make (for example: https://cproof.uvic.ca/gliderdata/deployments/dfo-k999/dfo-k999-20240703/L0-gridfiles/dfo-k999-20240703_grid.nc). I'm asking about serving a grid of data (need not be glider data), with some of the data only on one dimension of that grid. eg dimensions |
Beta Was this translation helpful? Give feedback.
-
I am often mistaken about these things, but from the ERDDAP docs:
EDDGrid datasets handle gridded data.
• In EDDGrid datasets, data variables are multi-dimensional arrays of data.
• There MUST be an axis variable for each dimension. Axis variables MUST be specified in the order that the data variables use them.
• In EDDGrid datasets, all data variables MUST use (share) all of the axis variables.
(Why? What if they don't?)
• Sorted Dimension Values - In all EDDGrid datasets, each dimension MUST be in sorted order (ascending or descending). Each can be irregularly spaced. There can be no ties. This is a requirement of the CF metadata standard. If any dimension's values aren't in sorted order, the dataset won't be loaded and ERDDAP™ will identify the first unsorted value in the log file, bigParentDirectory/logs/log.txt .
A few subclasses have additional restrictions (notably, EDDGridAggregateExistingDimension requires that the outer (leftmost, first) dimension be ascending.
Unsorted dimension values almost always indicate a problem with the source dataset. This most commonly occurs when a misnamed or inappropriate file is included in the aggregation, which leads to an unsorted time dimension. To solve this problem, see the error message in the ERDDAP™ log.txt file to find the offending time value. Then look in the source files to find the corresponding file (or one before or one after) that doesn't belong in the aggregation.
and:
What if the grid variables in the source dataset DON'T share the same axis variables?
In EDDGrid datasets, all data variables MUST use (share) all of the axis variables. So if a source dataset has some variables with one set of dimensions, and other variables with a different set of dimensions, you will have to make two datasets in ERDDAP. For example, you might make one ERDDAP™ dataset entitled "Some Title (at surface)" to hold variables that just use [time][latitude][longitude] dimensions and make another ERDDAP™ dataset entitled "Some Title (at depths)" to hold the variables that use [time][altitude][latitude][longitude]. Or perhaps you can change the data source to add a dimension with a single value (for example, altitude=0) to make the variables consistent.
In your dataset, latitude and longitude are not dimension variables in the strict sense (ie the parameters are not defined on them). Thus I doubt ERDDAP can handle these as grid, hence why the IOOS Glider DAAC has them as tables. I don't see the issue with having the data as a table, you can extract the data almost as if a grid, see for example:
https://rmendels.github.io/rerddap_docs/articles/Using_rerddap.html#ioos-glider-data
Moreover, in a grid, you can only subset on the "outer" (ie coordinate) variables, while in a table you can subset on all the variables, so for things like glider data personally I prefer tables, but everyone has different preferences.
HTH,
-Roy
On Jul 28, 2024, at 10:02 AM, Jody Klymak ***@***.***> wrote:
@rmendels Thanks for your comment. We already upload data to the IOOS GDAC (here). However, as far as I can tell, it flattens all the profiles into a long timeseries with row as the dimension, rather than serving a grid.
Thats great for "raw" data, however, most science analysis is done on grids, so I'd also like my users (including myself) to have access to the gridded data sets we already make (for example: https://cproof.uvic.ca/gliderdata/deployments/dfo-k999/dfo-k999-20240703/L0-gridfiles/dfo-k999-20240703_grid.nc). I'm asking about serving a grid of data (need not be glider data), with some of the data only on one dimension of that grid. eg dimensions depth and time with full data temperature(depth, time) and data along just one of the dimensions, eg latitude(time) and longitude(time). This should be a pretty common data organization, so I'm sure I am just missing something.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.
**********************
"The contents of this message do not reflect any position of the U.S. Government or NOAA."
**********************
Roy Mendelssohn
Supervisory Operations Research Analyst
NOAA/NMFS
Environmental Research Division
Southwest Fisheries Science Center
***Note new street address***
110 McAllister Way
Santa Cruz, CA 95060
Phone: (831)-420-3666
Fax: (831) 420-3980
e-mail: ***@***.*** www: https://www.pfeg.noaa.gov/
"Old age and treachery will overcome youth and skill."
"From those who have been given much, much will be expected"
"the arc of the moral universe is long, but it bends toward justice" -MLK Jr.
|
Beta Was this translation helpful? Give feedback.
-
Thanks again @rmendels. Agreed that you can easily represent data along a track with I agree with your reading of the docs. Maybe what I am asking for is a feature request? It would be very nice to have grids of data that can be conditionally sampled in depth, time and latitude, longitude, and any other 1-D vector representing the profile dimension. My reading of the CF convention H.6.2. Profiles along a single trajectory is exactly the way I organize my data (well, I put |
Beta Was this translation helpful? Give feedback.
-
I believe you are misreading the CF convention - Profiles are not Grids, though they can be stored in netcdf files. From an ERDDAP table that is from a Profile, I can extract the data just as you say, and save the result to a netcdf file that obeys the Profile convention, it is just the syntax is different. Give it a try. In the example I sent I could have just as easily download a etc
For example, go to:
https://gliders.ioos.us/erddap/tabledap/amlr01-20181216T0641-delayed.html
Subset however you like, attached is a screen shot of some of the formats in which you can get the results.
HTH,
-Roy
On Jul 28, 2024, at 6:38 PM, Jody Klymak ***@***.***> wrote:
Thanks again @rmendels.
Agreed that you can easily represent data along a track with scatter coloured by a variable. However, that is of limited use in further analysis. For example, something like: temperature.mean(dim='time') to get the mean temperature profile is a pretty useful thing to be able to calculate, that cannot be done unless you grid in the vertical first. Grids also have display uses, including using potentially faster graphics methods like pcolormesh, and the ability to contour. Given this, most folks I know use gridded data sets for profile data, organized similar to the above for data analysis.
I agree with your reading of the docs. Maybe what I am asking for is a feature request? It would be very nice to have grids of data that can be conditionally sampled in depth, time and latitude, longitude, and any other 1-D vector representing the profile dimension. My reading of the CF convention H.6.2. Profiles along a single trajectory is exactly the way I organize my data (well, I put depth on the first dimension, but...), so I was surprised ERDDAP didn't seem to smoothly support it.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.
**********************
"The contents of this message do not reflect any position of the U.S. Government or NOAA."
**********************
Roy Mendelssohn
Supervisory Operations Research Analyst
NOAA/NMFS
Environmental Research Division
Southwest Fisheries Science Center
***Note new street address***
110 McAllister Way
Santa Cruz, CA 95060
Phone: (831)-420-3666
Fax: (831) 420-3980
e-mail: ***@***.*** www: https://www.pfeg.noaa.gov/
"Old age and treachery will overcome youth and skill."
"From those who have been given much, much will be expected"
"the arc of the moral universe is long, but it bends toward justice" -MLK Jr.
|
Beta Was this translation helpful? Give feedback.
-
I should add look a Callum Rollo's wonderful Python scripts using erddapy, where he subsets glider data that are tables in ERDDAP, converts them to xarray objects so can do exactly what you ask:
https://github.com/voto-ocean-knowledge/goos-erddap-demo
I have translated these into R:
https://github.com/rmendels/rollo_scripts
HTH,
-Roy
On Jul 28, 2024, at 8:20 PM, Roy Mendelssohn ***@***.***> wrote:
I believe you are misreading the CF convention - Profiles are not Grids, though they can be stored in netcdf files. From an ERDDAP table that is from a Profile, I can extract the data just as you say, and save the result to a netcdf file that obeys the Profile convention, it is just the syntax is different. Give it a try. In the example I sent I could have just as easily download a etc
For example, go to:
https://gliders.ioos.us/erddap/tabledap/amlr01-20181216T0641-delayed.html
Subset however you like, attached is a screen shot of some of the formats in which you can get the results.
HTH,
-Roy
> On Jul 28, 2024, at 6:38 PM, Jody Klymak ***@***.***> wrote:
>
>
> Thanks again @rmendels.
>
> Agreed that you can easily represent data along a track with scatter coloured by a variable. However, that is of limited use in further analysis. For example, something like: temperature.mean(dim='time') to get the mean temperature profile is a pretty useful thing to be able to calculate, that cannot be done unless you grid in the vertical first. Grids also have display uses, including using potentially faster graphics methods like pcolormesh, and the ability to contour. Given this, most folks I know use gridded data sets for profile data, organized similar to the above for data analysis.
>
> I agree with your reading of the docs. Maybe what I am asking for is a feature request? It would be very nice to have grids of data that can be conditionally sampled in depth, time and latitude, longitude, and any other 1-D vector representing the profile dimension. My reading of the CF convention H.6.2. Profiles along a single trajectory is exactly the way I organize my data (well, I put depth on the first dimension, but...), so I was surprised ERDDAP didn't seem to smoothly support it.
>
> —
> Reply to this email directly, view it on GitHub, or unsubscribe.
> You are receiving this because you were mentioned.
>
**********************
"The contents of this message do not reflect any position of the U.S. Government or NOAA."
**********************
Roy Mendelssohn
Supervisory Operations Research Analyst
NOAA/NMFS
Environmental Research Division
Southwest Fisheries Science Center
***Note new street address***
110 McAllister Way
Santa Cruz, CA 95060
Phone: (831)-420-3666
Fax: (831) 420-3980
e-mail: ***@***.*** www: https://www.pfeg.noaa.gov/
"Old age and treachery will overcome youth and skill."
"From those who have been given much, much will be expected"
"the arc of the moral universe is long, but it bends toward justice" -MLK Jr.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you are subscribed to this thread.
**********************
"The contents of this message do not reflect any position of the U.S. Government or NOAA."
**********************
Roy Mendelssohn
Supervisory Operations Research Analyst
NOAA/NMFS
Environmental Research Division
Southwest Fisheries Science Center
***Note new street address***
110 McAllister Way
Santa Cruz, CA 95060
Phone: (831)-420-3666
Fax: (831) 420-3980
e-mail: ***@***.*** www: https://www.pfeg.noaa.gov/
"Old age and treachery will overcome youth and skill."
"From those who have been given much, much will be expected"
"the arc of the moral universe is long, but it bends toward justice" -MLK Jr.
|
Beta Was this translation helpful? Give feedback.
-
I think the CF-convention for trajectoryProfile is relatively clear: https://cfconventions.org/Data/cf-conventions/cf-conventions-1.7/build/aphs06.html Example H21 has format
which is exactly the form of my data, except it falls under the caveat
This is also the form that I was hoping to serve to our users.
I don't see that @callumrollo 's scripts here make 2-D representation of the glider data. The glider data he loads into xarray are just a flat table from what I can see, but I only quickly skimmed. Of course one can always take the flat table and do the depth binning to make the 2-D table or grid; indeed that is what we do to produce the grid I want to ingest. However the binning is somewhat involved, and a bit slow, so the goal here is to save users the bother since we do it routinely anyways. |
Beta Was this translation helpful? Give feedback.
-
Hi Jody, I have also experienced this frustration with how ERDDAP handles 2-D datasets. I haven't found a way around it, without promoting all the 1-D variables like latitude into 2-D grids, which feels quite wasteful. That's why we distribute our data as flat timeseries at the moment. I remember seeing something interesting from Rob Cermak where he appeared to solve this issue for glider ADCP data. He posted it to the UG2 slack data channel on March 31st. I haven't had the time to look into it, but it appears to enable serving 2D data alongside 1D profiles, which could serve as a basis for making 2-D profile data work with 1 D arrays of lon, lat, DAC etc. |
Beta Was this translation helpful? Give feedback.
-
Pinging @ChrisJohnNOAA is this capacity that @jklymak describes something ERDDAP can support/could support in future? As I understand, the desired behavior is to have a single griddap dataset which serves 2-D gridded variables like temperature(profile_num, depth_bin) as well as 1-D variables like lat(profile_num), without broadcasting these 1-D variables to 2-D. At the moment, from reading the docs @rmendels highlighted ( |
Beta Was this translation helpful? Give feedback.
-
I must admit I am still at a loss here. I can go into:
https://gliders.ioos.us/erddap/tabledap/amlr01-20181216T0641-delayed.html
subset by time, latitude, longitude, depth, etc etc and save the result as using either the ncCF or ncCFMA. Please at least try this and see if that gets closer to what you want. I would add that by having a table the DAAC is able to aggregate the selection over a large number of actual files, for example that table above represents these files:
https://gliders.ioos.us/erddap/files/amlr01-20181216T0641-delayed/
…-Roy
PS - I just picked some numbers at random to show what I mean, here is an ERDDAP URL that subsets on lat, lon, depth and time and returns the data as ncCFMA:
https://gliders.ioos.us/erddap/tabledap/amlr01-20181216T0641-delayed.ncCFMA?profile_id,time,latitude,longitude,depth,u&time>=2019-03-01T00:00:00Z&time<=2019-03-08T09:37:33Z&latitude>=-63.3&latitude<=-63.1&depth>=5&depth<=10
What am I missing?
On Jul 29, 2024, at 7:26 AM, Jody Klymak ***@***.***> wrote:
Thanks @callumrollo <https://github.com/callumrollo>
Not only wasteful, but a lost opportunity as subsetting on lat/long or other 1-d variables is often helpful.
I will admit to being confused whether griddap is a standard that is part of erddap or whether it is used by erddap. If the former I guess a feature request is possible though I don't know how active development is.
—
Reply to this email directly, view it on GitHub <#177 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAOUQNYCKW72LVMVIXA2TRTZOZGJNAVCNFSM6AAAAABLS4HCFKVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTAMJYGA2DQOI>.
You are receiving this because you were mentioned.
|
Beta Was this translation helpful? Give feedback.
-
Also can you please send me an example file that you were trying to read into ERDDAP?
Thanks,
…-Roy
On Jul 29, 2024, at 7:41 AM, Roy Mendelssohn ***@***.***> wrote:
I must admit I am still at a loss here. I can go into:
https://gliders.ioos.us/erddap/tabledap/amlr01-20181216T0641-delayed.html
subset by time, latitude, longitude, depth, etc etc and save the result as using either the ncCF or ncCFMA. Please at least try this and see if that gets closer to what you want. I would add that by having a table the DAAC is able to aggregate the selection over a large number of actual files, for example that table above represents these files:
https://gliders.ioos.us/erddap/files/amlr01-20181216T0641-delayed/
-Roy
PS - I just picked some numbers at random to show what I mean, here is an ERDDAP URL that subsets on lat, lon, depth and time and returns the data as ncCFMA:
https://gliders.ioos.us/erddap/tabledap/amlr01-20181216T0641-delayed.ncCFMA?profile_id,time,latitude,longitude,depth,u&time>=2019-03-01T00:00:00Z&time<=2019-03-08T09:37:33Z&latitude>=-63.3&latitude<=-63.1&depth>=5&depth<=10
What am I missing?
> On Jul 29, 2024, at 7:26 AM, Jody Klymak ***@***.***> wrote:
>
>
> Thanks @callumrollo <https://github.com/callumrollo>
> Not only wasteful, but a lost opportunity as subsetting on lat/long or other 1-d variables is often helpful.
>
> I will admit to being confused whether griddap is a standard that is part of erddap or whether it is used by erddap. If the former I guess a feature request is possible though I don't know how active development is.
>
> —
> Reply to this email directly, view it on GitHub <#177 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAOUQNYCKW72LVMVIXA2TRTZOZGJNAVCNFSM6AAAAABLS4HCFKVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTAMJYGA2DQOI>.
> You are receiving this because you were mentioned.
>
—
Reply to this email directly, view it on GitHub <#177 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAOUQN3PFU3UZUPGBSEAFUTZOZIDHAVCNFSM6AAAAABLS4HCFKVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTAMJYGA3DGNI>.
You are receiving this because you are subscribed to this thread.
|
Beta Was this translation helpful? Give feedback.
-
Can you send me the URLs used so I can reproduce the extracts. Also have you tried ncCF?
Thanks,
…-Roy
On Jul 29, 2024, at 8:12 AM, Jody Klymak ***@***.***> wrote:
@rmendels <https://github.com/rmendels>, we upload the IOOS GDAC all the time, and I understand the files the ERDDAP produces. If I subset the above to just give me temperature for one day, and download ncCFMA I get.
<xarray.Dataset>
Dimensions: (trajectory: 27, profile: 1, obs: 7274)
Coordinates:
* trajectory (trajectory) object 'amlr01-20190228T1531' ... 'amlr01-20190...
time (trajectory, profile) datetime64[ns] ...
latitude (trajectory, profile) float64 ...
longitude (trajectory, profile) float64 ...
depth (trajectory, profile, obs) float32 ...
Dimensions without coordinates: profile, obs
Data variables:
profile_id (trajectory, profile) float64 ...
temperature (trajectory, profile, obs) float32 ...
Attributes: (12/60)
acknowledgement: This work supported by funding from NOAA
cdm_data_type: TrajectoryProfile
which is less than useful.
If I download as a flat nc file, I get the timeseries, which is useful, but still needs to be binned.
Dimensions: (row: 68215)
Dimensions without coordinates: row
Data variables:
trajectory (row) object ...
profile_id (row) float64 ...
time (row) datetime64[ns] ...
latitude (row) float64 ...
longitude (row) float64 ...
depth (row) float32 ...
temperature (row) float32 ...
Attributes: (12/60)
acknowledgement: This work supported by funding from NOAA
cdm_data_type: TrajectoryProfile
If I download the gridded data set at https://cproof.uvic.ca/gliderdata/deployments/dfo-k999/dfo-k999-20240703/L0-gridfiles/dfo-k999-20240703_grid.nc
I get
Dimensions: (depth: 110, time: 317)
Coordinates:
* depth (depth) float64 0.5 10.5 20.5 ... 1.08e+03 1.09e+03
profile (time) float64 ...
* time (time) datetime64[ns] 2024-07-03T21:21:02.51500006...
Data variables: (12/23)
longitude (time) float64 ...
latitude (time) float64 ...
... ...
temperature (depth, time) float64 ...
... ...
—
Reply to this email directly, view it on GitHub <#177 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAOUQN6E46DMG5F6WMDMBUTZOZLUBAVCNFSM6AAAAABLS4HCFKVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTAMJYGEYDCMY>.
You are receiving this because you were mentioned.
|
Beta Was this translation helpful? Give feedback.
-
Also, I believe the template that ERDDAP uses for what it returns is based on the NCEI guidelines based on the DSG type defined in the datasets.xml and I believe is what IOOS follows:
https://www.ncei.noaa.gov/data/oceans/ncei/formats/netcdf/v2.0/index.html
Looking at what was returned in my examples from the IOOS DAAC, it does seem to follow those guidelines, I am not certain that the file you initially referenced follows their definition of a trajectoryProfile, I would have to look at the present CF definition. So it may be desirable to add a different return type, but this is independent of whether ERDDAP does the subsetting as a Table.
…-Roy
On Jul 29, 2024, at 8:19 AM, Roy Mendelssohn ***@***.***> wrote:
Can you send me the URLs used so I can reproduce the extracts. Also have you tried ncCF?
Thanks,
-Roy
> On Jul 29, 2024, at 8:12 AM, Jody Klymak ***@***.***> wrote:
>
>
> @rmendels <https://github.com/rmendels>, we upload the IOOS GDAC all the time, and I understand the files the ERDDAP produces. If I subset the above to just give me temperature for one day, and download ncCFMA I get.
>
> <xarray.Dataset>
> Dimensions: (trajectory: 27, profile: 1, obs: 7274)
> Coordinates:
> * trajectory (trajectory) object 'amlr01-20190228T1531' ... 'amlr01-20190...
> time (trajectory, profile) datetime64[ns] ...
> latitude (trajectory, profile) float64 ...
> longitude (trajectory, profile) float64 ...
> depth (trajectory, profile, obs) float32 ...
> Dimensions without coordinates: profile, obs
> Data variables:
> profile_id (trajectory, profile) float64 ...
> temperature (trajectory, profile, obs) float32 ...
> Attributes: (12/60)
> acknowledgement: This work supported by funding from NOAA
> cdm_data_type: TrajectoryProfile
> which is less than useful.
>
> If I download as a flat nc file, I get the timeseries, which is useful, but still needs to be binned.
>
> Dimensions: (row: 68215)
> Dimensions without coordinates: row
> Data variables:
> trajectory (row) object ...
> profile_id (row) float64 ...
> time (row) datetime64[ns] ...
> latitude (row) float64 ...
> longitude (row) float64 ...
> depth (row) float32 ...
> temperature (row) float32 ...
> Attributes: (12/60)
> acknowledgement: This work supported by funding from NOAA
> cdm_data_type: TrajectoryProfile
> If I download the gridded data set at https://cproof.uvic.ca/gliderdata/deployments/dfo-k999/dfo-k999-20240703/L0-gridfiles/dfo-k999-20240703_grid.nc
>
> I get
>
> Dimensions: (depth: 110, time: 317)
> Coordinates:
> * depth (depth) float64 0.5 10.5 20.5 ... 1.08e+03 1.09e+03
> profile (time) float64 ...
> * time (time) datetime64[ns] 2024-07-03T21:21:02.51500006...
> Data variables: (12/23)
> longitude (time) float64 ...
> latitude (time) float64 ...
> ... ...
> temperature (depth, time) float64 ...
> ... ...
> —
> Reply to this email directly, view it on GitHub <#177 (reply in thread)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAOUQN6E46DMG5F6WMDMBUTZOZLUBAVCNFSM6AAAAABLS4HCFKVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTAMJYGEYDCMY>.
> You are receiving this because you were mentioned.
>
—
Reply to this email directly, view it on GitHub <#177 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAOUQNYVNGG6CGYP3I7AQLDZOZMRXAVCNFSM6AAAAABLS4HCFKVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTAMJYGEYTANY>.
You are receiving this because you are subscribed to this thread.
|
Beta Was this translation helpful? Give feedback.
-
Hi All,
I believe Callum is correct, this is a present limitation in ERDDAP.
Here is an except Bob in 2017:
If a variable in the source file is e.g., lat and lon values that use
different dimensions than the main data variables and that convert the
projection x,y locations into lat and lon, then in ERDDAP they need to be
in a separate dataset.
See
https://coastwatch.pfeg.noaa.gov/erddap/download/setupDatasetsXml.html#dataStructures
and the subsequent few paragraphs.
I know this sounds goofy and severely limiting. It is the one major
situation where using ERDDAP isn't the best choice. But it only affects a
small percentage of the total data files in NOAA (that is small consolation
for you where it affects perhaps 100% of your data files for polarwatch).
There is a solution -- a modification to ERDDAP that would support this but
doing it would be a massive effort on my part (a couple of months with no
distractions) so I haven't had time to do it.
There was a reason for doing it this way: it is this
slightly-simpler-than-netcdf data model that allows ERDDAP to read data
from many file types and write data to many file types. So there is great
benefit, but it comes at a cost. Few people/groups/datasets pay the cost,
but you are. Sorry.
The solution we had to implement there with two datasets was a bit of a
hack and difficult for users. I'll add this to the Github ticket. It would
be great to see this feature added to ERDDAP. At Scripps we have similar
glider products that would benefit from this feature. Please check in with
Dale and Sunny on the PolarWatch use case with gridded projected datasets.
This feature may also accomodate satellite swath data and be of interest to
CoastWatch/NASA.
As is, I think a griddap dataset of this data can work in two less than
ideal ways:
1) Flatten the lat and lon values to match the dimensions of the variables.
You may be able to accomplish this with ncml, but I have not personally
done this.
2) Create a separate dataset for the lat, lon, and other data with one
value per profile. We had to go this route with PolarWatch because
projected datasets had a related issue, where the grid definition had
different dimensions than the data.
I agree there is value in having this type of synthesized glider data
accessible via griddap over tabledap. The list of benefits is quite long.
Best,
Jenn
…On Mon, Jul 29, 2024 at 9:29 AM Jody Klymak ***@***.***> wrote:
The glider DAC example definitely does *not* follow those guidelines.
From
https://www.ncei.noaa.gov/thredds-ocean/dodsC/example/v2.0/NCEI_trajectoryProfile_template_v2.0_2016-09-22_181838.014029.nc
<https://urldefense.com/v3/__https://www.ncei.noaa.gov/thredds-ocean/dodsC/example/v2.0/NCEI_trajectoryProfile_template_v2.0_2016-09-22_181838.014029.nc__;!!Mih3wA!DeJXOYemWuv6uwsCMG5igcErqdDO_0B4Fkar-S2--tfXDtLxkM3X90q_4aVOlS9tzZwlo8vHe9cNIeUaimALoxrwaaLf$>
you get exactly the format I would like where temp is a [1, 10, 4] array
and z is an array of length 4.
Dimensions: (trajectory: 1, obs: 10, z: 4)
Coordinates:
* trajectory (trajectory) int32 -2147483647
time (trajectory, obs) object ...
lat (trajectory, obs) float64 ...
lon (trajectory, obs) float64 ...
* z (z) float64 1.0 2.0 3.0 4.0
Dimensions without coordinates: obs
Data variables:
sal (trajectory, obs, z) float64 ...
temp (trajectory, obs, z) float64 ...
That is impossible to return from the glider DAC because their data is not
binned in z.
I think the thing being overlooked here is that binning in z requires the
data to be ingested by the ERDDAP and served differently. You will never
get the profile data ingested by the IOOS gliderDAC into 2-D binned
matrices because it is supplied to the GDAC as full resolution, no binning,
profiles.
—
Reply to this email directly, view it on GitHub
<https://urldefense.com/v3/__https://github.com/ERDDAP/erddap/discussions/177*discussioncomment-10181847__;Iw!!Mih3wA!DeJXOYemWuv6uwsCMG5igcErqdDO_0B4Fkar-S2--tfXDtLxkM3X90q_4aVOlS9tzZwlo8vHe9cNIeUaimALo51WxDZG$>,
or unsubscribe
<https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ACWBFBZCE2TNNYSU37FPVJDZOZUXZAVCNFSM6AAAAABLS4HCFKVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTAMJYGE4DINY__;!!Mih3wA!DeJXOYemWuv6uwsCMG5igcErqdDO_0B4Fkar-S2--tfXDtLxkM3X90q_4aVOlS9tzZwlo8vHe9cNIeUaimALoy8h0g97$>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Hi all - I've been banging my head against a wall ingesting data that I thought would be TrajectoryProfile data, and thought I'd turn to the folks here for help/examples...
The data is in what I thought would be a pretty standard configuration for a CTD cruise, or in this case a series of glider profiles - it is a 2-D netcdf with a column per profile, and gridded vertically in depth. I usually use
time
as the column dimension, but could easily switch toprofile
. A representative filencdump
looks like (with some of the metadata stripped):Note that
latitude
,longitude
andprofile
are 1-D variables, with one value per profile/column.Happy to expand on my many failures to round trip this, but it seems that a)
EDDGridFromNcFiles
doesn't recognize thelatitude
andlongitude
as 1-d arrays, and tries to broadcast them to (depth, time) arrays, and then accessing eitherlatitude
orlongitude
(eg when making a graph) gets the dataset marked as invalid. If I doEDDTableFromNCCFFiles
it seems OK, but everything gets flattened.I don't particularly need to amalgamate multiple files together, though that might be nice.
I assume this style of netcdf is possible in ERDDAP. If anyone has an example netcdf structure and
datasets.xml
snippet that I could adapt, I would be appreciative.Beta Was this translation helpful? Give feedback.
All reactions