This document describes the NEAD (Non-binary Environmental Archive Data) file format.
The NEAD format is a delimiter separated value (DSV) format similar to CSV, but with a header section, followed by the data section (example). This document describes the format using Extended Bakus-Naur form (EBNF) syntax (below) or graphically at https://raw.githubusercontent.com/GEUS-PROMICE/NEAD/main/NEAD.svg
Note - Here we use the EBNF notation from the W3C Extensible Markup Language (XML), not the ISO/IEC 14977:1996 EBNF standard. The W3C XML standard is implemented and parsed by the Railroad Diagram Generator (https://github.com/GuntherRademacher/rr) that is used to make the graphics shown in this document.
- This file format is under development.
Ionut Iosifescu Enescu; Mathias Bavay; Kenneth Mankoff (2020). New Environmental Data Archive (NEAD) format. EnviDat. doi: 10.16904/envidat.187
Click on the example graphic below to see the full graphical representation of the EBNF.
The graphical EBNF form is generated from the following EBNF syntax (from https://www.w3.org/TR/xml/#sec-notation):
NEAD ::= firstline header_section data_section
firstline ::= '#' ' ' 'NEAD' ' ' version_number ' ' file_format newline
header_section ::= metadata_header metadata_section fields_header fields_section
/* metadata */
metadata_header ::= '#' ' ' '[METADATA]' newline
metadata_section ::= (('#' ((whitespace (required_metadata |
recommended_metadata |
other_metadata))? )? lineend ) | newline)+
required_metadata ::= ('field_delimiter' assignment field_delimiter) |
('geometry' assignment geometry) |
('srid' assignment EPSG_code)
recommended_metadata ::= ('station_id' assignment alphanumeric)
recommended_metadata ::= ('timestamp_meaning' assignment timestamp_meanings)
recommended_metadata ::= 'nodata' assignment (integer | float)
recommended_metadata ::= 'timezone' assignment (integer | float | tz_string)
recommended_metadata ::= ('doi' | 'reference') assignment value
other_metadata ::= key assignment value
/* fields */
fields_header ::= '#' ' ' '[FIELDS]' newline
fields_section ::= (('#' ((whitespace ( required_fields |
recommended_fields |
other_fields ))? )? lineend ) | newline)+
required_fields ::= ('fields' assignment values)
recommended_fields ::= ('units_multiplier' |
'units_offset' |
'units' |
'long_name' |
'standard_name') assignment values
recommended_fields ::= 'timestamp_meaning' assignment timestamp_meanings
other_fields ::= key assignment values
/* data */
data_section ::= '#' ' ' '[DATA]' newline dataline+
dataline ::= (value ( field_delimiter value )* newline)
values ::= value ( field_delimiter whitespace value )*
/*
NOTE:
All values must be the same length.
This means everything in the "[FIELDS]" section maps 1:1 to the columns in the "[DATA]" section.
*/
/* other */
key ::= char (alphanumeric*)?
value ::= unicode-char*
assignment ::= whitespace? '=' whitespace?
field_delimiter ::= [,|\/:;]
version_number ::= digit* ('.' (alphanumeric)+)? ('.' (alphanumeric)+)?
file_format ::= 'UTF-8' | 'ASCII'
EPSG_code ::= 'EPSG' ':' digit digit digit digit
geometry ::= 'POINT(' float float ')' |
'POINTZ(' float float float ')' |
WKT_string |
column_name
timestamp_meanings ::= 'beginning' | 'end' | 'middle' | 'instantaneous' | 'other' | 'undefined'
/* generic */
whitespace ::= (tab | space)+
comment ::= '#' whitespace? (unicode-char)?
lineend ::= (whitespace | comment)? newline
newline ::= #x0A
tab ::= #x9
space ::= #x20
char ::= [a-zA-Z]
digit ::= [0-9]
integer ::= [+-]? digit+
alphanumeric ::= (digit | char)+
hex ::= (digit | [a-fA-F])+
float ::= integer '.' ((digit)+)?
/* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */
unicode-char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]
# NEAD 1.0 UTF-8 # [METADATA] # station_id = 803027F4 # station_name = GC-NET GOES station Summit Station # srid = EPSG:4326 # geometry = POINTZ(38.5053, 72.5794, 3199) # nodata = -999 # timezone = 0 # field_delimiter = , # # [FIELDS] # fields = timestamp,ISWR,OSWR,NSWR,TA1,TA2,RH1,RH2,VW1,VW2,DW1,DW2,P,HS1,HS2,V # units_offset = 0,0,0,0,273.15,273.15,0,0,0,0,0,0,0,0,0,0 # units_multiplier = 1,1,1,1,1,1,0.01,0.01,1,1,1,1,100,1,1,1 # # [DATA] 1996-05-12 11:00:00+00,356.6,288.29,-999,-999,-999,96.05,94.79,3.84,4.2,186.5,-999,691.7,-999,0.05,4.59 1996-05-12 12:00:00+00,489.3,453,-999,-999,-999,95.55,94.04,4.11,4.5,205.5,-999,691.8,-999,0.01,1.05 1996-05-12 13:00:00+00,622,506.87,-15.43,-999,-999,91.01,90.89,3.39,3.58,165.3,-999,692,-999,0,0 1996-05-12 14:00:00+00,684.2,569.11,15.51,-999,-999,87.46,88.67,5.36,5.61,217.6,-999,692.2,-0.01,-0.01,12.69 1996-05-12 15:00:00+00,680.6,572.57,-90.87,-999,-999,86.25,87.5,6.82,7.13,222.3,-999,692.6,-0.01,-0.01,12.73 1996-05-12 16:00:00+00,674.6,569.3,-137.32,-999,-999,87.05,87.38,5.51,5.77,219.6,-999,692.6,0,0,12.78 1996-05-12 17:00:00+00,620.2,528.53,-157.64,-999,-999,88.73,89.67,5.78,6.06,220.9,-999,692.8,0,0.01,12.69 1996-05-12 18:00:00+00,507.6,435.89,-117.4,-999,-999,90.03,91.31,5.84,6.12,221.3,-999,693.1,0,0.01,12.64 1996-05-12 19:00:00+00,406.8,350.35,-61.34,-999,-999,91.13,92.28,5.83,6.1,229.2,-999,692.9,0,0.01,12.61 1996-05-12 20:00:00+00,366.8,319.41,-77.03,-999,-999,91.24,92.4,7.06,7.37,240.2,-999,693,0,0,12.54 1996-05-12 21:00:00+00,275.8,241.88,-92.72,-999,-999,92.66,93.76,4.87,5.16,237.9,-999,693,0,0,12.44