Skip to content
intRobyn edited this page Mar 29, 2017 · 2 revisions

Introduction

CSV

“Purpose: CSV (comma-separated values) is a simple file format used to store tabular data, such as a spreadsheet or database. Files in the CSV format can be imported to and exported from programs that store data in tables, such as Microsoft Excel or OpenOffice Calc.” Quoted from http://www.computerhope.com/issues/ch001356.htm

Examples and how to code CVS: https://docs.python.org/2/library/csv.html

GeoCSV

Purpose: Specify a common system of annotations and rules for data in tabular text data formats in support of a specific style described in this document called “GeoCSV”. An important factor is readability for both humans and machines. Simplicity is considered key for adoption and use. This specification is primarily targeted at data delivered as data streams from GeoWS web services. Ideally, existing structured text data would need very minimal modification, perhaps a few additional GeoCSV comment lines, to be compliant. Quoted from GeoCSV Formatting template: http://geows.ds.iris.edu/documents/GeoCSV.pdf

Wiki link to more documentation: https://github.com/NCAR/chords/wiki/Geo-CSV

Web Service Template: http://geows.ds.iris.edu/documents/GeoWS-Service-Template.pdf

CHORD'S Implementation of CSV files

So far the chords system has been using csv for monitoring data ranging from wind to volcanoes. Our current csv data shows the project, site, affiliation, and what station the instrument is located at as well as variables inputted from the researchers themselves.

(Example CVS data)

(Header)

['Project', 'EarthCube Portal']
['Site', 'NCAR Foothills Lab']
['Affiliation', 'NSF EarthCube']
['Instrument', 'FL Wx Station']

(Data)

['Time', 'Wind Direction', 'Wind Speed', 'Wind Max', 'Temperature', 'Humidity', 'Pressure', 'Rain Total', 'Battery']
['2017-01-19T18:28:25Z', '35.0', '1.4', '2.5', '7.2', '35.5', '823.6', '3.0', '17.9']
['2017-01-19T18:33:25Z', '42.0', '1.6', '3.0', '7.5', '34.5', '823.6', '3.0', '17.9']
['2017-01-19T18:38:25Z', '55.0', '1.3', '2.8', '7.6', '34.0', '823.5', '3.0', '17.9']
['2017-01-19T18:43:25Z', '47.0', '1.3', '2.5', '7.6', '37.6', '823.5', '3.0', '17.9']
['2017-01-19T18:48:25Z', '60.0', '1.4', '2.8', '7.7', '38.4', '823.4', '3.0', '17.9']
['2017-01-19T18:53:25Z', '56.0', '1.3', '2.6', '7.7', '35.1', '823.3', '3.0', '17.9']
['2017-01-19T18:58:25Z', '78.0', '1.3', '2.2', '7.8', '34.6', '823.2', '3.0', '17.9']
['2017-01-19T19:03:25Z', '79.0', '1.2', '2.2', '8.2', '34.6', '823.1', '3.0', '17.9']
['2017-01-19T19:08:25Z', '68.0', '1.1', '2.2', '8.2', '36.0', '823.0', '3.0', '17.9']
['2017-01-19T19:13:25Z', '89.0', '1.2', '2.5', '8.5', '32.8', '823.0', '3.0', '17.9']
['2017-01-19T19:18:25Z', '108.0', '1.2', '2.1', '8.7', '32.3', '822.9', '3.0', '18.0']

Note the data values are exactly in order to the variables listed above them.

GeoCSV

The best way to describe GeoCSV is from the words of it’s creators, “The GeoCSV design was developed within the GeoWS project as a way to provide a baseline of compatibility between tabular text data sets from various sub-domains in geoscience. Funded through NSF’s EarthCube initiative, the GeoWS project aims to develop common web service interfaces for data access across hydrology, geodesy, seismology, marine geophysics, atmospheric science and other areas.

The GeoCSV format is an essential part of delivering data via simple web services for discovery and utilization by both humans and machines. As most geoscience disciplines have developed and use data formats specific for their needs, tabular text data can play a key role as a lowest common denominator useful for exchanging and integrating data across sub-domains.

The design starts with a core definition compatible with best practices described by the W3C- CSV on the Web Working Group (CSVW). Compatibility with CSVW is intended to ensure the broadest usability of data expressed as GeoCSV. An optional, simple, but limited metadata description mechanism was added to allow inclusion of important metadata with comma separated data, while staying with the definition of a “dialect” by CSVW. The format is designed both for creating new datasets and to annotate data sets already in a tabular text format such that they are compliant with GeoCSV.” (quoted from http://data.geows.org/documents/IN11F-1809-geocsv_poster.pdf)

In the example below Geo CSV shows what field unit and type is being used and if there's a specific website they retrieved this information from (CHORDS). It also shows exactly where they got the data (latitude and longitude) as well as what device they used. And last but not least units of measurement.

(Example)

Definitions:

  • UTF8: is the text encoding
  • Field_unit: units for each column of data
  • Field_type: types for each column, one of ‘string’,’integer’,’float’,’datetime’
  • Attribution: identify attribution information, probably a URL

(for more information visit http://geows.ds.iris.edu/documents/GeoCSV.pdf )

(Header)
# dataset: GeoCSV 2.0
# field_unit: UTF­8, UTF­8, degrees_north, degrees_east, meters, UTC, UTC
# field_type: datetime, float, float, float, float, float, float, float, float
# attribution: (website.)
# Axes: (latitude, longitude)
# device_information: (e.g. magnetometer (make: Geometrics model: G­882))
# Units of Measure: (degrees, meters, decimal, etc)


(Data)
Time, Wind Direction, Wind Speed, Wind Max, Temperature, Humidity, Pressure, Rain Total, Battery
2017-01-19T18:28:25Z, 35.0, 1.4, 2.5, 7.2, 35.5, 823.6, 3.0, 17.9
2017-01-19T18:33:25Z, 42.0, 1.6, 3.0, 7.5, 34.5, 823.6, 3.0, 17.9
2017-01-19T18:38:25Z, 55.0, 1.3, 2.8, 7.6, 34.0, 823.5, 3.0, 17.9
2017-01-19T18:43:25Z, 47.0, 1.3, 2.5, 7.6, 37.6, 823.5, 3.0, 17.9
2017-01-19T18:48:25Z, 60.0, 1.4, 2.8, 7.7, 38.4, 823.4, 3.0, 17.9
2017-01-19T18:53:25Z, 56.0, 1.3, 2.6, 7.7, 35.1, 823.3, 3.0, 17.9
2017-01-19T18:58:25Z, 78.0, 1.3, 2.2, 7.8, 34.6, 823.2, 3.0, 17.9
2017-01-19T19:03:25Z, 79.0, 1.2, 2.2, 8.2, 34.6, 823.1, 3.0, 17.9
2017-01-19T19:08:25Z, 68.0, 1.1, 2.2, 8.2, 36.0, 823.0, 3.0, 17.9
2017-01-19T19:13:25Z, 89.0, 1.2, 2.5, 8.5, 32.8, 823.0, 3.0, 17.9
2017-01-19T19:18:25Z, 108.0, 1.2, 2.1, 8.7, 32.3, 822.9, 3.0, 18.0

Note the data values are exactly in order to the variables listed above them.

Comparison

CSV Shows GeoCSV Shows
Project Field Unit (units for each column of data)
Site Field Type (types for each column, ‘string’,’integer’,’float’,’datetime’)
Affiliation Attribution (website)
What Station the Instrument is located at Axes (latitude, longitude)
Variables input by the researchers Device Information (What make/model)
Units of Measure

Where to find CSV downloads in chords

There are currently two places in the app from where CSV files can be downloaded:

  • Individual instrument apge instrument page
  • This csv file (as well as the instrument page itself) is generated by the ./app/controllers/instruments_controller.rb controller by the “show” function.
  • The data download page
  • This “data” page is generated by the ./app/controllers/data_controller.rb, which loads the view in ./app/views/data/index.hmtl.haml. This, in turn, links to the instrument controller (./app/controllers/instruments_controller.rb) to generate the CSV file containing the actual data.

Conclusion

In conclusion GeoCSV helps us by refining our data to show specific variables to the Geosciences that we may have missed when using just CSV. Variables such as axes, device information, and units of measure. With this we can improve the data and documentation that we acquire with real time data and improve our methods based off of them.

Clone this wiki locally