-
Notifications
You must be signed in to change notification settings - Fork 813
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add vocab to support description of Weather #362
Comments
If we model forecasts as events, we can reduce a fair bit of the weight of this - there are lots of things we would inherit instead of repeating... |
/cc Jeremy Tandy @6a6d74 |
(thanks - I see the draft proposal) |
Saving an URL from chat w/ @6a6d74 - http://codes.wmo.int/common/quantity-kind |
@danbri I think the codes from that link could be integrated into schema:QuantitativeValue easily - e.g. by using the URLs of a unit (like http://codes.wmo.int/common/quantity-kind/_aerodromeMaximumWindGustSpeed) with http://schema.org/unitCode or by defining a prefix for the local name, like wmo:. |
(disclosure: I work for Met Office) Way back in Feb @danbri asked "what levels of weather data are interesting?" - citing everything from raw sensor metrics to decision support (e.g. do I need my umbrella today?). My gut feeling is that the sweet spot is the provision of a bunch of quantitative values for common weather properties at a given location and time (or time interval). This is typical of weather forecasts published on the web - I see this kind of information provided by many people, from National Meteorological Services such as my organisation (Met Office), to media organisations (such as BBC) and commercial weather service providers (such as Weather Company). Also, I note that @danbri used the term “weather data”. I think we should be talking about both forecasts (future conditions) and observations (past conditions). My assumption is that most people just want to know what the weather is going to be (or was) without needing to know how those numbers were determined. For those “power users” there are a number of rich semantic models that could be used to describe how the data was created. Examples include the Semantic Sensor Network Ontology. This kind of information is vital for applications like climate science where one needs to know what kind of procedure was used, what kind of sensor was used, how that sensor was calibrated etc. For the rest of us, it’s probably good enough just to have the values of the weather properties themselves and infer the quality of those values by way of which organisation published them. If we considered the ‘weather report’ as a CreativeWork then the property isBasedOnUrl provides a candidate mechanism to refer from the simple “weather report” view to the rich structured data (so long as that structured data is also published on the web!). I mentioned quality just now - and that we often infer this based on where something came from (e.g. the publishing organisation) … I suppose that it’s a bit like choosing a particular brand when buying new tyres for your car; there’s an implicit assumption that some tyres are better than others. Once again, if we consider the weather report (or collection of weather reports) as a CreativeWork, we can use terms such as publisher to attribute the source organisation. Similarly, we can use license to inform people about conditions of use (e.g. OGL). Let’s start by looking at some examples - both taken from existing Met Office services.
The data underpinning these web pages is also available via the Met Office’s DataPoint API. You need to register to get an API, but access is free … The basic pattern for a “weather report” (either forecast or observation) is:
This is similar to Event (in that it relates to a place and time) but the semantics don’t match; example of Event are concerts, lectures and festivals. They are things people can go to for a period of time. As an initial suggestion lets assume that we create a new Type in schema.org: Looking at existing temporal properties, I couldn’t see one that matched my needs - so think of A weather report might also include textual content; e.g. a summary of weather expected for a region and/or time interval. Suggest property So - weather properties. There are a lot of them. Those used in the examples above include:
Let’s assume that all of these properties are explicitly mentioned in schema.org - whose range is defined as either QuantitativeValue (for numeric values that need to express a unit of measure) or instances of Enumeration. It’s possible that a more soft-typed approach is more appropriate - e.g. where the property is referenced from some external controlled vocabulary. PropertyValue with property propertyID may provide a mechanism to do this. There may be a minimum (core) set of parameters that are used in most weather reports (e.g. air temperature, wind speed, wind direction, relative humidity, weather type and probability of precipitation) that could be defined in schema.org with others being defined externally - a mix and match approach? But, for now, lets just assume that all the properties are defined in schema.org. Given that each weather report relates to a specific location, it seems sensible that information about those locations are published once as “reference data” … then we can simply refer back to that information. Met Office doesn’t currently do this. Instead, it provides a gazetteer service to help people find the site that closest to their point of interest. However, if we did publish site information, we might include structured markup as follows: Clifton (Bristol):
Filton:
(apologies for errors … just trying to give the gist of things) When publishing observation and forecast data, Met Office bundles up individual Within the DataPoint API we refer to these collections as “data feeds”; we’re continually updating the content and exposing it through an API. The concept of data feeds maps well into schema.org (or, more specifically, into the (As an aside, it’s also worth noting that Met Office updates the forecast and observation data at hourly intervals which is faster, I imagine, than most search-engines will crawl the content - the content behaves like a continually refreshed data feed.) A data feed comprises instances of DataFeedItem, each of which would contain a A ‘data feed’ of weather observations might be structured like the example below; a truncated set of weather observations for Filton (not the full 24-hours) …
There is a short delay between the observation Note that the location for the weather report is specified as an external object: Also note that by default QuantitativeValue uses the UN/CEFACT Common Codes for units of measurement. Those used are enumerated below:
This is the first time I have come across these Common Codes. Within the earth science community the Unified Code for Units of Measure (UCUM), based on ISO 2955, has broad adoption. UCUM provides a basic set of symbols for units and an expression syntax for combining these symbols to obtain valid units for derived quantities. The symbols from UCUM are recognisable (e.g. metre is Values of Values of Values of (see the below for more details about these enumerations) A ‘data feed’ for a weather forecast might be structured like the example below; the first two time-steps of a 5-day weather forecast for Clifton …
Imagine that you got forecasts on Monday and Tuesday for the weather on Wednesday. Generally speaking, the forecast published on Tuesday will be more “accurate”. Here we can use the data feed item’s dateCreated value to determine which forecast is most recent, and hence may be considered the better one to use. Meteorologists also talk about the “reference” or “analysis” time. This is the time for which the atmospheric conditions are calculated that provide the initial state used in the forecast. In this example, the reference time is the start value of the time interval provided for the datasetTimeInterval property (and is usually the first time-step in the forecast). Values of In this second example, we see Finally, given the number of properties that are defined here, it might be pertinent to create a hosted extension for weather within schema.org … weather.schema.org anyone? —— Enumerations:Visibility Category - as defined by Met Office
UV Index - based on World Health Organisation's "GLOBAL SOLAR UV INDEX - A Practical Guide”
Pressure Tendency Category
Weather Type - as defined by Met Office
|
Jeremy: Just to pick one issue from your comprehensive email: If the UN/CEFACT Common Code does not work well for you, we / you could define a prefix for another establish unit coding system and then use the other codes. This is already supported as of now via http://schema.org/unitCode. "The unit of measurement given using the UN/CEFACT Common Code (3 characters) or a URL. Other codes than the UN/CEFACT Common Code may be used with a prefix followed by a colon." So we could simply add a line to the description of unitCode like "For Unified Code for Units of Measure (UCUM) unit codes, use the prefix 'ucum:', like so: ucum:hPa". It will make the publication of data simpler but the consumption more difficult, so the major consumers of schema.org data should have a word on this, too. Martin martin hepp http://www.heppnetz.de
|
Hi Martin- thanks for pointing this out. I did see that I could prefix the unit code - but I wasn't sure how anyone would know how to resolve I also note that one could use a URL to denote the unit of measurement ... this could be used to refer to terms from the QUDT Unit Vocabulary such as http://qudt.org/vocab/unit#Meter ... albeit that URLs are a bit longer than just the unit of measure symbol itself. Thanks! |
As this subject has just come up in discussions recently, just warehousing some links to relevant resources when and if we pivot back around to this. National Digital Forecast Database XML/SOAP Service - NOAA's National Weather Service Extensible_Markup_Language.pdf (for the service above) Meteorological Data and XML Software (The European Centre for Medium-Range Weather Forecasts (ECMWF)) |
This issue is being tagged as Stale due to inactivity. |
Bad bot. |
There was an incomplete proposal for Weather schemas, originally from a team at Microsoft. A rough draft is in W3C mercurial repo, https://dvcs.w3.org/hg/webschema/file/b4c3ad199322/schema.org/ext/weatherforecast.html but deserves recording here and revisiting as the topic comes up periodically.
What levels of detail for weather data are interesting?
See also https://github.com/w3c/csvw/blob/gh-pages/examples/rdf-data-cube-example.md which examines datacube/csvw representations for weather datasets.
The text was updated successfully, but these errors were encountered: