Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add vocab to support description of Weather #362

Open
danbri opened this issue Feb 25, 2015 · 12 comments
Open

Add vocab to support description of Weather #362

danbri opened this issue Feb 25, 2015 · 12 comments
Labels
no-issue-activity Discuss has gone quiet. Auto-tagging to encourage people to re-engage with the issue (or close it!). schema.org vocab General top level tag for issues on the vocabulary status:work expected We are likely to, or would like to, or probably should try, ... to do something in this area.

Comments

@danbri
Copy link
Contributor

danbri commented Feb 25, 2015

There was an incomplete proposal for Weather schemas, originally from a team at Microsoft. A rough draft is in W3C mercurial repo, https://dvcs.w3.org/hg/webschema/file/b4c3ad199322/schema.org/ext/weatherforecast.html but deserves recording here and revisiting as the topic comes up periodically.

What levels of detail for weather data are interesting?

  • raw sensor metrics
    • consumer-oriented predictions
    • umbrellaNeededToday?

See also https://github.com/w3c/csvw/blob/gh-pages/examples/rdf-data-cube-example.md which examines datacube/csvw representations for weather datasets.

@danbri danbri added the schema.org vocab General top level tag for issues on the vocabulary label Feb 25, 2015
@chaals
Copy link
Contributor

chaals commented Mar 14, 2015

If we model forecasts as events, we can reduce a fair bit of the weight of this - there are lots of things we would inherit instead of repeating...

@danbri danbri added this to the 2015 Q2 milestone Mar 16, 2015
@danbri danbri added type:enhancement status:work expected We are likely to, or would like to, or probably should try, ... to do something in this area. labels Mar 16, 2015
@danbri
Copy link
Contributor Author

danbri commented Mar 16, 2015

/cc Jeremy Tandy @6a6d74

@6a6d74
Copy link

6a6d74 commented Mar 16, 2015

(thanks - I see the draft proposal)

@danbri danbri modified the milestones: 2015 Q2, 2015 Q1 Apr 17, 2015
@danbri
Copy link
Contributor Author

danbri commented Nov 4, 2015

Saving an URL from chat w/ @6a6d74 - http://codes.wmo.int/common/quantity-kind

@mfhepp
Copy link
Contributor

mfhepp commented Nov 4, 2015

@danbri I think the codes from that link could be integrated into schema:QuantitativeValue easily - e.g. by using the URLs of a unit (like http://codes.wmo.int/common/quantity-kind/_aerodromeMaximumWindGustSpeed) with http://schema.org/unitCode or by defining a prefix for the local name, like wmo:.

@6a6d74
Copy link

6a6d74 commented Nov 5, 2015

(disclosure: I work for Met Office)

Way back in Feb @danbri asked "what levels of weather data are interesting?" - citing everything from raw sensor metrics to decision support (e.g. do I need my umbrella today?).

My gut feeling is that the sweet spot is the provision of a bunch of quantitative values for common weather properties at a given location and time (or time interval).

This is typical of weather forecasts published on the web - I see this kind of information provided by many people, from National Meteorological Services such as my organisation (Met Office), to media organisations (such as BBC) and commercial weather service providers (such as Weather Company).

Also, I note that @danbri used the term “weather data”. I think we should be talking about both forecasts (future conditions) and observations (past conditions).

My assumption is that most people just want to know what the weather is going to be (or was) without needing to know how those numbers were determined.

For those “power users” there are a number of rich semantic models that could be used to describe how the data was created. Examples include the Semantic Sensor Network Ontology. This kind of information is vital for applications like climate science where one needs to know what kind of procedure was used, what kind of sensor was used, how that sensor was calibrated etc.

For the rest of us, it’s probably good enough just to have the values of the weather properties themselves and infer the quality of those values by way of which organisation published them. If we considered the ‘weather report’ as a CreativeWork then the property isBasedOnUrl provides a candidate mechanism to refer from the simple “weather report” view to the rich structured data (so long as that structured data is also published on the web!).

I mentioned quality just now - and that we often infer this based on where something came from (e.g. the publishing organisation) … I suppose that it’s a bit like choosing a particular brand when buying new tyres for your car; there’s an implicit assumption that some tyres are better than others.

Once again, if we consider the weather report (or collection of weather reports) as a CreativeWork, we can use terms such as publisher to attribute the source organisation. Similarly, we can use license to inform people about conditions of use (e.g. OGL).

Let’s start by looking at some examples - both taken from existing Met Office services.

The data underpinning these web pages is also available via the Met Office’s DataPoint API. You need to register to get an API, but access is free …

The basic pattern for a “weather report” (either forecast or observation) is:

  • specific location
  • specific time (or time interval)
  • set of values for atmospheric phenomena - both quantitative (numbers with units of measurement) and nominal (types, categories etc.).

This is similar to Event (in that it relates to a place and time) but the semantics don’t match; example of Event are concerts, lectures and festivals. They are things people can go to for a period of time.

As an initial suggestion lets assume that we create a new Type in schema.org: WeatherReport which specialises StructuredValue, Intangible and Thing respectively. Properties of WeatherReport would include spatial, time and a bunch of weather properties (more on those later).

Looking at existing temporal properties, I couldn’t see one that matched my needs - so think of time as a placeholder; a first guess :-)

A weather report might also include textual content; e.g. a summary of weather expected for a region and/or time interval. Suggest property summary of Type Text.

So - weather properties. There are a lot of them. Those used in the examples above include:

  • wind direction
  • “feels like temperature” - a temperature adjusted for wind-chill (and other factors)
  • wind gust speed
  • relative humidity
  • probability of precipitation
  • wind speed
  • air temperature
  • dew point temperature
  • visibility
  • pressure
  • pressure tendency
  • solar UV Index (and exposure)
  • weather type

Let’s assume that all of these properties are explicitly mentioned in schema.org - whose range is defined as either QuantitativeValue (for numeric values that need to express a unit of measure) or instances of Enumeration.

It’s possible that a more soft-typed approach is more appropriate - e.g. where the property is referenced from some external controlled vocabulary. PropertyValue with property propertyID may provide a mechanism to do this. There may be a minimum (core) set of parameters that are used in most weather reports (e.g. air temperature, wind speed, wind direction, relative humidity, weather type and probability of precipitation) that could be defined in schema.org with others being defined externally - a mix and match approach? But, for now, lets just assume that all the properties are defined in schema.org.

Given that each weather report relates to a specific location, it seems sensible that information about those locations are published once as “reference data” … then we can simply refer back to that information. Met Office doesn’t currently do this. Instead, it provides a gazetteer service to help people find the site that closest to their point of interest. However, if we did publish site information, we might include structured markup as follows:

Clifton (Bristol):

{
  "@context": "http://schema.org",
  "@id": "http://data.metoffice.gov.uk/sites/loc/350953",
  "@type": "Place",
  "geo": {
    "@type": "GeoCoordinates",
    "latitude": "51.4632",
    "longitude": "-2.6169",
    "elevation": "64.0"
  },
  "name": "CLIFTON (BRISTOL)",
  "address": {
    "@type": "PostalAddress",
    "addressLocality": "Clifton",
    "addressRegion": "Bristol",
    "addressCountry": "GB"
  }
}

Filton:

{
  "@context": "http://schema.org",
  "@id": "http://data.metoffice.gov.uk/sites/loc/3628",
  "@type": "Place",
  "geo": {
    "@type": "GeoCoordinates",
    "latitude": "51.521",
    "longitude": "-2.576",
    "elevation": "59.0"
  },
  "name": "FILTON",
  "address": {
    "@type": "PostalAddress",
    "addressLocality": "Filton",
    "addressRegion": "South Gloucestershire",
    "addressCountry": "GB"
  }
}

(apologies for errors … just trying to give the gist of things)

When publishing observation and forecast data, Met Office bundles up individual WeatherReport instances (each for a specific time & location) into collections. These collections might be organised by location (e.g. a set of values for different times at the same location), by time (e.g. a set of values for different locations at the same time) or neither (e.g. a set of values for multiple locations and times).

Within the DataPoint API we refer to these collections as “data feeds”; we’re continually updating the content and exposing it through an API.

The concept of data feeds maps well into schema.org (or, more specifically, into the DataFeed proposal made in ISSUE #688 ).

(As an aside, it’s also worth noting that Met Office updates the forecast and observation data at hourly intervals which is faster, I imagine, than most search-engines will crawl the content - the content behaves like a continually refreshed data feed.)

A data feed comprises instances of DataFeedItem, each of which would contain a WeatherReport.

A ‘data feed’ of weather observations might be structured like the example below; a truncated set of weather observations for Filton (not the full 24-hours) …

{
  "@context": "http://schema.org",
  "@type": "DataFeed",
  "about": "Weather observations for Filton, South Gloucestershire, UK (last 24-hours)",
  "publisher": {
    "@type": "Organization",
    "name": "Met Office",
    "url": "http://www.metoffice.gov.uk"
  },
  "license": "http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/",
  "datasetTimeInterval": "2015-11-03T18:00:00Z/P1D",
  "spatial": { "@id": "http://data.metoffice.gov.uk/sites/loc/350953" },
  "dataFeedElement": [
    {
      "@type": "DataFeedItem",
      "dateCreated": "2015-11-03T18:05:09Z",
      "item": {
        "@type": "WeatherReport",
        "spatial": { "@id": "http://data.metoffice.gov.uk/sites/loc/3628" },
        "time": "2015-11-03T18:00:00Z",
        "windDirection": "ENE",
        "relativeHumidity": { "@type": "QuantitativeValue", "value": "100", "unitCode": "P1" },
        "atmosphericPressure": { "@type": "QuantitativeValue", "value": "1012", "unitCode": "A97" },
        "windSpeed": { "@type": "QuantitativeValue", "value": "6", "unitCode": "HM" },
        "airTemperature": { "@type": "QuantitativeValue", "value": "11.4", "unitCode": "CEL" },
        "visibility": { "@type": "QuantitativeValue", "value": "2000", "unitCode": "MTR" },
        "weatherType": "12",
        "pressureTendency": "F",
        "dewpointTemp": { "@type": "QuantitativeValue", "value": "11.4", "unitCode": "CEL" }
      }
    }, {
      "@type": "DataFeedItem",
      "dateCreated": "2015-11-03T19:05:33Z",
      "item": {
        "@type": "WeatherReport",
        "spatial": { "@id": "http://data.metoffice.gov.uk/sites/loc/3628" },
        "time": "2015-11-03T19:00:00Z",
        "windDirection": "ENE",
        "relativeHumidity": { "@type": "QuantitativeValue", "value": "100", "unitCode": "P1" },
        "atmosphericPressure": { "@type": "QuantitativeValue", "value": "1012", "unitCode": "A97" },
        "windSpeed": { "@type": "QuantitativeValue", "value": "5", "unitCode": "HM" },
        "airTemperature": { "@type": "QuantitativeValue", "value": "11.8", "unitCode": "CEL" },
        "visibility": { "@type": "QuantitativeValue", "value": "3600", "unitCode": "MTR" },
        "weatherType": "15",
        "pressureTendency": "F",
        "dewpointTemp": { "@type": "QuantitativeValue", "value": "11.8", "unitCode": "CEL" }
      }
    }
  ]
}

There is a short delay between the observation time and the data feed item dateCreated. It takes a moment or two for the information to percolate though the systems. The delay here is illustrative (I made the numbers up!).

Note that the location for the weather report is specified as an external object: { "@id": "http://data.metoffice.gov.uk/sites/loc/350953" }. (i) this is a fictitious URL, (ii) this is my guess at how to link to the location reference data I talked about earlier.

Also note that by default QuantitativeValue uses the UN/CEFACT Common Codes for units of measurement. Those used are enumerated below:

  • MTR = metre
  • HM = mile per hour
  • P1 = percent
  • CEL = degree Celsius
  • A97 = hectopascal

This is the first time I have come across these Common Codes. Within the earth science community the Unified Code for Units of Measure (UCUM), based on ISO 2955, has broad adoption. UCUM provides a basic set of symbols for units and an expression syntax for combining these symbols to obtain valid units for derived quantities. The symbols from UCUM are recognisable (e.g. metre is m, hectopascal is hPa) and doesn’t require a table lookup.

Values of windDirection are from the 16-point compass: N, NNE, NE, ENE, E, ESE, SE, SSE, S, SSW, SW, WSW, W, WNW, NW, NNW.

Values of pressureTendency (change in atmospheric pressure) indicate rising, falling or steady: R, F, S.

Values of weatherType are taken from the list specified by Met Office. The World Meteorological Organization defines a standard set of weather types — but this is much more specific and not, I think, suitable for usage with the general public. As such, it is likely that each publisher is likely to have their own set of weather types. How would this be managed in schema.org?

(see the below for more details about these enumerations)

A ‘data feed’ for a weather forecast might be structured like the example below; the first two time-steps of a 5-day weather forecast for Clifton …

{
  "@context": "http://schema.org",
  "@type": "DataFeed",
  "about": "Weather forecast for Clifton, Bristol, UK",
  "publisher": {
    "@type": "Organization",
    "name": "Met Office",
    "url": "http://www.metoffice.gov.uk"
  },
  "license": "http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/",
  "datasetTimeInterval": "2015-11-04T18:00:00Z/P5D",
  "spatial": { "@id": "http://data.metoffice.gov.uk/sites/loc/350953" },
  "dataFeedElement": [
    {
      "@type": "DataFeedItem",
      "dateCreated": "2015-11-04T18:22:14Z",
      "item": {
        "@type": "WeatherReport",
        "spatial": { "@id": "http://data.metoffice.gov.uk/sites/loc/350953" },
        "time": "2015-11-04T18:00:00Z",
        "windDirection": "SW",
        "feelsLikeTemp": { "@type": "QuantitativeValue", "value": "12", "unitCode": "CEL" },
        "windGustSpeed": { "@type": "QuantitativeValue", "value": "16", "unitCode": "HM" },
        "relativeHumidity": { "@type": "QuantitativeValue", "value": "90", "unitCode": "P1" },
        "precipitationProbability": { "@type": "QuantitativeValue", "value": "24", "unitCode": "P1" },
        "windSpeed": { "@type": "QuantitativeValue", "value": "11", "unitCode": "HM" },
        "airTemperature": { "@type": "QuantitativeValue", "value": "14", "unitCode": "CEL" },
        "visibility": "GO",
        "solarUVExposure": "LO",
        "weatherType": "8"
      }
    }, {
      "@type": "DataFeedItem",
      "dateCreated": "2015-11-04T18:22:14Z",
      "item": {
        "@type": "WeatherReport",
        "spatial": { "@id": "http://data.metoffice.gov.uk/sites/loc/350953" },
        "time": "2015-11-04T21:00:00Z",
        "windDirection": "SSW",
        "feelsLikeTemp": { "@type": "QuantitativeValue", "value": "12", "unitCode": "CEL" },
        "windGustSpeed": { "@type": "QuantitativeValue", "value": "16", "unitCode": "HM" },
        "relativeHumidity": { "@type": "QuantitativeValue", "value": "94", "unitCode": "P1" },
        "precipitationProbability": { "@type": "QuantitativeValue", "value": "10", "unitCode": "P1" },
        "windSpeed": { "@type": "QuantitativeValue", "value": "7", "unitCode": "HM" },
        "airTemperature": { "@type": "QuantitativeValue", "value": "13", "unitCode": "CEL" },
        "visibility": "MO",
        "solarUVExposure": "LO",
        "weatherType": "7"
      }
    }
  ]
}

Imagine that you got forecasts on Monday and Tuesday for the weather on Wednesday. Generally speaking, the forecast published on Tuesday will be more “accurate”. Here we can use the data feed item’s dateCreated value to determine which forecast is most recent, and hence may be considered the better one to use.

Meteorologists also talk about the “reference” or “analysis” time. This is the time for which the atmospheric conditions are calculated that provide the initial state used in the forecast. In this example, the reference time is the start value of the time interval provided for the datasetTimeInterval property (and is usually the first time-step in the forecast).

Values of solarUVExposure are derived from the World Health Organisation's "GLOBAL SOLAR UV INDEX - A Practical Guide”: LO, MO, HI, VH, EX.

In this second example, we see visibility provided as a nominal value (category) rather than a quantitative value: UN, VP, PO, MO, GO, VG, EX. There may be other properties with this duality.

Finally, given the number of properties that are defined here, it might be pertinent to create a hosted extension for weather within schema.org … weather.schema.org anyone?

——

Enumerations:

Visibility Category - as defined by Met Office

Value Description
UN    Unknown
VP    Very poor - Less than 1 km
PO    Poor - Between 1-4 km
MO    Moderate - Between 4-10 km
GO    Good - Between 10-20 km
VG    Very good - Between 20-40 km
EX    Excellent - More than 40 km

UV Index - based on World Health Organisation's "GLOBAL SOLAR UV INDEX - A Practical Guide”

Value Index Exposure
LO    0-2   Low
MO    3-5   Moderate
HI    6-7   High
VH    8-10  Very high
EX    11    Extreme

Pressure Tendency Category

Value Description
F     Falling
S     Steady
R     Rising

Weather Type - as defined by Met Office

Value Description
NA    Not available
0     Clear night
1     Sunny day
2     Partly cloudy (night)
3     Partly cloudy (day)
4     Not used
5     Mist
6     Fog
7     Cloudy
8     Overcast
9     Light rain shower (night)
10    Light rain shower (day)
11    Drizzle
12    Light rain
13    Heavy rain shower (night)
14    Heavy rain shower (day)
15    Heavy rain
16    Sleet shower (night)
17    Sleet shower (day)
18    Sleet
19    Hail shower (night)
20    Hail shower (day)
21    Hail
22    Light snow shower (night)
23    Light snow shower (day)
24    Light snow
25    Heavy snow shower (night)
26    Heavy snow shower (day)
27    Heavy snow
28    Thunder shower (night)
29    Thunder shower (day)
30    Thunder

@mfhepp
Copy link
Contributor

mfhepp commented Nov 5, 2015

Jeremy: Just to pick one issue from your comprehensive email: If the UN/CEFACT Common Code does not work well for you, we / you could define a prefix for another establish unit coding system and then use the other codes.

This is already supported as of now via http://schema.org/unitCode.

"The unit of measurement given using the UN/CEFACT Common Code (3 characters) or a URL. Other codes than the UN/CEFACT Common Code may be used with a prefix followed by a colon."

So we could simply add a line to the description of unitCode like "For Unified Code for Units of Measure (UCUM) unit codes, use the prefix 'ucum:', like so: ucum:hPa".

It will make the publication of data simpler but the consumption more difficult, so the major consumers of schema.org data should have a word on this, too.

Martin


martin hepp http://www.heppnetz.de
mhepp@computer.org @mfhepp

On 05 Nov 2015, at 16:47, Jeremy Tandy notifications@github.com wrote:

(disclosure: I work for Met Office)

Way back in Feb @danbri asked "what levels of weather data are interesting?" - citing everything from raw sensor metrics to decision support (e.g. do I need my umbrella today?).

My gut feeling is that the sweet spot is the provision of a bunch of quantitative values for common weather properties at a given location and time (or time interval).

This is typical of weather forecasts published on the web - I see this kind of information provided by many people, from National Meteorological Services such as my organisation (Met Office), to media organisations (such as BBC) and commercial weather service providers (such as Weather Company).

Also, I note that @danbri used the term “weather data”. I think we should be talking about both forecasts (future conditions) and observations (past conditions).

My assumption is that most people just want to know what the weather is going to be (or was) without needing to know how those numbers were determined.

For those “power users” there are a number of rich semantic models that could be used to describe how the data was created. Examples include the Semantic Sensor Network Ontology. This kind of information is vital for applications like climate science where one needs to know what kind of procedure was used, what kind of sensor was used, how that sensor was calibrated etc.

For the rest of us, it’s probably good enough just to have the values of the weather properties themselves and infer the quality of those values by way of which organisation published them. If we considered the ‘weather report’ as a CreativeWork then the property isBasedOnUrl provides a candidate mechanism to refer from the simple “weather report” view to the rich structured data (so long as that structured data is also published on the web!).

I mentioned quality just now - and that we often infer this based on where something came from (e.g. the publishing organisation) … I suppose that it’s a bit like choosing a particular brand when buying new tyres for your car; there’s an implicit assumption that some tyres are better than others.

Once again, if we consider the weather report (or collection of weather reports) as a CreativeWork, we can use terms such as publisher to attribute the source organisation. Similarly, we can use license to inform people about conditions of use (e.g. OGL).

Let’s start by looking at some examples - both taken from existing Met Office services.

5-day weather forecast for Clifton (Bristol) with data provided at 3-hourly intervals: http://www.metoffice.gov.uk/public/weather/observation/gcnhu2w6v

Most recent weather observations (24-hours) for Filton (South Gloucestershire - and the closest observation site to Bristol) with data provided at 1-hourly intervals: http://www.metoffice.gov.uk/public/weather/observation/gcnjj690k

The data underpinning these web pages is also available via the Met Office’s DataPoint API. You need to register to get an API, but access is free …

The basic pattern for a “weather report” (either forecast or observation) is:

• specific location
• specific time (or time interval)
• set of values for atmospheric phenomena - both quantitative (numbers with units of measurement) and nominal (types, categories etc.).
This is similar to Event (in that it relates to a place and time) but the semantics don’t match; example of Event are concerts, lectures and festivals. They are things people can go to for a period of time.

As an initial suggestion lets assume that we create a new Type in schema.org: WeatherReport which specialises StructuredValue, Intangible and Thing respectively. Properties of WeatherReport would include spatial, time and a bunch of weather properties (more on those later).

Looking at existing temporal properties, I couldn’t see one that matched my needs - so think of time as a placeholder; a first guess :-)

A weather report might also include textual content; e.g. a summary of weather expected for a region and/or time interval. Suggest property summary of Type Text.

So - weather properties. There are a lot of them. Those used in the examples above include:

• wind direction
• “feels like temperature” - a temperature adjusted for wind-chill (and other factors)
• wind gust speed
• relative humidity
• probability of precipitation
• wind speed
• air temperature
• dew point temperature
• visibility
• pressure
• pressure tendency
• solar UV Index (and exposure)
• weather type
Let’s assume that all of these properties are explicitly mentioned in schema.org - whose range is defined as either QuantitativeValue (for numeric values that need to express a unit of measure) or instances of Enumeration.

It’s possible that a more soft-typed approach is more appropriate - e.g. where the property is referenced from some external controlled vocabulary. PropertyValue with property propertyID may provide a mechanism to do this. There may be a minimum (core) set of parameters that are used in most weather reports (e.g. air temperature, wind speed, wind direction, relative humidity, weather type and probability of precipitation) that could be defined in schema.org with others being defined externally - a mix and match approach? But, for now, lets just assume that all the properties are defined in schema.org.

Given that each weather report relates to a specific location, it seems sensible that information about those locations are published once as “reference data” … then we can simply refer back to that information. Met Office doesn’t currently do this. Instead, it provides a gazetteer service to help people find the site that closest to their point of interest. However, if we did publish site information, we might include structured markup as follows:

Clifton (Bristol):

{
"@context": "http://schema.org",
"@id": "http://data.metoffice.gov.uk/sites/loc/350953",
"@type": "Place",
"geo": {
"@type": "GeoCoordinates",
"latitude": "51.4632",
"longitude": "-2.6169",
"elevation": "64.0"
},
"name": "CLIFTON (BRISTOL)",
"address": {
"@type": "PostalAddress",
"addressLocality": "Clifton",
"addressRegion": "Bristol",
"addressCountry": "GB"
}
}

Filton:

{
"@context": "http://schema.org",
"@id": "http://data.metoffice.gov.uk/sites/loc/3628",
"@type": "Place",
"geo": {
"@type": "GeoCoordinates",
"latitude": "51.521",
"longitude": "-2.576",
"elevation": "59.0"
},
"name": "FILTON",
"address": {
"@type": "PostalAddress",
"addressLocality": "Filton",
"addressRegion": "South Gloucestershire",
"addressCountry": "GB"
}
}

(apologies for errors … just trying to give the gist of things)

When publishing observation and forecast data, Met Office bundles up individual WeatherReport instances (each for a specific time & location) into collections. These collections might be organised by location (e.g. a set of values for different times at the same location), by time (e.g. a set of values for different locations at the same time) or neither (e.g. a set of values for multiple locations and times).

Within the DataPoint API we refer to these collections as “data feeds”; we’re continually updating the content and exposing it through an API.

The concept of data feeds maps well into schema.org (or, more specifically, into the DataFeed proposal made in ISSUE #688 ).

(As an aside, it’s also worth noting that Met Office updates the forecast and observation data at hourly intervals which is faster, I imagine, than most search-engines will crawl the content - the content behaves like a continually refreshed data feed.)

A data feed comprises instances of DataFeedItem, each of which would contain a WeatherReport.

A ‘data feed’ of weather observations might be structured like the example below; a truncated set of weather observations for Filton (not the full 24-hours) …

{
"@context": "http://schema.org",
"@type": "DataFeed",
"about": "Weather observations for Filton, South Gloucestershire, UK (last 24-hours)",
"publisher": {
"@type": "Organization",
"name": "Met Office",
"url": "http://www.metoffice.gov.uk"
},
"license": "http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/",
"datasetTimeInterval": "2015-11-03T18:00:00Z/P1D",
"spatial": { "@id": "http://data.metoffice.gov.uk/sites/loc/350953" },
"dataFeedElement": [
{
"@type": "DataFeedItem",
"dateCreated": "2015-11-03T18:05:09Z",
"item": {
"@type": "WeatherReport",
"spatial": { "@id": "http://data.metoffice.gov.uk/sites/loc/3628" },
"time": "2015-11-03T18:00:00Z",
"windDirection": "ENE",
"relativeHumidity": { "@type": "QuantitativeValue", "value": "100", "unitCode": "P1" },
"atmosphericPressure": { "@type": "QuantitativeValue", "value": "1012", "unitCode": "A97" },
"windSpeed": { "@type": "QuantitativeValue", "value": "6", "unitCode": "HM" },
"airTemperature": { "@type": "QuantitativeValue", "value": "11.4", "unitCode": "CEL" },
"visibility": { "@type": "QuantitativeValue", "value": "2000", "unitCode": "MTR" },
"weatherType": "12",
"pressureTendency": "F",
"dewpointTemp": { "@type": "QuantitativeValue", "value": "11.4", "unitCode": "CEL" }
}
}, {
"@type": "DataFeedItem",
"dateCreated": "2015-11-03T19:05:33Z",
"item": {
"@type": "WeatherReport",
"spatial": { "@id": "http://data.metoffice.gov.uk/sites/loc/3628" },
"time": "2015-11-03T19:00:00Z",
"windDirection": "ENE",
"relativeHumidity": { "@type": "QuantitativeValue", "value": "100", "unitCode": "P1" },
"atmosphericPressure": { "@type": "QuantitativeValue", "value": "1012", "unitCode": "A97" },
"windSpeed": { "@type": "QuantitativeValue", "value": "5", "unitCode": "HM" },
"airTemperature": { "@type": "QuantitativeValue", "value": "11.8", "unitCode": "CEL" },
"visibility": { "@type": "QuantitativeValue", "value": "3600", "unitCode": "MTR" },
"weatherType": "15",
"pressureTendency": "F",
"dewpointTemp": { "@type": "QuantitativeValue", "value": "11.8", "unitCode": "CEL" }
}
}
]
}

There is a short delay between the observation time and the data feed item dateCreated. It takes a moment or two for the information to percolate though the systems. The delay here is illustrative (I made the numbers up!).

Note that the location for the weather report is specified as an external object: { "@id": "http://data.metoffice.gov.uk/sites/loc/350953" }. (i) this is a fictitious URL, (ii) this is my guess at how to link to the location reference data I talked about earlier.

Also note that by default QuantitativeValue uses the UN/CEFACT Common Codes for units of measurement. Those used are enumerated below:

• MTR = metre
• HM = mile per hour
• P1 = percent
• CEL = degree Celsius
• A97 = hectopascal
This is the first time I have come across these Common Codes. Within the earth science community the Unified Code for Units of Measure (UCUM), based on ISO 2955, has broad adoption. UCUM provides a basic set of symbols for units and an expression syntax for combining these symbols to obtain valid units for derived quantities. The symbols from UCUM are recognisable (e.g. metre is m, hectopascal is hPa) and doesn’t require a table lookup.

Values of windDirection are from the 16-point compass: N, NNE, NE, ENE, E, ESE, SE, SSE, S, SSW, SW, WSW, W, WNW, NW, NNW.

Values of pressureTendency (change in atmospheric pressure) indicate rising, falling or steady: R, F, S.

Values of weatherType are taken from the list specified by Met Office. The World Meteorological Organization defines a standard set of weather types — but this is much more specific and not, I think, suitable for usage with the general public. As such, it is likely that each publisher is likely to have their own set of weather types. How would this be managed in schema.org?

(see the below for more details about these enumerations)

A ‘data feed’ for a weather forecast might be structured like the example below; the first two time-steps of a 5-day weather forecast for Clifton …

{
"@context": "http://schema.org",
"@type": "DataFeed",
"about": "Weather forecast for Clifton, Bristol, UK",
"publisher": {
"@type": "Organization",
"name": "Met Office",
"url": "http://www.metoffice.gov.uk"
},
"license": "http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/",
"datasetTimeInterval": "2015-11-04T18:00:00Z/P5D",
"spatial": { "@id": "http://data.metoffice.gov.uk/sites/loc/350953" },
"dataFeedElement": [
{
"@type": "DataFeedItem",
"dateCreated": "2015-11-04T18:22:14Z",
"item": {
"@type": "WeatherReport",
"spatial": { "@id": "http://data.metoffice.gov.uk/sites/loc/350953" },
"time": "2015-11-04T18:00:00Z",
"windDirection": "SW",
"feelsLikeTemp": { "@type": "QuantitativeValue", "value": "12", "unitCode": "CEL" },
"windGustSpeed": { "@type": "QuantitativeValue", "value": "16", "unitCode": "HM" },
"relativeHumidity": { "@type": "QuantitativeValue", "value": "90", "unitCode": "P1" },
"precipitationProbability": { "@type": "QuantitativeValue", "value": "24", "unitCode": "P1" },
"windSpeed": { "@type": "QuantitativeValue", "value": "11", "unitCode": "HM" },
"airTemperature": { "@type": "QuantitativeValue", "value": "14", "unitCode": "CEL" },
"visibility": "GO",
"solarUVExposure": "LO",
"weatherType": "8"
}
}, {
"@type": "DataFeedItem",
"dateCreated": "2015-11-04T18:22:14Z",
"item": {
"@type": "WeatherReport",
"spatial": { "@id": "http://data.metoffice.gov.uk/sites/loc/350953" },
"time": "2015-11-04T21:00:00Z",
"windDirection": "SSW",
"feelsLikeTemp": { "@type": "QuantitativeValue", "value": "12", "unitCode": "CEL" },
"windGustSpeed": { "@type": "QuantitativeValue", "value": "16", "unitCode": "HM" },
"relativeHumidity": { "@type": "QuantitativeValue", "value": "94", "unitCode": "P1" },
"precipitationProbability": { "@type": "QuantitativeValue", "value": "10", "unitCode": "P1" },
"windSpeed": { "@type": "QuantitativeValue", "value": "7", "unitCode": "HM" },
"airTemperature": { "@type": "QuantitativeValue", "value": "13", "unitCode": "CEL" },
"visibility": "MO",
"solarUVExposure": "LO",
"weatherType": "7"
}
}
]
}

Imagine that you got forecasts on Monday and Tuesday for the weather on Wednesday. Generally speaking, the forecast published on Tuesday will be more “accurate”. Here we can use the data feed item’s dateCreated value to determine which forecast is most recent, and hence may be considered the better one to use.

Meteorologists also talk about the “reference” or “analysis” time. This is the time for which the atmospheric conditions are calculated that provide the initial state used in the forecast. In this example, the reference time is the start value of the time interval provided for the datasetTimeInterval property (and is usually the first time-step in the forecast).

Values of solarUVExposure are derived from the World Health Organisation's "GLOBAL SOLAR UV INDEX - A Practical Guide”: LO, MO, HI, VH, EX.

In this second example, we see visibility provided as a nominal value (category) rather than a quantitative value: UN, VP, PO, MO, GO, VG, EX. There may be other properties with this duality.

Finally, given the number of properties that are defined here, it might be pertinent to create a hosted extension for weather within schema.org … weather.schema.org anyone?

——

Enumerations:

Visibility Category - as defined by Met Office

Value Description
UN Unknown
VP Very poor - Less than 1 km
PO Poor - Between 1-4 km
MO Moderate - Between 4-10 km
GO Good - Between 10-20 km
VG Very good - Between 20-40 km
EX Excellent - More than 40 km

UV Index - based on World Health Organisation's "GLOBAL SOLAR UV INDEX - A Practical Guide”

Value Index Exposure
LO 0-2 Low
MO 3-5 Moderate
HI 6-7 High
VH 8-10 Very high
EX 11 Extreme

Pressure Tendency Category

Value Description
F Falling
S Steady
R Rising

Weather Type - as defined by Met Office

Value Description
NA Not available
0 Clear night
1 Sunny day
2 Partly cloudy (night)
3 Partly cloudy (day)
4 Not used
5 Mist
6 Fog
7 Cloudy
8 Overcast
9 Light rain shower (night)
10 Light rain shower (day)
11 Drizzle
12 Light rain
13 Heavy rain shower (night)
14 Heavy rain shower (day)
15 Heavy rain
16 Sleet shower (night)
17 Sleet shower (day)
18 Sleet
19 Hail shower (night)
20 Hail shower (day)
21 Hail
22 Light snow shower (night)
23 Light snow shower (day)
24 Light snow
25 Heavy snow shower (night)
26 Heavy snow shower (day)
27 Heavy snow
28 Thunder shower (night)
29 Thunder shower (day)
30 Thunder


Reply to this email directly or view it on GitHub.

@6a6d74
Copy link

6a6d74 commented Nov 5, 2015

Hi Martin- thanks for pointing this out. I did see that I could prefix the unit code - but I wasn't sure how anyone would know how to resolve ucum: to the Unified Code for Units of Measure. I think you're saying that one petitions to amend the description of http://schema.org/unitCode to include that information - which makes sense.

I also note that one could use a URL to denote the unit of measurement ... this could be used to refer to terms from the QUDT Unit Vocabulary such as http://qudt.org/vocab/unit#Meter ... albeit that URLs are a bit longer than just the unit of measure symbol itself.

Thanks!

@Aaranged
Copy link

Aaranged commented Jun 8, 2016

As this subject has just come up in discussions recently, just warehousing some links to relevant resources when and if we pivot back around to this.

National Digital Forecast Database XML/SOAP Service - NOAA's National Weather Service
http://graphical.weather.gov/xml/

Extensible_Markup_Language.pdf (for the service above)
http://products.weather.gov/PDD/Extensible_Markup_Language.pdf

Meteorological Data and XML
(from the World Meteorological Organization)
https://www.wmo.int/pages/prog/www/DPS/ET-DR-C-PRAGUE-02/Doc6(1).doc

Software (The European Centre for Medium-Range Weather Forecasts (ECMWF))
http://www.ecmwf.int/en/computing/software

@mastermind1429
Copy link

@6a6d74 @mfhepp Is weather schema still being worked on or is xml the only supported method for marking up weather for now?

@github-actions
Copy link

github-actions bot commented Sep 3, 2020

This issue is being tagged as Stale due to inactivity.

@github-actions github-actions bot added the no-issue-activity Discuss has gone quiet. Auto-tagging to encourage people to re-engage with the issue (or close it!). label Sep 3, 2020
@OmgImAlexis
Copy link

Bad bot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no-issue-activity Discuss has gone quiet. Auto-tagging to encourage people to re-engage with the issue (or close it!). schema.org vocab General top level tag for issues on the vocabulary status:work expected We are likely to, or would like to, or probably should try, ... to do something in this area.
Projects
None yet
Development

No branches or pull requests

7 participants