Skip to content

What should address.addressRegion be for UK addresses? #1848

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jamescridland opened this issue Feb 16, 2018 · 39 comments
Closed

What should address.addressRegion be for UK addresses? #1848

jamescridland opened this issue Feb 16, 2018 · 39 comments
Assignees
Labels
no-issue-activity Discuss has gone quiet. Auto-tagging to encourage people to re-engage with the issue (or close it!).

Comments

@jamescridland
Copy link

Google is grumpy at me because I'm missing jobLocation.address.addressRegion from the job advertisements I have on my website.

http://schema.org/addressRegion gives one example - "CA". This is an abbreviation for California, an American state, I'm presuming; though it could just as easily be the ISO two-letter abbreviation for "Canada". The documentation is US-centric and vague.

If addressRegion is intended to be a state, province or prefecture, that's helpful for those countries that have them.

The UK doesn't have any "states". It does have counties, but they are mainly historical, and London isn't in a county. Nor, technically, is Bristol or Manchester, for example. There's nothing I can put there.

While the UK doesn't normally add a nation ("England", "Wales", "Scotland", "Northern Ireland") to an address, is this what is expected in this field?

This is a mandatory field for a jobLocation, yet Google's geodata doesn't include a "nation" or a historical county - https://goo.gl/maps/was25WujcuT2 is a visible example of an address in London. Where might I programmatically get the correct data to satisfy Google's requirement for an addressRegion for a UK address?

@danbri
Copy link
Contributor

danbri commented Feb 16, 2018

Hi @jamescridland. We can only handle the general schema vocabulary structure issue here. What Google or other companies do with the data is beyond the scope of the Schema.org project. It sounds like the schema is not sufficiently clear though so let's keep this issue open for that...

@jamescridland
Copy link
Author

Thanks, Dan. Without knowing what is expected in here, it could be filled with gibberish - so it would be helpful to understand what is intended to be in addressRegion. I can then go and talk to Google (somehow) to try and understand why this field is mandatory in their spec; but right now, it's a bit opaque as to what should be there in the first place.

@danbri
Copy link
Contributor

danbri commented Feb 18, 2018

@jamescridland - that's a fair request. I'm a reasonable Google contact for this (although it would help if you posted in https://productforums.google.com/forum/#!topicsearchin/webmasters/category$3Astructured-data%7Csort:relevance%7Cspell:false ). Moving house this week but will follow up.

@danbri danbri self-assigned this Feb 18, 2018
@danbri
Copy link
Contributor

danbri commented Feb 18, 2018

From a quick and unscientific look at some Web data, it seems sites in the US are publishing state codes like "CA" under addressRegion, whereas in the UK, it is being filled with county(ish) names, by which I mean "London" and "Greater London" are both used too.

@WeaverStever
Copy link

WeaverStever commented Feb 18, 2018

Here is what Google says in their localBusiness Guide:

address.addressRegion | Text, required where applicable State or province.

https://developers.google.com/search/docs/data-types/local-business

I've found that the field can be an empty string without raising a warning or error on the SDTT.

@jamescridland
Copy link
Author

As a suggestion, then, could I request that http://schema.org/addressRegion suggests this field is:

For example, the state, province, Länder or prefecture: eg CA, QLD or Bayern

...so that at least it's clearer than "CA" (California/Canada).

What's not been very helpful is not knowing whether I should be using an abbreviation here or not. I don't know if the specification should say so.

In the UK, it is being filled with county(ish) names, by which I mean "London" and "Greater London" are both used too.

I've since discovered that this field is analogous to Google Places "administrative_area_level_2", which for UK address contains things like "Greater London" or "City of Bristol". That's what I'm now sticking in there, anyway.

@WeaverStever
Copy link

@jamescridland
Certainly have to admit that the current description in http://schema.org/addressRegion is extremely poor. A better description?

Province State Code
https://gs1.org/voc/addressRegion
Text specifying a province or state in abbreviated format for example NJ.

Nearest schema.org equivalent:
Exact match: schema:addressRegion

Found here: https://www.gs1.org/voc/addressRegion

@philbarker
Copy link
Contributor

Might be worth looking at the Post Office advice on addressing. Town in that context is postal town, it doesn't necessarily imply the locality is within the boundary of the town as a municipal authority. There are examples of places with a postal town that is in a different county[*]

Bringing this back to schema.org: it would help a lot if the definitions for addressRegion and addressLocality said which was bigger / which was more precise. and how the related to streetAddress and addressCountry, e.g. addressLocality: the locality within which the streetAddress lies, smaller than / less precise than addressRegion.

[* no, I didn't know there were postal address nerds either, not until I fell into conversation with one about this a couple of years ago]

@ketanumretiya030
Copy link

You cane usen from http://schema.org/docs/gs.html
also chek hre : http://schema.org/addressRegion

@nickevansuk
Copy link

nickevansuk commented Mar 11, 2018

For the London example specifically, we've been adopting:

  • "addressLocality" = Borough name
  • "addressRegion" = "London" or "Greater London"

Certainly agree that this is ambiguous for the UK and has caused confusion with implementers.

CC @ldodds

@WeaverStever
Copy link

WeaverStever commented Mar 11, 2018

The Google SDTT does not complain when this field is empty, or not included. In fact when you do a query of Google Places, the array of fields rarely line up with their descriptions.

Note the following facts about the address_components array:

The array of address components may contain more components than the formatted_address.
The array does not necessarily include all the political entities that contain an address, apart from those included in the formatted_address. To retrieve all the political entities that contain a specific address, you should use reverse geocoding, passing the latitude/longitude of the address as a parameter to the request.

The format of the response is not guaranteed to remain the same between requests. In particular, the number of address_components varies based on the address requested and can change over time for the same address. A component can change position in the array. The type of the component can change. A particular component may be missing in a later response.
https://developers.google.com/places/web-service/details

For Europe, I would suggest that Metropolitan Areas could be included in the description for valid input.
https://en.wikipedia.org/wiki/List_of_metropolitan_areas_in_Europe

For US addresses, I have in the past used the http://schema.org/containedInPlace to associate it with a Metropolitan Statistical Area, or similar alias.

(For instance, a musician is holding a MusicEvent in Redondo Beach CA, the event may be of interest to people within the Los Angeles Metropolitan Statistical Area, in addition to Redondo Beach and just a few neighboring cities.)

@jamescridland
Copy link
Author

jamescridland commented Mar 11, 2018 via email

@WeaverStever
Copy link

WeaverStever commented Mar 12, 2018

@jamescridland,

A little deeper into the rabbit hole at Google Places, I believe that they (SDTT) are expecting "administrative_area_level_1" for addressRegion.

The (regions) type collection instructs the Places service to return any result matching the following types:
locality
sublocality
postal_code
country
administrative_area_level_1
administrative_area_level_2
The (cities) type collection instructs the Places service to return results that match locality or administrative_area_level_3.
https://developers.google.com/places/web-service/supported_types#table2

There is some JSON output on this page showing these Types.

   "html_attributions" : [],
   "result" : {
      "address_components" : [
         {
            "long_name" : "5",
            "short_name" : "5",
            "types" : [ "floor" ]
         },
         {
            "long_name" : "48",
            "short_name" : "48",
            "types" : [ "street_number" ]
         },
         {
            "long_name" : "Pirrama Road",
            "short_name" : "Pirrama Rd",
            "types" : [ "route" ]
         },
         {
            "long_name" : "Pyrmont",
            "short_name" : "Pyrmont",
            "types" : [ "locality", "political" ]
         },
         {
            "long_name" : "Council of the City of Sydney",
            "short_name" : "Sydney",
            "types" : [ "administrative_area_level_2", "political" ]
         },
         {
            "long_name" : "New South Wales",
            "short_name" : "NSW",
            "types" : [ "administrative_area_level_1", "political" ]
         },
         {
            "long_name" : "Australia",
            "short_name" : "AU",
            "types" : [ "country", "political" ]
         },
         {
            "long_name" : "2009",
            "short_name" : "2009",
            "types" : [ "postal_code" ]
         }
      ],
(shortened for brevity)

https://developers.google.com/places/web-service/details

This StackOverflow question indicates that we are looking for "administrative_area_level_1" in schema addressRegion.

For the US the "administrative_area_level_1" is the state, in Canada it is the provinces and territories. It appears that these are called "Administrative Divisions" in Wikipedia.

https://en.wikipedia.org/wiki/List_of_administrative_divisions_by_country
(Referencing the "First-level" column in the table.)

So let's write a proposed description for the http://schema.org/addressRegion property.

The United Nation's, First-level political region or equivelent. In Canada, the First-level political regions are the provinces and territories, in the U.S. these regions are the 50 states (and some additional) . For more information see: https://en.wikipedia.org/wiki/List_of_administrative_divisions_by_country

@jamescridland
Copy link
Author

jamescridland commented Mar 12, 2018 via email

@WeaverStever
Copy link

WeaverStever commented Mar 12, 2018

@jamescridland

A quick look back at this link confirms that we are indeed supplying the abbreviated version. AKA the short_name in Google Places. https://www.gs1.org/voc/addressRegion

Text specifying a province or state in abbreviated format for example NJ.
The United Nation's, First-level political region or equivalent.
In Canada, the First-level political regions are the provinces and territories, in the U.S. these regions are the 50 states (and some additional). For other countries and more information see: https://en.wikipedia.org/wiki/List_of_administrative_divisions_by_country

@WeaverStever
Copy link

@danbri

Would you read over these last five or six posts? The description in addressRegion is very poor and we've proposed some verbiage.

Thanks

@ghost
Copy link

ghost commented Apr 22, 2018

I've just been working on co-ordinating a data schema on a WordPress business directory, using Google Places API to pull in validated data, and then use it to output the JSON-LD schema for rich snippets. The problem isn't that Google isn't clear, but that the UK seem to be different. For UK addresses the schema appears to be locality -> village / urban district, postal_town -> town / city, administrative_area_2 -> county, administrative_area_1 -> country (E, S, W, NI), country -> kingdom (UK). However, for non-UK companies (US & Finland so far and I guess normal countries instead of Kingdoms, it appears to be neighborhood -> urban district, locality -> town / city, administrative_area_1 -> state, country -> country.

It's worth getting a free API key and experimenting with Places API to see how things are returned. It's also interesting to note that the Places API returns a HTML element in string form uses span with class names to standardise locality, region, country, postcode. In the end to confirm the data validation, for my project I used jQuery to render the HTML and then pull of the content from the individual spans - it made things a bit more standard and predictable, and helped me get my head around how the data fitted into the address schema format.

Also, with regards place names, the API returns both short and long names, ie. United Kingdom and UK, DC and District of Columbia, although renders the short version most commonly. Also does the same for other fields, e.g. Pennsylvania Avenue Northwest & Pennsylvania Ave NW.

-- Edit --
Google Places uses the adr microformat in the API return data
https://developers.google.com/places/web-service/details#PlaceDetailsResults
http://microformats.org/wiki/adr#Property_List

@WeaverStever
Copy link

@LeodanDesign
Here is what google has to say about the Address Array:

The format of the response is not guaranteed to remain the same between requests. In particular, the number of address_components varies based on the address requested and can change over time for the same address. A component can change position in the array. The type of the component can change. A particular component may be missing in a later response.

As for the addressRegion, in the US and Canada they are the states and provinces. For other countries, I believe that addressRegion accepts administrative_area_level_1 (Evidence in my earlier posts)

The administrative areas are listed here.
https://en.wikipedia.org/wiki/List_of_administrative_divisions_by_country


As for your project, I have done something similar for localBusiness. I need to revisit how I handled addresses, I think I parsed the formatted address.
https://places.parkingbeater.com/

@WeaverStever
Copy link

P.S.
As I recall from a related project, I found that the address components array were unreliable. I think one of the bad examples was the Microsoft Theater in Los Angeles.

@Download
Copy link

Download commented Sep 3, 2018

I just wanted to let you guys know that I found this issue because ! had exactly the same question as the OP (trying to get a UK address to fit).

The documentation really needs a change but there seem to be many small issues like these not getting fixed for months so I wonder if people are actually still working on this. Also the sheer number of open issues is daunting. The Google Structured Data Testing Tool also has many issues. It makes it very hard for implementers to do it correctly.

@andrewmartinuk
Copy link

I've opted to use addressRegion alongside County (e.g. Cambridgeshire), although I did originally start off using it for Country instead (e.g. England) but felt it gave little value, as Country really needed to be United Kingdom, and UK addresses rarely bother including England and United Kingdom (for now).

@jamescridland
Copy link
Author

It's disappointing that nobody has been bothered to update the ambiguous documentation after so long. @danbri - could I at least ask that the documentation is updated with more examples than just "CA", which could just as easily be Canada as California? Or, if this project is in maintenance-only mode, could you be honest and say so? grumpy james is grumpy

@andrewmartinuk Be cautious not to confuse schema's requirement with a Royal Mail address requirement. (For Royal Mail addresses, counties should never be used.) Google/Schema needs, weirdly, more detail than actually exists in postal addresses.

@Download
Copy link

Google/Schema needs, weirdly, more detail than actually exists in postal addresses.

Yes. And I'd say that's not just weird, it's wrong.
Schema.org should let me describe what is on my website. Not tell me what info I should put on my website in the first place. So all these fields should be optional and the validator should not complain.

@philbarker
Copy link
Contributor

philbarker commented Sep 24, 2018

Google/Schema needs, weirdly, more detail than actually exists in postal addresses.

More accurately: Google needs, weirdly, more detail in schema.org format than exists in postal addresses.

See the second post in this thread, & immediate replies, for what can be addressed here (i.e. issues with schema.org such as guidance, but not Google's requirements).

@jamescridland
Copy link
Author

Phil - there are two issues here.

  1. The awful, US-centric, confusing documentation by schema.org that gives no clarification of what should go in an address outside the US.

  2. Google's (incorrect?) requirements.

We can all help fix #1. I wish someone at schema would run with this ball. Currently it seems as if nobody in the project is interested. (I can't even tell whether anyone's working on schema at all).

@Download
Copy link

I can't even tell whether anyone's working on schema at all

Well, I have been monitoring the issues for a while now and small changes (such as this issue, which basically suggests a doc change that should be implementable in a few hours) are simply not picked up.

I basically spelled out the text that I think should go in the docs in this issue, but nothing happens... I would make a PR if it were clear what to do but I found no guidance.

@jamescridland
Copy link
Author

Yeah. It's clear that nobody's working on this project any more. I just wish @danbri and others would be honest. Disappointing.

@kmcconnell
Copy link

philbarker added a commit to philbarker/schemaorg that referenced this issue Sep 27, 2018
@philbarker
Copy link
Contributor

I've put in a pull request that I think would help ease some of the confusion @jamescridland feels.

I don't really see the problem about CA possibly being Canada: it's clearly not because Canada is a country and addressCountry exists.

I think the main point of this issue, at least the part that is relevant to schema.org, that is country-specific guidance on using PostalAddress might be better addressed through the wiki than adding to already over-loaded examples section.

@danbri
Copy link
Contributor

danbri commented Sep 27, 2018

Thanks for the concrete suggestion @philbarker

From your draft, f989131

  • addressLocality: The locality in which street address is, and which is in the region. For example, Mountain View.
  • addressRegion: The region in which locality is, and which is in the country. For example, California.

The wording feels a little off (just in terms of the English, missing a couple of "the"s maybe?) but does this feel like progress, @jamescridland et al?

Phil is correct that we can't add too much country-specific detail into the definitions, but the broad intent of the definitions ought to be clear from their content without too much searching around elsewhere.

/cc @ldodds

@ldodds
Copy link
Contributor

ldodds commented Sep 27, 2018

Echoing @nickevansuk point from above, we've been encouraging use of county/area for addressRegion as part of the OpenActive project, e.g:

      "addressLocality": "Bath",
      "addressRegion": "Somerset",

@edpars0ns
Copy link

This will always be a bit messy.

From a UK perspective I would agree with @ldodds @nickevansuk that addressRegion represents county - This is messy because for example my home county Middlesex has not existed for 50 years but is still in widespread use instead of the "correct" Greater London.

I would suggest you are trying to express here is some form of hierarchy that represent not official postal geography but common usage, and something that will with the nature of addresses have considerable variation between countries.

@jamescridland
Copy link
Author

  • addressLocality: The locality in which street address is, and which is in the region. For example, Mountain View.
  • addressRegion: The region in which locality is, and which is in the country. For example, California.

The wording feels a little off (just in terms of the English, missing a couple of "the"s maybe?) but does this feel like progress, @jamescridland et al?

"California" or "CA"? Or, for that matter, "Bayside" or "Southern California" or "East Sussex"?

@jamescridland
Copy link
Author

My suggestion here might be:

  • addressLocality: The locality (city, town, village) in which the street address is. For example, Mountain View, Huddersfield, Kelowna
  • addressRegion: The state, province or prefecture where the locality is. For example, Queensland, California, Ontario, Kyoto. Standard abbreviations may be used (CA, QLD).

We need to be clear what data we're asking for; and also the format of that data. I don't know whether "California" is acceptable here, or whether it should be CA. I'm a bit confused whether it's a state or a county (Mountain View is in Santa Clara County - isn't that the region?) How can we make this clearer?

My concern is that, for English addresses, we're asking for data which does not appear in postal addresses and is ambiguous. For the English city of Hull (or Kingston upon Hull, to give it its full name), the correct addressRegion could be "East Yorkshire", "Humberside" or the actually correct "Hull", since it isn't in a county at all. I also have concern that this data may not be available in any database.

A specification this ambiguous isn't a specification - and I can't see how the random data you might extract from this schema can actually be useful in a real-world scenario.

There is prior work here - the excellent Geonames database. Here's Hull, for example (which is not marked as being in a county); Ashgrove in Queensland, or Mountain View.

@philbarker
Copy link
Contributor

@danbri I see what you mean. Will fix.
@jamescridland I am not in favour of the extensions to the definitions you give.

Sometimes a city will be the addressLocality, sometimes it will be the addressRegion. Take 10 Downing St, Westminster, London SW1A 2AA for an example. Sometimes a city will be an addressCountry: (Monaco or Citta del Vaticano). I don't think any number of examples of what a locality or region might be are going to help, they will introduce as many ambiguities as they resolve.

UK postal addresses are strange. They're maybe not the only strange addressing system in the world (I know places in rural Spain with no street names), but maybe they are the only ones where the postal addresses are so disjoint from the administrative location hierarchy. Loose definitions work well when trying to fit diverse data.

Q: how does GeoNames deal with places where the postal town is in a different county? E.g. Chirbury in Shropshire (England) which has the postal town of Montgomery (Wales).

@jamescridland
Copy link
Author

@philbarker Thanks for the comments.

Helpful to remind myself that this discussion is about http://schema.org/PostalAddress and not a physical address.

The example you give - "10 Downing Street, Westminster, London SW1A 2AA" - is not the postal address. Using the Royal Mail checker, the correct postal address is 10 Downing Street, London SW1A 2AA. (I can see the benefit of knowing it's in Westminster, but this is not the postal address.)

So, this is correct, no?

    <span itemprop="streetAddress">10 Downing Street</span>
    <span itemprop="addressLocality">London</span>,
    <span itemprop="addressRegion"></span>
    <span itemprop="postalCode">SW1A 2AA</span>

Loose definitions work well when trying to fit diverse data.

I think this is a religious argument. In this case, I don't agree. It's important to ensure that, if ten different people read the definition, all ten people will put the same data in the same fields. And currently this isn't the case.

Q: how does GeoNames deal with places where the postal town is in a different county?

Geonames doesn't do postal addresses. That's very clear (though not so for Schema). The correct answer is that Chirbury is in Shropshire (the county), and "Chirbury with Brompton" (the parish).

So what goes in a Schema address here? The pottery shop in Chirbury has a postal address of "4, Shepherd's Yard, Chirbury, Montgomery SY15 6BH", yet a physical address of, I can only assume, "4, Shepherd's Yard, Chirbury, Shropshire SY15 6BH".

Does "Montgomery" (which is a postal town, and not the administrative area) go into addressRegion? If we can work out this case, it would really help us, I think.

@philbarker
Copy link
Contributor

@jamescridland I think for consistency I would always put the postal town in the addressRegion (and think of the addressLocality as being in the area served by it). So

<span itemprop="streetAddress">10 Downing Street</span>
<span itemprop="addressLocality"></span>,
<span itemprop="addressRegion">London</span>
<span itemprop="postalCode">SW1A 2AA</span>

and

<span itemprop="streetAddress">4, Shepherd's Yard</span>
<span itemprop="addressLocality">Chirbury</span>,
<span itemprop="addressRegion">Montgomery</span>
<span itemprop="postalCode">SY15 6BH</span>

Using PostalAddress to describe the physical address would put Shropshire in addressRegion, as @ldodds &co do (and yes this does lead to oddities where Bath could be the value of an addressLocality for a physical address in Somerset and the value of addressRegion for a postal address, but this is because UK postal addresses are odd). This fits with @danbri 's comment of seeing addressRegion "being filled with county(ish) names, by which I mean "London" and "Greater London" are both used too." So I think this is consistent with what Dan sees and Leigh suggests.

Hope this helps, but I think I've reached the limit of my knowledge about and interest in this topic, If the pull request I sent closes this issue, then that's great, but forgive me if I duck out now.

@WeaverStever
Copy link

WeaverStever commented Sep 30, 2018

From Google Maps

country indicates the national political entity, and is typically the highest order type returned by the Geocoder.
administrative_area_level_1 indicates a first-order civil entity below the country level. Within the United States, these administrative levels are states. Not all nations exhibit these administrative levels. In most cases, administrative_area_level_1 short names will closely match ISO 3166-2 subdivisions and other widely circulated lists; however this is not guaranteed as our geocoding results are based on a variety of signals and location data.
administrative_area_level_2 indicates a second-order civil entity below the country level. Within the United States, these administrative levels are counties. Not all nations exhibit these administrative levels.
administrative_area_level_3 indicates a third-order civil entity below the country level. This type indicates a minor civil division. Not all nations exhibit these administrative levels.
administrative_area_level_4 indicates a fourth-order civil entity below the country level. This type indicates a minor civil division. Not all nations exhibit these administrative levels.
administrative_area_level_5 indicates a fifth-order civil entity below the country level. This type indicates a minor civil division. Not all nations exhibit these administrative levels.

Wikipedia: List of administrative divisions by country

I've posted some other documentation earlier in this thread that suggest administrative_area_level_1 or administrative_area_level_2 could be appropriate for addressRegion in countries not delimited by state.

@github-actions
Copy link

This issue is being tagged as Stale due to inactivity.

@github-actions github-actions bot added the no-issue-activity Discuss has gone quiet. Auto-tagging to encourage people to re-engage with the issue (or close it!). label Jul 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no-issue-activity Discuss has gone quiet. Auto-tagging to encourage people to re-engage with the issue (or close it!).
Projects
None yet
Development

No branches or pull requests