Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

amp-geo: Fall back to API for country #26407

Merged
merged 8 commits into from Feb 28, 2020
Merged

Conversation

mdmower
Copy link
Contributor

@mdmower mdmower commented Jan 18, 2020

If country code is not available in a prerendered body's classes or in
amp-geo-0.1.js itself, support falling back to an API request. The
publisher must provide the API endpoint; cdn.ampproject.org will
not host a general country code API.

Allow case insensitive country code in JSON schema (country is most
often presented in uppercase when ISO 3166-1 table is referenced).

JSON schema of Geo API response - version 0.1:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "properties": {
    "country": {
      "type": "string",
      "title": "ISO 3166-1 alpha-2 (case insensitive) country code of client request",
      "default": "",
      "pattern": "^[a-zA-Z]{2}$"
    }
  },
  "required": [
    "country"
  ]
}

Sample response:

{
  "country": "de"
}

@mdmower mdmower changed the title [WIP] amp-geo: Fallback to API for country [WIP] amp-geo: Fall back to API for country Jan 19, 2020
If country code is not available in a prerendered body's classes or in
amp-geo-0.1.js itself, support falling back to an API request if the
publisher must provide (cdn.ampproject.org will not host a general
country code API).

Allow case insensitive country code in JSON schema (country is most
often presented in uppercase when ISO 3166-1 table is referenced).

JSON schema of Geo API response - version 0.1:
{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "properties": {
    "country": {
      "type": "string",
      "title": "ISO 3166-1 alpha-2 (case insensitive) country code of client request",
      "default": "",
      "pattern": "^[a-zA-Z]{2}$"
    }
  },
  "required": [
    "country"
  ]
}

Sample response:
{
  "country": "de"
}

temp

temp
@mdmower mdmower marked this pull request as ready for review February 1, 2020 21:47
@mdmower mdmower changed the title [WIP] amp-geo: Fall back to API for country amp-geo: Fall back to API for country Feb 1, 2020
@mdmower
Copy link
Contributor Author

mdmower commented Feb 1, 2020

@Gregable @sebastianbenz @zhouyx @jridgewell - I believe this PR matches the amp-geo API design we discussed (relates to #25873). If any of you have time for a review (or could recommend a reviewer) it would be appreciated.

@zhouyx
Copy link
Contributor

zhouyx commented Feb 3, 2020

Thank you @mdmower, as mentioned in the meeting, one thing that we are actively working on is to add more granular geo detection support to amp-geo. Thoughts on supporting ISO 3166-2 format in the future using the same API?

@mdmower
Copy link
Contributor Author

mdmower commented Feb 4, 2020

Thoughts on supporting ISO 3166-2 format in the future using the same API?

@zhouyx The JSON schema could either be modified to allow ISO 3166-1 or ISO 3166-2 in a single property (instead of country, it could be called region), or a second optional subdivision property could be introduced. In either case, if only the ISO 3166-1 code is available, then amp-geo works at the country level, and if ISO 3166-2 is available, amp-geo works at the subdivision level. Do you have a preference for one of these schema changes?

@zhouyx
Copy link
Contributor

zhouyx commented Feb 6, 2020

I prefer the second optional subdivision property because in this case it doesn't mix thee required country code vs the optional region code.

Two concerns I have. These concerns are not blocking this change. But I want to come up with a solution to the first concern when we launch US-CA support.

  • AMP is going to only support US-CA in short term. This means that US without subdivision code can mean 1. subdivision is not supported, or 2. user is not in US-CA. This means that a CA user can be labeled as a non CA user during self host. I want to understand how bad this is given publishers are going to rely on the decision to apply CCPA settings.
  • When providing the region information through AMP's cdn, there will be a confidence value that also comes with decision. I briefly mentioned this in I2I: amp-geo to add preset-us-ca #26637. Though the generic region support design has not been finalized yet, it's likely that we would also require such a confidence level value from the endpoint.

Thanks

@mdmower
Copy link
Contributor Author

mdmower commented Feb 6, 2020

@zhouyx Thanks for the details. I believe this JSON example would satisfy your requirements.

{
    "countries": [
        {
            "country": "US",
            "confidence": 0.9,
            "subdivisions: [
                {
                    "subdivision": "WA",
                    "confidence": 0.6
                },
                {
                    "subdivision": "OR",
                    "confidence": 0.4
                }
            ]
        },
        {
            "country": "CA",
            "confidence": 0.1,
            "subdivisions: [
                {
                    "subdivision": "BC",
                    "confidence": 1
                }
            ]
        }
    ]
}

Special cases:

  • countries: [] - Country lookup supported but no country was identified
  • countries[n].subdivisions: null - Subdivision lookup not supported
  • countries[n].subdivisions: [] - Subdivision lookup supported but no subdivision was identified

Confidence ranges from 0.0 to 1.0 and the sum of all confidences for countries or subdivisions should not exceed 1.0 (but can be less than 1.0).

For example, if an API supports only country lookup and has no support for confidence levels or subdivisions, it would report:

{
    "countries": [
        {
            "country": "US",
            "confidence": 1,
            "subdivisions: null
        }
    ]
}

Let me know if you think a schema that fits this model would be preferable at start, or if it should be introduced later as "Version 0.2".

@zhouyx
Copy link
Contributor

zhouyx commented Feb 12, 2020

Thank you for the proposal! I have reached out to the team who provides the amp-geo service, because I don't feel comfortable finalizing the endpoint design before we have a complete story from them.

A few concerns we have today

  1. support for disputed area, which an ISO code does not exist.
  2. confidence level and how we want to handle it. (Note it's most likely that only one result with a confidence level will be returned). that looks like
{
  country: 'us'
  region 'us-ca'
  confidence level: 0.8
}

there won't be another entry with confidence level 0.2. Also the confidence level will likely only applies to region value.

Because we don't have the design finalized. Let's only focus on country and us-ca for now. What about this

region not supported

{
  "country": "us"
}

region supported, and in us-ca

{
  "country": "us",
  "region": "us-ca"
}

region supported, not in us-ca. or can't verify region

{
  "country": "us",
  "region": null // or any value that's not "us-ca" for now
}

We are still deciding if region value should be "ca" or "us-ca", let's only get "country" support in this PR.

Copy link
Contributor

@jridgewell jridgewell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Needs @zhouyx's approval, and probably @Gregable's.

* @return {Promise<(string|null)>}
*/
fetchCountry_() {
if (typeof urls.geoApi !== 'string') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jridgewell Did we require https in urls? If not we should use assertHttpsUrl

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTTPS sounds like a good requirement to protect user privacy (physical location). assertHttpsUrl is geared more towards AMPHTML elements that have disallowed src attribute values. I opted for isSecureUrlDeprecated, which also makes allowances for localhost testing, since it is a bit more generic and less tied to AmpDoc.

})
.then(res => res.json())
.then(json => {
if (!/^[a-z]{2}$/i.test(json.country)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: please use json['country'] to avoid the name minimize.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

);
return null;
}
return json.country.toLowerCase();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: same here, json['country']

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -199,14 +202,97 @@ export class AmpGeo extends AMP.BaseElement {
this.mode_ = mode.GEO_HOT_PATCH;
this.country_ = trimmedCountryMatch[1];
} else if (trimmedCountryMatch[0] === '' && !getMode(this.win).localDev) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd prefer to handle one mode in one if statement block. Can we please change the logic to

else if (...) {
  this.mode_ = mode.GEO_HOT_PATCH;
} else if (trimmedCountryMatch[0] === '' && urls.geoApi) {
  this.mode_ = mode.GEO_API;
} else if (!getMode(this.win).localDev) {
  this.error_ = true;
}
....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

method: 'GET',
credentials: 'omit',
})
.then(res => res.json())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jridgewell Curious does xhr request timeout by default? Should we add a timeout here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Timeout introduced in latest commit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm 60 seconds timeout. @jridgewell what's the typical timeout value applied by AMP runtime?

- Code style preferences
- Require HTTPS and non-ampproject-cdn API URL
- Timeout API request after 60 seconds
- Add timeout test
- Cleanup test output of new tests
@mdmower mdmower requested a review from zhouyx February 24, 2020 06:52
return null;
}

if (isProxyOrigin(url)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ProxyOrigin doesn't work the same reason any random url doesn't work. I don't think we need a special check for this. Let me know if you think differently.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Admittedly, there's not much reason for this check. It can be removed. I will wait for a final decision on isSecureUrlDeprecated/assertHttpsUrl before pushing a new change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's needed.

@@ -199,14 +202,97 @@ export class AmpGeo extends AMP.BaseElement {
this.mode_ = mode.GEO_HOT_PATCH;
this.country_ = trimmedCountryMatch[1];
} else if (trimmedCountryMatch[0] === '' && !getMode(this.win).localDev) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

return null;
}

if (!isSecureUrlDeprecated(url)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm the function name tells me that it's deprecated : )
#isSecureUrlDeprecated() doesn't handle the relative url case //, but #assertHttpsUrl() does.
The elementContext and sourceName makes the #assertHttpsUrl() looks like it's tied to AMP elements. But they are used only as error message. A better name is probably context instead of elementContext. What about

assertHttpsUrl(url, 'geoApiUrl', 'AMP_CONFIG urls')

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced assertHttpsUrl() is the right choice. It's a wrapper for isSecureUrlDeprecated(url) || /^(\/\/)/.test(url). Given the privacy implications of user location, do you really want to allow relative protocol? It seems more reasonable to directly make use of isSecureUrlDeprecated which requires HTTPS even if an AMP page is served from HTTP.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason it's deprecated is because it doesn't work properly in the different ampdoc modes. Use Services.urlFor's isSecure() instead.

- Prefer Services.Url.isSecure to test if a string URL is HTTPS
- Remove unnecessary check for isProxyOrigin
- Remove unreachable code
@mdmower
Copy link
Contributor Author

mdmower commented Feb 26, 2020

Thank you both for the reviews. Modifications made.

@mdmower mdmower requested a review from zhouyx February 26, 2020 04:14
Copy link
Contributor

@zhouyx zhouyx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with one minor request. Thanks!

* @param {*} url
* @return {?string}
*/
validateApiUrl(url) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: let's make it a private method please. Thanks

@mdmower
Copy link
Contributor Author

mdmower commented Feb 28, 2020

Are we ok with waiting to document this feature until #25873 is ready? Availability of self-hosting is when this feature becomes useful.

@zhouyx
Copy link
Contributor

zhouyx commented Feb 28, 2020

That's sounds like the optimal approach. SGTM Thanks

@mdmower mdmower merged commit 26fde84 into ampproject:master Feb 28, 2020
robinvanopstal added a commit to jungvonmatt/amphtml that referenced this pull request Mar 2, 2020
* master:
  Launch `amp-next-page` v2 & clean up experiment (ampproject#27035)
  ✨Implement Display Locking on amp-accordion (ampproject#25867)
  📖<amp-next-page> documentation and validation (ampproject#26841)
  Improve visibility event tests (ampproject#26847)
  amp-geo: Fall back to API for country (ampproject#26407)
  ✨ Add customization options to <amp-story-quiz> (ampproject#26714)
  navigation: Minor test improvements (ampproject#26926)
  ♻️ Alias video.fullscreen action for symmetry with interface names (ampproject#27017)
  ✨ Implements `amp-analytics` support for `amp-next-page` (ampproject#26451)
  ✨ amp-video-iframe integration: global jwplayer() singleton by default (ampproject#26969)
  Fix unit tests broken by ampproject#26687 (ampproject#27000)
  Filter redirect info from emails (ampproject#27012)
  📖 Make amp-bind doc valid, fix amp-form stories filter (ampproject#27003)
  url-replacements: Minor test improvement (ampproject#26930)
  viewport: Minor test improvement (ampproject#26931)
  amp-consent fullscreen warning (ampproject#26467)
  dep-check: USE CAPS FOR IMPORTANCE (ampproject#26993)
  fix img url (ampproject#26987)

# Conflicts:
#	extensions/amp-next-page/amp-next-page.md
@mdmower mdmower deleted the patch-geo-01 branch March 16, 2020 01:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants