Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement Request#gdpr? #22526

Merged
merged 2 commits into from
May 18, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
20 changes: 11 additions & 9 deletions cookbooks/cdo-varnish/libraries/http_cache.rb
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ class HttpCache

# Language header and cookie are needed to separately cache language-specific pages.
LANGUAGE_HEADER = %w(Accept-Language).freeze
COUNTRY_HEADER = %w(CloudFront-Viewer-Country).freeze
WHITELISTED_HEADERS = LANGUAGE_HEADER + COUNTRY_HEADER
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've been trying to avoid this phrase. "Accept list" or something else instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm leaving this for now for consistency with the rest of the existing file and with the CloudFront API terminology, but will follow up on whether we have a need to adopt/clarify naming guidelines moving forward.


DEFAULT_COOKIES = [
# Language drop-down selection.
Expand Down Expand Up @@ -71,7 +73,7 @@ def self.config(env)
behaviors: [
{
path: '/api/hour/*',
headers: LANGUAGE_HEADER,
headers: WHITELISTED_HEADERS,
# Allow the company cookie to be read and set to track company users for tutorials.
cookies: whitelisted_cookies + ['company']
},
Expand All @@ -98,13 +100,13 @@ def self.config(env)
/pd-program-registration*
/poste*
),
headers: LANGUAGE_HEADER,
headers: WHITELISTED_HEADERS,
cookies: whitelisted_cookies
},
{
path: '/dashboardapi/*',
proxy: 'dashboard',
headers: LANGUAGE_HEADER,
headers: WHITELISTED_HEADERS,
cookies: whitelisted_cookies
}
],
Expand All @@ -130,7 +132,7 @@ def self.config(env)
/v3/animations/*
/v3/files/*
),
headers: LANGUAGE_HEADER,
headers: WHITELISTED_HEADERS,
cookies: whitelisted_cookies
},
{
Expand All @@ -142,12 +144,12 @@ def self.config(env)
/api/user_progress/*
/milestone/*
),
headers: LANGUAGE_HEADER + ['User-Agent'],
headers: WHITELISTED_HEADERS + ['User-Agent'],
cookies: whitelisted_cookies
},
{
path: CACHED_SCRIPTS_MAP.values,
headers: LANGUAGE_HEADER,
headers: WHITELISTED_HEADERS,
cookies: default_cookies
},
{
Expand All @@ -157,7 +159,7 @@ def self.config(env)
},
{
path: '/api/*',
headers: LANGUAGE_HEADER,
headers: WHITELISTED_HEADERS,
cookies: whitelisted_cookies
},
{
Expand All @@ -169,7 +171,7 @@ def self.config(env)
{
path: '/v2/*',
proxy: 'pegasus',
headers: LANGUAGE_HEADER,
headers: WHITELISTED_HEADERS,
cookies: whitelisted_cookies
},
{
Expand All @@ -180,7 +182,7 @@ def self.config(env)
],
# Default Dashboard paths are session-specific, whitelist all session cookies and language header.
default: {
headers: LANGUAGE_HEADER,
headers: WHITELISTED_HEADERS,
cookies: whitelisted_cookies
}
}
Expand Down
10 changes: 10 additions & 0 deletions lib/cdo/rack/request.rb
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
require 'rack/request'
require 'ipaddr'
require 'json'
require 'country_codes'

module Cdo
module RequestExtension
Expand Down Expand Up @@ -101,6 +102,15 @@ def user_id_from_session_cookie
rescue
return nil
end

def country
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just confirming: do both the cloudfront country and the fallback country work for Pegasus requests too? And country is now part of the cache variant for all sites?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do both the cloudfront country and the fallback country work for Pegasus requests too? And country is now part of the cache variant for all sites?

No, this won't work for [all] Pegasus requests as it's currently configured, because country isn't added to the cache variant for cached pages. I've only added the country header to non-cached CloudFront path behaviors for now to avoid impact on any existing cache behavior.

If we do end up with cases where we want to vary cached page-responses based on country-code (in addition to the language-varying we already do on these pages), we can change the cache configuration to make that work as well. However, such a change would multiply the number of uniquely-cached pages by the number of countries (so the total variants for each cached page will be languages * countries * user_types). This would probably be fine, but would be significant enough cache change that we'd want to watch for any increased load when deploying.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a followup, we now do vary the pegasus caching by country courtesy of #26753.

env['HTTP_CLOUDFRONT_VIEWER_COUNTRY'] ||
location&.country_code
end

def gdpr?
gdpr_country_code?(country)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently we've been using EEA as a broader catch-all, rather than EU. Maybe we could have the option to query either? @poorvasingal can comment fully.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just been trying to use EEA to be more broad / play it safer. EEA = EU + Iceland, Liechtenstein and Norway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

end
end
end
Rack::Request.prepend Cdo::RequestExtension
Expand Down
49 changes: 49 additions & 0 deletions lib/country_codes.rb
Original file line number Diff line number Diff line change
Expand Up @@ -250,6 +250,48 @@
"ZW" => "Zimbabwe",
}.freeze

# ISO 3166 alpha-2 country codes for Member States of the European Union.
# Source: http://publications.europa.eu/code/pdf/370000en.htm
# ISO codes for Greece (GR) and United Kingdom (GB) are listed instead of their recommended abbreviations (EL and UK).
EU_COUNTRY_CODES = %w(
AT
BE
BG
CY
CZ
DE
DK
EE
ES
FI
FR
GB
GR
HR
HU
IE
IT
LT
LU
LV
MT
NL
PL
PT
RO
SE
SI
SK
).freeze

# EEA = EU + Iceland, Liechtenstein and Norway.
EEA_COUNTRY_CODES = EU_COUNTRY_CODES +
%w(
IS
LI
NO
)

# Returns the name of the country whose two character country code is code.
# If code is not a valid two character country code, returns code.
def country_name_from_code(code)
Expand All @@ -264,3 +306,10 @@ def get_all_countries
def valid_country_code?(code)
return COUNTRY_CODE_TO_COUNTRY_NAME[code.to_s.strip.upcase].present?
end

# @return true if the provided alpha-2 country code represents a
# member state of the European Union covered by the GDPR.
def gdpr_country_code?(code)
return false if code.nil?
EEA_COUNTRY_CODES.include?(code.to_s.strip.upcase)
end
9 changes: 9 additions & 0 deletions pegasus/routes/v2_geocoder_routes.rb
Original file line number Diff line number Diff line change
Expand Up @@ -21,3 +21,12 @@
content_type :json
JSON.pretty_generate(location)
end

get '/v2/country' do
dont_cache
content_type :json
JSON.pretty_generate(
country: request.country,
gdpr: request.gdpr?
)
end
13 changes: 13 additions & 0 deletions pegasus/test/test_request.rb
Original file line number Diff line number Diff line change
Expand Up @@ -28,4 +28,17 @@ def test_unknown_ip
req = Rack::Request.new({'HTTP_X_FORWARDED_FOR' => 'unknown'})
assert_nil req.location
end

def test_gdpr
assert Rack::Request.new('HTTP_CLOUDFRONT_VIEWER_COUNTRY' => 'gb').gdpr?
refute Rack::Request.new('HTTP_CLOUDFRONT_VIEWER_COUNTRY' => 'us').gdpr?

# If the CloudFront-Viewer-Country header is not set, IP-based geolocation is used as a fallback.
user_ip = '89.151.64.0' # Great Britain IP address range
# The geocoder gem resolves the IP using freegeoip, this mocks the underlying HTTP requests.
stub_request(:get, "#{CDO.freegeoip_host || 'freegeoip.io'}/json/#{user_ip}").to_return(
body: {ip: user_ip, country_code: 'GB'}.to_json
)
assert Rack::Request.new('REMOTE_ADDR' => user_ip).gdpr?
end
end