-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(aws-lambda) Lambda authentication via EC2 IAM roles #2777
feat(aws-lambda) Lambda authentication via EC2 IAM roles #2777
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @Mgutjahr for the PR! Overall I think this will be a very useful addition. A handful of mostly minor style/nitpick comments attached. Additionally, we would like to see tests, both for the schema change, and an integration test for the plugin behavior change as well. Admittedly mocking out something for this will be challenging; perhaps draw on something like #2287 for past discussions of how we've handled tests for this plugin.
kong/plugins/aws-lambda/schema.lua
Outdated
} | ||
}, | ||
self_check = function(schema, plugin_t, dao, is_update) | ||
if not plugin_t["use_ec2_iam_role"] or plugin_t["use_ec2_iam_role"] == false then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prefer to use the dot notation to access table members in this case (e.g. plugin_t.use_ec2_iam_role
)
kong/plugins/aws-lambda/handler.lua
Outdated
local iam_role_credentials = get_credentials_from_iam_role() | ||
opts.access_key = iam_role_credentials.access_key | ||
opts.secret_key = iam_role_credentials.secret_key | ||
opts.headers['X-Amz-Security-Token'] = iam_role_credentials.session_token |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very minor style nitpick, prefer to use double quotes instead of single quotes.
|
||
local CACHE_IAM_INSTANCE_CREDS_DURATION = 60 -- seconds to cache credentials from metadata service | ||
|
||
local function get_iam_credentials_cache_key() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could probably be a constant instead of a function
|
||
local function get_iam_credentials_from_instance_profile() | ||
local iam_profile_credentials, err = cache.get_or_set(get_iam_credentials_cache_key(), CACHE_IAM_INSTANCE_CREDS_DURATION, | ||
fetch_iam_credentials_from_metadata_service) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would like to see the fetch_iam_credentials_from_metadata_service
call explicitly defining params, instead of relying on the defaults defined in the function itself. this would likely make testing easier.
iam_role_name .. "' with session token " .. iam_security_token_data['Token']) | ||
|
||
return { | ||
access_key = iam_security_token_data['AccessKeyId'], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: prefer to use dot notation to access these field members
iam_role_name .. "' with session token " .. iam_security_token_data['Token']) | ||
|
||
return { | ||
access_key = iam_security_token_data['AccessKeyId'], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor style: spacing seems off here, please use two spaces to indent.
@@ -0,0 +1,91 @@ | |||
local http = require 'resty.http' | |||
local json = require 'cjson' | |||
local cache = require "kong.tools.database_cache" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor style comment: please align assignment operators here
metadata_service_port = metadata_service_port or 80 | ||
|
||
local client = http.new() | ||
local ok, err = client:connect(metadata_service_host, metadata_service_port) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: please align assignment operators here
end | ||
|
||
if role_name_request_res.status ~= 200 then | ||
return nil, "[aws-lambda] Fetching role name from metadata service returned status code ".. role_name_request_res.status .. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: can we try to adjust line breaks and such so as to avoid the line being much longer than 80 chars
|
||
local iam_security_token_data = json.decode(iam_security_token_request:read_body()) | ||
|
||
ngx.log(ngx.INFO, "[aws-lambda] Received temporary IAM credential from metadata service for role '" .. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this also seems like it should be a debug level log entry
Hi @p0pr0ck5, I addressed most of your recommendations and fixed the mentioned style issues. On the testing side I added a tests for the schema extension. In addition, I added an integration test which loads credentials from a mocked-out metadata service (using the same approach aws-lambda uses) I decided not to add support for configuring the role the plugin should load as one EC2 instance can only have role configured at a time. This can be found from the AWS documentation:
This is also how almost all of the available AWS SDK's decided to implement support for instance roles. Sorry again for the style issues and the missing tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Mgutjahr thanks for the refactors! a lot of these comments are ticky-tack style notes, just good to be consistent, certainly not a huge problem. we appreciate your feedback! at the moment we are in a merge freeze on master
as the release of 0.11 stable is closing in (i believe @thibaultcha should have details on that), but barring that i think this is close to being merged. thanks again for the contribution!
kong/plugins/aws-lambda/schema.lua
Outdated
@@ -81,4 +85,14 @@ return { | |||
func = check_status, | |||
}, | |||
}, | |||
self_check = function(schema, plugin_t, dao, is_update) | |||
if not plugin_t.use_ec2_iam_role or plugin_t.use_ec2_iam_role == false then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i believe this second condition is redundant, if not
should handle both nil
and false
metadata_service_host = metadata_service_host or '169.254.169.254' | ||
metadata_service_port = metadata_service_port or 80 | ||
|
||
local iam_profile_credentials, err = cache.get_or_set(IAM_CREDENTIALS_CACHE_KEY, CACHE_IAM_INSTANCE_CREDS_DURATION, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A note here, this will require refactoring for the upcoming Kong 0.11, as the cache mechanism has changed and this function call is no longer valid. We should probably have PRs for 0.10.x and 0.11 and merge them simultaneously, with the latter being generated from rebasing this commit in its final form onto next
and adjusting as needed. this might need coordination/communication with @thibaultcha prior to releasing a stable 0.11?
end | ||
|
||
if role_name_request_res.status ~= 200 then | ||
return nil, "[aws-lambda] Fetching role name from metadata service returned status code ".. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor style point: need a space between "
and ..
end | ||
|
||
if iam_security_token_request.status == 404 then | ||
return nil, '[aws-lambda] Unable to request IAM credentials for role' .. iam_role_name .. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor style point: prefer double quotes "
to single quote '
@@ -0,0 +1,88 @@ | |||
local http = require 'resty.http' | |||
local json = require 'cjson' | |||
local cache = require 'kong.tools.database_cache' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor style point: prefer double quotes "
to single quote '
for the entries (generally we prefer double quotes everywhere)
kong/plugins/aws-lambda/schema.lua
Outdated
required = true, | ||
}, | ||
use_ec2_iam_role = { | ||
type="boolean", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: spacing around assignment operator
return { | ||
access_key = iam_security_token_data.AccessKeyId, | ||
secret_key = iam_security_token_data.SecretAccessKey, | ||
session_token = iam_security_token_data.Token |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a few minor style comments:
- align assignment operators
- append a trailing comma to the end of array elements to minimize future diffs
|
||
local iam_role_name = role_name_request_res:read_body() | ||
|
||
ngx.log(ngx.DEBUG, "[aws-lambda] IAM role '" .. iam_role_name .. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here (and a few other places) we can use variadic arguments instead of string concatenation to build the message
|
||
local iam_security_token_request, err = client:request { | ||
method = "GET", | ||
path = "/latest/meta-data/iam/security-credentials/" .. iam_role_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: trailing comma here for consistency
spec/fixtures/custom_nginx.template
Outdated
AccessKeyId = "test_iam_access_key_id", | ||
SecretAccessKey ="test_iam_secret_access_key", | ||
Token = "test_session_token" | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor style: i think we're one space too far indented here?
Hi @p0pr0ck5, On creating the test I realised that the metadata service call had no connection timeout configured, as this could eat up resources I added a 500ms timeout for the metadata service calls (this should be sufficient as the metadata service runs locally on EC2 instances. Requests usually take ~1ms). |
|
||
local function fetch_iam_credentials_from_metadata_service(metadata_service_host, metadata_service_port) | ||
local client = http.new() | ||
client:set_timeout(500) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's dangerous to make such assumptions considering Kong might be running in a variety of infrastructures with different limitations or requirements out there. Such timeout variables should ideally be made configurable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not quite sure about the validity of this comment, given the very narrow infrastructure scope of the context at play here. Adding a tunable for this seems like unnecessary cognitive load.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our "mantra" of being opinionated (in order to not oversight any potential use-case or to be compatible with as many infrastructures as possible), hints that we should provide customizable timeouts. In fact there shouldn't be a single timeout (to the best of my knowledge) that isn't configurable in Kong already, including this module itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the idea here is that inter-infrastructure compatibility is meaningless here. Is it worth sticking to a hard-line mantra in such a case? IMO it's not, but this isn't a hill for me to die on, just chiming in that this is a very specific use case :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get your concerns, I tried to omit any unnecessary config properties to keep the plugin configuration relatively clean. Especially as the timeout would be another semi-optional config property (only applies if role based authentication is used). From my experience (using native AWS sdks on a variety of platforms) this is a property you would never need to touch. But maybe there are cases where you have too.
I checked other "official" SDK's from Amazon. Most set a default timeout which cannot be changed.
- python cli --> 1 second
- python boto --> 1 second (can be configured)
- java sdk --> 2 seconds
The timeouts from these SDK's are higher than the one set by me, but most of timeouts mentioned above apply for all services offered by the metadata service (and not just the credentials service).
I'd suggest that I increase the timeout to 5 seconds by default and expose it as a parameter to get_iam_credentials_from_instance_profile()
. - This way we can tweak it for the test and exposing it to the plugins config afterwards would be trivial if needed.
If you want me to add it to the config I can do this of course too. Just let me know what you'd prefer.
} | ||
|
||
if not role_name_request_res or role_name_request_res == "" then | ||
ngx.log(ngx.ERR, "[aws-lambda] Could not fetch role name from metadata service: ", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Some weird indentation here again)
kong/plugins/aws-lambda/schema.lua
Outdated
}, | ||
aws_secret = { | ||
type = "string", | ||
required = true, | ||
}, | ||
use_ec2_iam_role = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our policy is that no plugin config already in the database should be left in a previous version of the plugin's schema. This is because we want to avoid complicated code paths handling different versions of the same plugin's schema in the plugin business logic itself. This is a policy that has unfortunately not always been followed, and such oversight lead us to a few bugs/overhead maintenance issues recently.
So: new schema value, new migration for Cassandra/PostgreSQL :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @thibaultcha,
are you referring to the use_ec2_iam_role
config property introduced in this PR? - The aws_secret and aws_key were changed to semi-optional, so whatever config was in the database is still correct after the PR (it will continue to use secrets & keys from the db config if use_ec2_iam_role
is not set / set to false). In other words, as long as use_ec2_iam_role is not set to true the old schema is still enforced.
Do you mean I should configure use_ec2_iam_role
to a required property and set it to false by default through a migration script? I'm unsure of the benefits of this. Wouldn't it be better to remove the default=false flag here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm indeed... I do agree with that reasoning. We kind of set this "unsaid" policy of enforcing migrations when introducing new values, but optional values would not benefit from it in any way, because they might be nil
per definition already.
Alright then, this sounds good! Sorry for the noise!
@Mgutjahr can you rebase this commit to fix the conflict with the rockspec? After that I think we're good to go here. |
19f2813
to
76184c0
Compare
Hi @p0pr0ck5 & @thibaultcha , |
Hi @thibaultcha, do you have any updates/comments on this PR? |
@Mgutjahr sorry for the delay on our end. Apologies, since we've had another release since the last change, can we update the rockspec here? I'd like to see this merged into 0.12. @thibaultcha thoughts? |
Hello @p0pr0ck5 , @thibaultcha and @Mgutjahr . Sorry for disturb you, guys, but we really need this feature available in Kong for AWS. Thanks in advance. |
Is all that needs to happen just rebasing the rockspec? I'm happy to do that real fast if that will help this get merged in. |
is it ok if i re-open this fixed? |
@thibaultcha is all that's left for this to update the rockspec and resolve the handler.lua conflict? seems something trivial and we can merge this. @Mgutjahr would you be able to do that or someone else can fork this and make the necessary updates. would be really nice to get this merged since it's so close and quite useful. |
@farazs I can try to rebase it this weekend. I want to test it at least once also on an actual EC2 on AWS with the current version of kong which is why it is a bit more work than just rebasing it. |
Any chance of this getting reviewed soon? |
Adds support for querying temporary credentials from EC2 metadata service instead of fetching long-lived key-pairs from the database. The credentials are fetched from the EC2 metadata service and are cached for 60 seconds. I assume that many users are running kong on EC2 instances. For these users this is the recommended way to authenticate requests for AWS services as key-rotation and credential provisioning are managed by AWS. For further details refer to the [AWS documentation about EC2 roles](http://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use_switch-role-ec2.html)
…C2 metadata service instance role
…ptional aws_secret and aws_key config properties
…request termination to handler on metadata service error
… as function params
…cation to be compatible with kong 0.11
76184c0
to
70c116f
Compare
Hey, I rebased the PR again to the current master branch. The branch could be merged if you want to. There are just two things I'd like to point out:
|
Manuel Gutjahr seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
Summary
Adds support for querying temporary credentials from EC2 metadata service instead of fetching long-lived key-pairs from the database.
The credentials are fetched from the EC2 metadata service and are cached for 60 seconds. I assume that many users are running kong on EC2 instances. For these users this is the recommended way to authenticate requests for AWS services as key-rotation and credential provisioning are managed by AWS.
For further details refer to the AWS documentation about EC2 roles
Full changelog
kong.plugins.aws-lambda.iam-role-credentials
for fetching and caching temporary credentials from the metadata service (default 169.254.169.254 port 80)use_ec2_iam_role
tokong.plugins.aws-lambda.schema
and declaredaws_key
andaws_secret
as optionalself_check
function to check whether eitheruse_ec2_iam_role
is set to true or both aws_key and aws_secret as configuredkong.plugins.aws-lambda.handler
to either fetch credentials from the metadata service or use credentials stored in database