Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot pass AWS session token to dynamodb connection #25066

Closed
3 tasks done
malatep opened this issue Aug 23, 2023 · 10 comments
Closed
3 tasks done

Cannot pass AWS session token to dynamodb connection #25066

malatep opened this issue Aug 23, 2023 · 10 comments

Comments

@malatep
Copy link

malatep commented Aug 23, 2023

Hello, I am trying to connect to DynamoDB using the connection string mentioned here. Version Superset 2.0.1

dynamodb://{aws_access_key_id}:{aws_secret_access_key}@dynamodb.{region_name}.amazonaws.com:443?connector=superset

Since I am using IAM roles, I would also need to pass the AWS session token. As mentioned in the AWS docs at this link

When you make a call using temporary security credentials, the call must include a session token, which is returned along with those temporary credentials. AWS uses the session token to validate the temporary security credentials.

This is supported by PyDynamoDB as mentioned here

from pydynamodb import connect

cursor = connect(aws_access_key_id="aws_access_key_id",
                 aws_secret_access_key="aws_secret_access_key",
                 aws_session_token="aws_session_token",
                 region_name="region_name").cursor()

However, I could not find any way to include this correctly in the connection string. Is there any way to achieve this?

How to reproduce the bug

  1. Create IAM Role with DDB permissions
  2. Create new connection to DDB using key ID and key secret
  3. This will fail because session token is missing

Expected results

There should be an option to add the AWS session token

Actual results

The connection is unsuccessful

Checklist

Make sure to follow these steps before submitting your issue - thank you!

  • I have checked the superset logs for python stacktraces and included it here as text if there are any.
  • I have reproduced the issue with at least the latest released version of superset.
  • I have checked the issue tracker for the same issue and I haven't found one similar.

Additional context

Add any other context about the problem here.

@mdeshmu
Copy link
Contributor

mdeshmu commented Aug 24, 2023

PyDynamoDB supports iam roles so that you don't have to worry about token expiry. That's better than manually updating token in superset connection everytime

from pydynamodb import connect

cursor = connect(role_arn="role_arn",
                 role_session_name="PyDynamoDB-session",
                 duration_seconds=3600,
                 region_name="region_name").cursor()

@malatep
Copy link
Author

malatep commented Aug 24, 2023

Thanks @mdeshmu, do you know if there is any way to use this in the SQLALCHEMI_URI string mentioned here, or pass it as an extra database setting?

@mdeshmu
Copy link
Contributor

mdeshmu commented Aug 24, 2023

yes, here is an example

@malatep
Copy link
Author

malatep commented Aug 25, 2023

Thanks again for your input.

Unfortunately for my use-case I am not able to use the IAM role directly with the assume_role call made by PyDynamoDB here.

Superset is running in k8s and I am working with ServiceAccount tokens to get AWS credentials. So what I would need it a call to the API assume_role_with_web_identity with the token I provide, but this does not seem supported by PyDynamoDB Connection

@passren
Copy link
Contributor

passren commented Aug 27, 2023

@malatep PyDynamodb v0.5.2 was released to add this feature. Please have a try.

@malatep
Copy link
Author

malatep commented Sep 4, 2023

Thanks @passren for your quick implementation of the changes!

I am not sure if I am missing something or if maybe is a current limitation of superset but I am still unable to connect.

What I am doing is pass role and token in connection Settings > Advanced > Engine Parameters:

{ "connect_args": { 
    "role_arn":"arn:aws:iam::<ACCOUNT_ID>:role/my_role", 
    "web_identity_token":"TOKEN_SERVICEACCOUNT_ROLE"
    }
}

I think the problem is that superset dynamodb connector requires {aws_access_key_id}:{aws_secret_access_key}, which in the case of AssumeRoleWithWebIdentity are available only AFTER you make this call

Calling AssumeRoleWithWebIdentity does not require the use of AWS security credentials.

PyDynamoDB also allows to get credentials from instance profiles, without the need to pass any credentials information. Is it possible to achieve the same in superset?

@passren
Copy link
Contributor

passren commented Sep 7, 2023

@malatep You have to place {aws_access_key_id}:{aws_secret_access_key} in the connection string because Superset will check the string format. But you can use a dummy access_key and secret here, like dummy_key:dummy_secret.

For Pydynamodb, the credentials passing from the connection string will be overwritten by AssumeRole** method if any particular parameters are passed in (for example - role_arn, saml_assertion, web_identity_token, etc.). This behavior will happen before creating an actual connection.

Could you check what errors come out in the backend? I'd like to help if there are error logs.

@malatep
Copy link
Author

malatep commented Sep 8, 2023

Thanks again for your help.

So I tried again passing dummy values as for key and secret and I think there is a good sign that superset calls _assume_role_with_web_identity. However I am getting this error:

An error occurred while creating databases: (builtins.NoneType) None
[SQL: Parameter validation failed:
Invalid length for parameter ProviderId, value: 0, valid min length: 4]
(Background on this error at: https://sqlalche.me/e/14/dbapi)

I think it's because of this line of code in PyDynamoDB
https://github.com/passren/PyDynamoDB/blob/f70ae7f41725793a382aba19ad7882581e8e369e/pydynamodb/connection.py#L239

ProviderId parameter is optional and as per API doc

Do not specify this value for an OpenID Connect identity provider.

So in my case I don't need to pass the provider_id, but it gets passed as an empty string anyway. This seems in line with the error Invalid length for parameter ProviderId, value: 0

@passren
Copy link
Contributor

passren commented Sep 8, 2023

@malatep Yes, you are right. That is my bad. I have fixed this bug in v0.5.4. Please update pydynamodb to latest one.
PyDynamoDB 0.5.4

Thank you so much for the investigation.

@malatep
Copy link
Author

malatep commented Sep 12, 2023

Thank you for the fix. It works!

The challenge I have now is how to update the connection with fresh credentials once the token expires.

I have created a dedicated discussion to see if anyone has guidance on this.

Thank you again. I will close this issue.

@malatep malatep closed this as completed Sep 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants