Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add snowflake keypair authentication #21322

Merged
merged 12 commits into from
Sep 9, 2022
Merged

Conversation

xiayanzheng
Copy link
Contributor

@xiayanzheng xiayanzheng commented Sep 4, 2022

SUMMARY

We want add snowflake in superset, but our snowflake is using SSO authentication so we can't add it with built-in snowflake connector, we found we can connect it with snowflake keypair authentication. so I modified connector file to support keypair authentication.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

BEFORE:
I can't add snowflake with keypair authentication
image

AFTER:
【Update @ 2022/09/04】
Now we can add private key path and private key password in "SECURE EXTRA" with flowing format, like below.

{
    "auth_method": "keypair", 
    "auth_params": {
        "privatekey_path":"/app/Your Pirvate Key Path",
        "privatekey_pass":"Your Private Key Password"
    }
}

iShot_2022-09-04_19 22 08 in

TESTING INSTRUCTIONS

  1. Follow this document to create and bind keypair to snowflake
  2. Upload private key file(.p8 file) to server(ie. /app/rsa_key.p8)
  3. the In Superset web page, add database,select "Other" option in SUPPORTED DATABASES dropdown.
  4. Add SQLALCHEMY URI in SQLALCHEMY URI field
  5. switch tab to "ADVANCED"
  6. Click "Security"
  7. Replace value in "privatekey_path" and "privatekey_pass" and fill it into "SECURE EXTRA"
{
    "auth_method": "keypair", 
    "auth_params": {
        "privatekey_path":"/app/Your Pirvate Key Path on Server",
        "privatekey_pass":"Your Private Key Password"
    }
}
  1. Click Connect or "TEST CONNECTION" in BACSIC Tab

ADDITIONAL INFORMATION

  • Has associated issue: SSO login for snowflake #21107
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@xiayanzheng
Copy link
Contributor Author

@mebegu @villebro @dpgaspar
Hi all, I created a new PR here for snowflake keypair auth, the original PR's commit history is wrong, so I have to create a new one. and I‘ve moved keypair authentication logic to superset/db_engine_specs/snowflake.py and wrap it into update_params_from_encrypted_extra function, please review when you have time,thanks.

@villebro
Copy link
Member

villebro commented Sep 4, 2022

Awesome @xiayanzheng , I'll review within the next 24 hours 👍

@villebro villebro self-requested a review September 4, 2022 14:55
@codecov
Copy link

codecov bot commented Sep 4, 2022

Codecov Report

Merging #21322 (053230c) into master (c3f8417) will decrease coverage by 11.39%.
The diff coverage is 45.63%.

❗ Current head 053230c differs from pull request most recent head e7e93b6. Consider uploading reports for the commit e7e93b6 to get more accurate results

@@             Coverage Diff             @@
##           master   #21322       +/-   ##
===========================================
- Coverage   66.56%   55.17%   -11.40%     
===========================================
  Files        1790     1790               
  Lines       68393    68454       +61     
  Branches     7279     7287        +8     
===========================================
- Hits        45526    37768     -7758     
- Misses      20990    28809     +7819     
  Partials     1877     1877               
Flag Coverage Δ
hive 52.92% <20.96%> (-0.03%) ⬇️
mysql ?
postgres ?
presto 52.82% <20.96%> (-0.03%) ⬇️
python 57.78% <20.96%> (-23.57%) ⬇️
sqlite ?
unit 50.83% <20.96%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...lLab/components/ExploreCtasResultsButton/index.tsx 8.33% <0.00%> (ø)
...c/SqlLab/components/ExploreResultsButton/index.tsx 100.00% <ø> (ø)
...rontend/src/SqlLab/components/QueryTable/index.tsx 70.17% <ø> (ø)
...frontend/src/SqlLab/components/SouthPane/index.tsx 66.66% <ø> (ø)
...end/src/components/Datasource/DatasourceEditor.jsx 65.35% <0.00%> (-0.26%) ⬇️
...onents/ErrorMessage/ErrorMessageWithStackTrace.tsx 80.00% <ø> (+20.00%) ⬆️
...-frontend/src/components/FilterableTable/index.tsx 72.63% <ø> (ø)
...src/dashboard/components/PropertiesModal/index.tsx 61.81% <0.00%> (-0.38%) ⬇️
...d/src/explore/components/PropertiesModal/index.tsx 62.90% <0.00%> (-1.04%) ⬇️
superset/security/manager.py 63.30% <ø> (-32.12%) ⬇️
... and 304 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Member

@villebro villebro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for rewriting this functionality. Left a few comments - In addition, please add documentation for this as requested in the previous PR so others don't need to read the code to understand how this works.

Comment on lines 297 to 298
auth_method = encrypted_extra.pop("auth_method", None)
auth_params = encrypted_extra.pop("auth_params", {})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any reason to .pop() here, let's just .get()

Comment on lines 303 to 308
with open(auth_params['privatekey_path'], "rb") as key:
p_key = serialization.load_pem_private_key(
key.read(),
password=auth_params['privatekey_pass'].encode(),
backend=default_backend()
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also support passing the privatekey directly to avoid having to place the key inside the Superset instance/pod. This way it would be possible to either pass privatekey_path or privatekey.

Comment on lines 139 to 142
snowflake-connector-python==2.7.9
# via snowflake
snowflake-sqlalchemy==1.2.4
# via snowflake
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we need these, as there are no integration tests added here that require having the snowflake drivers present.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the previous pr if snowflake drivers are not included, the GitHub actions will raise snowflake not found error even no integration tests added.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because the previous PR did imports from the snowflake-sqlalchemy package (this one no longer does)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I forgot that, thanks for reminding me. In this case, snowflake drivers should be removed from testing requirements.

return
connect_args = params.setdefault("connect_args", {})
if auth_method == "keypair":
with open(auth_params['privatekey_path'], "rb") as key:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There appears to be some linting issues here (we should always use double quotes)

Copy link
Member

@dpgaspar dpgaspar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. Would be nice to have tests and add some docs also.

@villebro it feels like all these different excrypted_extra Dict keys can become hard to manage on the future, do you think adding some TypedDicts would improve it?

encryption_algorithm=serialization.NoEncryption())
connect_args["private_key"] = pkb
except json.JSONDecodeError as ex:
logger.error(ex, exc_info=True)
Copy link
Member

@dpgaspar dpgaspar Sep 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: use logger.exception(ex). Also wonder if there is value on this try/except block, since load_perm_private_key can raise and p_key.private_bytes can also raise. We could just let it fail

if not database.encrypted_extra:
return
try:
encrypted_extra = json.loads(database.encrypted_extra)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we can move the except block to here, since only json.loads will raise JSONDecodeError. More readable and less nesting

if not auth_method:
return
connect_args = params.setdefault("connect_args", {})
if auth_method == "keypair":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should add an else and support ALLOWED_EXTRA_AUTHENTICATIONS

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added in 89cf4d3

@xiayanzheng
Copy link
Contributor Author

@villebro @dpgaspar Thanks for the review~ changes are updated in commit f5afffb

@villebro
Copy link
Member

villebro commented Sep 5, 2022

Looking good. Would be nice to have tests and add some docs also.

@villebro it feels like all these different excrypted_extra Dict keys can become hard to manage on the future, do you think adding some TypedDicts would improve it?

@dpgaspar not a bad idea. Maybe we can do a standardization pass for 3.0 that makes these configs as uniform as possible across all specs?

Copy link
Member

@villebro villebro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments, but we're getting closer 👍

Comment on lines 37 to 54
***Please note that you need to split your private key content line by line before store it to privatekey_body***

```
{
"auth_method": "keypair",
"auth_params": {
"privatekey_body":[
"-----BEGIN ENCRYPTED PRIVATE KEY-----",
"Key Body",
"Key Body",
"Key Body",
"Key Body",
"-----END ENCRYPTED PRIVATE KEY-----"
],
"privatekey_pass":"Your Private Key Password"
}
}
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While JSON isn't ideal for storing multiline strings, maybe we should just store it with the newlines replaced by a \n as follows:

{
     "auth_method": "keypair", 
     "auth_params": {
         "privatekey_body": "-----BEGIN ENCRYPTED PRIVATE KEY-----\n...\n-----END ENCRYPTED PRIVATE KEY-----",
         "privatekey_pass":"Your Private Key Password"
     }
 }

Later when we consolidate this type of logic across all db engine specs we can then move the private key to a separate column in the database model and have a dedicated field in the UI that accepts the string with newlines. WDYT @xiayanzheng @dpgaspar ?

Copy link
Member

@dpgaspar dpgaspar Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@villebro a string with newlines would be more user friendly, JSON is more flexible but can tend to generate more divergent configs. standardization pass for 3.0 sounds perfect to me. Can you add it to #19597 ?

Currently, I prefer the \n way for privatekey_body

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your advice, I'm ok with both formats, and this change just pushed to this PR.

@@ -136,7 +136,6 @@ websocket-client==1.2.0
# via docker
wrapt==1.12.1
# via astroid

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add the removed newline to avoid introducing unnecessary changes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in 78f625a

@villebro villebro changed the title feat: add snowflake keypair authantication feat: add snowflake keypair authentication Sep 7, 2022
Copy link
Member

@villebro villebro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. It would be great to add some unit tests, but I'm ok adding this as-is and following up with unit tests later if/when bugs are found.

@xiayanzheng
Copy link
Contributor Author

LGTM. It would be great to add some unit tests, but I'm ok adding this as-is and following up with unit tests later if/when bugs are found.

Thanks for the review and approval, I'm starting to spin up the test environment and write tests, after the test code is ready I will create a new PR.

@xiayanzheng
Copy link
Contributor Author

Hi @villebro, do we need to wait for someone to merge this pr, or do I need to do something to merge it?

@villebro
Copy link
Member

villebro commented Sep 9, 2022

Hi @villebro, do we need to wait for someone to merge this pr, or do I need to do something to merge it?

I think it's been ready for review long enough to be merged. If we need to refine this let's open a follow up PR. Thanks for all the work @xiayanzheng , really awesome feature!

@villebro villebro merged commit 9fdd75b into apache:master Sep 9, 2022
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 2.1.0 and removed 🚢 2.1.3 labels Mar 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/M 🚢 2.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants