Skip to content

Conversation

@brighton1101
Copy link
Contributor

Adds support for ECDSA amongst other keys in SSHHook. Adds corresponding test case as well.

closes: #12318

@github-actions
Copy link

The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.

@NBardelot
Copy link
Contributor

@brighton1101 this looks good. One suggestion though: the list of key types allowed_pkey_types might be better configured than hardcoded.

The issue is that Paramiko uses a bad design pattern, since each subclass of PKey implements the from_private_key but there is no utility function or collection that provides the list of algorithms... So you'll have to hardcode the list anyway in Airflow (and also hardcode a transcoding name -> Paramiko's PKey subclass).

Proposition:

  • a section [ssh] in Airflow configuration
  • with an option private_keys_algorithm_support
  • which contains a list of coma separated algorithm names (I'd choose the same names as ssh-keygen uses for its option -t)
  • and in the hooks/ssh.py Airflow code a map that transcodes from those name to the corresponding Paramiko class, and on which you can iterate

@brighton1101
Copy link
Contributor Author

Ah, yeah that makes sense. 👍 Busy today as it is finals week at my university but I will have a chance to get this done over the weekend.

@brighton1101
Copy link
Contributor Author

@NBardelot I went ahead and made those changes. One thing that I'm a bit confused on is how we are actually using those values from the config? Seems like they're just there for posterity with the above

@github-actions
Copy link

The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.

@NBardelot
Copy link
Contributor

NBardelot commented Nov 25, 2020

@brighton1101 I see two benefits:

  1. it helps understand that there is a mapping between an algorithm and the implementation in Paramiko (maybe one day another lib will be used instead of Paramiko?)
  2. it allows someone to restrict their choice to a subset of algorithms without being tied to the entire list of implementations by Paramiko (let's say Paramiko implements a new “foobar” algorithm that is not really yet standard or approved, you can avoid using it)

@brighton1101
Copy link
Contributor Author

@NBardelot understood. Would you like me to add some reconciliation logic to set a property like _pkey_mappings based off of a comparison between the algorithms specified in the config and the defaults specified in the hook? Then the user could effectively restrict which algos are allowed. Or do the changes look ok as is?

Comment on lines 2056 to 2065
- name: ssh
description: ~
options:
- name: private_keys_algorithm_support
description: |
Comma separated list of ssh algorithms names (following `ssh-keygen` naming) supported by
SSH hook
type: string
example: ~
default: "dsa,ecdsa,ed25519,rsa"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This config value doesn't appear to be used anywhere.

Either we should remove it, or use it. Given this needs a source-code change to add a new supported type, is there any value in having this config section?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NBardelot Did you have a use case in mind where you would like to configure this value -- given this is just to support the SSH key given by the user do we really need this configurable.

Copy link
Contributor

@NBardelot NBardelot Dec 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where I work, Airflow is given to teams of users as a platform (Airflow is a PaaS for our teams):

  • my team builds the infrastructure and provides Airflow with a standard configuration
  • the teams use their instance of Airflow

Our standard configuration includes the decisions about security. We need to be able to enforce the security team's choice of algorithms used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I have a strong enough argument for either option on this. Does Airflow configuration typically allow for limitations on user input like this?

Let me know what you guys decide and I'll change it in either direction (by either removing the config altogether, or adding reconciliation logic that will remove algos not specified in the config)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ashb what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that really enforce anything? Any one can create a custom SshHook and overwrite this security measure. Isn't it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this doesn't really feel like a config setting adds anything here -- this should be enforced on the connecting server side rather than here.

@brighton1101 Lets remove this part please.

(Sorry for letting this languish for sooo long)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I've not been very available lately but I'd like to say that putting the only control on the server-side is not the intended way the protocols like SSH behave, in my opinion. Protocol negociation is made to reflect both the client and server interests.

An Airflow instance might very well connect to a server that is not under the control of its administrator. It is typically the case in large companies. The administrator should be able to implement a policy, limiting protocols, without depending on Paramiko or the server administrator.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ashb I'll try to get this done by end of day tomorrow 👍

@brighton1101
Copy link
Contributor Author

@ashb Not sure why the mysql tests are failing. Looks like a memory issue. The changes are now there.

@ashb ashb added this to the Airflow 2.1 milestone Feb 8, 2021
@github-actions
Copy link

github-actions bot commented Feb 8, 2021

The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest master or amend the last commit of the PR, and push it with --force-with-lease.

@github-actions github-actions bot added the okay to merge It's ok to merge this PR as it does not require more tests label Feb 8, 2021
@ashb ashb merged commit f180fa1 into apache:master Feb 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

okay to merge It's ok to merge this PR as it does not require more tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SSHOperator does support only RSA keys and should support other keys compatible with Paramiko

4 participants