Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIRFLOW-3383] Rotate fernet keys. #4225

Merged
merged 1 commit into from Jan 21, 2019
Merged

[AIRFLOW-3383] Rotate fernet keys. #4225

merged 1 commit into from Jan 21, 2019

Conversation

jmcarp
Copy link
Contributor

@jmcarp jmcarp commented Nov 22, 2018

Make sure you have checked all steps below.

Jira

  • My PR addresses the following Airflow Jira issues and references them in the PR title. For example, "[AIRFLOW-XXX] My Airflow PR"

Description

  • Here are some details about my PR, including screenshots of any UI changes:

As far as I can tell, it's not straightforward to rotate the fernet key for encrypted passwords and extras. A user would have to generate a new key, restart airflow, and manually re-enter each value to be encrypted via the web interface. It should be possible to specify multiple fernet keys at once, and to easily re-encrypt values with a new key. The cryptography package provides a MultiFernet class with a rotate method that handles this use case, so I wrote up a patch that uses MultiFernet to support multiple keys and rotation via the command line.

With this approach, we can rotate keys by adding a new key at the start of the FERNET_KEYS config variable, then running the rotate_credentials command from the command line. If the approach makes sense, I'll write up some documentation.

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:

Commits

  • My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • When adding new operators/hooks/sensors, the autoclass documentation generation needs to be added.

Code Quality

  • Passes flake8

@codecov-io
Copy link

codecov-io commented Nov 22, 2018

Codecov Report

Merging #4225 into master will increase coverage by <.01%.
The diff coverage is 63.15%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4225      +/-   ##
==========================================
+ Coverage    74.1%   74.11%   +<.01%     
==========================================
  Files         421      421              
  Lines       27662    27679      +17     
==========================================
+ Hits        20498    20513      +15     
- Misses       7164     7166       +2
Impacted Files Coverage Δ
airflow/models/__init__.py 92.38% <100%> (+0.23%) ⬆️
airflow/bin/cli.py 64.64% <14.28%> (-0.44%) ⬇️
airflow/models/connection.py 64.02% <83.33%> (+0.73%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 31318ba...16204ff. Read the comment docs.

@Fokko
Copy link
Contributor

Fokko commented Nov 26, 2018

Ideally, this isn't anything that I would use. In my case I would:

  • Don't store all the credentials in Airflow itself, but connect to an external credential system such as Google KMS, or Azure Keyvault, Hashicorp vault, etc.
  • When provisioning a new Airflow instance, first deploy it, and then set all the credentials again programmatically. We don't want to manually enter all the credentials because that is a very brittle process. Therefore this should be automated. If you update you Fernet key, you could update your credentials as well.

What do you think?

@jmcarp
Copy link
Contributor Author

jmcarp commented Nov 26, 2018

  • Agreed that it would be useful to integrate Airflow with external KMS options. But AWS/GCP/Azure KMS services don't store encrypted credentials--they store encryption keys. AWS Parameter Store and Hashicorp Vault do store secrets. Would you be interested in adding pluggable secret encryption and/or storage for Airflow 2.0? I'd be happy to contribute.
  • I think it should be possible to rotate the Fernet key without updating all credentials. Ideally users can easily automate updating all credentials, but that might not be possible--for example, users might add credentials to Airflow manually, credentials might need to be pulled from many sources, etc. And in general, it's good practice to make it possible to easily rotate any cryptographic keys.

By the way, I also submitted #4232, which should make it easier to programmatically update credentials.

@jmcarp
Copy link
Contributor Author

jmcarp commented Dec 9, 2018

Ping @Fokko @ashb @kaxil. I agree that storing credentials in an external service like Vault would be a useful feature, but as long as we're supporting credential storage in the airflow database with fernet encryption, I think we should support fernet key rotation. This was the simplest implementation that came to mind. WDYT?

I would also be happy to write up some options for using external encryption and credential services if you'd be open to it.

@ashb
Copy link
Member

ashb commented Dec 9, 2018

but as long as we're supporting credential storage in the airflow database with fernet encryption, I think we should support fernet key rotation

Agreed.

@jmcarp
Copy link
Contributor Author

jmcarp commented Dec 11, 2018

@ashb thanks! Can you take a look and let me know if this approach makes sense? I'll write up docs if so.

@@ -614,6 +614,16 @@ def next_execution(args):
print(None)


@cli_utils.action_logging
def rotate_credentials(args):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good, only comment is that I think this name needs changing - even knowning what the PR was for I thought this was about rotating the credentials in the connection.

How about reencrypt_connections, rotate_fernet_key, or rotate_encryption_keys?

@jmcarp jmcarp force-pushed the rotate-keys branch 2 times, most recently from b44cbd5 to 0ed3e20 Compare December 13, 2018 04:59
@jmcarp
Copy link
Contributor Author

jmcarp commented Dec 13, 2018

Thanks for reviewing @ashb. I agree that rotate_credentials is too vague--I renamed the command to rotate_fernet_key. I also added documentation to secure-connections.rst. The build is failing, but looks like it's not related to these changes.

@jmcarp
Copy link
Contributor Author

jmcarp commented Dec 19, 2018

The build is passing again after rebasing on master. Ready for review when you have time @ashb.

},
{
'func': rotate_fernet_key,
'help': 'Rotate all encrypted connection credentials.',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh argh! We also encrypt variables too. Sorry, forgot about that before.

@jmcarp
Copy link
Contributor Author

jmcarp commented Dec 19, 2018

Good call, I updated the help text for the rotate_fernet_key command.

@jmcarp
Copy link
Contributor Author

jmcarp commented Dec 20, 2018

Merge conflicts resolved. Should be ready for another look @ashb.

airflow/bin/cli.py Outdated Show resolved Hide resolved
@jmcarp
Copy link
Contributor Author

jmcarp commented Dec 21, 2018

Sorry @ashb, I was in a hurry earlier and totally misread your point about encrypting variables. I pushed up some changes to rotate keys on encrypted variables too.

@jmcarp jmcarp force-pushed the rotate-keys branch 2 times, most recently from 684f636 to 222c1d9 Compare December 21, 2018 21:01
@jmcarp
Copy link
Contributor Author

jmcarp commented Jan 8, 2019

Ready for another look when you have time @ashb.

@jmcarp
Copy link
Contributor Author

jmcarp commented Jan 16, 2019

Ping @ashb

Copy link
Member

@ashb ashb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically good to go now (thanks for the ping) but one or two small possible improvements now

tests/models.py Outdated Show resolved Hide resolved
airflow/bin/cli.py Outdated Show resolved Hide resolved
@jmcarp
Copy link
Contributor Author

jmcarp commented Jan 20, 2019

Thanks @ashb, updated the patch.

@ashb ashb merged commit bd74dda into apache:master Jan 21, 2019
ashb pushed a commit to ashb/airflow that referenced this pull request Mar 29, 2019
Add the ability to change the encryption key of all encrypted variables and
connections
wmorris75 pushed a commit to modmed/incubator-airflow that referenced this pull request Jul 29, 2019
Add the ability to change the encryption key of all encrypted variables and
connections
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants