Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HPCC-24546 Support for Hashicorp vaults and kubernetes secrets #14122

Merged
merged 1 commit into from Sep 14, 2020

Conversation

afishbeck
Copy link
Member

@afishbeck afishbeck commented Aug 28, 2020

Signed-off-by: Anthony Fishbeck anthony.fishbeck@lexisnexisrisk.com

Type of change:

  • This change is a bug fix (non-breaking change which fixes an issue).
  • This change is a new feature (non-breaking change which adds functionality).
  • This change improves the code (refactor or other change that does not change the functionality)
  • This change fixes warnings (the fix does not alter the functionality or the generated code)
  • This change is a breaking change (fix or feature that will cause existing behavior to change).
  • This change alters the query API (existing queries will have to be recompiled)

Checklist:

  • My code follows the code style of this project.
    • My code does not create any new warnings from compiler, build system, or lint.
  • The commit message is properly formatted and free of typos.
    • The commit message title makes sense in a changelog, by itself.
    • The commit is signed.
  • My change requires a change to the documentation.
    • I have updated the documentation accordingly, or...
    • I have created a JIRA ticket to update the documentation.
    • Any new interfaces or exported functions are appropriately commented.
  • I have read the CONTRIBUTORS document.
  • The change has been fully tested:
    • I have added tests to cover my changes.
    • All new and existing tests passed.
    • I have checked that this change does not introduce memory leaks.
    • I have used Valgrind or similar tools to check for potential issues.
  • I have given due consideration to all of the following potential concerns:
    • Scalability
    • Performance
    • Security
    • Thread-safety
    • Cloud-compatibility
    • Premature optimization
    • Existing deployed queries will not be broken
    • This change fixes the problem, not just the symptom
    • The target branch of this pull request is appropriate for such a change.
  • There are no similar instances of the same problem that should be addressed
    • I have addressed them here
    • I have raised JIRA issues to address them separately
  • This is a user interface / front-end modification
    • I have tested my changes in multiple modern browsers
    • The component(s) render as expected

Smoketest:

  • Send notifications about my Pull Request position in Smoketest queue.
  • Test my draft Pull Request.

Testing:

@hpcc-jirabot
Copy link

https://track.hpccsystems.com/browse/HPCC-24546
Jira not updated (pull request already registered)

@afishbeck
Copy link
Member Author

@richardkchapman @ghalliday please review.

The helm/examples/secrets/README.md file is meant to be a very easy to use walk through of setting up and using vault and kubernetes secrets.. please include usability comments in your review.

I've kept the secret and vault functions in jutil.cpp for now, but can move them once we've agreed on the approach.

I've added open source cpp-http code for my REST calls. The nice thing is it's all in one header file. But when I move the code I could use our CHttpClient code. I didn't want to link esphttp to jlib. And this library is a bit more low level.

@afishbeck
Copy link
Member Author

Another option for REST calls could be boost.. I didn't want to deal with large dependencies though.

@afishbeck
Copy link
Member Author

Removed tests that were only meant for my environment, and updated what seems to be arbitrary cpp-httplib openssl version deprecation.

@AttilaVamos
Copy link
Contributor

I am sure the schedule2 failure isn't related to this PR.

@afishbeck
Copy link
Member Author

@afishbeck
Copy link
Member Author

Updated caching logic to check all viable caching locations before going disk or vaults.

Copy link
Member

@ghalliday ghalliday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@afishbeck the changes look good to me. A few comments, but none of them very serious.

helm/examples/secrets/README.md Outdated Show resolved Hide resolved
helm/hpcc/templates/_helpers.tpl Outdated Show resolved Hide resolved
system/jlib/jutil.cpp Outdated Show resolved Hide resolved
system/jlib/jutil.cpp Outdated Show resolved Hide resolved
system/jlib/jutil.cpp Outdated Show resolved Hide resolved
if (created && (created < timeoutThreshold))
{
tree->removeTree(envelope);
puts("\nremoved from vault cache\n");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar comments about tracing with puts

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to scrub these personal debug statements out.

return false;
}
};
class CVaultSet
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: newline between classes.

//MORE: cache the secret for up to secretTimeoutMs
bool getCachedSecret(CVaultKind &kind, StringBuffer &content, const char *secret, const char *version)
{
std::map<std::string, std::unique_ptr<CVault>>::iterator it = vaults.begin();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

possibly simpler to use "auto" instead of the explicit type (and elsewhere)

@@ -3056,19 +3175,481 @@ extern jlib_decl void setSecretMount(const char * path)
secretDirectory.set(path);
}

extern jlib_decl StringBuffer & getSecret(StringBuffer & result, const char * name, const char * key)
enum class CVaultKind { kv_v1, kv_v2 };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it worth splitting this code into a jsecret.hpp/cpp? (I don't have a strong opinion either way.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I'll split it out. Wanted an initial sanity check on the approach first.

static StringBuffer secretDirectory;
static CriticalSection secretCS;
static unsigned secretTimeoutMs = UINT_MAX;
static unsigned secretTimeoutMs = 60 * 60 * 1000;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this typical for k8s? (I never researched what the standard default was.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll see if there's some consensus somewhere.

helm/examples/secrets/README.md Show resolved Hide resolved
helm/examples/secrets/README.md Show resolved Hide resolved
helm/examples/secrets/README.md Show resolved Hide resolved
Setup vault auth policy granting access to the ecl secrets locations we plan to use:

```bash
vault policy write hpcc-kv-ro hpcc_vault_policies.hcl
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users will need to give the absolute path to hpcc_vault_policies.hcl or be located in the directory.

vault policy write hpcc-kv-ro HPCC-Platform/helm/examples/secrets/hpcc_vault_policies.hcl

Install the HPCC helm chart with the secrets just defined added to all components that run ECL.

```bash
helm install myhpcc ../../hpcc/ --set global.image.version=latest -f val
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolute path:

helm install myhpcc HPCC-Platform/helm/hpcc/ --set global.image.version=latest -f HPCC-Platform/helm/examples/secrets/values-secrets.yaml

@afishbeck
Copy link
Member Author

Updated to support binary secrets, moved secret code to jsecrets.cpp, cleaned up based on review comments.
@gfortil updated README.md based on your comments. Among other things I now assume user is running from HPCC-Platform/helm.
@stuartort if you want to test when you get to the hpcc helm install line add " --set global.image.root=afishbeck" as follows:

helm install myhpcc hpcc/ --set global.image.root=afishbeck --set global.image.version=latest -f examples/secrets/values-secrets.yaml

@afishbeck
Copy link
Member Author

@AttilaVamos not sure what "Conflicting files, should skip build and test" means. I'll try rebasing.

@afishbeck
Copy link
Member Author

rebased

@afishbeck
Copy link
Member Author

Also changed the vault yaml format a bit.
@ghalliday please review

@afishbeck
Copy link
Member Author

rebased again to resolve conflict with my other PR which was just merged.

@afishbeck
Copy link
Member Author

@ghalliday

@afishbeck
Copy link
Member Author

@AttilaVamos I doubt those roxie timeouts had much to do with this change, can we re-run the test?

@AttilaVamos
Copy link
Contributor

@afishbeck The schedule2 is definitely not related to this PR. The workflow test is new, I should check it. However I think it isn't related, as well based on that test has been successful on in the another Smoketest environment (Host: ip-10-20-0-221.ca-central-1.compute.internal).

@ghalliday
Copy link
Member

Added HPCC-24707 to cover the last issue - not likely to be related to tony's change, but worth investigating.

@ghalliday
Copy link
Member

@ghalliday - to make sure I review it next

Copy link
Member

@ghalliday ghalliday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@afishbeck this looks like it is very close to being ready. A few questions/comments.

@@ -801,10 +805,13 @@ class CWSCHelper : implements IWSCHelper, public CInterface
unsigned done;
Owned<IPropertyTree> xpathHints;
Linked<ClientCertificate> clientCert;
bool customClientCert = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

packing: I don't think it matters in this case, so I probably wouldn't change since there are not likely to be large numbers of the objects, but if this boolean was moved to follow timeLimitExceeded the object would be 8 bytes smaller.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if I need to make any other changes I will include this change as well.

"ecl": {
"$ref": "#/definitions/vaultCategory"
},
"ecl-user": {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: How does ecl-user differ from ecl? Are there some components that will have ecl available, but no ecl-user? I assume it is to avoid ecl code from accessing implicit secrets. Is there corresponding protection for k8s secrets?
The secret categories were not previously required by the caller, but I think they are now. Is that because you need to have potentially different vaults/security for the different categories to prevent them leaking when they shouldn't. I think it generally makes sense, but want to be sure I understand it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should consider publishing the k8s secrets in subdirectories corresponding to the category - I think that would avoid the issue.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The categories allow the various types of secrets to be organized and secured differently. To prevent leaking and usage for unintended purposes. I haven't made use of "ecl-user" yet, but that is going to be the least secure category, ecl code will have direct access to the contents of the secret. For example, if I need a key for use with some plugin or to pass in some non standard way to a 3rd party. Because the ecl can see the contents it could return it, write it, or send it anywhere. I don't want gateway or storage plane credentials to be accessible in that way.

When it comes to the organization of the actual vault it could be a simple as:
http://${env.VAULT_SERVICE_HOST}:${env.VAULT_SERVICE_PORT}/v1/secret/data/storage/${secret}
http://${env.VAULT_SERVICE_HOST}:${env.VAULT_SERVICE_PORT}/v1/secret/data/ecl/${secret}
http://${env.VAULT_SERVICE_HOST}:${env.VAULT_SERVICE_PORT}/v1/secret/data/ecl-user/${secret}

But those different sections can have different access rights, and allow the secrets to be organized based on how and where they are used.

Over time we may want to segment the vault access by components. So certain roxies can call some gateways but hthor can't for example. But I'd rather get a feel for how these things are used in the wild a bit first. Our users can give us some feedback on how they need to further restrict.

I definitely agree on the idea of the k8s secret subdirectories. That might mean I no longer have to mangle the names of the HTTP-CONNECT secrets. That was for example to prevent them being used in the way the ecl-user category will be in the future.


static void splitUrlAuthority(const char *authority, size_t authorityLen, StringBuffer &user, StringBuffer &password, StringBuffer &host, StringBuffer *port)
{
if (isEmptyString(authority)||authorityLen==0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure it should be calling isEmptyString - very unlikely change of accessing invalid memory. Main objection is it suggests it is a null terminated string.


static inline void extractUrlProtocol(const char *&url, StringBuffer *scheme)
{
if(!url || strlen(url) <= 7)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trivial - I think the strlen() test is redundant.

if (token.length())
return;
StringBuffer login_token;
login_token.loadFile("/var/run/secrets/kubernetes.io/serviceaccount/token");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this require the pods to have any access rights? Some recent PRs of Richard's have tied down both the access rights and the network access. Does that introduce problems here?

Copy link
Member Author

@afishbeck afishbeck Sep 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on a quick comment from @richardkchapman I think the line in the README where I do:

vault write auth/kubernetes/role/hpcc-vault-access
bound_service_account_names=hpcc-default
bound_service_account_namespaces=default
policies=hpcc-kv-ro
ttl=24h

may be enough to grant access to our default account and can be changed to reflect whatever the user may change the account to.
It's more of an operational thing on the vault side.
I also support passing in a direct access token as a k8s secret. Hopefully that will cover some other authentication schemes the user may set up at least until we determine what other authentication modes we want to support.
Vault supports many authentication schemes and we can add more over time based on what scenarios users need, or we want to support in general.

@afishbeck
Copy link
Member Author

@ghalliday updated

  1. Secret categories are based on usage, not just where they should go
  2. Added new system category for HPCC system level secrets.
  3. k8s secret path includes category.
  4. Fixed vault configMap entries for THOR
  5. README now demonstrates enabling all account names.
  6. Minor improvements.

@ghalliday
Copy link
Member

@afishbeck I like the changes - please can you squash and I will scan it one more time before merging.

Signed-off-by: Anthony Fishbeck <anthony.fishbeck@lexisnexisrisk.com>
@afishbeck
Copy link
Member Author

@ghalliday squashed changes

@HPCCSmoketest
Copy link
Contributor

Automated Smoketest: ✅
OS: centos 7.6.1810 (Linux 3.10.0-957.1.3.el7.x86_64)
GCC: gcc (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5)
Host: ip-10-20-0-181.ca-central-1.compute.internal
Sha: 6b9063a
Build: success
Milestone:Install hpccsystems-platform-community_7.11.0-trunk0.el7.x86_64.rpm
HPCC Start: OK

Unit tests result:

Test total passed failed errors timeout elaps
unittest 144 144 0 0 0 75 sec
wutoolTest(Dali) 19 19 0 0 0 1 sec
wutoolTest(Cassandra) 19 19 0 0 0 8 sec

Regression test result:

phase total pass fail elaps
setup (hthor) 11 11 0 25 sec (00:00:25)
setup (thor) 11 11 0 43 sec (00:00:43)
setup (roxie) 11 11 0 16 sec (00:00:16)
test (hthor) 930 930 0 330 sec (00:05:30)
test (thor) 842 842 0 742 sec (00:12:22)
test (roxie) 1009 1009 0 424 sec (00:07:04)

HPCC Stop: OK
HPCC Uninstall: OK
Time stats:

Prep time Build time Package time Install time Start time Test time Stop time Summary
10 sec (00:00:10) 778 sec (00:12:58) 124 sec (00:02:04) 23 sec (00:00:23) 18 sec (00:00:18) 1843 sec (00:30:43) 18 sec (00:00:18) 2814 sec (00:46:54)

@HPCCSmoketest
Copy link
Contributor

Automated Smoketest: ✅
OS: centos 7.8.2003 (Linux 3.10.0-957.1.3.el7.x86_64)
GCC: gcc (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5)
Host: ip-10-20-0-116.ca-central-1.compute.internal
Sha: 6b9063a
Build: success
Milestone:Install hpccsystems-platform-community_7.11.0-trunk0.el7.x86_64.rpm
HPCC Start: OK

Unit tests result:

Test total passed failed errors timeout elaps
unittest 144 144 0 0 0 74 sec
wutoolTest(Dali) 19 19 0 0 0 2 sec
wutoolTest(Cassandra) 19 19 0 0 0 8 sec

Regression test result:

phase total pass fail elaps
setup (hthor) 11 11 0 29 sec (00:00:29)
setup (thor) 11 11 0 45 sec (00:00:45)
setup (roxie) 11 11 0 20 sec (00:00:20)
test (hthor) 930 930 0 664 sec (00:11:04)
test (thor) 842 842 0 1036 sec (00:17:16)
test (roxie) 1009 1009 0 742 sec (00:12:22)

HPCC Stop: OK
HPCC Uninstall: OK
Time stats:

Prep time Build time Package time Install time Start time Test time Stop time Summary
9 sec (00:00:09) 1097 sec (00:18:17) 126 sec (00:02:06) 23 sec (00:00:23) 18 sec (00:00:18) 2797 sec (00:46:37) 18 sec (00:00:18) 4088 sec (01:08:08)

@ghalliday ghalliday merged commit bd00e60 into hpcc-systems:master Sep 14, 2020
@afishbeck afishbeck deleted the secrets_and_vaults branch October 7, 2022 16:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
7 participants