Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Hardware Security Module support #2625

Merged
merged 53 commits into from
Jan 11, 2022
Merged

feat: Hardware Security Module support #2625

merged 53 commits into from
Jan 11, 2022

Conversation

aarmam
Copy link
Contributor

@aarmam aarmam commented Jul 7, 2021

This change introduces support for Hardware Security Modules, a physical computing device that safeguards and manages digital keys, performs encryption and decryption functions for digital signatures, strong authentication, and other cryptographic functions.

If enabled, the Hardware Security Module is used to look up any keys. If no key is found, the software module is used as a fallback for lookup. This allows you to use the HSM for privileged keys, and the software module to manage lifecycle keys (e.g. for Token Exchange).

For more information, please read the guide.

Thank you to aarmam for this great contribution!


Proposed changes

Hardware Security Module support for keys hydra.openid.id-token, hydra.jwt.access-token using PKCS#11 Cryptographic Token Interface Standard

Checklist

Further comments

Related PR in ory/fosite

@CLAassistant
Copy link

CLAassistant commented Jul 7, 2021

CLA assistant check
All committers have signed the CLA.

Copy link
Member

@aeneasr aeneasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! This looks nice already. I think the three big things we need to address are:

  • How do we write tests for HSM that work cross-platform
  • Implementing HSM as a different KeyManager as opposed to having if c.HSMEnabled everywhere
  • Which library to choose for HSM

Another topic will probably be supporting HSM from cloud vendors (e.g. https://cloud.google.com/kms/docs/hsm) as probably most companies today rely on cloud HSM versus having their own HSM deployed.

jwk/handler.go Outdated Show resolved Hide resolved
jwk/jwt_strategy.go Outdated Show resolved Hide resolved
docs/docs/hsm-support.md Outdated Show resolved Hide resolved
jwk/handler.go Outdated Show resolved Hide resolved
@aeneasr
Copy link
Member

aeneasr commented Jul 23, 2021

While the PR is being worked on I will mark it as a draft. That declutters our review backlog :)

Once you're done with your changes and would like someone to review them, mark the PR as ready and request a review from one of the maintainers.

Thank you!

@aeneasr aeneasr marked this pull request as draft July 23, 2021 14:34
@aarmam
Copy link
Contributor Author

aarmam commented Jul 30, 2021

While the PR is being worked on I will mark it as a draft. That declutters our review backlog :)

Back from vacation! Working on it! :)

@codecov
Copy link

codecov bot commented Aug 9, 2021

Codecov Report

Merging #2625 (0ee34ae) into master (3236e31) will increase coverage by 1.51%.
The diff coverage is 83.89%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2625      +/-   ##
==========================================
+ Coverage   78.26%   79.78%   +1.51%     
==========================================
  Files         110      113       +3     
  Lines        7731     8054     +323     
==========================================
+ Hits         6051     6426     +375     
+ Misses       1265     1222      -43     
+ Partials      415      406       -9     
Impacted Files Coverage Δ
driver/registry.go 80.00% <ø> (ø)
hsm/hsm.go 0.00% <0.00%> (ø)
jwk/manager.go 100.00% <ø> (ø)
persistence/sql/persister.go 78.57% <ø> (ø)
driver/config/provider.go 85.96% <18.18%> (-3.44%) ⬇️
cmd/server/helper_cert.go 48.57% <25.00%> (ø)
persistence/sql/persister_jwk.go 66.66% <52.00%> (-5.17%) ⬇️
jwk/handler.go 68.67% <62.50%> (+10.16%) ⬆️
driver/registry_base.go 90.35% <81.81%> (-0.42%) ⬇️
hsm/manager_hsm.go 88.57% <88.57%> (ø)
... and 10 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 74da7b6...0ee34ae. Read the comment docs.

@aarmam aarmam force-pushed the feature/hsm branch 3 times, most recently from fb43075 to 5298b1c Compare August 15, 2021 20:47
driver/registry_sql.go Outdated Show resolved Hide resolved
jwk/handler.go Outdated Show resolved Hide resolved
hsm/manager_hsm.go Outdated Show resolved Hide resolved
@aarmam aarmam closed this Aug 16, 2021
@aarmam aarmam reopened this Aug 16, 2021
@aarmam
Copy link
Contributor Author

aarmam commented Aug 17, 2021

  • How do we write tests for HSM that work cross-platform

Added quickstart-hsm for testing

hydra/Makefile

Lines 88 to 90 in 5298b1c

.PHONY: quicktest-hsm
quicktest-hsm:
docker build --progress=plain -f .docker/Dockerfile-build -t oryd/hydra:latest-sqlite --target test-hsm .

FROM builder as build-hydra
RUN go build -tags sqlite -o /usr/bin/hydra
FROM builder as test-hsm
RUN apk -U --no-cache add softhsm opensc
RUN pkcs11-tool --module /usr/lib/softhsm/libsofthsm2.so --slot 0 --init-token --so-pin 0000 --init-pin --pin 1234 --label hydra \
&& pkcs11-tool --module /usr/lib/softhsm/libsofthsm2.so \
--login --pin 1234 --token-label hydra \
--keypairgen --key-type rsa:4096 --usage-sign \
--label hydra.openid.id-token --id 68796472612e6f70656e69642e69642d746f6b656e \
&& pkcs11-tool --module /usr/lib/softhsm/libsofthsm2.so \
--login --pin 1234 --token-label hydra \
--keypairgen --key-type rsa:4096 --usage-sign \
--label hydra.jwt.access-token --id 68796472612e6a77742e6163636573732d746f6b656e
RUN addgroup -S ory; \
adduser -S ory -G ory -D -h /home/ory -s /bin/nologin; \
chown -R ory:ory /home/ory; \
chown -R ory:ory /var/lib/softhsm/tokens
ENV HSM_ENABLED=true
ENV HSM_LIBRARY=/usr/lib/softhsm/libsofthsm2.so
ENV HSM_TOKEN_LABEL=hydra
ENV HSM_PIN=1234
RUN go test -failfast -short -tags sqlite ./...

but I guess we should add circle-ci step with similar setup to run tests with hsm enabled?

  • Implementing HSM as a different KeyManager as opposed to having if c.HSMEnabled everywhere

Implemented hsm key manager. See other comments for questions.

  • Which library to choose for HSM

Library is HSM vendor specific. So you install/setup the HSM client
https://thalesdocs.com/gphsm/luna/7.4/docs/pci/Content/sdk/using/libraries_and_applications.htm
and point your pkcs11 conf. library to /usr/safenet/lunaclient/lib/libCryptoki2_64.so

Another topic will probably be supporting HSM from cloud vendors (e.g. https://cloud.google.com/kms/docs/hsm) as probably
most companies today rely on cloud HSM versus having their own HSM deployed.

Any cloud HSM that supports pkcs11 works exactly the same. For example https://docs.aws.amazon.com/cloudhsm/latest/userguide/pkcs11-library.html

@aarmam aarmam marked this pull request as ready for review August 17, 2021 10:05
@aarmam aarmam requested a review from aeneasr August 17, 2021 12:55
@aarmam
Copy link
Contributor Author

aarmam commented Sep 16, 2021

@aeneasr, how can I help moving this pull request forward?

@aeneasr
Copy link
Member

aeneasr commented Sep 16, 2021

Hey @aarmam - due to the security sensitive nature and my lack of knowledge around HSM I will probably need a full week to work on this. It might be helpful if we jump on a call to review it together. I am available again next week - you can ping me on Slack :)

Copy link
Member

@aeneasr aeneasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Verify that cross compilation works, you can use: https://github.com/ory/xgoreleaser#testing-builds
  • We need to ensure that HSM does not require CGO. If it does require CGO, we need to have a build flag (similar to -tags sqlite) that enables HSM and thus also the dependency on CGO.
  • Exclude "real" HSM tests with a go build tag to ensure you can run the full tests without having a soft HSM library installed
  • See if soft HSM works on macOS
  • Write a guide for "developing the HSM feature". i.e. how to install soft HSM (e.g. apt-get install softhsm) and how to set up test keys (e.g. pkcs-tool ...)

.docker/Dockerfile-build Outdated Show resolved Hide resolved
Makefile Outdated Show resolved Hide resolved
go.mod Outdated Show resolved Hide resolved
driver/registry_sql.go Outdated Show resolved Hide resolved
hsm/hsm.go Outdated Show resolved Hide resolved
hsm/manager_hsm.go Show resolved Hide resolved
hsm/manager_hsm.go Outdated Show resolved Hide resolved
hsm/manager_hsm.go Outdated Show resolved Hide resolved
hsm/manager_hsm.go Show resolved Hide resolved
@aeneasr
Copy link
Member

aeneasr commented Oct 13, 2021

Clean up tests with a helper func:

func createMockHSMKeys(t *testing.T) (key interface{}, cleanup func()) {
	id := createHSMKey()
	t.Cleanup(func() {
		deleteKey(id)
	})
	return id
}

func TestKeyManager_GenerateKeySet(t *testing.T) {

	
	t.Run("subtest", func(t *testing.T) {
		key := createMockHSMKeys(t)
	})
}

Table tests

func TestKeyManager_GenerateKeySet(t *testing.T) {
	for _, tc := range []struct{
		name string
		in string
		expected string
	}{
		{
			name:"foo=bar",
			in: "rs256",
			expected: "bar",
		},
		{
			name:"foo=bar",
			in: "hs256",
			expected: "bar",
		},
	}{
		t.Run("case="+tc.name, func(t *testing.T) {
			// set up  routines...
			
			assert.Equals(t, tc.expected, myfunc(tc.in))
			t.Cleanup(func() {
				// ...
			})
		})
	}
}

Or use testify.Suite: https://github.com/stretchr/testify#suite-package

@aeneasr
Copy link
Member

aeneasr commented Oct 13, 2021

Running CircleCI locally: https://circleci.com/docs/2.0/local-cli/#run-a-job-in-a-container-on-your-machine

Add HSM set up to (e.g. apt-get install hsm) the test task of the CircleCI file around here:

- run: make .bin/go-acc

switch k := key.Public().(type) {
case *rsa.PublicKey:
alg = "RS256"
// TODO: Should we validate minimal key length by checking CKA_MODULUS_BITS?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aarmam do you want to address this one? Otherwise we can remove the TODO item I think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When HSM mode is enabled and no keys are found then RSA keys are generated with key length 4096. But if keys are generated on HSM beforehand then there is no key limit set. We could throw error here if key length is too small. I'm on vacation right now and not able to write serious code :) So I would remove this TODO right now if you think its not important or leave this in and I will do separate PR for this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that sounds good to me! I will create a follow up issue for it and link it here :)

@aeneasr
Copy link
Member

aeneasr commented Dec 25, 2021

@aarmam this looks basically good to merge! However, I still have one question left. As far as I can tell, the Go code does not actually test HSM support with integration? I saw that we use quite a lot of mocking (which is awesome 😎 ) but I realized that we do not actually use the HSM library in the Go test. I think this is ok as you have covered all the cases, but I wanted to check in if this is intentional or not.

I also saw quicktest-hsm in the Dockerfile which appears to be using all the HSM dependencies, but it's not actually using them during testing because it is mocking the HSM context?

Another thing I noticed is that we introduce the hsm build flag. However, I can't seem to find any issue when we would not have that build tag available. As far as I can tell, the library github.com/ThalesIgnite/crypto11 does not need any special build parameters as the HSM libraries are included at runtime? I'm not sure if I am correct here. Do we actually need this separate build tag?

From what I can tell, Ory Hydra would also build without that tag normally on all systems. I don't have HSM support on my mac installed yet, but it runs fine.

Otherwise, from a code perspective, this is good to go. Sorry for not reviewing earlier :( But I have time these next days to review the PR and get it merged before NYE :)

@aarmam
Copy link
Contributor Author

aarmam commented Dec 27, 2021

@aarmam this looks basically good to merge! However, I still have one question left. As far as I can tell, the Go code does not actually test HSM support with integration? I saw that we use quite a lot of mocking (which is awesome 😎 ) but I realized that we do not actually use the HSM library in the Go test. I think this is ok as you have covered all the cases, but I wanted to check in if this is intentional or not.

I also saw quicktest-hsm in the Dockerfile which appears to be using all the HSM dependencies, but it's not actually using them during testing because it is mocking the HSM context?

Unit tests that use mocks are executed always (HSM enabled/disabled). quicktest-hsm executes all tests in HSM mode using SoftHSM that emulates HSM. **So all tests are actually run using HSM when executed with quicktest-hsm ** (that's why some test had to be modified or even disabled, because keys cannot be added to HSM only generated). Also CircleCI HSM enabled test job was added to ensure all future features are compatible with HSM.

Another thing I noticed is that we introduce the hsm build flag. However, I can't seem to find any issue when we would not have that build tag available. As far as I can tell, the library github.com/ThalesIgnite/crypto11 does not need any special build parameters as the HSM libraries are included at runtime? I'm not sure if I am correct here. Do we actually need this separate build tag?

From what I can tell, Ory Hydra would also build without that tag normally on all systems. I don't have HSM support on my mac installed yet, but it runs fine.

I used the cross compilation tool you recommended and it seemed that ARM based goarch targets did not compile. I might be completely wrong but It seemed so? So I had to use build tags -> https://github.com/ory/hydra/blob/2fab67a444051ca15b04b35941bec1fcb0941d91/.goreleaser.yml

I'm on vacation and might respond slowly, but I will rebase this PR later in the evening! And yes if possible lets try to merge it before NYE :) Let me know if you have more questions!

@aeneasr
Copy link
Member

aeneasr commented Dec 27, 2021

Cross compile on ARM might indeed be an issue, I will check it out. Awesome! I’m also on vacation, I hope you enjoy yours and have time to spend with your family & loved ones and don’t get too busy reading GitHub 😇

@aeneasr
Copy link
Member

aeneasr commented Dec 27, 2021

You're right, cross compilation does indeed not work out of the box. So yes, the HSM build tags do make sense. Unfortunately my internet is incredibly slow, but I'll try to check wether cross compile works in our cross compile image. If not, we have a small problem 😅

# Conflicts:
#	driver/config/provider.go
#	driver/registry.go
#	driver/registry_sql.go
#	go.mod
#	persistence/sql/persister_test.go
aeneasr
aeneasr previously approved these changes Dec 27, 2021
Copy link
Member

@aeneasr aeneasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the clarifications and very, very hard work. I fixed the tests and ensured cross compilation works (it seems to work as far as I can tell). This is now good to merge if the CI passes!

@aeneasr
Copy link
Member

aeneasr commented Dec 27, 2021

Hm, not sure what's going on, but the HSM tests fail with an unrelated test:

https://app.circleci.com/pipelines/github/ory/hydra/3327/workflows/1b5910a6-a18f-4828-863a-5f6660d3b2d0/jobs/34338?invite=true#step-110-1612

This test though passes on master and also in the main pipeline (without HSM enabled). Could you maybe take a look when you have a bit of time @aarmam ? Once resolved, we can merge this!

@aarmam
Copy link
Contributor Author

aarmam commented Dec 30, 2021

Hm, not sure what's going on, but the HSM tests fail with an unrelated test:

https://app.circleci.com/pipelines/github/ory/hydra/3327/workflows/1b5910a6-a18f-4828-863a-5f6660d3b2d0/jobs/34338?invite=true#step-110-1612

This test though passes on master and also in the main pipeline (without HSM enabled). Could you maybe take a look when you have a bit of time @aarmam ? Once resolved, we can merge this!

These tests are related to #2384 that was just merged. I'm not sure how to resolve this right now. As I understand #2384 involves trusting public keys.

If HSM is enabled hardware key manager is the default key manager. Keys API uses hardware key manager to Find/Generate/Delete keys, but the newly added Trust API uses software key manager. It uses AddKey method to store the public key. This means that this key is not found when using Keys API.

There is a possibility to store x509 certificates to HSM using crypto11 so that public key could be stored and retrieved. Importing public key only is not supported by crypto11 at this moment.

How many keys are expected to be trusted with #2384 - hundreds/thousands/more?

If nr. of keys are in a range of HSM capabilities (I don't know what's reasonable) I could figure something with crypto11 import certificate functionality.

Otherwise I'm not sure at the moment how to resolve this. And this might be problem with future features also, that need to import public keys or keypairs. I guess it would be possible to refactor code so that software/hardware key manager can be selected when accessing Keys API etc. If you have ideas please let me know!

@aeneasr
Copy link
Member

aeneasr commented Dec 30, 2021

I see, I am glad we were able to catch this!

How many keys are expected to be trusted with #2384 - hundreds/thousands/more?

Depending on the use case it could be up to millions as this feature will be used similar to SAML assertions. Basically you exchange a JWT for an access token.

Otherwise I'm not sure at the moment how to resolve this. And this might be problem with future features also, that need to import public keys or keypairs. I guess it would be possible to refactor code so that software/hardware key manager can be selected when accessing Keys API etc. If you have ideas please let me know!

Wouldn't it suffice to use the software key manager in this scenario, and use HSM only for system-level keys?

@aeneasr
Copy link
Member

aeneasr commented Dec 30, 2021

An alternative could also be to have two lookup strategies. When HSM is enabled, we try to look up in HSM first, then in software. When keys are added, we always add it to software. So it's not an either/or question, but something "on top".

@aarmam
Copy link
Contributor Author

aarmam commented Jan 7, 2022

An alternative could also be to have two lookup strategies. When HSM is enabled, we try to look up in HSM first, then in software. When keys are added, we always add it to software. So it's not an either/or question, but something "on top".

Thanks I implemented this variant! I hope I was not too inattentive, I'm still in vacation mood :)
Ready for review @aeneasr

aeneasr
aeneasr previously approved these changes Jan 11, 2022
@aeneasr
Copy link
Member

aeneasr commented Jan 11, 2022

Great work man!

# Conflicts:
#	driver/registry_base.go
#	driver/registry_sql.go
#	go.mod
@aeneasr aeneasr merged commit 7578aa9 into ory:master Jan 11, 2022
@vinckr
Copy link
Member

vinckr commented Jan 12, 2022

Hello @aarmam
Congrats on merging your first PR in Ory 🎉 !
Your contribution will soon be helping secure millions of identities around the globe 🌏
As a small token of appreciation we send all our first time contributors a gift package to welcome them to the community.
Please drop me an email to claim your Ory swag!

@aarmam
Copy link
Contributor Author

aarmam commented Jan 18, 2022

Great work man!

Back from vacation and really nice news to kickstart the work mode!:tada::relaxed: Thank you @aeneasr and thanks for all the guidance!

@StarAurryon StarAurryon mentioned this pull request Mar 17, 2022
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants