New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added TLS security #1537
Added TLS security #1537
Conversation
a07bee2
to
b978b04
Compare
Makefile
Outdated
@@ -213,6 +213,13 @@ merged.lcov: | |||
lcov --add-tracefile coverage/unittest.info --add-tracefile coverage/integrationtest.info -a coverage/lcov.info -o merged.lcov | |||
|
|||
##@ Utilities | |||
tls-files: | |||
ifeq (, $(shell which mkcert)) | |||
$(error "mkcert is not installed, consider following this instructions: https://github.com/FiloSottile/mkcert#installation ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does there need to be a dependencies.toml
or a README entry to have devs install this tool?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -72,8 +72,9 @@ To set up a development environment for the CLI | |||
3. Run `make all` to install dependencies and build binaries and assets | |||
4. Start a `kind` cluster like so: `KIND_CLUSTER_NAME=<some name> ./tools/kind-with-registry.sh` | |||
5. Run `./bin/gitops install --config-repo=<repo url>` | |||
6. Start the in-cluster API replacement job (powered by [http://tilt.dev](tilt.dev)) with `make cluster-dev` | |||
7. make or make unit-tests to ensure everything built correctly. | |||
6. Run `make tls-files` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we are generating tls keys in memory are these new tools and generating localhost keys really necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair question. The certificates generated in memory are not valid, but the ones generated here are. I guess this is a matter of preference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that we are providing steps and tools to generate valid certs and having invalid auto-generated in-memory certs is a bit unusual, I have personally never seen any other project doing that, shouldn't we just remove the invalid auto-generate ones?
Since also we set up our dev environment to use the valid certs with tilt that auto-generate option will be mostly unused.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not too unusual, that is exactly how it is being done in the enterprise project. I personally prefer to enforce the use of valid certificates for development. I'll remove the in-memory generation.
cmd/gitops-server/cmd/cmd.go
Outdated
@@ -74,6 +81,10 @@ func NewCommand() *cobra.Command { | |||
cmd.Flags().StringVar(&options.NotificationControllerAddress, "notification-controller-address", "http://notification-controller./", "the address of the notification-controller running in the cluster") | |||
cmd.Flags().IntVar(&options.WatcherPort, "watcher-port", 9443, "the port on which the watcher is running") | |||
|
|||
cmd.Flags().StringVar(&options.TLSCert, "tls-cert-file", "", "filename for the TLS certificate, in-memory generated if omitted") | |||
cmd.Flags().StringVar(&options.TLSKey, "tls-private-key", "", "filename for the TLS key, in-memory generated if omitted") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to keep consistency with the other flag and with kube-api
, we could add the -file
suffix here.
cmd/gitops-server/cmd/cmd.go
Outdated
@@ -40,6 +43,9 @@ type Options struct { | |||
LoggingEnabled bool | |||
OIDC OIDCAuthenticationOptions | |||
NotificationControllerAddress string | |||
TLSCert string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to match the flag and avoid confusion we could change TLSCert
and TLSKey
to TLSCertFile
and TLSKeyFile
cmd/gitops-server/cmd/cmd.go
Outdated
@@ -74,6 +81,10 @@ func NewCommand() *cobra.Command { | |||
cmd.Flags().StringVar(&options.NotificationControllerAddress, "notification-controller-address", "http://notification-controller./", "the address of the notification-controller running in the cluster") | |||
cmd.Flags().IntVar(&options.WatcherPort, "watcher-port", 9443, "the port on which the watcher is running") | |||
|
|||
cmd.Flags().StringVar(&options.TLSCert, "tls-cert-file", "", "filename for the TLS certificate, in-memory generated if omitted") | |||
cmd.Flags().StringVar(&options.TLSKey, "tls-private-key", "", "filename for the TLS key, in-memory generated if omitted") | |||
cmd.Flags().BoolVar(&options.NoTLS, "no-tls", false, "do not attempt to read TLS certificates") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think perhaps --insecure
would be a better flag name?
cmd.Flags().StringVar(&options.TLSCert, "tls-cert-file", "", "filename for the TLS certificate, in-memory generated if omitted") | ||
cmd.Flags().StringVar(&options.TLSKey, "tls-private-key", "", "filename for the TLS key, in-memory generated if omitted") | ||
cmd.Flags().BoolVar(&options.NoTLS, "no-tls", false, "do not attempt to read TLS certificates") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also add a flag so that we can also enable client certificate validation as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^^ this is should be there 100%, in production use, I would never turn it off.
Edit: Is this the server side part only? If yes, it's a client thing, but there we should add a flag for it.
pkg/server/tls/tls.go
Outdated
return tlsConfig, nil | ||
} | ||
|
||
func generateKeyPair(hosts []string) ([]byte, []byte, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potentially we can reuse an existing rootCA. On my local environment I have one rootCA,added that cert as trusted cert and i just generate keys per "site".
https://gist.github.com/yitsushi/2cd4ec3158c2f334d07f493da7f829e0
so basically the priv, err :=
part is not generated, it can come from a file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I'm deleting this function as it is no longer relevant.
tools/validate-mkcert.sh
Outdated
if ! command -v mkcert &> /dev/null | ||
then | ||
echo "mkcert is not installed, consider following this instructions: https://github.com/FiloSottile/mkcert#installation " | ||
exit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exit 1
to indicate error, and make it fail if we use in a pipeline (like any other tasks that has tls-files
as a dependency)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not supposed to run as part of a pipeline. But I would like to start using this patter as this could end up in a pipeline at some point.
log.Println("TLS connections disabled") | ||
return srv.ListenAndServe() | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given insecure is false here we should probably check that the options.TLSCertFile
and options.TLSKeyFile
are not empty. If they are empty then we should return an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And check that the files exist?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@richardcase You are right we should check that those flags are not empty. On the other hand, if the files do not exist we will get this type of errors already:
…icate, --tls-server-key
…create self-signed certificates.
# Conflicts: # cmd/gitops-server/cmd/cmd.go
…to match with the latest changes
* Added tls/ssl to api for gitops-server cmd * Added new flags to gitops-server cmd: --insecure, --tls-server-certificate, --tls-server-key * Fixed lint issues * Default to false on insecure flag * Added tls support for development. Also added instructions on how to create self-signed certificates. * Generate in-memory TLS keys # Conflicts: # cmd/gitops-server/cmd/cmd.go * Ignore testing tls files * Added make target to generate valid testing tls files when using tilt * Enforce mTLS by default * Use script to validate mkcert instead of doing it in the makefile * Return error status on validate-mkcert.sh * Disable mTLS by default * Error if certificate or key file are empty when insecure flag is false * Removed unused file wego-app.yaml and expose port 9001 on https link to match with the latest changes
There were quite a few conflicts, but many of them were very easy to resolve: anything related to acceptance tests has been deleted, lots of import conflicts, conficts in generated files, and some things that don't have a common ancestor and yet have almost the same content which makes git twitchy. There were a few more significant conflicts I've dealt with manually: * There's 2 independent implementations of https listeners (#1537, #1483). This also means it's been implemented in two places, `gitops run ui` and `gitops-server`. Keep the rate limiting from the main one, but otherwise mostly stick to the v2 one - it's been written to only work in docker. This means no more certificate generation code: v2 expects you to mount your certificates into the pod. TODO this deletes all gitops-server tests because they're all incompatible * The auth system in main has grown a username, which isn't present in v2. Add it to relevant files. * gitops profile upgrade is new. I've not really investigated how it works or if it still works, I've just adapted the API to the kind of changes we've done and checked that the tests pass. * There's been a lot of tilt rewrites in both branches. Since what you need to deploy the app and what it needs has changed in v2, I've mostly ignored main. * v2 and main disagreed on how to embed many svg images - v2 imported the images via filesystem path where they reside, while main imported them through images.ts. I've followed the v2 pattern. * Commit "Get auth working" (fdd15fb) in v2 rewrote how feature flags work, while main continued developing the old API. I've ported the new main code to the v2 API. * UserSettings had conflicts introduced in "Fix proxy port and user settings styling" (84bdc98) vs "WG529 Auth styling and flags improvements (#1624)" (f7098fb), I've tried to merge those. * Both main and v2 have turned a bunch of hard-coded auth secret name strings into constants. They've placed those constants in different places and given them different values. I gave up on trying to make this good and sensible and decided to just make everything compile and pass tests and start, and am looking forward to someone asking themselves "who would do this!?"
There were quite a few conflicts, but many of them were very easy to resolve: anything related to acceptance tests has been deleted, lots of import conflicts, conficts in generated files, and some things that don't have a common ancestor and yet have almost the same content which makes git twitchy. There were a few more significant conflicts I've dealt with manually: * There's 2 independent implementations of https listeners (#1537, #1483). This also means it's been implemented in two places, `gitops run ui` and `gitops-server`. Keep the rate limiting from the main one, but otherwise mostly stick to the v2 one - it's been written to only work in docker. This means no more certificate generation code: v2 expects you to mount your certificates into the pod. * The auth system in main has grown a username, which isn't present in v2. Add it to relevant files. * gitops profile upgrade is new. I've not really investigated how it works or if it still works, I've just adapted the API to the kind of changes we've done and checked that the tests pass. * There's been a lot of tilt rewrites in both branches. Since what you need to deploy the app and what it needs has changed in v2, I've mostly ignored main. * v2 and main disagreed on how to embed many svg images - v2 imported the images via filesystem path where they reside, while main imported them through images.ts. I've followed the v2 pattern. * Commit "Get auth working" (fdd15fb) in v2 rewrote how feature flags work, while main continued developing the old API. I've ported the new main code to the v2 API. * UserSettings had conflicts introduced in "Fix proxy port and user settings styling" (84bdc98) vs "WG529 Auth styling and flags improvements (#1624)" (f7098fb), I've tried to merge those. * Both main and v2 have turned a bunch of hard-coded auth secret name strings into constants. They've placed those constants in different places and given them different values. I gave up on trying to make this good and sensible and decided to just make everything compile and pass tests and start.
There were quite a few conflicts, but many of them were very easy to resolve: anything related to acceptance tests has been deleted, lots of import conflicts, conficts in generated files (which I've just re-generated and ignored), and some things that don't have a common ancestor and yet have almost the same content which makes git twitchy. There were a few more significant conflicts I've dealt with manually: * There's 2 independent implementations of https listeners (#1537, #1483). This also means it's been implemented in two places, `gitops run ui` and `gitops-server`. Keep the rate limiting from the main one, but otherwise mostly stick to the v2 one - it's been written to only work in docker. This means no more certificate generation code: v2 expects you to mount your certificates into the pod. * The auth system in main has grown a username, which isn't present in v2. Add it to relevant files. * gitops profile upgrade is new. I've not really investigated how it works or if it still works, I've just adapted the API to the kind of changes we've done and checked that the tests pass. * There's been a lot of tilt rewrites in both branches. Since what you need to deploy the app and what it needs has changed in v2, I've mostly ignored main. * v2 and main disagreed on how to embed many svg images - v2 imported the images via filesystem path where they reside, while main imported them through images.ts. I've followed the v2 pattern. * Commit "Get auth working" (fdd15fb) in v2 rewrote how feature flags work, while main continued developing the old API. I've ported the new main code to the v2 API. * UserSettings had conflicts introduced in "Fix proxy port and user settings styling" (84bdc98) vs "WG529 Auth styling and flags improvements (#1624)" (f7098fb), I've tried to merge those. * Both main and v2 have turned a bunch of hard-coded auth secret name strings into constants. They've placed those constants in different places and given them different values. I gave up on trying to make this good and sensible and decided to just make everything compile and pass tests and start.
There were quite a few conflicts, but many of them were very easy to resolve: anything related to acceptance tests has been deleted, lots of import conflicts, conficts in generated files (which I've just re-generated and ignored), and some things that don't have a common ancestor and yet have almost the same content which makes git twitchy. There were a few more significant conflicts I've dealt with manually: * There's 2 independent implementations of https listeners (#1537, #1483). This also means it's been implemented in two places, `gitops run ui` and `gitops-server`. Keep the rate limiting from the main one, but otherwise mostly stick to the v2 one - it's been written to only work in docker. This means no more certificate generation code: v2 expects you to mount your certificates into the pod. * The auth system in main has grown a username, which isn't present in v2. Add it to relevant files. * gitops profile upgrade is new. I've not really investigated how it works or if it still works, I've just adapted the API to the kind of changes we've done and checked that the tests pass. * There's been a lot of tilt rewrites in both branches. Since what you need to deploy the app and what it needs has changed in v2, I've mostly ignored main. * v2 and main disagreed on how to embed many svg images - v2 imported the images via filesystem path where they reside, while main imported them through images.ts. I've followed the v2 pattern. * Commit "Get auth working" (fdd15fb) in v2 rewrote how feature flags work, while main continued developing the old API. I've ported the new main code to the v2 API. * UserSettings had conflicts introduced in "Fix proxy port and user settings styling" (84bdc98) vs "WG529 Auth styling and flags improvements (#1624)" (f7098fb), I've tried to merge those. * Both main and v2 have turned a bunch of hard-coded auth secret name strings into constants. They've placed those constants in different places and given them different values. I gave up on trying to make this good and sensible and decided to just make everything compile and pass tests and start.
Closes: #1499
What changed?
Why?
How did you test it?
Release notes
Documentation Changes