Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I can connect to Zeebe with a custom certificate #3028

Closed
barmac opened this issue Jul 14, 2022 · 38 comments · Fixed by #3046
Closed

I can connect to Zeebe with a custom certificate #3028

barmac opened this issue Jul 14, 2022 · 38 comments · Fixed by #3046
Assignees
Labels
Camunda 8 Flags an issue as related to Camunda 8 channel:support deploy enhancement New feature or request

Comments

@barmac
Copy link
Contributor

barmac commented Jul 14, 2022

What should we do?

  • I can specify a certificate which is used to connect to a TLS secured Zeebe instance.

More insights in #2404

Why should we do it?

Cannot deploy models to TLS secured Zeebe instance.

@barmac
Copy link
Contributor Author

barmac commented Jul 14, 2022

We need to define the UI part of this.

@nikku nikku added Camunda 8 Flags an issue as related to Camunda 8 deploy labels Jul 14, 2022
@barmac
Copy link
Contributor Author

barmac commented Jul 14, 2022

Ideas how to solve this:

  • have a directory with certificates which could be used to interact with Zeebe

TODO:

@christian-konrad
Copy link
Contributor

Additional notes from the customer:

We are using HTTPS certificates with very frequent refreshes (expiry are very short) so I need to do the "extract SSL certificate using openssl s_client" dance pretty often.
Not sure how to address frequent cert changes with a nice UX

@nikku
Copy link
Member

nikku commented Jul 14, 2022

Proposal: Instead of investing into "building that UX", let us consider to hand over some parts of it to the OS.

How do they usually develop currently (why does "frequent refresh") work with zeebectl (if it works?).

@barmac
Copy link
Contributor Author

barmac commented Jul 15, 2022

I checked how custom certificates are handled in WebStorm. I can accept all untrusted certificates (insecure) or add specific certificates in the UI:

image

@barmac
Copy link
Contributor Author

barmac commented Jul 15, 2022

These are the settings available in VSCode:

image

Note that it does not have a concept of deployment without extensions (e.g. https://marketplace.visualstudio.com/items?itemName=mkloubert.vs-deploy).

@christian-konrad
Copy link
Contributor

@barmac is this OS-loading thing possible?

@barmac
Copy link
Contributor Author

barmac commented Jul 15, 2022

Once I setup my keychain to always trust my self-signed certificate, I was able to connect with zbctl. Still, I haven't been able to connect with zeebe-node yet.

@barmac
Copy link
Contributor Author

barmac commented Jul 18, 2022

I was able to setup this properly. You can check out the repo: https://github.com/barmac/zeebe-tls-connection-test

zeebe-node is able to connect to the instance when I provide the certificate to the client. Note that the certificate needs to have DNS:localhost specified in the subjectAltName section.

Unfortunately, even if I add the certificate to the system keychain, and make the OS trust it, zeebe-node fails to connect to the instance. It works with zbctl though.

@barmac
Copy link
Contributor Author

barmac commented Jul 18, 2022

I tried to set the certicate/key paths via env variables, but apparently there is a bug in zeebe-node which prevents us from using env variables at the moment 🤡 I will create an issue for this soon.

edit: I prepared a fix instead: camunda-community-hub/zeebe-client-node-js#263

@barmac
Copy link
Contributor Author

barmac commented Jul 18, 2022

Simple solution sketch:

Given I have a certificate located on my disk, I can either:

  1. Run Modeler with --zeebe-ssl-certificate flag like so: modeler --zeebe-ssl-certificate="/path/to/my/cert.pem", or
  2. Configure flags.json file like so:
{
  "zeebe-ssl-certificate": "/path/to/my/cert.pem"
}

So that when Modeler tries to connect to Zeebe, it will use the provided certificate.

@ajeans Does it sound like a proper solution for your use case? What potential problems can you see?

@christian-konrad @nikku Please share your feedback. Note that this allows only a single certificate as opposed to the idea in #3028 (comment). However, this is adjusted to zeebe-node API.

@barmac
Copy link
Contributor Author

barmac commented Jul 19, 2022

Regarding the OS keychain, NodeJS uses per default bundled root certificates from Mozilla CA store. I believe this is the reason why the custom certificate added to a system keychain is ignored by the client.

In VSCode, they patch the http agent to also include the certificates found in the system keychain, cf. https://github.com/microsoft/vscode-proxy-agent/blob/main/src/index.ts#L364 We could do that as well, but it would take more time to implement than the flag-based approach.

@barmac barmac self-assigned this Jul 19, 2022
@barmac
Copy link
Contributor Author

barmac commented Jul 19, 2022

Some additional findings:

We cannot use Electron's net module (which uses Chromium networking behind the scenes) because gRPC is based on Node's http2 module while Electron's net can only do normal http(s) requests.

There are also not-vscode packages which handle OS stores:

Note that they're not actively developed.

@barmac
Copy link
Contributor Author

barmac commented Jul 20, 2022

@christian-konrad and I just had another meeting on this issue. We decided to implement two solutions:

  • use system keychain certificates (there is no reason for CM to ignore trusted certificates),
  • provide a flag to supplement a custom certificate (to solve the issue for quick dev setup).

@barmac
Copy link
Contributor Author

barmac commented Jul 20, 2022

I tried out the system keychain solution and it worked fine. Check out this commit for details.

@barmac
Copy link
Contributor Author

barmac commented Jul 20, 2022

Interesting learning: As grpc-js uses Node's http2 module instead of https, global agent patches provided by mac-ca and alike don't solve the issue. Instead, we need to pass the root certs directly to zeebe-node. AFAIK there is no global agent object for http2 module.

@nikku
Copy link
Member

nikku commented Jul 20, 2022

I'm happy that we ended up with #3028 (comment).

There is no reason why we should not work with standard system facilities.

Given the flag anyone can plug-in their own certificate, overriding the default lookup. 👍

@barmac
Copy link
Contributor Author

barmac commented Jul 21, 2022

Unfortunately, it seems that with zeebe-node we need to know upfront whether we want or not to use TLS. This is impossible to determine right away from localhost:26500 as the endpoint address :/

I think the right solution to this problem is to enforce http(s) prefix. This will remove the ambiguity, and for legacy endpoints we can display http when no authentication was configured, otherwise https.

@nikku
Copy link
Member

nikku commented Jul 21, 2022

Usually you'd see a tls:// prefix to distinguish secured from insecure access.

Is what we do HTTP? I'm not sure; I thought we'd speak binary.

@barmac
Copy link
Contributor Author

barmac commented Jul 21, 2022

gRPC is done over HTTP2, and this is how we communicate with Zeebe.

barmac added a commit that referenced this issue Jul 22, 2022
The flag to use is `--zeebe-ssl-certificate=<path-to-file>`.

Related to #3028
barmac added a commit that referenced this issue Sep 2, 2022
barmac added a commit that referenced this issue Sep 2, 2022
Protocol is now required for the contact points of self-hosted Zeebe instances.

Closes #3028
@barmac
Copy link
Contributor Author

barmac commented Sep 2, 2022

Hi @ajeans,

I am glad to see you back. Thank you for trying out my suggestion and providing the exact commands you used, as this gave me a 💡 moment.

So I tried to use the certificate with a relative path like in the snippet you provided:

Started with
$ /opt/camunda-modeler-3028-select-certificate-via-flag-linux-x64/camunda-modeler --zeebe-ssl-certificate=bin/dev.crt

...and indeed the solution did not work on my machine. It worked with an absolute path, though -> I must've used an absolute path all the time.

The good news is that I was able to implement relative path handling, so if you download the artifact now, it should work both ways. Please let me know if you can confirm it.

@ajeans
Copy link

ajeans commented Sep 6, 2022

Hello @barmac

So I happily downloaded the new image and...

  • Redid my port-forward
kubectl port-forward -n zeebe-caseflow service/zeebe-gateway 26500
  • Launched the camunda modeler with a pointer to the certificate
GRPC_VERBOSITY=DEBUG GRPC_TRACE=all  /opt/camunda-modeler-3028-select-certificate-via-flag-linux-x64/camunda-modeler --zeebe-ssl-certificate=bin/dev.crt

(This modeler is from September 2nd so definitely the new version)

  • Tried deploying graphically

image

But still a failure, full verbose logs attached.

startup.log

Also tried pointing at the certificate absolute path with no visible change in behaviour.

Would you consider adding debug logs in a build so that we can pinpoint where the problem is? I assume the certificate is propery loaded (it is in the system certificates and loaded from the CLI).

Checking back with zbctl

  • ./bin/zbctl deploy quicksign-caseflow-bpmn/src/main/resources/demo-signature.bpmn --address zeebe-gateway.zeebe-caseflow:26500 --certPath bin/dev.crt works fine

@ajeans
Copy link

ajeans commented Sep 6, 2022

Looking some more at the logs, this seems to be the best candidate for the problem

D 2022-09-06T07:26:06.812Z | subchannel | (2) 127.0.0.1:26500 creating HTTP/2 session
D 2022-09-06T07:26:06.876Z | subchannel | (2) 127.0.0.1:26500 connection closed with error unable to verify the first certificate
D 2022-09-06T07:26:06.876Z | subchannel | (2) 127.0.0.1:26500 connection closed

Looking for this specific error message points at grpc/grpc-node#2055 but that is an open issue on grpc-js so that doesn't sound good. :-(

@ajeans
Copy link

ajeans commented Sep 6, 2022

@barmac

Is there any way I can build camunda-modeler locally and run it in debug mode and step into the grpc behaviour with an IDE?

I am looking at https://github.com/grpc/grpc-node/blob/master/packages/grpc-js/src/subchannel.ts#L400 and would like to understand what part of the TLS validation fails.

Another option would be to check if "X509v3 Subject Alternative Name" is properly supported by changing your certificate test case to something that looks more like my certificate.
So changing https://github.com/barmac/zeebe-tls-connection-test/blob/main/generate-cert.sh to have a CN that is different from the hostname.

Here is my current certificate.
dev.txt

What do you think?

barmac added a commit that referenced this issue Sep 8, 2022
The flag to use is `--zeebe-ssl-certificate=<path-to-file>`.

Related to #3028
barmac added a commit that referenced this issue Sep 8, 2022
barmac added a commit that referenced this issue Sep 8, 2022
Protocol is now required for the contact points of self-hosted Zeebe instances.

Closes #3028
@barmac
Copy link
Contributor Author

barmac commented Sep 8, 2022

Hi @ajeans

I tried out the certificate you provided and couldn't get it to work with either Camunda Modeler or zbctl, which failed with:

zbctl --certPath ./cert.pem status --authority zeebe-gateway.zeebe-caseflow --address zeebe-gateway.zeebe-caseflow:26500
Error: rpc error: code = Unavailable desc = connection error: desc = "transport: authentication handshake failed: remote error: tls: internal error"

Still, my suggestion would be that you try to use a certificate where SAN includes the Common Name, as suggested at https://support.dnsimple.com/articles/what-is-common-name/#common-name-vs-subject-alternative-name

Is there any way I can build camunda-modeler locally and run it in debug mode and step into the grpc behaviour with an IDE?

To build Camunda Modeler locally, you need to clone this repo, git checkout 3028-select-certificate-via-flag, and then npm ci && npm run build to reproduce. You can also run the app in the development mode via npm start. npm@8 is required.

For the next steps, I will mark the linked PR as ready and once we can within the team confirm that it works with basic certificate across the supported platforms, we will merge it. We can still continue work on supporting your use case, but that should be subject to a separate PR, as I believe we can release support for at least a basic scenario right now.

@barmac
Copy link
Contributor Author

barmac commented Sep 9, 2022

BTW according to [the article I shared](https://support.dnsimple.com/articles/what-is-common-name/#:~:text=The%20Common%20Name%20(AKA%20CN,common%20name%20in%20the%20certificate.):

The Common Name (AKA CN) represents the server name protected by the SSL certificate. The certificate is valid only if the request hostname matches the certificate common name. Most web browsers display a warning message when connecting to an address that does not match the common name in the certificate.

This is also mentioned at https://knowledge.digicert.com/solution/SO7239.html

I suppose that can be the reason why I get the errors when trying out your certificate.

@ajeans
Copy link

ajeans commented Sep 10, 2022

Hi @barmac

Well this same article that you point at says the following close to the end.

On the technical side, the SAN extension was introduced to integrate the common name. Since HTTPS was first introduced in 2000 (and defined by the RFC 2818), the use of the commonName field has been considered deprecated, because it’s ambiguous and untyped.

Anyway, I agree this should not hold you MR.

I will try to spend more time to debug this next week, because I would like this to work and camunda-modeler is the only software that doesn't work with how our kubernetes cluster presents its certificates.

Hopefully I can find additional information.

Thanks

barmac added a commit that referenced this issue Sep 12, 2022
The flag to use is `--zeebe-ssl-certificate=<path-to-file>`.

Related to #3028
barmac added a commit that referenced this issue Sep 12, 2022
barmac added a commit that referenced this issue Sep 12, 2022
Protocol is now required for the contact points of self-hosted Zeebe instances.

Closes #3028
@bpmn-io-tasks bpmn-io-tasks bot added needs review Review pending in progress Currently worked on and removed in progress Currently worked on needs review Review pending labels Sep 12, 2022
@ajeans
Copy link

ajeans commented Sep 13, 2022

\cc @fguay this is our first step to do deployments to DEV graphically.

barmac added a commit that referenced this issue Sep 15, 2022
The flag to use is `--zeebe-ssl-certificate=<path-to-file>`.

Related to #3028
barmac added a commit that referenced this issue Sep 15, 2022
barmac added a commit that referenced this issue Sep 15, 2022
Protocol is now required for the contact points of self-hosted Zeebe instances.

Closes #3028
@bpmn-io-tasks bpmn-io-tasks bot removed the in progress Currently worked on label Sep 15, 2022
barmac added a commit that referenced this issue Sep 15, 2022
The flag to use is `--zeebe-ssl-certificate=<path-to-file>`.

Related to #3028
barmac added a commit that referenced this issue Sep 15, 2022
barmac added a commit that referenced this issue Sep 15, 2022
Protocol is now required for the contact points of self-hosted Zeebe instances.

Closes #3028
@barmac
Copy link
Contributor Author

barmac commented Sep 15, 2022

I created a follow-up issue for the documentation: camunda/camunda-docs#1268

@ajeans When you have new findings regarding how you connect to Zeebe, please create a new issue, and link this one. The flag we have implemented will be available in the nightly build from tomorrow on. Thank you for collaboration on this so far.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Camunda 8 Flags an issue as related to Camunda 8 channel:support deploy enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants