Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration test failure #1461

Closed
Sylphe88 opened this issue Jun 29, 2023 · 22 comments · Fixed by #1527
Closed

Integration test failure #1461

Sylphe88 opened this issue Jun 29, 2023 · 22 comments · Fixed by #1527
Labels
bug Dysfunctionnal behavior

Comments

@Sylphe88
Copy link

Sylphe88 commented Jun 29, 2023

Version(s)

2.0.0

Which components

server- Integration tests

Tested With

No response

What happened

I've got the same issue as #952 , the build log is attached.

My environment is as follows:
Apache Maven 3.8.7 (b89d5959fcde851dcb1c8946a785a163f14e1e29)
Maven home: C:\apache-maven-3.8.7
Java version: 17.0.6, vendor: Oracle Corporation, runtime: C:\Program Files\Java\jdk-17

Default locale: fr_FR, platform encoding: Cp1252
OS name: "windows 10", version: "10.0", arch: "amd64", family: "windows"

I recall being able to build Leshan a few months back on Windows and my environment is likely to be the same as then. The build succeeds on my WSL (most notable difference is Java 11 there).

Any thought?

How to reproduce

Just run a mvn clean install

Relevant Output

No response

leshan-build-fail-SecurityTest.txt

@Sylphe88 Sylphe88 added the bug Dysfunctionnal behavior label Jun 29, 2023
@sbernard31
Copy link
Contributor

sbernard31 commented Jun 29, 2023

Indeed it looks very similar 🤔 and as previously I have no idea about what could be the issue.
Did you try with other version of java (e.g. java 11 or openjdk) ?
On my side, I will try to build it with openjdk-17 (but I'm still on linux/debian)

Did you try : #952 (comment) ?
Maybe we will get more information to understand the issue ?

@sbernard31
Copy link
Contributor

😬 Without too much surprise I'm not able to reproduce this issue with :

mvn -version
Apache Maven 3.6.3
Maven home: /usr/share/maven
Java version: 17.0.7, vendor: Debian, runtime: /usr/lib/jvm/java-17-openjdk-amd64
Default locale: fr_FR, platform encoding: UTF-8
OS name: "linux", version: "5.10.0-23-amd64", arch: "amd64", family: "unix"

@Sylphe88
Copy link
Author

Java 11 gives the same result.

Will add the logback config later today :)

@jvermillard
Copy link
Contributor

mvn 3.9.3 on master with openjdk 17.0.7 on fedora 38, no build/test issues

@Sylphe88
Copy link
Author

Sylphe88 commented Jun 29, 2023

Looks like there's a DTLS issue, the generated x509 certificate during integration tests is not validated (see log file)...
leshan-build-fail-SecurityTest-debug.txt

I'll see if Windows updates may be responsible for this as I haven't changed any tls/crypto-related packages or libraries myself
(Sorry for failposting on the original thread)

@sbernard31
Copy link
Contributor

I just looked at it.

There is some code in Leshan where we verify that Certificate subject matches hostname.
And for some reason in your case, it doesn't match :

org.eclipse.californium.scandium.dtls.HandshakeException: 
**Certificate chain could not be validated - server identity does not match certificate**
        at org.eclipse.leshan.client.californium.BaseCertificateVerifier.validateSubject(BaseCertificateVerifier.java:110)
        at org.eclipse.leshan.client.californium.ServiceCertificateConstraintCertificateVerifier.verifyCertificate(ServiceCertificateConstraintCertificateVerifier.java:89)
        at org.eclipse.leshan.client.californium.BaseCertificateVerifier.verifyCertificate(BaseCertificateVerifier.java:67)
        at org.eclipse.californium.scandium.dtls.Handshaker.verifyCertificate(Handshaker.java:2466)
        at org.eclipse.californium.scandium.dtls.ClientHandshaker.receivedServerCertificate(ClientHandshaker.java:538)
        at org.eclipse.californium.scandium.dtls.ClientHandshaker.doProcessMessage(ClientHandshaker.java:306)
        at org.eclipse.californium.scandium.dtls.Handshaker.processNextHandshakeMessages(Handshaker.java:969)
        at org.eclipse.californium.scandium.dtls.Handshaker.processNextMessages(Handshaker.java:824)
        at org.eclipse.californium.scandium.dtls.Handshaker.processMessage(Handshaker.java:782)
        at org.eclipse.californium.scandium.DTLSConnector.processHandshakeRecord(DTLSConnector.java:2401)
        at org.eclipse.californium.scandium.DTLSConnector.processRecord(DTLSConnector.java:2121)

The certificates for integration tests are created with a CN=localhost, so as you run your tests locally this should works 🤔

I see some peer [kubernetes.docker.internal/127.0.0.1:51117] in your logs, so maybe this is relative to this environnment ?

Anyway the log message is not so good, we should change it to display what is received and what is expected.
I will do it but a clean way could be more complicate than expected.
If you want to have more information now, you can modify BaseCertificateVerifier.validateSubject(InetSocketAddress, X509Certificate) with something like :

    protected void validateSubject(final InetSocketAddress peerSocket, final X509Certificate receivedServerCertificate)
            throws HandshakeException {

        if (X509CertUtil.matchSubjectDnsName(receivedServerCertificate, peerSocket.getHostName()))
            return;

        if (X509CertUtil.matchSubjectInetAddress(receivedServerCertificate, peerSocket.getAddress()))
            return;

        AlertMessage alert = new AlertMessage(AlertLevel.FATAL, AlertDescription.BAD_CERTIFICATE);
        throw new HandshakeException(String.format(
                "Certificate chain could not be validated - server identity %s:%s does not match certificate %s",
                peerSocket.getHostName(), peerSocket.getAddress(), receivedServerCertificate), alert);
    }

then run your test again and share just the stacktrace about BaseCertificateVerifier.validateSubject ?

@Sylphe88
Copy link
Author

Sylphe88 commented Jun 29, 2023

I fixed the kubernetes thing that was apparently written to my hosts file for no legitimate reason and reverted to the good old 127.0.0.1 localhost line. However there's still something going on with the way the InetSocketAddress looks up the address/hostname. With the additional logging you recommended:

Caused by: org.eclipse.californium.scandium.dtls.HandshakeException: Certificate chain could not be validated - server identity 127.0.0.1:127.0.0.1/127.0.0.1 does not match certificate [
[
  Version: V3
  Subject: CN=Server signed with Intermediate CA
  Signature Algorithm: SHA256withECDSA, OID = 1.2.840.10045.4.3.2

  Key:  Sun EC public key, 256 bits
  public x coord: 81741892632623131560020269223745815777694899637370162187782131386479660134752
  public y coord: 51913889454012516220131240810654165014666744529872007761037466049035318492215
  parameters: secp256r1 [NIST P-256,X9.62 prime256v1] (1.2.840.10045.3.1.7)
  Validity: [From: Thu Nov 19 15:47:55 CET 2020,
               To: Sat Oct 26 16:47:55 CEST 2120]
  Issuer: CN=Leshan intermediate CA
  SerialNumber: [    4a8c4577]

Certificate Extensions: 6
[1]: ObjectId: 2.5.29.35 Criticality=false
AuthorityKeyIdentifier [
KeyIdentifier [
0000: AD 31 B9 AF 93 72 64 92   5A D2 95 1E E5 40 AE 35  .1...rd.Z....@.5
0010: 56 9B 60 D7                                        V.`.
]
]

[2]: ObjectId: 2.5.29.19 Criticality=false
BasicConstraints:[
  CA:false
  PathLen: undefined
]

[3]: ObjectId: 2.5.29.37 Criticality=false
ExtendedKeyUsages [
  serverAuth
]

[4]: ObjectId: 2.5.29.15 Criticality=true
KeyUsage [
  DigitalSignature
  Key_Agreement
]

[5]: ObjectId: 2.5.29.17 Criticality=false
SubjectAlternativeName [
  DNSName: localhost
]

[6]: ObjectId: 2.5.29.14 Criticality=false
SubjectKeyIdentifier [
KeyIdentifier [
0000: E2 C9 90 A0 98 E5 43 FE   72 BC 2E 01 2B 7E 2D A5  ......C.r...+.-.
0010: 9C 70 9B 06                                        .p..
]
]

]
  Algorithm: [SHA256withECDSA]
  Signature:
0000: 30 45 02 20 23 D3 6E 77   BC F1 81 47 21 78 FB 04  0E. #.nw...G!x..
0010: DD CE 66 DA 3D C2 53 A3   37 B5 DC 96 39 48 EF 87  ..f.=.S.7...9H..
0020: F0 E9 89 AA 02 21 00 EC   70 8C 6E 5A B0 8D 9C C7  .....!..p.nZ....
0030: AB A4 52 08 31 88 56 A9   7E F9 BC 29 03 05 D6 2E  ..R.1.V....)....
0040: BD B1 CB 18 4B C1 C4                               ....K..

]

I can successfully resolve 127.0.0.1 to be localhost with ping or other network tools

@sbernard31
Copy link
Contributor

sbernard31 commented Jun 29, 2023

Are you sure this a log corresponding to a failing tests ?
Because some tests check that a bad certificate can not be used to connect.
I'm not sure but I think this could be one of them 🤔 (and so this is expected behavior)

@Sylphe88
Copy link
Author

I basically get the same error on all failing tests. Here's an extract. I get this is related to my network setup but I
leshan-build-fail-SecurityTest-debug-singletest.txt

One of the failing test can be run with mvn test -rf :leshan-integration-tests -Dtest="SecurityTest#registered_device_with_x509cert_to_server_with_x509cert_rootca_certificate_ca_domain_root_ca_given"

@sbernard31
Copy link
Contributor

Very hard to me to understand 🤔, especially as I can not reproduce on my side 😬.

I tried to launch that test : SecurityTest#registered_device_with_x509cert_to_server_with_x509cert_rootca_certificate_ca_domain_root_ca_given

And check how behave : BaseCertificateVerifier.validateSubject
On my side at :

if (sans != null) {
for (List<?> san : sans) {
int generalName = (Integer) san.get(0);
if (generalName == GeneralName.DNS_NAME.value) {
String value = (String) san.get(1);
if (dnsNameMatch(value, dnsName)) {
return true;
}
}
}

line 295, value equals dnsName and both are "localhost" string.
Could you eventually try to see what are values for those 2 variable on you side ? (maybe adding LOG or sysout?)

@sbernard31
Copy link
Contributor

@Sylphe88 any news about that ?

@Sylphe88
Copy link
Author

@sbernard31 Not much, I still have the same issue on my machine (but at least the CI runs - as the target is Debian). The problem lies with both matchSubjectInetAddress and matchSubjectDnsName. There's a mismatch in the second one (localhost vs 127.0.0.1) and the first one does not even runs the name comparison, the InetSocketAddress.getAddress returns a weird 127.0.0.1/127.0.0.1 address (wtf is that prefix).

Even more odd, a small test java app running InetSocketAddress.getHostName() returns localhost...

A colleague of mine experiences the same inability to test/build on his Windows machine

@sbernard31
Copy link
Contributor

Thx for details. 🙏

The javadoc about InetSocketAddress.getHostName() says :

Gets the hostname. Note: This method may trigger a name service reverse lookup if the address was created with a literal IP address.

And looking at Leshan Integrations tests code, I see that the InetSocketAddress is created with IP literal, so a name service reserve lookup should be done and for some reason it does not return "localhost".

So maybe we could modify the code to create the InetSocketAddress with "localhost".

Could you test with master and replace code in LeshanTestClientBuilder class of method getServerUri() by :

    private URI getServerUri() {
        LwM2mServerEndpoint endpoint = server.getEndpoint(protocolToUse);
        URI serverUri = endpoint.getURI();
        if (proxy != null) {
            // if server is behind a proxy we use its URI
            return EndpointUriUtil.replaceAddress(serverUri,
                    new InetSocketAddress("localhost", proxy.getClientSideProxyAddress().getPort()));
        } else {
            return EndpointUriUtil.replaceAddress(serverUri, new InetSocketAddress("localhost", serverUri.getPort()));
        }
    }

@sbernard31
Copy link
Contributor

@Sylphe88 did you test this solution ☝️ ?
If you confirm it works for you, I can integrate this in master

@Sylphe88
Copy link
Author

Sorry @sbernard31, I've been super busy. I'll try that next week if that's fine

@sbernard31
Copy link
Contributor

That's fine 👍

@Sylphe88
Copy link
Author

Sylphe88 commented Sep 5, 2023

I was still late to test it, but it worked fine this time, with the master branch and the return condition you mentioned 💪 Is that "localhost" condition acceptable in all test conditions? 🤔

@malrancau
Copy link

Has this issue been resolved? A similar phenomenon occurs to me.

@sbernard31
Copy link
Contributor

Has this issue been resolved?

@malrancau, when this will be resolved, the issue will be commented and closed.

A similar phenomenon occurs to me.

Did you try : #1461 (comment) ? does it solved your issue ?

@sbernard31
Copy link
Contributor

@Sylphe88, Oops I totally missed your comment.

I was still late to test it, but it worked fine this time, with the master branch and the return condition you mentioned 💪 Is that "localhost" condition acceptable in all test conditions? 🤔

I think using localhost should be OK. I will integrate this kind of fix soon.

@sbernard31
Copy link
Contributor

(I created a PR about that #1527)

@sbernard31
Copy link
Contributor

The PR is integrated in master.
I hope this will solve the issue.

@Sylphe88 thx a lot for taking time to reporting this 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Dysfunctionnal behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants