Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large breakage on NonStop - No plan found in TAP output [master, 3.2] #22798

Closed
rsbeckerca opened this issue Nov 22, 2023 · 19 comments
Closed

Large breakage on NonStop - No plan found in TAP output [master, 3.2] #22798

rsbeckerca opened this issue Nov 22, 2023 · 19 comments
Labels
triaged: bug The issue/pr is/fixes a bug

Comments

@rsbeckerca
Copy link
Contributor

The commit 6d552a5 on the openssl-3.2 branch fails in a number of places in all NonStop builds. Help is needed.

70-test_certtypeext.t                 (Wstat: 65280 Tests: 0 Failed: 0)
  Non-zero exit status: 255
  Parse errors: No plan found in TAP output
70-test_comp.t                        (Wstat: 65280 Tests: 0 Failed: 0)
  Non-zero exit status: 255
  Parse errors: No plan found in TAP output
70-test_key_share.t                   (Wstat: 65280 Tests: 0 Failed: 0)
  Non-zero exit status: 255
  Parse errors: No plan found in TAP output
70-test_quic_multistream.t            (Wstat: 256 Tests: 1 Failed: 1)
  Failed test:  1
  Non-zero exit status: 1
70-test_quic_tserver.t                (Wstat: 256 Tests: 1 Failed: 1)
  Failed test:  1
  Non-zero exit status: 1
70-test_renegotiation.t               (Wstat: 65280 Tests: 0 Failed: 0)
  Non-zero exit status: 255
  Parse errors: Bad plan.  You planned 5 tests but ran 0.
70-test_sslcbcpadding.t               (Wstat: 65280 Tests: 0 Failed: 0)
  Non-zero exit status: 255
  Parse errors: No plan found in TAP output
70-test_sslcertstatus.t               (Wstat: 65280 Tests: 0 Failed: 0)
  Non-zero exit status: 255
  Parse errors: No plan found in TAP output
70-test_sslextension.t                (Wstat: 65280 Tests: 0 Failed: 0)
  Non-zero exit status: 255
  Parse errors: No plan found in TAP output
70-test_sslmessages.t                 (Wstat: 65280 Tests: 0 Failed: 0)
  Non-zero exit status: 255
  Parse errors: No plan found in TAP output
70-test_sslrecords.t                  (Wstat: 65280 Tests: 0 Failed: 0)
  Non-zero exit status: 255
  Parse errors: No plan found in TAP output
70-test_sslsessiontick.t              (Wstat: 65280 Tests: 0 Failed: 0)
  Non-zero exit status: 255
  Parse errors: No plan found in TAP output
70-test_sslsigalgs.t                  (Wstat: 65280 Tests: 0 Failed: 0)
  Non-zero exit status: 255
  Parse errors: No plan found in TAP output
70-test_sslsignature.t                (Wstat: 65280 Tests: 0 Failed: 0)
  Non-zero exit status: 255
  Parse errors: No plan found in TAP output
70-test_sslskewith0p.t                (Wstat: 65280 Tests: 0 Failed: 0)
  Non-zero exit status: 255
  Parse errors: No plan found in TAP output
70-test_sslversions.t                 (Wstat: 65280 Tests: 0 Failed: 0)
  Non-zero exit status: 255
  Parse errors: No plan found in TAP output
70-test_sslvertol.t                   (Wstat: 65280 Tests: 0 Failed: 0)
  Non-zero exit status: 255
  Parse errors: No plan found in TAP output
70-test_tls13alerts.t                 (Wstat: 65280 Tests: 0 Failed: 0)
  Non-zero exit status: 255
  Parse errors: No plan found in TAP output
70-test_tls13cookie.t                 (Wstat: 65280 Tests: 0 Failed: 0)
  Non-zero exit status: 255
  Parse errors: No plan found in TAP output
70-test_tls13downgrade.t              (Wstat: 65280 Tests: 0 Failed: 0)
  Non-zero exit status: 255
  Parse errors: No plan found in TAP output
70-test_tls13hrr.t                    (Wstat: 65280 Tests: 0 Failed: 0)
  Non-zero exit status: 255
  Parse errors: No plan found in TAP output
70-test_tls13kexmodes.t               (Wstat: 65280 Tests: 0 Failed: 0)
  Non-zero exit status: 255
  Parse errors: No plan found in TAP output
70-test_tls13messages.t               (Wstat: 65280 Tests: 0 Failed: 0)
  Non-zero exit status: 255
  Parse errors: No plan found in TAP output
70-test_tls13psk.t                    (Wstat: 65280 Tests: 0 Failed: 0)
  Non-zero exit status: 255
  Parse errors: No plan found in TAP output
70-test_tlsextms.t                    (Wstat: 65280 Tests: 0 Failed: 0)
  Non-zero exit status: 255
  Parse errors: No plan found in TAP output
79-test_http.t                        (Wstat: 256 Tests: 2 Failed: 1)
  Failed test:  1
  Non-zero exit status: 1
80-test_cmp_http.t                    (Wstat: 65280 Tests: 0 Failed: 0)
  Non-zero exit status: 255
  Parse errors: Bad plan.  You planned 6 tests but ran 0.
80-test_ssl_old.t                     (Wstat: 512 Tests: 13 Failed: 2)
  Failed tests:  2, 8
  Non-zero exit status: 2
Files=295, Tests=3515, 2441 wallclock secs (10.01 usr  0.00 sys + 1450.35 cusr 21.76 csys = 1482.12 CPU)
Result: FAIL
@rsbeckerca rsbeckerca added the issue: bug report The issue was opened to report a bug label Nov 22, 2023
@levitte
Copy link
Member

levitte commented Nov 22, 2023

I very much doubt it's that specific commit, as it's a Windows specific commit (i.e. irrelevant for you). A wild guess is that you have a fairly serious build failure, and we're just getting to see the downfall. A complete build log would help.

@nhorman
Copy link
Contributor

nhorman commented Nov 22, 2023

Agreed, I can't fathom a relationship between the windows template and the above failures. In addition to the build log @levitte asked for, it might be valuable to run some of the tests directly to gather additional output. For instance, running :
./test/quic_multistream_test
directly might be somewhat revealing

@rsbeckerca
Copy link
Contributor Author

The last confirmed successful build/test we had was at commit 9e75a0b. This is for the unthreaded builds. The whole build/test is attached.
openssl.3.2.fail.log

@levitte
Copy link
Member

levitte commented Nov 22, 2023

That's quite a range of commits...

$ git log --pretty=oneline 9e75a0b911ffb2ad99190a72a3d740d100edf61f..6d552a532754f6ee66d6cc604655deaeb5425b16
6d552a532754f6ee66d6cc604655deaeb5425b16 Fix typo in variable name
a091bc6022b23c0b1caf1c7acbb1f15bdf290816 Move freeing of an old record layer to dtls1_clear_sent_buffer
e59ed0bfeece9db433809af2cebbe271a402d59b zero data in hm_fragment on alloc
5091aadc223315ce115ee12f62df2af173bf5efb augment quic demos to support ipv4/6 connections
cf6342bc024868f5a55f2225f2e083415fb1329a NOTES-WINDOWS: fix named anchor links in table of contents
5f6b08e218974d4fbbd77ffedc2d94a08a194cc2 Bump actions/github-script from 6 to 7
4ea752997df83c2a694fdb157aab07908303fc90 Configure: do not check for an absolute prefix in cross-builds
dcfd8cfd4abf4a9fd26aef290dcdd5b4bb1c7f7a Update ci and ABI xml files to validate function parameters
46376fcf4b6d11ec417c2a530475037d4d09fcbf Correct tag len check when determining how much space we have in the pkt
aa6ac60728207ba18779d7cbe71893c066bcbc28 Add some additional tests for the new fc "consumed" params
e57bf6b3bfa2f0b18e5cad7fd3c5fdd7c51516b9 Keep track of connection credit as we add stream data
f5a63bf1c02cf0605ce3f5614cd016c3750766d3 Fix SHA3_squeeze on armv4.
86db958835d1f8ba9ce49a9f93b5309c3d13b91c demos: tidy up makefiles, fix warnings
56aa3e8d1a286e11e56d9a9f6373c33a87a69ff4 Import repro from #22708 as a test case
11e61b3174762ec21ce0875a6449c61e316b4a0b Fix BLAKE2s reporting the same EVP_MD_get_size() as BLAKE2b (64)
4f0172c543dd0f5582d52185bfe2c132faee9c8e README: add link to OpenSSL 3.2 manual pages
ed5e478261127cafe9c3f86c4992eab1e5c7ebb1 ppc64: Fix SHA3_squeeze
10264b534b366785bb560be77913fdbf9f0f4f8f Document the exporter change in CHANGES.md and NEWS.md
c768ccebc718ea0ed6afc5147fe4079fff632cd6 Add exporters for CMake
2ac569a67b9d0980efa2d8061a6a61e0645f37a7 Clean up exporters, specifically those we have for pkg-config
fe487609c17dac049f867f230e09ee090b65e966 Exclude more in the fuzz introspector report
456b32ba4f85000d168230b8cc5f58571699ed63 Rearrange some CI jobs
0ddcb55b602800d4a1bcf1e76ca32939ed4fdaa4 Correct 50-nonstop.conf to support QUIC tests under SPT threading models.
7602bf871564df86005f6c4c989f1d7cc2393878 Enable AES and SHA3 optimisations on Apple Silicon M3-based macOS systems
f63e1b48ac893dd6110452e70ed08f191547cd89 Force Nonstop to use fcntl(F_GETFL) in BIO_sock_nbio
9890cc42daff5e2d0cad01ac4bf78c391f599a6e set_client_ciphersuite(): Fix for potential UB if session->cipher is NULL
ae9fe65d9f85e027bd7428e0f84aa46ab368880e Augment rand argument parsing to allow scaling
66c27d06e0e9a1bb716c67390ad9e5ac613d45d3 Properly limit the variable output size for BLAKE2

@rsbeckerca
Copy link
Contributor Author

@levitte I will run a bisect but it will take a while. Stay tuned.

@levitte
Copy link
Member

levitte commented Nov 22, 2023

I went through the log you gave us (thank you!), and since it was about test failures, I went searching for not ok. And yeah, I found some, all having the exact same pattern, exemplified by the first:

        # 00000000:error:10080106:BIO routines:BIO_socket_nbio:passed invalid argument:/home/jenkinsbuild/.jenkins/workspace/OpenSSL-3.2_Monitor/crypto/bio/bio_sock.c:391:
        # OPENSSL_TEST_RAND_SEED=1700625731
        not ok 1 - iteration 1

So I went looking for commits in the given range that touched crypto/bio/bio_sock.c:

$ git log --pretty=oneline 9e75a0b911ffb2ad99190a72a3d740d100edf61f..6d552a532754f6ee66d6cc604655deaeb5425b16 -- crypto/bio/bio_sock.c
f63e1b48ac893dd6110452e70ed08f191547cd89 Force Nonstop to use fcntl(F_GETFL) in BIO_sock_nbio

It's possible this is good enough to lead to a solution. I don't think I can help much further...

@rsbeckerca
Copy link
Contributor Author

That was from the series trying to get SPT to work. I suggest reverting that particular PR or at least the change in BIO_sock_nbio. I am continuing the bisect just in case.

@levitte
Copy link
Member

levitte commented Nov 22, 2023

#22696 was only that commit.

A simple thing you could try locally would be git revert f63e1b48ac893dd6110452e70ed08f191547cd89 and build from there to see if that solves anything.

@nhorman
Copy link
Contributor

nhorman commented Nov 22, 2023

@rsbeckerca but you had tested that change when we were tracking down your hang, and it passed all tests at the time, no?

@rsbeckerca
Copy link
Contributor Author

@rsbeckerca but you had tested that change when we were tracking down your hang, and it passed all tests at the time, no?

@nhorman It passed for SPT. I did not run Unthreaded for it.

@rsbeckerca
Copy link
Contributor Author

Bisect showed that commit f63e1b4 is the problem. I'm going to try to revert that one and retest.

@rsbeckerca
Copy link
Contributor Author

Also, my team has decided that we are going to officially (and publicly) drop support for SPT as part of 3.2, including deleting the configurations for it. There is a new threading model coming that I hope to add - it should just be a new/replacement configuration in 50-nonstop.conf once we have certified, which should be around March.

@nhorman
Copy link
Contributor

nhorman commented Nov 22, 2023

Thanks for the effort in trying to get it to work, but given the issues we've looked at here, that sounds like a very reasonable approach

@rsbeckerca
Copy link
Contributor Author

Thanks for the effort in trying to get it to work, but given the issues we've looked at here, that sounds like a very reasonable approach

I have essentially given up on SPT for the long term. Thank you for all of your support, though. A PR is coming with resolutions to all of this, once I am done testing, including SPT deprecation.

@nhorman
Copy link
Contributor

nhorman commented Nov 23, 2023

thank you!

rsbeckerca added a commit to rsbeckerca/openssl that referenced this issue Nov 23, 2023
This fix removes explicit support for the SPT threading model in configurations.
This also reverts commit f63e1b4 that were
required for SPT but broke other models.

Fixes: openssl#22798

Signed-off-by: Randall S. Becker <randall.becker@nexbridge.ca>
rsbeckerca added a commit to rsbeckerca/openssl that referenced this issue Nov 23, 2023
This fix removes explicit support for the SPT threading model in configurations.
This also reverts commit f63e1b4 that were
required for SPT but broke other models.

Fixes: openssl#22798

Signed-off-by: Randall S. Becker <randall.becker@nexbridge.ca>
@mattcaswell mattcaswell added triaged: bug The issue/pr is/fixes a bug and removed issue: bug report The issue was opened to report a bug labels Nov 28, 2023
@rsbeckerca rsbeckerca changed the title Large breakage on NonStop - No plan found in TAP output Large breakage on NonStop - No plan found in TAP output [master, 3.2] Dec 6, 2023
@rsbeckerca
Copy link
Contributor Author

This situation persists for 3.2 and master branches.

@rsbeckerca
Copy link
Contributor Author

It is likely that PR #22807 may solve this, but it is pending the SPT deprecation with conditions discussed by the OTC. As of today, no requests to keep SPT around has crossed my feeds.

@nhorman
Copy link
Contributor

nhorman commented Dec 6, 2023

nor mine, but I have a reminder setup to expire after the two week period at which point I'll merge that to the stable branches

openssl-machine pushed a commit that referenced this issue Dec 12, 2023
This fix removes explicit support for the SPT threading model in configurations.
This also reverts commit f63e1b4 that were
required for SPT but broke other models.

Fixes: #22798

Signed-off-by: Randall S. Becker <randall.becker@nexbridge.ca>

Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Neil Horman <nhorman@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from #22807)

(cherry picked from commit 5cd1792)
@rsbeckerca
Copy link
Contributor Author

Thanks team. The build/test cycle worked for master and 3.2. Much appreciations.

wbeck10 pushed a commit to wbeck10/openssl that referenced this issue Jan 8, 2024
This fix removes explicit support for the SPT threading model in configurations.
This also reverts commit f63e1b4 that were
required for SPT but broke other models.

Fixes: openssl#22798

Signed-off-by: Randall S. Becker <randall.becker@nexbridge.ca>

Reviewed-by: Richard Levitte <levitte@openssl.org>
Reviewed-by: Neil Horman <nhorman@openssl.org>
Reviewed-by: Hugo Landau <hlandau@openssl.org>
(Merged from openssl#22807)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triaged: bug The issue/pr is/fixes a bug
Projects
None yet
Development

No branches or pull requests

4 participants