Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(build): upgraded openssl to latest in Cargo.toml after resolving … #5565

Closed
wants to merge 18 commits into from

Conversation

chefsale
Copy link
Contributor

@chefsale chefsale commented Dec 1, 2021

…buildkite issues and package upgrades.

Ran:

╰─ cargo update -p openssl                                                                                                            ─╯
info: syncing channel updates for '1.56.0-aarch64-apple-darwin'
info: latest update on 2021-10-21, rust version 1.56.0 (09c42c458 2021-10-18)
info: component 'rustfmt' for target 'aarch64-apple-darwin' is up to date
info: downloading component 'rust-std' for 'wasm32-unknown-unknown'
info: installing component 'rust-std' for 'wasm32-unknown-unknown'
    Updating crates.io index
    Updating git repository `https://github.com/near/paperclip`
    Updating openssl v0.10.36 -> v0.10.38
    Updating openssl-src v111.16.0+1.1.1l -> v300.0.2+3.0.0
    Updating openssl-sys v0.9.67 -> v0.9.71

Related issue: #5109
Blocker: buildkite/elastic-ci-stack-for-aws#970

Copy link
Collaborator

@frol frol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would raise a concern of migrating from openssl 1.1.x to openssl 3 would it be dynamically linked, but since we already statically link it, there should be no issue with the upgrade since it is self-contained

@chefsale
Copy link
Contributor Author

chefsale commented Dec 2, 2021

Aftefr openssl upgrade, test test_health_ok is failing, does anybody have a clue why would that be the case? @frol @bowenwang1996 @matklad

Seems like it panic on: thread 'test_health_ok' panicked at 'assertion failed: health.is_ok()', chain/jsonrpc/jsonrpc-tests/tests/rpc_query.rs:449:9

@matklad
Copy link
Contributor

matklad commented Dec 2, 2021

better error message:



thread 'test_health_ok' panicked at 'assertion failed: `(left == right)`
--
  | left: `Err(RpcError { error_struct: Some(HandlerError(Object({"name": String("NO_NEW_BLOCKS"), "info": Object({"elapsed": Object({"secs": Number(18), "nanos": Number(571333588)})})}))), code: -32000, message: "Server error", data: Some(Object({"name": String("NO_NEW_BLOCKS"), "info": Object({"elapsed": Object({"secs": Number(18), "nanos": Number(571333588)})})})) })`,
  | right: `Ok(())`', chain/jsonrpc/jsonrpc-tests/tests/rpc_query.rs:449:9

@matklad
Copy link
Contributor

matklad commented Dec 2, 2021

Spend five minutes looking into this, feels rather mysterious:

  • the failure reproduces locally every time (cargo t -p near-jsonrpc-tests --test rpc_query fails with this PR, but works on master)
  • the failure is related to interractions between the tests. Specifcally, cargo t -p near-jsonrpc-tests --test rpc_query -- health_ok passes

Probably won't look into this further today.

@chefsale
Copy link
Contributor Author

chefsale commented Dec 3, 2021

@matklad Can you help with this one, as I'm at capacity so would appreciate some help :D

@pmnoxx pmnoxx assigned pmnoxx and unassigned pmnoxx Dec 3, 2021
@matklad
Copy link
Contributor

matklad commented Dec 8, 2021

On my plate for tomorrow!

pmnoxx
pmnoxx previously approved these changes Dec 8, 2021
Copy link
Contributor

@pmnoxx pmnoxx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@matklad
Copy link
Contributor

matklad commented Dec 9, 2021

Made some progress:

Running just four tests in parallel with openssl 3 fails:

test_status
test_status_fail
test_health_fail_no_blocks
test_health_ok

The reason the health_ok fails is that is is just slow under open ssl when running concurrently -- the test just takes more than 2 seconds, so, by the time we are actually sending a requenst, the chain thinks it is stalled.

Now why openssl causes a massive slowdown on startup (40ms -> 1s) is something I am still investigating.

@matklad
Copy link
Contributor

matklad commented Dec 9, 2021

Narrowed this down! This line (telemetry)

Self::new(TelemetryConfig::default())

goes from 40ms to 1s due to openssl upgrade

@matklad
Copy link
Contributor

matklad commented Dec 9, 2021

I think this is what we are hitting here:

https://www.spinics.net/lists/openssl-users/msg14308.html

@matklad
Copy link
Contributor

matklad commented Dec 9, 2021

Ok, so the test failure highlights a real problem: with the new openssl, our startup can become significantly (1.5 seconds) slower. The root culprit is that one of the openssl initialization routines got unreasonably slower in openssl 3.0.0 (sfackler/rust-openssl#1576). This is exacerbated by the fact that actix web client which we use for telemetry calls this initialization three times (actix/actix-web#2502).

Our options:

  • Decide that slower startup is OK, fix the test in some crude way (by increasing the timeout), and upgrade openssl
  • Decide that slower startup is a bug, pin older openssl in Cargo.toml/Cargo.lock at least until awc issue is fixed

I need some help with evaluation this, as I don't really know if sticking to older openssl is acceptable for us.

@pmnoxx pmnoxx dismissed their stale review December 9, 2021 18:12

@matklad Unless those are security related issues. It's better to keep the old version.

@bowenwang1996
Copy link
Collaborator

Decide that slower startup is a bug, pin older openssl in Cargo.toml/Cargo.lock at least until awc issue is fixed

I think this makes more sense

@matklad
Copy link
Contributor

matklad commented Dec 10, 2021

The important bit I was missing is that openssl 1.1.1 is still supported, and is going to be supported for some time:

Version 1.1.1 will be supported until 2023-09-11 (LTS).

https://www.openssl.org/policies/releasestrat.html

It seems like rust-openssl might revert 3.0.0 as well: sfackler/rust-openssl#1576

So, the course of action here is essentially "do-nothing":

  • we already have good version of openssl in Cargo.lock
  • I don't think pining in Cargo.toml makes sense -- there's no specific Cargo.toml where this belongs, and picking versions of the things should generally be the responsibility of the downstream project with a Cargo.lock
  • I'll keep an eye on rust-openssl

@matklad matklad closed this Dec 10, 2021
matklad added a commit to matklad/nearcore that referenced this pull request Dec 10, 2021
near-bulldozer bot pushed a commit that referenced this pull request Dec 10, 2021
@Ekleog-NEAR Ekleog-NEAR deleted the upgrade-openssl branch March 29, 2024 15:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants