Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crypto: revamp C code [WIP] #2095

Merged
merged 139 commits into from
Feb 4, 2019
Merged

crypto: revamp C code [WIP] #2095

merged 139 commits into from
Feb 4, 2019

Conversation

hogand
Copy link
Contributor

@hogand hogand commented Jan 12, 2019

Added comprehensive error checking for Erlang enif and OpenSSL calls.

Revamped all of the C code so it uses a common idiom of incremental allocation and a single path to free resources to avoid memory leaks and double frees. It's also friendly to tools like Coccinelle.

Mostly kept the same logic as the original code but simplified in some places to reduce nesting. Also, added some explicit casts and reworked some type issues so the code has very few warnings.

Tested with crypto_SUITE on macOS with OpenSSL 1.1.1a, OpenSSL 0.9.8zh and LibreSSL 2.8.3.

WIP (needs more review) but it would be nice to throw this into the nightly regression tests to see what the fallout is across various OSes and TLS library versions. It would be nice to crank up the warnings to see if there's anything useful. I was using these extra CFLAGS in Makefile.in for clang:

CFLAGS = @DED_CFLAGS@ @SSL_FLAGS@ -Weverything -Wformat=2 -Wno-system-headers

Happy to split this up into multiple PRs as well. Almost all of the commits are per function and independent of each other.

Make it NULL safe.
Make it NULL safe.
Make it NULL safe.
Make it NULL safe.
A number of files will do bounds checking.
* Add error checking on all OpenSSL and Erlang calls.
* Add additional error checking.
* pkey is only set on success.
* Bounds check key.size before casting.
* Fix a possible memory leak where enif_alloc_binary() wasn't always
  converted into a term or freed in the error path.
* Check return values on all OpenSSL calls.
* Remove trailing semicolon after function definition.
* Check return values on all OpenSSL and Erlang calls.
* Remove trailing semicolon after function definition.
* Bug fix: ECDH_compute_key was not being checked for failures
  - That function returns 0 on failure and never returns < 0.
  - Using <= 0 check because OpenSSL uses that internally and on their wiki.
* Remove unnecessary variable i
* Make the gotos always flow downward rather than jumping back.
* Simplify logic by having incremental allocation and only free on error
  in one place.
* Add error checking on all OpenSSL calls.
* Make it explicit when you need to be careful with non-ref counted pointers.
  - set0 calls will save the pointer without reference counting.
  - On success, set pointers to NULL to avoid double frees since the struct
    is now responsible for freeing the resources.
* Simplify logic by having incremental allocation and only free on error on one place.
* Add error checking on all OpenSSL calls.
* Make it explicit when you need to be careful with non-reference counted pointers.
  - set0 calls will use the pointer values without ref counting.
  - On success, set pointers to NULL to avoid double frees since the struct
    is now responsible for freeing the resources.
* Change the parameter from int to size_t.
  - Only caller doesn't need to change since it was already passing sizeof().
* Add unsigned wrapping checks.
* Add error checking and use enif_make_badarg() on error.
* Use size_t when using sizeof().
* Move initialization away from declaration so it's not as easy to miss.
* Add bounds check before casting.
* Add error checking for all OpenSSL calls.
* Add error handling for all OpenSSL calls.
* Bounds check before casting.
* Add bounds checking.
* Add error checking for OpenSSL calls.
* Only set argument *bnp on success.
* Add bounds checking.
* Add error checking for OpenSSL calls.
* Only set *bnp on success.
* Add error handling for all OpenSSL calls
  - There was nothing returned on error before so use enif_make_badarg().
* Add bounds checking before casting.
* Add error checking for OpenSSL calls.
* Change dlen from unsigned to signed since BN_num_bytes is int.
* Add bounds checking before casting.
* Consolidate all freeing to one location on error or success.
* Add error handling for OpenSSL calls.
* Change dlen to signed since BN_num_bytes() returns int.
* Use enif_make_badarg() on error since it only returned
  undefined before in one type of error.
* Simplify error checking.
* Add bounds checking before casting.
* Add error checking for all OpenSSL calls.
* Add error handling for all OpenSSL calls.
  - However, disable custom crypto mem functions for LibreSSL
    since it has never supported it.
cipher = cipherp->cipher.p;
if (!cipher) {
if (argc != 4 && argc != 5)
goto bad_arg;
Copy link
Contributor

@sverker sverker Jan 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Throwing badarg on invalid argc to NIFs are pointless.
Calling a NIF with wrong number of arguments are as doable as calling a pure Erlang functions with wrong number of arguments.
Either remove such checks or replace them with (debug) asserts crashing the VM.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, ok. I can remove them.


done:
if (bn_base)
BN_free(bn_base);
Copy link
Contributor

@RoadRunnr RoadRunnr Jan 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

according to the OpenSSL man page, passing a NULL pointer to BN_free is valid and correct. The check is therefor not needed.

Copy link
Contributor Author

@hogand hogand Jan 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't just look at the master man page though. The problem is Erlang is trying to support multiple OpenSSL versions including unmaintained branches. To be able to call that safely, I would have to review all of the versions that Erlang supports. I don't know what the lower limit was to go digging into it. Some functions have become more NULL safe over time.

if (bn_result)
BN_free(bn_result);
if (bn_ctx)
BN_CTX_free(bn_ctx);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

according to the OpenSSL man page, passing a NULL pointer to BN_CTX_free is valid and correct. The check is therefor not needed.

Copy link
Contributor Author

@hogand hogand Jan 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar issue here and with all of the free calls. It's technically correct to have an if check before calling it. Removing the check means I'm asserting that it's NULL safe for all versions of OpenSSL that Erlang supports. Those kind of removals could be in a later commit when someone signs up to check all of the versions of OpenSSL that Erlang supports. I don't know how far back to go.

bn_len = BN_num_bytes(bn);
bin_ptr = enif_make_new_binary(env, bn_len, &term);
BN_bn2bin(bn, bin_ptr);
if ((bn_len = BN_num_bytes(bn)) < 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How could this possible be negative? The man pages does not say that it can return < 0 and the source code in OpenSSL suggests that it can not return a negative value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to remove that check and still cast to size_t, I would need to check all versions of OpenSSL that Erlang supports and be confident that they never introduce a negative value as an error condition in the future. The man page does not say it cannot return < 0. The man page doesn't say anything about possible error conditions.

if ((bin_ptr = enif_make_new_binary(env, (size_t)bn_len, &term)) == NULL)
goto err;

if (BN_bn2bin(bn, bin_ptr) < 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, I can't understand how this could possibly be negative.

@HansN
Copy link
Contributor

HansN commented Jan 14, 2019

I run with OpenSSL-1.1.1 (not 1.1.1a) and Linux 3.13.0-147-generic #196-Ubuntu SMP Wed May 2 15:51:34 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

When running crypto/test/engine_SUITE there is a coredump and some faults which disappears when I revert the commits

b645e20
64515f3
d444902

Then there is a coredump in ssh/test/ssh_algorithms_SUITE after a short while.
When I revert

1c540d4

(and re-define put_int32) that coredump disappears.

I do not add the branch to the internal tests yet, I want to test a bit more on my Linux first.

* Add if checks and update coccinelle script.
* Need to keep a reference even though ownership was transfered to dh_params.
* Also, be more conservative and return atom_error where the original
  code did.
* This was documented as sending in an argument but it doesn't read argv.
@HansN
Copy link
Contributor

HansN commented Jan 15, 2019

There are errors for all engine test cases, but it does not coredump. The other test suites as well as ssh test suites works without problems. I've tested on Linux with OpenSSL 1.1.1, 1.0.2k, 1.0.0i and 0.9.8zh.
The branch is now added to our nightly tests. I'll continue to track the engine problems later today or tomorrow.

@HansN
Copy link
Contributor

HansN commented Jan 15, 2019

With 0256606 also the engine suite runs without error on my local machine! So no error seem to be introduced by this PR now. Let's see tomorrow when all test machines has tested it...

enif_raise_exception(env, atom_notsup);
goto out_err;
enif_raise_exception(env, atom_notsup);
goto out_err;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got compilation error:
06134: ec.c: In function ‘ec_key_new’:
06135: ec.c:102:9: error: label ‘out_err’ used but not defined
06136: goto out_err;
06137: ^

@HansN
Copy link
Contributor

HansN commented Jan 16, 2019

All testing on our internal machines looks good, except for the compilation error in ec.c commented in that file!

7cb2e25 and f609841 were not included but added for tonight.

Tested with19 different OpenSSL and 1 LibreSSL on about 25 different machines with linux, darwin, windows, openbsd, freebsd and sunos of varying versions.

@HansN
Copy link
Contributor

HansN commented Jan 21, 2019

@hogand This PR has run our tests all the weekend without problems, so I think it could be merged now. I just want to check if you want to rearrange/rebase the commits or something before I merge?

@hogand
Copy link
Contributor Author

hogand commented Jan 21, 2019

@HansN I'd like to take a few days to review it again to double check that everything looks ok.

@HansN
Copy link
Contributor

HansN commented Jan 22, 2019

@hogand No problem!

@IngelaAndin IngelaAndin added the waiting waiting for changes/input from author label Jan 22, 2019
@HansN
Copy link
Contributor

HansN commented Feb 1, 2019

@hogand I'll need to merge this now on Monday, so if you have some commits you want to add, please push them.

@hogand
Copy link
Contributor Author

hogand commented Feb 3, 2019

@HansN I have a few more commits, but none of them are bug fixes. I'll open up a separate PR for the new commits.

@HansN HansN merged commit 784fb8d into erlang:master Feb 4, 2019
@HansN
Copy link
Contributor

HansN commented Feb 4, 2019

Thanks for the PR!

@HansN HansN removed the waiting waiting for changes/input from author label Feb 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team:PS Assigned to OTP team PS
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants