Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

If an engine comes up explicitely, it must also come down explicitely #1643

Closed
wants to merge 3 commits into from

Conversation

levitte
Copy link
Member

@levitte levitte commented Sep 28, 2016

In apps/apps.c, one can set up an engine with setup_engine().
However, we freed the structural reference immediately, which means
that for engines that don't already have a structural reference
somewhere else (because it's a built in engine), we end up returning
an invalid reference.

Instead, the function release_engine() is added, and called at the end
of the routines that call setup_engine().

@levitte
Copy link
Member Author

levitte commented Sep 28, 2016

@dwmw2, would you please make a few tries with this, see if that fixes the issues you're seeing?

@levitte
Copy link
Member Author

levitte commented Sep 28, 2016

(#1644 is the same for 1.0.2)

@dwmw2
Copy link
Contributor

dwmw2 commented Sep 28, 2016

==128751== Invalid read of size 8
==128751==    at 0x5A546A1: check_key_fork (p11_front.c:150)
==128751==    by 0x5A550BD: PKCS11_private_encrypt (p11_front.c:432)
==128751==    by 0x571292: RSA_private_encrypt (rsa_crpt.c:37)
==128751==    by 0x430C34: rsautl_main (rsautl.c:240)
==128751==    by 0x4215F7: do_cmd (openssl.c:500)
==128751==    by 0x420CEA: main (openssl.c:177)
==128751==  Address 0x5760b68 is 40 bytes inside a block of size 48 free'd
==128751==    at 0x4C2CD5A: free (vg_replace_malloc.c:530)
==128751==    by 0x5419A3: CRYPTO_free (mem.c:179)
==128751==    by 0x5A7F7FD: CRYPTO_free (mem.c:166)
==128751==    by 0x5A51911: pkcs11_destroy_keys (p11_key.c:551)
==128751==    by 0x5A5340D: pkcs11_destroy_token (p11_slot.c:519)
==128751==    by 0x5A54100: pkcs11_release_slot (p11_slot.c:451)
==128751==    by 0x5A544E6: pkcs11_release_all_slots (p11_slot.c:431)
==128751==    by 0x5A4E78D: pkcs11_finish (eng_back.c:269)
==128751==    by 0x517A74: engine_unlocked_finish (eng_init.c:60)
==128751==    by 0x517C10: ENGINE_finish (eng_init.c:101)
==128751==    by 0x405E64: load_key (apps.c:705)
==128751==    by 0x43098E: rsautl_main (rsautl.c:177)
==128751==  Block was alloc'd at
==128751==    at 0x4C2BBAD: malloc (vg_replace_malloc.c:299)
==128751==    by 0x541750: CRYPTO_malloc (mem.c:92)
==128751==    by 0x541816: CRYPTO_realloc (mem.c:113)
==128751==    by 0x5A7F65F: CRYPTO_realloc (mem.c:110)
==128751==    by 0x5A50D4F: pkcs11_init_key.isra.0 (p11_key.c:497)
==128751==    by 0x5A51AD5: pkcs11_next_key (p11_key.c:458)
==128751==    by 0x5A51AD5: pkcs11_find_keys (p11_key.c:436)
==128751==    by 0x5A51AD5: pkcs11_enumerate_keys (p11_key.c:398)
==128751==    by 0x5A4DF19: pkcs11_load_key (eng_back.c:767)
==128751==    by 0x5A4E89B: pkcs11_load_private_key (eng_back.c:842)
==128751==    by 0x5191AE: ENGINE_load_private_key (eng_pkey.c:75)
==128751==    by 0x405E54: load_key (apps.c:704)
==128751==    by 0x43098E: rsautl_main (rsautl.c:177)
==128751==    by 0x4215F7: do_cmd (openssl.c:500)

As I said in #1642, probably an engine_pkcs11 bug. Even so, can we fix it this way in 1.1.0 and 1.0.2? I had been thinking that we couldn't.... but actually, engine_pkcs11 doesn't suffer the above use-after-free until the latest git, where we made it stop registering default key methods. Until then, it was pinned because its methods were default. So as long as we fix this bug in engine_pkcs11 and make it properly refcount its keys before the next release, maybe it's OK for you to ship your version in a stable OpenSSL release?

Shipping my version where we hold a functional reference not a structural reference might still be safer though, even if we can persuade ourselves it's OK for engine_pkcs11.

@levitte
Copy link
Member Author

levitte commented Sep 28, 2016

The trouble you're having is here. The EVP_PKEY that's constructed should have a functional reference to the engine used, but there's no way for that to happen. It's crucial that this is done properly.

Sorry to say this, but I guess that's a long standing bug on the engine_pkcs11 side of things, you've just been lucky that there was a reference of some sort hiding it...

@levitte
Copy link
Member Author

levitte commented Sep 28, 2016

This line here in libp11 should really be:

    rsa = RSA_new_method(eng);

(given en input ENGINE *eng)

@dwmw2
Copy link
Contributor

dwmw2 commented Sep 29, 2016

Sure, I understand the nature of the problem (although thanks for pointing me at the specific fix; I hadn't quite got there yet).

But since we're hopefully looking at applying this to the stable 1.0.2 and 1.1 trees too, my concern was whether we want to do it in a way that triggers such "long standing bugs". Why are you so keen to return a structural reference from setup_engine(), and not a functional reference as I had — given that it potentially triggers new bugs that weren't being triggered before?

I had chosen to use a functional reference because that basically mimics the existing assumptions — it only ever worked when ENGINE_set_default() had caused a functional reference to be held, and now you've changed that so that we are taking and dropping the functional reference at runtime in a new and exciting (at least for engine_pkcs11) way.

As I said, for engine_pkcs11 perhaps it's OK — because there is no release yet which doesn't register those bogus generic EC and RSA methods, and which doesn't thus get a functional reference pinned anyway via ENGINE_set_default(). So as long as we fix the EVP_PKEY engine refcounting bug in libp11 before we do our next release, we'll be fine.

For the general case of engines without default methods which have similar refcounting bugs... maybe it's valid to observe that they never worked anyway, as we're only changing the failure mode in a stable release and that's OK?

But still, even for that I think you'd have to really want it to be only a structural reference, not a functional one. And I haven't seen a really good reason to want that. It's not supposed to be a torture test for engines :)

@levitte
Copy link
Member Author

levitte commented Sep 29, 2016

For the general case of engines without default methods which have similar refcounting bugs... maybe it's valid to observe that they never worked anyway, as we're only changing the failure mode in a stable release and that's OK?

I can't claim to have seen every engine in the world... however, of all I've seen, engine_pkcs11 / libp11 is the only one so far where I've seen this particular reference bug.

The problem comes when some other application does things "the right way" (i.e. ENGINE_init() and ENGINE_finish() are called around the actual use of engine functionality)... will they have to be changed to hold a global functional reference as well?

But still, even for that I think you'd have to really want it to be only a structural reference, not a functional one.

I'll have you note that there is a functional reference around key loading, 49e476a makes sure of that.

@dwmw2
Copy link
Contributor

dwmw2 commented Sep 29, 2016

The problem comes when some other application does things "the right way" (i.e. ENGINE_init() and ENGINE_finish() are called around the actual use of engine functionality)... will they have to be changed to hold a global functional reference as well?

Not 'changed to'. In the context of this discussion, it was more about whether they should continue to do it that way in a stable release series, if that's they way they were already behaving.

With the noted caveat that in our case, it didn't work before either; it was just differently broken before. Your version is fine; I just wanted to double-check that the compatibility implications for the stable release were fully considered. Thanks.

I'll have you note that there is a functional reference around key loading, 49e476a makes sure of that.

And then you drop the reference immediately, while you still hold the key. That's the behaviour I was considering "new and exciting" for a stable release. But you said that engine bug isn't common and you've never seen it anywhere but in engine_pkcs11, and we know it was masked in engine_pkcs11 by other bugs until this week anyway, so it's never been in a release. So "new and exciting" is absolutely fine by me.

@levitte
Copy link
Member Author

levitte commented Sep 29, 2016

The problem comes when some other application does things "the right way" (i.e. ENGINE_init() and ENGINE_finish() are called around the actual use of engine functionality)... will they have to be changed to hold a global functional reference as well?

Not 'changed to'. In the context of this discussion, it was more about whether they should continue to do it that way in a stable release series, if that's they way they were already behaving.

I'm not sure I understand what you mean. Are you saying that applications usually do hold an explicit global functional reference (i.e. they call ENGINE_init() somewhere early and call ENGINE_finish() just before exiting, like https://github.com/OpenSC/libp11/blob/master/tests/evp-sign.c does?
If they do, I don't see why they should change, there's no actual harm in that per se, it's just not necessary.

@levitte
Copy link
Member Author

levitte commented Sep 29, 2016

Your version is fine; I just wanted to double-check that the compatibility implications for the stable release were fully considered. Thanks.

Noted, and thank you.

Now, all I need is another team member reviewing this stuff.

@dwmw2
Copy link
Contributor

dwmw2 commented Sep 29, 2016

(Potentially flogging a dead horse here, since I've already conceded the point. But since you asked after I had done so... and since when I'm done yak-shaving on this latest occurrence of "I'll just show someone how to do this trivial thing with PKCS#11.... oh f%ck it's all f%cking broken" I get to go back to the joyous task of writing test cases for my NSS support of PKCS#11 URIs... I'll tarry a while longer ☺)

I'm not sure I understand what you mean. Are you saying that applications usually do hold an explicit global functional reference (i.e. they call ENGINE_init() somewhere early and call ENGINE_finish() just before exiting, like https://github.com/OpenSC/libp11/blob/master/tests/evp-sign.c does?
If they do, I don't see why they should change, there's no actual harm in that per se, it's just not necessary.

Nonono, I'm saying that if, in a stable series, they have previously held that long-term reference (as was the openssl s_client model in the cases where it was actually working, because SSL_set_default() bumped it and then it stayed for the duration) , then they should not stop doing that, and start calling ENGINE_finish() as soon as they can to turn it into an engine torture test, in that stable series :)

But I have already conceded the point that for openssl s_client and friends, you can probably get away with that "torture test". Because any engine which provides default methods will not see any difference, and any engine which doesn't was not working before anyway, so you've only swapped one failure mode for (potentially, and unlikely) another.

(Note s_client still broken in 1.1.0 because commit a6972f3 wants backporting)

@levitte
Copy link
Member Author

levitte commented Sep 29, 2016

Nonono, I'm saying that if, in a stable series, they have previously held that long-term reference (as was the openssl s_client model in the cases where it was actually working, because SSL_set_default() bumped it and then it stayed for the duration) , then they should not stop doing that, and start calling ENGINE_finish() as soon as they can to turn it into an engine torture test, in that stable series :)

There is a slight misunderstanding here. ENGINE_set_default() (not SSL_set_default()) will bump the functional reference counter if there is default functionality provided by the engine. Since your recent change in engine_pkcs11 stopped providing this default functionality, a functional reference to hold those is not needed any more, that's why it's suddenly disappearing. The openssl application hasn't changed its behavior on that front, and the issue was rather that it put too much trust into engines to always provide some default functionality. That was obviously buggy, and you helped us uncover that.

(Note s_client still broken in 1.1.0 because commit a6972f3 wants backporting)

Now done.

@dwmw2
Copy link
Contributor

dwmw2 commented Sep 29, 2016

That was a typo not a misunderstanding fwiw. Brain said ENGINE_set_default () but fingers typed SSL.

For default functionality... now we really need your guidance in OpenSC/libp11#110

@levitte
Copy link
Member Author

levitte commented Sep 29, 2016

For 1.1.0, we've figured out / reminded ourselves that this is not strictly necessary, as engines without any registered cipher or hash methods is impossible... or, well, not strictly impossible, but entirely useless and/or buggy, given the intended ENGINE structure. It's a different story with 1.0.2, since the engine implementor has full access to all structures, and can therefore hack around OpenSSL limitations much more.

@levitte
Copy link
Member Author

levitte commented Sep 29, 2016

I'm closing this one and starting it over. It's not necessary for 1.1.0, as registering any algorithm is mandatory (either to be able to retrieve a key or to use the engine for crypto acceleration).

@dwmw2
Copy link
Contributor

dwmw2 commented Oct 5, 2016

For now, we're making engine_pkcs11 just register a default RSA method (not EC; it only has to be default for something) and pass through to the OpenSSL internal methods for "alien" keys.

However...

It's not necessary for 1.1.0, as registering any algorithm is mandatory (either to be able to retrieve a key or to use the engine for crypto acceleration).

I think we can make that work even for 1.1. Even though RSA_new_method() and EC_KEY_new_method() are unusable for engines which don't provide a generic method even for alien keys, we can simulate it by calling ENGINE_init() for ourselves, and then ENGINE_finish() from our RSA/EC_KEY finish methods. As long as we are sure that the ENGINE_finish() is a tail-call from the finish method, of course. :)

And right now, engine_pkcs11 is unloadable anyway because it never releases all its references to the keys it has. We've got plenty of work to do internally to fix its refcounting, before talk of releasing the engine even makes any sense.

So right now, apart from the use-after-free issue in openssl(1), engine_pkcs11 can just work without registering any defaults.

dwmw2 pushed a commit to OpenSC/libp11 that referenced this pull request Oct 5, 2016
Since commit f160e2f ("Gracefully handle alien RSA keys") it's OK for the
engine to be set as the default method for RSA keys. And since the openssl
command line tool in many released versions forgets to do ENGINE_init(), it
*only* works if the engine is default for *something*, as discussed in
openssl/openssl#1643

This reverts the RSA part of commit b9b7941.

Closes #118
In apps/apps.c, one can set up an engine with setup_engine().
However, we freed the structural reference immediately, which means
that for engines that don't already have a structural reference
somewhere else (because it's a built in engine), we end up returning
an invalid reference.

Instead, the function release_engine() is added, and called at the end
of the routines that call setup_engine().
@levitte
Copy link
Member Author

levitte commented Oct 19, 2016

Did a rebase to see that the builds all go through

@levitte
Copy link
Member Author

levitte commented Oct 19, 2016

Inspired by the comments in #1644, I did a small bit of cleanup here as well. Still fine, @richsalz ?

@richsalz
Copy link
Contributor

nice. +1

levitte added a commit that referenced this pull request Oct 20, 2016
In apps/apps.c, one can set up an engine with setup_engine().
However, we freed the structural reference immediately, which means
that for engines that don't already have a structural reference
somewhere else (because it's a built in engine), we end up returning
an invalid reference.

Instead, the function release_engine() is added, and called at the end
of the routines that call setup_engine().

Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from #1643)
levitte added a commit that referenced this pull request Oct 20, 2016
… always

Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from #1643)
levitte added a commit that referenced this pull request Oct 20, 2016
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from #1643)
levitte added a commit that referenced this pull request Oct 20, 2016
In apps/apps.c, one can set up an engine with setup_engine().
However, we freed the structural reference immediately, which means
that for engines that don't already have a structural reference
somewhere else (because it's a built in engine), we end up returning
an invalid reference.

Instead, the function release_engine() is added, and called at the end
of the routines that call setup_engine().

Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from #1643)
(cherry picked from commit dd1abd4)
levitte added a commit that referenced this pull request Oct 20, 2016
… always

Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from #1643)
(cherry picked from commit 907c6c8)
levitte added a commit that referenced this pull request Oct 20, 2016
Reviewed-by: Rich Salz <rsalz@openssl.org>
(Merged from #1643)
(cherry picked from commit b85bf63)
@levitte
Copy link
Member Author

levitte commented Oct 20, 2016

Merged. Thanks for the review, guys!

@levitte levitte closed this Oct 20, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch: master Merge to master branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants