Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[crypto/OCSP] parse response certID hashAlgorithm #90

Closed
wants to merge 1 commit into from

Conversation

t-lo
Copy link

@t-lo t-lo commented Feb 15, 2019

This patch implements parsing of the hashAlgorithm certID field
of incoming OCSP responses as well as validation of the field
value against the respective request field hashAlgorithm. The
remaining fields of the response certID are validated against
the request via memcpy().

The certID part of the OCSP response is defined as an ASN.1
sequence:

CertID ::= SEQUENCE {
hashAlgorithm AlgorithmIdentifier,
issuerNameHash OCTET STRING, -- Hash of issuer's DN
issuerKeyHash OCTET STRING, -- Hash of issuer's public key
serialNumber CertificateSerialNumber }

(see https://tools.ietf.org/html/rfc6960#appendix-B.1 )

Parsing hashAlgorithm improves the previous implementation which
used memcmp() to compare the certID memory (raw bits) of request
and response. As ASN.1 semantics are ignored by memcmp(), bitwise
different but semantically identical certIDs were rejected by the
previous implementation. This caused e.g. boot failures of OS images
downloaded via HTTPS.

We at Kinvolk use IPXE for pulling Flatcar Linux images when
bootstrapping container clusters. For security reasons we use
HTTPS for downloading images. We have observed numerous
download failures in the recent past because of the above issue,
depending on which OCSP server from a CDN responded to the
validation request. Some OCSP servers would respond with the
original CertID bits while others would use different bits, making
the issue hard to track down as it appeared and disappeared
seemingly at random.

Signed-off-by: Thilo Fromm thilo@kinvolk.io

@NiKiZe
Copy link
Contributor

NiKiZe commented Feb 15, 2019

Awesome to see this!
What is the resulting binary size difference between old and new code?
Especially BIOS builds are limited in memory, and some ROMs even more so, But would be good if we could rule that out as a reason to not take this.

@t-lo
Copy link
Author

t-lo commented Feb 15, 2019

This change will only be compiled into the resulting binary when HTTPS support is switched on via config/general.h. By default it's off.

Here's a size comparison of the ipxe binaries for plain (no HTTPS support), https-old (with the memcpy()), and https-fixed (with the above change):

plain/:
 328192   ipxe.dsk
1048576   ipxe.iso
 329238   ipxe.pxe
1409024   ipxe.usb

https-old/:
 339968  ipxe.dsk
1048576  ipxe.iso
 341181  ipxe.pxe
1409024  ipxe.usb

https-fixed/:
 340480  ipxe.dsk
1048576  ipxe.iso
 341105  ipxe.pxe
1409024  ipxe.usb

@NiKiZe
Copy link
Contributor

NiKiZe commented Feb 16, 2019

I'm confused how ipxe.pxe seems smaller in your tests with this branch(?)
my own tests for buildsize ...
this branch is 83c744d
master is 36a4c85
I find it weird that no https versions seems to be affected (sometimes) as well, I have tried several times now and always getting the same result, so not sure what's going on - this is not important from the code/PR standpoint, just an observation that I can't explain ;)

bin/ipxe.pxe
no https master: 331336
no https this branch: 331462
https master: 343210
https this branch: 343509

bin/intel.rom
no https master: 68096
no https this branch: 68096
https master: 82432
https this branch: 82944

bin/undionly.kpxe
no https master: 67195
no https this branch: 67192
https master: 81888
https this branch: 81950

bin-x86_64-efi/snponly.efi
no https master: 155168
no https this branch: 155168
https master: 188544
https this branch: 188928

Hack of script to simplify testing

#dstfile=bin/intel.rom
#dstfile=bin/undionly.kpxe
#dstfile=bin-x86_64-efi/snponly.efi

basemake="make -j16 $dstfile"
git fetch https://github.com/kinvolk/ipxe.git t-lo/ocsp-certid-parsing
git checkout FETCH_HEAD
git rebase origin/master
git log -1  --pretty=oneline | cat
$basemake CONFIG=none > /dev/null; ls -l $dstfile
$basemake CONFIG=https > /dev/null; ls -l $dstfile
git checkout origin/master
git log -1  --pretty=oneline | cat
$basemake CONFIG=none > /dev/null; ls -l $dstfile
$basemake CONFIG=https > /dev/null; ls -l $dstfile

named configs diff:

diff -ur config/local/none/general.h config/local/https/general.h
--- config/local/none/general.h 2019-02-16 14:35:40.217373291 +0100
+++ config/local/https/general.h        2019-02-16 14:35:26.341373344 +0100
@@ -0,0 +1 @@
+#define        DOWNLOAD_PROTO_HTTPS    /* Secure Hypertext Transfer Protocol */

@t-lo
Copy link
Author

t-lo commented Feb 16, 2019

Good point with the ipxe.pxe sizes - however since the overall difference is just a few bytes we may observe unrelated compilation side effects. Also, the compiler version you are using appears to produce slightly larger binaries than mine - the "https master" version of ipxe.pxe is ~2kB larger than my "https this branch" one.

Another way of measuring the potential size increase would be to look at the ocsp.o binary and compare "https master" against "https this branch". ocsp.c is the only file touched by my PR. To prevent measuring unrelated compilation artifacts we would strip the binaries of debugging info.

https master:
ls -la bin/ocsp.o
66100 bin/ocsp.o
strip bin/ocsp.o
ls -la bin/ocsp.o
 4696 bin/ocsp.o

https this branch:
ls -la bin/ocsp.o
73576 bin/ocsp.o
strip bin/ocsp.o
ls -la bin/ocsp.o
 5020 bin/ocsp.o

That's 324 more bytes.

@t-lo
Copy link
Author

t-lo commented Feb 18, 2019

I've updated the PR based on a IRC discussion with Michael Brown. While we still parse and extract the hashAlgorithm field, we use memcmp() for the remainder of certID (issuerNameHash, issuerKeyHash, and serialNumber). The memcmp() starts with, and includes, the type identifier for the issuerNameHash field (0x04, for "OCTET STRING").

@t-lo
Copy link
Author

t-lo commented Feb 21, 2019

@mcb30 Any updates? You've mentioned on IRC you're not yet happy with the patch even after it was updated. Could you please elaborate?

@t-lo
Copy link
Author

t-lo commented Feb 22, 2019

Hello IPXE maintainers - this issue has ongoing impact on our datacenter operations; we also learned that other IaaS customers are affected by this (see issues linked from this PR). We have raised this issue on the IPXE IRC, on the e-mail list, and via e-mail to @mcb30 directly. We also updated the PR immediately after we received initial feedback on IRC.

After updating our PR we were given feedback via IRC that the PR is not perfect yet - without any hints though on what we need to do to improve the patch.

We are actively standing by to implement any improvements suggested to us. Please help us to improve our PR by providing feedback on what you don't like with the code. This issue is causing ongoing pain for us and others. Please help us to resolve this.

@t-lo
Copy link
Author

t-lo commented Feb 23, 2019

Hello,
unfortunately we did not receive a response to our question yesterday. The PR has been updated about a week ago based on @mcb30 's feedback we received on IRC. If you have further feedback and we should make more changes, please let us know.

This issue blocks provisioning our clusters with OS images in a reliable, safe, and secure way. While discussing a fix we learned that it also affects other teams, causing similar impact. Please provide feedback on the patch so we can further improve it. Please help us to fix this issue.

@NiKiZe
Copy link
Contributor

NiKiZe commented Feb 23, 2019

It seems that you have written a fix, so you could build your own version and use that in your environment. Having this PR merged should not prevent you from having it working in your environment?

When devs are busy with other things it can take some time to get things merged.
I do understand that is frustrating when there is little feedback.

After looking at the commits I can say that a squash is required, it should be one commit to fix this correct?
In the mean time it might be worth re-reading the relevant IRC logs to see if something was missed/missunderstood.

Thanks!

@t-lo t-lo force-pushed the t-lo/ocsp-certid-parsing branch 2 times, most recently from 5d6e582 to 62fda20 Compare February 23, 2019 12:07
@t-lo
Copy link
Author

t-lo commented Feb 23, 2019

Hi @NiKiZe

thank you for getting back to us! I've squashed the two commits as you recommended. We cannot simply build our own version of IPXE as we are working with IaaS providers to run our clusters in public cloud offerings. Those providers use, and rely on, upstream IPXE, and they would feel hard-pressed to install a custom binary from a 3rd party (in this case, us) in their datacenters. While we nevertheless explore that path forward, too, it doesn't feel like the ideal way out for anyone involved.

While working with IaaS providers we learned that this issue affects not just us, but other IaaS customers, too - hence our repeated mentioning of ongoing pain. Also, this issue is being referenced by 2 other FOSS projects, netboot and Fedora CoreOS, who also suffer from fall-out.

We understand when devs are busy with other things - we're devs ourselves. All we're asking for is a path forward with our patch. Let us do the work. Tell us what needs improvement.

@stappersg

This comment was marked as abuse.

@t-lo
Copy link
Author

t-lo commented Feb 23, 2019

Hello @stappersg

Thank you for getting back to us! I've updated the pull request based on your feedback. Made the commit message clearer (it actually did not accurately reflect the code because it was not updated with the previous change requested by @mcb30 ), changed // -> /* */, and modified the sources for more clarity on what was added/removed.

@t-lo t-lo changed the title [crypto/OCSP] parse response certID for validation [crypto/OCSP] parse response certID hashAlgorithm Feb 23, 2019
@ipxe-devel
Copy link

ipxe-devel commented Feb 23, 2019 via email

@t-lo
Copy link
Author

t-lo commented Feb 23, 2019

Hello @ipxe-devel ,

Please pardon the confusion should we have given the wrong impression. We do not expect anyone, not the IPXE maintainer(s) nor anyone else, to fix an issue with IPXE that is impacting us and other IPXE users. I am very sorry should I have given the impression that we expect others to fix issues we're impacted from. We readily invest the time and effort to fix it ourselves.

We are relying on IPXE in a 3rd party environment - we provision kubernetes clusters with a container OS in datacenters of public IaaS providers. IPXE is supplied by our IaaS provider in their datacenters to download, install, and boot custom OSes - in our case, Flatcar Linux. The OCSP bug in IPXE blocks us from reliably using HTTPS to pull those OS images at provisioning time, so we root caused and fixed this bug. The IaaS provider's datacenter operators rely on upstream IPXE, so we are very interested in getting our fix upstreamed, and we will eagerly implement any change proposed by IPXE upstream. Please verify from the change history of this pull request that we will immediately jump into action as soon as we get feedback.

I am very sorry should I have given the impression to expect upstream (or others) to fix this issue. Let me state very clearly that we are willing to invest into this fix until it is ready for ingestion. Please help us to mitigate the pain this issue is inflicting on us and on other IPXE users. Please provide feedback on the PR and help us to get it right for upstreaming.

@t-lo
Copy link
Author

t-lo commented Feb 27, 2019

Hello IPXE maintainers,

could we please receive feedback on what additional changes we need to do in order for you to go ahead with our pull request?

@t-lo t-lo force-pushed the t-lo/ocsp-certid-parsing branch from 705f177 to e7f2968 Compare March 1, 2019 10:31
@t-lo
Copy link
Author

t-lo commented Mar 1, 2019

I merged @stappersg proposed changes from #91 into this PR.

Implements parsing of the hashAlgorithm  certID field of incoming OCSP
responses as well as validation of the field value against the respective
request field hashAlgorithm. The remaining fields of the response certID
are validated against the request via memcpy().

The certID part of the OCSP response is defined as an ASN.1 sequence:

CertID ::= SEQUENCE {
 hashAlgorithm           AlgorithmIdentifier,
 issuerNameHash          OCTET STRING, -- Hash of issuer's DN
 issuerKeyHash           OCTET STRING, -- Hash of issuer's public key
 serialNumber            CertificateSerialNumber }

(see https://tools.ietf.org/html/rfc6960#appendix-B.1 )

Parsing hashAlgorithm improves the previous implementation which
used memcmp() to compare the certID memory (raw bits) of request and
response. As ASN.1 semantics are ignored by memcmp(), bitwise different
but semantically identical certIDs were rejected by the previous
implementation. This caused e.g. boot failures of OS images downloaded
via HTTPS.

Authors: Geert Stappers <https://github.com/stappersg>
         Thilo Fromm <Thilo@kinvolk.io>
@t-lo t-lo force-pushed the t-lo/ocsp-certid-parsing branch from e7f2968 to 1ca871b Compare March 1, 2019 10:50
@mcb30
Copy link
Member

mcb30 commented Mar 10, 2019

Fixed via an alternative patch: http://git.ipxe.org/ipxe.git/commitdiff/b6ffe28a2

Please reopen if this patch does not fix the issue.

Thanks.

@mcb30 mcb30 closed this Mar 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants