Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect endian without relying on defined symbols. #8572

Closed
wants to merge 1 commit into from

Conversation

paulidale
Copy link
Contributor

Avoid using the B_ENDIAN define to determine host endianness.

Fixes #8570

  • tests are added or updated

@paulidale paulidale added branch: master Merge to master branch approval: review pending This pull request needs review by a committer labels Mar 25, 2019
@slontis
Copy link
Member

slontis commented Mar 25, 2019

Err.. B_ENDIAN is also used in bn_lib.c, sha512.c & wp_block.c.
This may not be good

@t-j-h
Copy link
Member

t-j-h commented Mar 25, 2019

If we are going to go down the path of runtime rather than compile time detection this should be done in a single place - so we don't end up copying that fragment all over the place. And we should be working to avoid having any header file (certainly any public header file) that depends on B_ENDIAN ...

@slontis
Copy link
Member

slontis commented Mar 25, 2019

Too late :) I think its already in multiple places :)

@paulidale
Copy link
Contributor Author

It will end up being resolved at compile time, once the optimiser does constant propagation and dead code elimination. Put it into a function and this won't happen. It could be made into a pair of macros.

I'd like to see if this fixes the endian problems in the tests before changing it everywhere.

@richsalz
Copy link
Contributor

Human comprehension is more important than CPU optimization. I agree with @t-j-h

@bernd-edlinger
Copy link
Member

Too late :) I think its already in multiple places :)

In other places B_ENDIAN / L_ENDIAN is used when endianness is guaranteed to be this way,
if neither is defined byte accesses are used.

@levitte
Copy link
Member

levitte commented Mar 25, 2019

B_ENDIAN is an internal implementation indication for our libraries. apps/openssl as well as most of our test programs are built without these sort of indicators, to ensure that they mimic the conditions of real world application that don't necessarily have the internal knowledge.

I did try to check if we use B_ENDIAN in our public headers, and found none at the time? Did I miss something?

@paulidale
Copy link
Contributor Author

@levitte the param API tests fail on big endian machines because B_ENDIAN isn't defined for tests.
There isn't a header to grab to get it that I can see.

@paulidale
Copy link
Contributor Author

@richsalz @t-j-h is this a nicer way to achieve the result?
Assuming it gets retrofitted elsewhere too.

@richsalz
Copy link
Contributor

It is trying to fix something by papering over the hole. It doesn't address the core problem. It uses a cut trick, which makes sense in very computationally expensive cryptography code, to get needless efficiency in corner cases of parameter passing which is NOT intended to be a heavily-used feature but rather to have a way to pass "opaque" structures between openssl core and cryptographic providers on the same machine. In my view, the whole approach that the project is using is wrong, too general and solving the wrong problem. The real fix is to remove the need for this kind of thing. Heck, use C's portable shift operators (going back to the days of BSD on a Vax and ntoh).

What is needed is the ability to pass strings, octet buffers, and BIGNUM's around.

Which is a long-winded and obnoxious way of saying "no," and implying that #8377 is still better.

@bernd-edlinger
Copy link
Member

I wonder why you want bignums to be native byte-order?
I think the to_naive makes only sense in quantities of 2,4,8.

@levitte
Copy link
Member

levitte commented Mar 26, 2019

If this is for test apps, wouldn't it make more sense to make this part of testutil?

@slontis
Copy link
Member

slontis commented Mar 26, 2019

So you are saying that B_ENDIAN is defined correctly in bn_lib.c, sha512.c & wp_block.c?

@levitte
Copy link
Member

levitte commented Mar 26, 2019

The "problem" could also be resolved by building up a suitable BIGNUM and then pass it in and out using the API. However, the affected check tests that the API handles the expected format of a passed number correctly, regardless of size. Considering the format is documented, it must be followed, and suitably tested (this is something I expect from a unit test)

@levitte
Copy link
Member

levitte commented Mar 26, 2019

So you are saying that B_ENDIAN is defined correctly in bn_lib.c, sha512.c & wp_block.c?

It should be. B_ENDIAN is passed with the lib_cppflags config attribute, which is used when building libraries (as implied by its prefix).

@richsalz
Copy link
Contributor

Not sure I understand your point in #8572 (comment), @levitte. Are you saying the design and docs are frozen and cannot be changed? That seems premature. Maybe the API so other work can progress. Apologies if you meant something else.

@bernd-edlinger
Copy link
Member

So you are saying that B_ENDIAN is defined correctly in bn_lib.c, sha512.c & wp_block.c?

Yes. B_ENDIAN / L_ENDIAN is only defined, when the target is known to be that way.
if neither B_ENDIAN nor L_ENDIAN is defined those modules fall back to doing shifts.

@bernd-edlinger
Copy link
Member

What I wonder, why we seem to have a BIGNUM with native-byte ordering.
BIGNUMs are not native. int, short, long, are native.

@richsalz
Copy link
Contributor

Since we are only passing these in the same executable, it makes sense to be efficient. Passing a pointer in, and passing a native serialization out is one way, for example. Converting to/from a neutral string format, might be more portable but at the expense of performance. Or, heck, use DER encoding for ASN1 Integer.

@richsalz
Copy link
Contributor

Given the complexity of the current model, I think using i2d/d2i might be the least-worst option, unfortunately.

@levitte
Copy link
Member

levitte commented Mar 26, 2019

@richsalz, doc/man3/OSSL_PARAM.pod is a public manual, and it describes the format for the current data types. We can add new data types with new data formats, sure, but what's currently there at the time of release will get frozen at that time, yes. And we had better test that the add-on API treats those formats as expected.

@levitte
Copy link
Member

levitte commented Mar 26, 2019

DER was actually suggested as a parameter passing format early on. That was not a welcome suggestion at the time. (the complexity is also higher with DER)

@levitte
Copy link
Member

levitte commented Mar 26, 2019

Since we are only passing these in the same executable

That's an incorrect assumption. A dynamically loaded module can't reliably be viewed as the same executable as the loading application.

@richsalz
Copy link
Contributor

I understand that things are frozen at the time of release. I thought your comment was saying it's frozen now. I'd like to know which supported platform supports dynamic modules in a separate executable. And if so, how does that reconcile with the PTR construct currently defined?

@levitte
Copy link
Member

levitte commented Mar 26, 2019

I believe that OSSL_PARAM needs to be fairly frozen early. It will be harmful if it keeps being a moving target.

@levitte
Copy link
Member

levitte commented Mar 26, 2019

What I wonder, why we seem to have a BIGNUM with native-byte ordering.
BIGNUMs are not native. int, short, long, are native.

The driving thought behind it was that an integer is an integer is an integer, regardless of size, and should therefore have the same byte order throughout.

@kroeckx
Copy link
Member

kroeckx commented Mar 26, 2019

My current view is that tests should be written so they are portable without having to know the endianness.

@levitte
Copy link
Member

levitte commented Mar 26, 2019

With that line of thinking, we can throw away all internal tests (there is a bunch of them), because they all rely on internal knowledge

@@ -0,0 +1,22 @@
/*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a testutil header. It has no use in our libraries (even though I know that there's code that use the same technique)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would disagree, since there is code in the library (siv128.c) that uses this technique. Why not consolidate?

@kroeckx
Copy link
Member

kroeckx commented Mar 26, 2019

I should probably clarify that that would be if you test a public API. If you can't test a public API without knowing the endianness, I'm concerned that it's difficult for applications to properly use the API, and that it might just work by accident until they run it on a system with a different endianness.

@levitte
Copy link
Member

levitte commented Mar 27, 2019

I should probably clarify that that would be if you test a public API. If you can't test a public API without knowing the endianness, I'm concerned that it's difficult for applications to properly use the API, and that it might just work by accident until they run it on a system with a different endianness.

Normally, the BIGNUM stuff would be handled with BN_native2bn and BN_bn2nativepad, so no need to know. This particular test is a bit special, though, as it checks that the API behaves correctly for specific byte input.

levitte
levitte previously approved these changes Mar 27, 2019
@levitte levitte dismissed their stale review March 27, 2019 07:25

Oh wait... Travis

Copy link
Member

@levitte levitte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Travis failure not relevant here

@levitte levitte added approval: done This pull request has the required number of approvals and removed approval: review pending This pull request needs review by a committer labels Mar 27, 2019
@tmshort
Copy link
Contributor

tmshort commented Mar 27, 2019

This also done in modes/siv128.c:

__owur static ossl_inline uint64_t siv128_getword(SIV_BLOCK const *b, size_t i)
{
const union {
long one;
char little;
} is_endian = { 1 };
if (is_endian.little)
return byteswap8(b->word[i]);
return b->word[i];
}

@levitte
Copy link
Member

levitte commented Mar 27, 2019

For siv128.c, we could as well use B_ENDIAN. It's possible, though, that whoever authored it simply didn't know, or that the code was initially developed elsewhere.

@tmshort
Copy link
Contributor

tmshort commented Mar 27, 2019

@levitte It was both.

levitte pushed a commit that referenced this pull request Mar 27, 2019
Reviewed-by: Richard Levitte <levitte@openssl.org>
(Merged from #8572)
@paulidale
Copy link
Contributor Author

This fragment also appears in: CTR, GCM, XTS, sha256, chacha20, blake2...
Then in some file oddities: evp/bio_ok.c & sha/sha_locl.h

Personally, I prefer this sequence to using the pre-processor because both branches get checked by the compiler.

Regardless, our endian handling should be unified.

There are other endian handling forms in the code, e.g. sha512 code includes __BYTE_ORDER__==__ORDER_LITTLE_ENDIAN__. arm_arch.h and poly1305 too.

I'm not even going to consider the assembly...

@paulidale paulidale closed this Mar 27, 2019
@paulidale paulidale deleted the param-endian branch March 27, 2019 22:11
@paulidale paulidale mentioned this pull request Mar 28, 2019
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approval: done This pull request has the required number of approvals branch: master Merge to master branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants