Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Left shift of a negative number is undefined behaviour v2 #3600

Closed

Conversation

mattcaswell
Copy link
Member

Fix undefined behaviour in curve25519.c. Prior to this running with
ubsan produces errors like this:

crypto/ec/curve25519.c:3871:18: runtime error: left shift of negative
value -22867

[extended tests]

This is an alternative approach to #3591 which (hopefully) avoids some of the problems identified there.

Fix undefined behaviour in curve25519.c. Prior to this running with
ubsan produces errors like this:

crypto/ec/curve25519.c:3871:18: runtime error: left shift of negative
value -22867

[extended tests]
@kaduk
Copy link
Contributor

kaduk commented Jun 1, 2017

From a formal standpoint, signed integer over/underflow is still undefined behavior (though -fwrapv can be used, of course).
But perhaps we know that none of the intermediates being shifted are large enough for that to come into play? (I did not analyze the surrounding code.)

@mattcaswell
Copy link
Member Author

But perhaps we know that none of the intermediates being shifted are large enough for that to come into play? (I did not analyze the surrounding code.)

I think all instances of this are of the following form:

carry6 = (s6 + (1 << 20)) >> 21;

...

s6 -= carry6 * (1 << 21);

i.e. there is a right shift of 21 bits first, followed by the subsequent problematic left shift of 21 bits. Due to the preceding right shift we know that it should never overflow.

@dot-asm
Copy link
Contributor

dot-asm commented Jun 1, 2017

Yes, multiplication is the way to go. The only relevant question if there is compiler that would fail to compile such multiplications as shifts. I mean that would be a disaster from performance viewpoint.

As [temporary] workaround for gcc failure to compile curve25519.c one can add -O to corresponding line in .travis.yml. I mean -O3 seems to be the problem and by passing additional -O you effectively override it.

@dot-asm
Copy link
Contributor

dot-asm commented Jun 1, 2017

I'm ready to plus-one. Do you want to add -O here or shall we take it as separate request?

@mattcaswell
Copy link
Member Author

Take it as a separate request. It is effectively what #3601 has done, except I used --debug instead (which results in -O0)

@dot-asm
Copy link
Contributor

dot-asm commented Jun 1, 2017

Potential problem with -O0 is execution time. Non-optimized code is slow, sanitized code is slow, non-optimized sanitized code is double-slow...

@dot-asm dot-asm added branch: master Merge to master branch approval: done This pull request has the required number of approvals labels Jun 1, 2017
@dot-asm
Copy link
Contributor

dot-asm commented Jun 1, 2017

non-optimized sanitized code is double-slow...

Though it doesn't look that bad according to log...

levitte pushed a commit that referenced this pull request Jun 2, 2017
Fix undefined behaviour in curve25519.c. Prior to this running with
ubsan produces errors like this:

crypto/ec/curve25519.c:3871:18: runtime error: left shift of negative
value -22867

[extended tests]

Reviewed-by: Andy Polyakov <appro@openssl.org>
(Merged from #3600)
@mattcaswell
Copy link
Member Author

Pushed. Thanks.

@noloader
Copy link
Contributor

noloader commented Jun 2, 2017

@dot-asm, @mattcaswell, @kaduk,

From a formal standpoint, signed integer over/underflow is still undefined behavior

That's easy enough to fix with the original solution - do the operations using unsigned types like in the first proposal. However a shift has a range, and its [0,31] for 32-bit types, and [0,63] for 64-bit types. The latter is why I suggested an AND operation or a mod as a potential solution. (When I said "mask", I was not talking about hiding the problem in the engineering sense).

-fwrapv can be used, of course

This is usually a bad idea. A program that needs it is likely illegal, so dragons can fly out your nose. In my day job, I will fail a program during a security audit if I see it. I'll send it back to the developers to fix, or I'll send it into risk acceptance if the devs won't fix it.

I'm ready to plus-one. Do you want to add -O here or shall we take it as separate request?

If you know its correct, then there are a few options. First, you can build curve25519.c without the sanitizer while leaving it in place for other source files. I've seen this approach in other libraries; see this GNUmakefile.

You can also make it a soft failure. That is, allow the ubsan job to fail in Travis, but it won't affect the build (a build is the collection of all the jobs). For an example of this approach, see this travis.yml file.

I kind of like the soft failure approach. The results are produced, and you can view the findings without breaking the build.

In general, the ubsan test is a very good test. I think its wise to keep them. I've seen Intel compilers remove less offending code. ICC and ICPC are ruthless even at -O2.

@mattcaswell
Copy link
Member Author

If you know its correct

It isn't. @dot-asm came to the conclusion that left shift of a negative number is undefined.

@noloader
Copy link
Contributor

noloader commented Jun 2, 2017

It isn't. @dot-asm came to the conclusion that left shift of a negative number is undefined.

Yes, we knew the negative shift was not correct. The tool told us so. I meant the right shift after the cast and left shift. From one of the earlier comments:

there is a right shift of 21 bits first, followed by the subsequent problematic left shift of 21 bits. Due to the preceding right shift we know that it should never overflow.

That's the one to watch next. It has to be in the range. Its not a machine constraint; rather, its a language constraint.

But the fix in the "files changed" should avoid the issue and lend itself to optimizations. The compiler might still perform the shift. When its generating code, its not bound by the language rules we suffer.

@mattcaswell
Copy link
Member Author

That's the one to watch next. It has to be in the range. Its not a machine constraint; rather, its a language constraint.

I'm confused by what you are saying. There is no problem (as far as we know) with the right shift. But because we know we right shifted by 21 bits, we also know that a subsequent left shift of 21 bits will not overflow.

@noloader
Copy link
Contributor

noloader commented Jun 2, 2017

@mattcaswell,

But because we know we right shifted by 21 bits, we also know that a subsequent left shift of 21 bits will not overflow.

Yes, that looks like it should be OK.

Sorry about the confusion.

@dot-asm
Copy link
Contributor

dot-asm commented Jun 2, 2017

If you know its correct

It isn't. @dot-asm came to the conclusion that left shift of a negative number is undefined.

There is certain ambiguity in this exchange. The left shift on negatives is indeed declared undefined, but in practice it works out as long as most significant bit doesn't change as result of left shift, i.e. result remains negative. That's how it worked so far. And it's effectively same limitation as with overflow in signed multiplication @kaduk referred to. I mean signed multiplication is formally defined only if it doesn't overflow, which in practical terms means that enough most significant bits should be same for result to not overflow. And it doesn't overflow here thanks to boundary conditions.

@dot-asm
Copy link
Contributor

dot-asm commented Jun 2, 2017

make it a soft failure

It would run too large risk to remain unnoticed for longer periods of time, which would make it harder to identify the change that caused failure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approval: done This pull request has the required number of approvals branch: master Merge to master branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants