Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix leakage when the cacheline is 32-bytes in CBC_MAC_ROTATE_IN_PLACE #18033

Closed
wants to merge 1 commit into from

Conversation

basavesh
Copy link
Contributor

@basavesh basavesh commented Apr 4, 2022

rotated_mac is a 64-byte aligned buffer of size 64 and rotate_offset is secret.
Consider a weaker leakage model(CL) where only cacheline base address is leaked,
i.e address/32 for 32-byte cacheline(CL32).

Previous code used to perform two loads
1. rotated_mac[rotate_offset ^ 32] and
2. rotated_mac[rotate_offset++]
which would leak 2q + 1, 2q for 0 <= rotate_offset < 32
and 2q, 2q + 1 for 32 <= rotate_offset < 64

The proposed fix performs load operations which will always leak 2q, 2q + 1 and
selects the appropriate value in constant-time.

@openssl-machine openssl-machine added the hold: cla required The contributor needs to submit a license agreement label Apr 4, 2022
@github-actions github-actions bot added the severity: fips change The pull request changes FIPS provider sources label Apr 5, 2022
@openssl-machine openssl-machine removed the hold: cla required The contributor needs to submit a license agreement label Apr 5, 2022
@basavesh
Copy link
Contributor Author

basavesh commented Apr 5, 2022

@mattcaswell

@mattcaswell
Copy link
Member

This will require a CLA to be submitted:
https://www.openssl.org/policies/cla.html

@mattcaswell mattcaswell added the approval: review pending This pull request needs review by a committer label Apr 5, 2022
@t8m t8m added branch: master Merge to master branch branch: 1.1.1 Merge to OpenSSL_1_1_1-stable branch triaged: bug The issue/pr is/fixes a bug branch: 3.0 Merge to openssl-3.0 branch labels Apr 5, 2022
@t8m t8m added approval: done This pull request has the required number of approvals and removed approval: review pending This pull request needs review by a committer labels Apr 5, 2022
@t8m
Copy link
Member

t8m commented Apr 5, 2022

@mattcaswell do you agree this should be cherry picked to all branches?

rotated_mac is a 64-byte aligned buffer of size 64 and rotate_offset is secret.
Consider a weaker leakage model(CL) where only cacheline base address is leaked,
i.e address/32 for 32-byte cacheline(CL32).

Previous code used to perform two loads
    1. rotated_mac[rotate_offset ^ 32] and
    2. rotated_mac[rotate_offset++]
which would leak 2q + 1, 2q for 0 <= rotate_offset < 32
and 2q, 2q + 1 for 32 <= rotate_offset < 64

The proposed fix performs load operations which will always leak 2q, 2q + 1 and
selects the appropriate value in constant-time.
@mattcaswell
Copy link
Member

@mattcaswell do you agree this should be cherry picked to all branches?

Yes - although I strongly suspect it won't cherry-pick to 1.1.1 cleanly so we will probably need another PR there.

@basavesh
Copy link
Contributor Author

basavesh commented Apr 5, 2022

@mattcaswell @t8m
I create another pull request for older version #18050

@mattcaswell mattcaswell removed the branch: 1.1.1 Merge to OpenSSL_1_1_1-stable branch label Apr 5, 2022
@davidben
Copy link
Contributor

davidben commented Apr 5, 2022

Ah, missed that #18050 wasn't the main PR. Reposting my comment here, to keep the main discussion in one place:

Rather than making assumptions about cachelines, we switched to a simpler cacheline-independent O(N log N) algorithm in BoringSSL. It removes the need for the CBC_MAC_ROTATE_IN_PLACE toggle and still performs fine: Rotate by powers of up 2, up to md_size. At each iteration, constant-time select either the rotated version, or the unrotated version, based on the corresponding (secret) bit of rotate_offset.

I think that would be a better strategy. It is more straightforwardly constant-time. CVE-2016-0702 is an example where assumptions about cacheline behavior don't quite match how CPUs behave. The O(N log N) approach avoids these assumptions.

@paulidale
Copy link
Contributor

I prefer @davidben's suggestion. Making assumptions about cache lines is less than ideal.

@openssl-machine
Copy link
Collaborator

24 hours has passed since 'approval: done' was set, but as this PR has been updated in that time the label 'approval: ready to merge' is not being automatically set. Please review the updates and set the label manually.

@t8m t8m added approval: ready to merge The 24 hour grace period has passed, ready to merge and removed approval: done This pull request has the required number of approvals labels May 9, 2022
@t8m
Copy link
Member

t8m commented May 9, 2022

Merged to master and 3.0 branches. Thank you for your contribution.

@t8m t8m closed this May 9, 2022
openssl-machine pushed a commit that referenced this pull request May 9, 2022
rotated_mac is a 64-byte aligned buffer of size 64 and rotate_offset is secret.
Consider a weaker leakage model(CL) where only cacheline base address is leaked,
i.e address/32 for 32-byte cacheline(CL32).

Previous code used to perform two loads
    1. rotated_mac[rotate_offset ^ 32] and
    2. rotated_mac[rotate_offset++]
which would leak 2q + 1, 2q for 0 <= rotate_offset < 32
and 2q, 2q + 1 for 32 <= rotate_offset < 64

The proposed fix performs load operations which will always leak 2q, 2q + 1 and
selects the appropriate value in constant-time.

Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from #18033)

(cherry picked from commit 3b83638)
openssl-machine pushed a commit that referenced this pull request May 9, 2022
rotated_mac is a 64-byte aligned buffer of size 64 and rotate_offset is secret.
Consider a weaker leakage model(CL) where only cacheline base address is leaked,
i.e address/32 for 32-byte cacheline(CL32).

Previous code used to perform two loads
    1. rotated_mac[rotate_offset ^ 32] and
    2. rotated_mac[rotate_offset++]
which would leak 2q + 1, 2q for 0 <= rotate_offset < 32
and 2q, 2q + 1 for 32 <= rotate_offset < 64

The proposed fix performs load operations which will always leak 2q, 2q + 1 and
selects the appropriate value in constant-time.

Reviewed-by: Matt Caswell <matt@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
(Merged from #18033)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approval: ready to merge The 24 hour grace period has passed, ready to merge branch: master Merge to master branch branch: 3.0 Merge to openssl-3.0 branch severity: fips change The pull request changes FIPS provider sources triaged: bug The issue/pr is/fixes a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants