Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

router: avoid per-packet allocations for MAC #4029

Closed
wants to merge 1 commit into from

Conversation

matzf
Copy link
Member

@matzf matzf commented Apr 23, 2021

Avoid creating a separate CMAC hasher, each with a separate AES block
cipher object, for each MAC computation/verification.
This was responsible for a very large portion of allocations during
forwarding. Additionally, reusing the hasher amortizes the cost of the
internal initialization of both the AES block cipher (key expansion) and
CMAC (subkey derivation).
The CMAC hasher is not goroutine safe, so care must be taken to not
share it by accident.


This change is Reviewable

Avoid creating a separate CMAC hasher, each with a separate AES block
cipher object, for each MAC computation/verification.
This was responsible for a very large portion of allocations during
forwarding. Additionally, reusing the hasher amortizes the cost of the
internal initialization of both the AES block cipher (key expansion) and
CMAC (subkey derivation).
The CMAC hasher is not goroutine safe, so care must be taken to not
share it by accident.
Copy link
Collaborator

@lukedirtwalker lukedirtwalker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r1.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @matzf)

Copy link
Collaborator

@lukedirtwalker lukedirtwalker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @matzf)

a discussion (no related file):
Nice, thank you 👍 :lgtm:


lukedirtwalker pushed a commit to lukedirtwalker/scion that referenced this pull request May 6, 2021
Avoid allocating scionPacketProcessor; re-organized packet processing
functions. The scionPacketProcessor is now the main entry point for
packet processing and contains all the pre-allocated per-packet state.

Avoid allocation of message buffer for writing with WriteBatch.
We could just pre-allocate a buffer, but as we always write a batch of
size one, just use WriteTo instead.
For writing single packets, this is both more convenient and faster.
Additional benefit: Read/WriteBatch make rather large allocations
internally (see golang/go#26838) which is
avoided by using WriteTo.

The largest remaining allocations when processing SCION packets are:
- ReadBatch allocates temporary storage internally (as mentioned above).
  This is by far the biggest culprit, it accounts for almost 80% (after
  the changes in this patch).
- slayers/path.NewPath called during slayers.SCION.DecodeFromBytes:
  Could be avoided by reusing Path objects, but requires larger changes.
- slayers/path/scion.Raw.GetCurrentHopField and GetCurrentInfoField, called during parsePath:
  Both return pointer types. Should be possible to change this to value
  types instead, but requires larger changes.
- DataPlane.resolveLocalDst: allocates IP and UDPAddr. Can only really
  be fixed with better address types.
- slayers/path.FullMAC. Allocates both input and output. The function
  interface could be changed to allow reuse of the input buffer.
  The output buffer cannot be reused with this CMAC library.

Depends on (or actually includes) scionproto#4029.

Closes scionproto#4030

GitOrigin-RevId: 72eb8918a54a25f74159ef7398ef3d98d756e501
marcfrei pushed a commit to netsec-ethz/scion that referenced this pull request May 10, 2021
Avoid creating a separate CMAC hasher, each with a separate AES block
cipher object, for each MAC computation/verification.
This was responsible for a very large portion of allocations during
forwarding. Additionally, reusing the hasher amortizes the cost of the
internal initialization of both the AES block cipher (key expansion) and
CMAC (subkey derivation).
The CMAC hasher is not goroutine safe, so care must be taken to not
share it by accident.

Closes scionproto#4029

GitOrigin-RevId: 0fab387273bfb377f3db0c8e4184e86555107e28
marcfrei pushed a commit to netsec-ethz/scion that referenced this pull request May 10, 2021
Avoid creating a separate CMAC hasher, each with a separate AES block
cipher object, for each MAC computation/verification.
This was responsible for a very large portion of allocations during
forwarding. Additionally, reusing the hasher amortizes the cost of the
internal initialization of both the AES block cipher (key expansion) and
CMAC (subkey derivation).
The CMAC hasher is not goroutine safe, so care must be taken to not
share it by accident.

Closes scionproto#4029

GitOrigin-RevId: 0fab387273bfb377f3db0c8e4184e86555107e28
marcfrei pushed a commit to netsec-ethz/scion that referenced this pull request May 26, 2021
Avoid allocating scionPacketProcessor; re-organized packet processing
functions. The scionPacketProcessor is now the main entry point for
packet processing and contains all the pre-allocated per-packet state.

Avoid allocation of message buffer for writing with WriteBatch.
We could just pre-allocate a buffer, but as we always write a batch of
size one, just use WriteTo instead.
For writing single packets, this is both more convenient and faster.
Additional benefit: Read/WriteBatch make rather large allocations
internally (see golang/go#26838) which is
avoided by using WriteTo.

The largest remaining allocations when processing SCION packets are:
- ReadBatch allocates temporary storage internally (as mentioned above).
  This is by far the biggest culprit, it accounts for almost 80% (after
  the changes in this patch).
- slayers/path.NewPath called during slayers.SCION.DecodeFromBytes:
  Could be avoided by reusing Path objects, but requires larger changes.
- slayers/path/scion.Raw.GetCurrentHopField and GetCurrentInfoField, called during parsePath:
  Both return pointer types. Should be possible to change this to value
  types instead, but requires larger changes.
- DataPlane.resolveLocalDst: allocates IP and UDPAddr. Can only really
  be fixed with better address types.
- slayers/path.FullMAC. Allocates both input and output. The function
  interface could be changed to allow reuse of the input buffer.
  The output buffer cannot be reused with this CMAC library.

Depends on (or actually includes) scionproto#4029.

Closes scionproto#4030

GitOrigin-RevId: 72eb8918a54a25f74159ef7398ef3d98d756e501
marcfrei pushed a commit to netsec-ethz/scion that referenced this pull request Jun 3, 2021
Avoid allocating scionPacketProcessor; re-organized packet processing
functions. The scionPacketProcessor is now the main entry point for
packet processing and contains all the pre-allocated per-packet state.

Avoid allocation of message buffer for writing with WriteBatch.
We could just pre-allocate a buffer, but as we always write a batch of
size one, just use WriteTo instead.
For writing single packets, this is both more convenient and faster.
Additional benefit: Read/WriteBatch make rather large allocations
internally (see golang/go#26838) which is
avoided by using WriteTo.

The largest remaining allocations when processing SCION packets are:
- ReadBatch allocates temporary storage internally (as mentioned above).
  This is by far the biggest culprit, it accounts for almost 80% (after
  the changes in this patch).
- slayers/path.NewPath called during slayers.SCION.DecodeFromBytes:
  Could be avoided by reusing Path objects, but requires larger changes.
- slayers/path/scion.Raw.GetCurrentHopField and GetCurrentInfoField, called during parsePath:
  Both return pointer types. Should be possible to change this to value
  types instead, but requires larger changes.
- DataPlane.resolveLocalDst: allocates IP and UDPAddr. Can only really
  be fixed with better address types.
- slayers/path.FullMAC. Allocates both input and output. The function
  interface could be changed to allow reuse of the input buffer.
  The output buffer cannot be reused with this CMAC library.

Depends on (or actually includes) scionproto#4029.

Closes scionproto#4030

GitOrigin-RevId: 72eb8918a54a25f74159ef7398ef3d98d756e501
@matzf matzf deleted the router-cmac-reuse branch August 20, 2021 18:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants