Skip to content

Conversation

@lemire
Copy link
Collaborator

@lemire lemire commented Jun 22, 2023

I finally had time to check and @aqrit is correct, doing a full branchless ipv4 parsing is faster. It is an interesting case because going branchless improves both the number of instructions (a good reduction) and the number instructions retired per cycle (a more modest gain, but appreciable nonetheless).

It is effectively a simplification of the prior code. I recommend that it'd be merged.

@k0ekk0ek
Copy link
Contributor

Thanks for reviewing @lemire. And, of course, thanks @aqrit! A very nice addition indeed.

@k0ekk0ek k0ekk0ek merged commit de047bc into NLnetLabs:main Jun 23, 2023
@lemire
Copy link
Collaborator Author

lemire commented Jun 23, 2023

@k0ekk0ek This should get you close to 3 GB/s for IPv4 parsing... It is probably now fast enough not to be a bottleneck.

@k0ekk0ek
Copy link
Contributor

I think you're right 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants