New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
riscv: Add fine-tuned checksum functions #380
Conversation
Upstream branch: f352a28 |
Upstream branch: f352a28 |
6e5b6bf
to
cb1e996
Compare
Upstream branch: f352a28 |
cb1e996
to
c79b399
Compare
Upstream branch: f352a28 |
c79b399
to
1537770
Compare
Upstream branch: 5a2cf77 |
1537770
to
9682d41
Compare
Upstream branch: 5a2cf77 |
9682d41
to
1b488a7
Compare
Upstream branch: 5a2cf77 |
1b488a7
to
e14ce83
Compare
Upstream branch: 5a2cf77 |
e14ce83
to
a376824
Compare
Upstream branch: 5a2cf77 |
a376824
to
baefd44
Compare
Upstream branch: c4e4b79 |
baefd44
to
bf460c6
Compare
Upstream branch: d3e591a |
bf460c6
to
11457c0
Compare
This csum_fold implementation introduced into arch/arc by Vineet Gupta is better than the default implementation on at least arc, x86, and riscv. Using GCC trunk and compiling non-inlined version, this implementation has 41.6667%, 25% fewer instructions on riscv64, x86-64 respectively with -O3 optimization. Most implmentations override this default in asm, but this should be more performant than all of those other implementations except for arm which has barrel shifting and sparc32 which has a carry flag. Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Reviewed-by: David Laight <david.laight@aculab.com> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Support static branches depending on the value of misaligned accesses. This will be used by a later patch in the series. At any point in time, this static branch will only be enabled if all online CPUs are considered "fast". Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Reviewed-by: Evan Green <evan@rivosinc.com> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Provide checksum algorithms that have been designed to leverage riscv instructions such as rotate. In 64-bit, can take advantage of the larger register to avoid some overflow checking. Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Xiao Wang <xiao.w.wang@intel.com> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Provide a 32 and 64 bit version of do_csum. When compiled for 32-bit will load from the buffer in groups of 32 bits, and when compiled for 64-bit will load in groups of 64 bits. Additionally provide riscv optimized implementation of csum_ipv6_magic. Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Xiao Wang <xiao.w.wang@intel.com> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Supplement existing checksum tests with tests for csum_ipv6_magic and ip_fast_csum. Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Upstream branch: d4abde5 |
11457c0
to
3427c90
Compare
Pull request for series with
subject: riscv: Add fine-tuned checksum functions
version: 12
url: https://patchwork.kernel.org/project/linux-riscv/list/?series=809482