Skip to content

Commit

Permalink
common/octeontx2: fix build with SVE
Browse files Browse the repository at this point in the history
[ upstream commit fe55802 ]

Building with gcc 10.2 with SVE extension enabled got error:

{standard input}: Assembler messages:
{standard input}:4002: Error: selected processor does not support `mov z3.b,#0'
{standard input}:4003: Error: selected processor does not support `whilelo p1.b,xzr,x7'
{standard input}:4005: Error: selected processor does not support `ld1b z0.b,p1/z,[x8]'
{standard input}:4006: Error: selected processor does not support `whilelo p4.s,wzr,w7'

This is because inline assembly code explicitly resets cpu model to
not have SVE support. Thus SVE instructions generated by compiler
auto vectorization got rejected by assembler.

Added SVE to the cpu model specified by inline assembly for SVE support.
Not replacing the inline assembly with C atomics because the driver relies
on specific LSE instruction to interface to co-processor [1].

Fixes: 8a4f835 ("common/octeontx2: add IO handling APIs")

[1] https://mails.dpdk.org/archives/dev/2021-January/196092.html

Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
  • Loading branch information
Reyfone authored and cpaelzer committed Feb 3, 2021
1 parent 4a2baf0 commit 4f99d06
Showing 1 changed file with 10 additions and 3 deletions.
13 changes: 10 additions & 3 deletions drivers/common/octeontx2/otx2_io_arm64.h
Original file line number Diff line number Diff line change
Expand Up @@ -21,14 +21,20 @@
#define otx2_prefetch_store_keep(ptr) ({\
asm volatile("prfm pstl1keep, [%x0]\n" : : "r" (ptr)); })

#if defined(__ARM_FEATURE_SVE)
#define __LSE_PREAMBLE " .cpu generic+lse+sve\n"
#else
#define __LSE_PREAMBLE " .cpu generic+lse\n"
#endif

static __rte_always_inline uint64_t
otx2_atomic64_add_nosync(int64_t incr, int64_t *ptr)
{
uint64_t result;

/* Atomic add with no ordering */
asm volatile (
".cpu generic+lse\n"
__LSE_PREAMBLE
"ldadd %x[i], %x[r], [%[b]]"
: [r] "=r" (result), "+m" (*ptr)
: [i] "r" (incr), [b] "r" (ptr)
Expand All @@ -43,7 +49,7 @@ otx2_atomic64_add_sync(int64_t incr, int64_t *ptr)

/* Atomic add with ordering */
asm volatile (
".cpu generic+lse\n"
__LSE_PREAMBLE
"ldadda %x[i], %x[r], [%[b]]"
: [r] "=r" (result), "+m" (*ptr)
: [i] "r" (incr), [b] "r" (ptr)
Expand All @@ -57,7 +63,7 @@ otx2_lmt_submit(rte_iova_t io_address)
uint64_t result;

asm volatile (
".cpu generic+lse\n"
__LSE_PREAMBLE
"ldeor xzr,%x[rf],[%[rs]]" :
[rf] "=r"(result): [rs] "r"(io_address));
return result;
Expand Down Expand Up @@ -92,4 +98,5 @@ otx2_lmt_mov_seg(void *out, const void *in, const uint16_t segdw)
dst128[i] = src128[i];
}

#undef __LSE_PREAMBLE
#endif /* _OTX2_IO_ARM64_H_ */

0 comments on commit 4f99d06

Please sign in to comment.