[PATCH CATERPILLAR v5] Caterpillar mdev auxiliary #359

MykytaI · 2017-12-21T15:33:13Z

Add miscellaneous patches required for mediated device drivers support.

"Work in progress" branch with mdev common code and i40e driver can be found here:
https://github.com/MykytaI/odp/tree/caterpillar_mdev_i40e_demo
Last two patches aren't yet ready for upstreaming (i40e works, but code needs more documentation & cleanup) and cxgb4+e1000e drivers aren't merged on the top of the branch either.

MykytaI · 2017-12-22T10:06:58Z

This is false positive in checkpatch (the same is done in the kernel), it's impossible to enclose this in parentheses:

_ERROR: Macros with complex values should be enclosed in parentheses
#206: FILE: platform/linux-generic/include/odp/drv/mmio.h:35:
+#define odpdrv_io_mb() asm volatile("" ::: "memory")

ERROR: Macros with complex values should be enclosed in parentheses
#207: FILE: platform/linux-generic/include/odp/drv/mmio.h:36:
+#define odpdrv_io_rmb() asm volatile("" ::: "memory")

ERROR: Macros with complex values should be enclosed in parentheses
#208: FILE: platform/linux-generic/include/odp/drv/mmio.h:37:
+#define odpdrv_io_wmb() asm volatile("" ::: "memory")_

Bill-Fischofer-Linaro · 2017-12-21T17:44:55Z

include/odp/drv/spec/mmio.h

+ * @param value	cpu native 8-bit value to write to MMIO
+ * @param addr MMIO address to write at
+ */
+void odpdrv_mmio_u8be_write(uint8_t value, volatile void *addr);


Both odpdrv_mmio_u8le_write() and odpdrv_mmio_u8be_write() are documented the same. I assume the conversions are from host-endian to le or be explicitly? That should be documented as such.

Whether to have u8le + u8be or a single u8 is just a matter of taste: anyone understands that they do the same thing, so the only question is whether you want this identifier to look similar to other in the same group. I'll change to u8, OK with me.

I'll update doxygen comments to explicitly say 'little endian' and 'big endian'.

Bill-Fischofer-Linaro · 2017-12-21T17:46:20Z

include/odp/drv/spec/mmio.h

+ * @param value	cpu native 64-bit value to write to MMIO
+ * @param addr MMIO address to write at
+ */
+void odpdrv_mmio_u64be_write(uint64_t value, volatile void *addr);


Similar documentation comments for the u16, u32, and u64 variants of these routines.

OK (see first comment)

Bill-Fischofer-Linaro · 2017-12-21T17:47:41Z

include/odp/drv/spec/mmio.h

+ * @param addr MMIO address to read at
+ * @return cpu native 64-bit value
+ */
+uint64_t odpdrv_mmio_u64be_read(volatile void *addr);


Again, I'd make it explicit that these take a le or be source and convert it into native host-endian as part of the read operation.

OK (see first comment)

Bill-Fischofer-Linaro · 2017-12-21T17:53:34Z

platform/linux-generic/include/odp/drv/mmio.h

+odpdrv_mmio_u8be_write(uint8_t value, volatile void __odpdrv_mmio *addr)
+{
+	odpdrv_mmio_u8le_write(value, addr);
+}


Why have two routines if they are just synonyms for each other? I thought the idea was to combine the write with an endian conversion? If there's no actual conversion going on then wouldn't odpdrv_mmio_u8_write(), etc., be sufficient?

OK (see first comment)

Bill-Fischofer-Linaro · 2017-12-21T18:12:24Z

platform/linux-generic/include/odp/drv/mmio.h

+odpdrv_mmio_u16le_write(uint16_t value, volatile void __odpdrv_mmio *addr)
+{
+	odpdrv_io_wmb();
+	*(__odp_force volatile odp_u16le_t *)addr = odp_cpu_to_le_16(value);


Since these routines are likely to be performance critical, you're probably going to want to invoke the byteorder inline forms directly rather than incur the cost of a function call. These simply expand into the GCC builtins. Whether this is done via the inlines or directly via southbound equivalent converters is something we'll need to consider. For now this is fine since it's just basic scaffolding, but something that will need to be revisited.

Aren't odp_xxx_to_yyy_NN() expand to inline function which uses cc builtin ? I guess cc is smart enough to optimize out all wrapper code and use builtin (or unmodified value) directly. Is there really a reason to duplicate byteorder_inlines.h here ?

Side note on performance impact:

I wouldn't consider MMIO accesses performance critical (in a sense that you can optimize it and get performance boost). What is basically happening when you access MMIO is that CPU starts read/write transaction which ends up at the chipset (PCIe root complex in fact), the chipset sends MMIO transaction (64 bytes + overhead) to the peripheral and then awaits response (completion) from the peripheral. All in all it takes hundreds of CPU cycles to complete, and CPU can't do anything during this time. In short: it's way more expensive than even cache miss on host RAM, and register size (8-64 bits) has no impact on the total transaction cost.

So, any code which is accessing MMIO too often is poorly designed, basically. In mediated device drivers we try to access it as little as possible, usually 1-2 times per TX/RX burst.

ODP APIs can only expand inline when --enable-abi-compat=no is used, which means the application is targeting an embedded environment. Take a look at odp_packet.c which supports inline forms of APIs. The idea is that the external API may be inlined or not when called by applications, but the implementation itself always uses the inline form since it's self-contained for a specific target.

For these conversions, ODP itself is going to be compiled for a specific endianness so you can use the GCC __builtin_bswapNN() routines directly as needed. This avoids any function calls. For example:

/* Set up the conversion functions we need for this host's endianness */ #if ODP_BYTE_ORDER == ODP_LITTLE_ENDIAN #define _odp_u16le(x) (x) #define _odp_u16be(x) __builtin_bswap16(x) #define _odp_u32le(x) (x) #define _odp_u32be(x) __builtin_bswap32(x) #define _odp_u64le(x) (x) #define _odp_u64be(x) __builtin_bswap64(x) #else #define _odp_u16le(x) __builtin_bswap16(x) #define _odp_u16be(x) (x) #define _odp_u32le(x) __builtin_bswap32(x) #define _odp_u32be(x) (x) #define _odp_u64le(x) __builtin_bswap64(x) #define _odp_u64be(x) (x) #endif static inline void odpdrv_mmio_u16le_write(uint16_t valie, volatile void __odpdrv_mmio *addr) { odpdrv_io_wmb(); *(__odp_force volatile odp_u16le_t *)addr = _odp_u16le(value); } ...etc.

Since these routines are on performance-critical paths, eliminating the calls to the external APIs is likely to be measurably beneficial.

Bill-Fischofer-Linaro · 2017-12-21T18:13:56Z

platform/linux-generic/include/odp/drv/mmio.h

+odpdrv_mmio_u8be_read(volatile void __odpdrv_mmio *addr)
+{
+	return odpdrv_mmio_u8le_read(addr);
+}


Similar comments w.r.t. the read variants. Since there is no endian conversion for u8 variants is odpdrv_mmio_u8_read() sufficient here?

OK (see first comment)

Bill-Fischofer-Linaro · 2017-12-21T18:16:47Z

platform/linux-generic/include/odp/drv/mmio.h

+	    odp_be_to_cpu_64(*(__odp_force volatile odp_u64be_t *)addr);
+
+	odpdrv_io_rmb();
+	return value;


Another thought, since host bytes are maintained in host byte order, do we need explicit be/le variants of all of these routines? Is it the case that different I/O devices expect bytes to be presented in different endian modes?

Yes, this is the case, different devices expect MMIO registers and dma coherent data in host RAM in either le or be, no one cares about host endianness. Some funny devices even mix le and be in the same peripheral-host communication, like Chelsio being be uC but expecting TX/RX doorbells in MMIO as le :)

Ok, that's good to know. Thanks.

joseppc · 2017-12-22T14:59:54Z

platform/linux-generic/include/odp_internal.h

+	({					\
+		__typeof__(a) tmp_a = (a);	\
+		__typeof__(b) tmp_b = (b);	\
+		tmp_a < tmp_b ? tmp_a : tmp_b;	\


I don't see how this is different from the previous definition. Won't this still produce a warning if a and b have different signedness? Shouldn't the second line be:
__typeof__(a) tmp_b = (__typeof__(a))(b);
although maybe an error/warning is preferable.

Yes, the idea is to generate compilation error if signedness doesn't match.

Bill-Fischofer-Linaro · 2017-12-22T20:32:31Z

platform/linux-generic/include/odp/drv/mmio.h

+odpdrv_mmio_u16le_write(uint16_t value, volatile void __odpdrv_mmio *addr)
+{
+	odpdrv_io_wmb();
+	*(__odp_force volatile odp_u16le_t *)addr = odp_cpu_to_le_16(value);


ODP APIs can only expand inline when --enable-abi-compat=no is used, which means the application is targeting an embedded environment. Take a look at odp_packet.c which supports inline forms of APIs. The idea is that the external API may be inlined or not when called by applications, but the implementation itself always uses the inline form since it's self-contained for a specific target.

For these conversions, ODP itself is going to be compiled for a specific endianness so you can use the GCC __builtin_bswapNN() routines directly as needed. This avoids any function calls. For example:

/* Set up the conversion functions we need for this host's endianness */ #if ODP_BYTE_ORDER == ODP_LITTLE_ENDIAN #define _odp_u16le(x) (x) #define _odp_u16be(x) __builtin_bswap16(x) #define _odp_u32le(x) (x) #define _odp_u32be(x) __builtin_bswap32(x) #define _odp_u64le(x) (x) #define _odp_u64be(x) __builtin_bswap64(x) #else #define _odp_u16le(x) __builtin_bswap16(x) #define _odp_u16be(x) (x) #define _odp_u32le(x) __builtin_bswap32(x) #define _odp_u32be(x) (x) #define _odp_u64le(x) __builtin_bswap64(x) #define _odp_u64be(x) (x) #endif static inline void odpdrv_mmio_u16le_write(uint16_t valie, volatile void __odpdrv_mmio *addr) { odpdrv_io_wmb(); *(__odp_force volatile odp_u16le_t *)addr = _odp_u16le(value); } ...etc.

Since these routines are on performance-critical paths, eliminating the calls to the external APIs is likely to be measurably beneficial.

Bill-Fischofer-Linaro · 2017-12-22T20:40:29Z

platform/linux-generic/include/odp/drv/mmio.h

+	    odp_be_to_cpu_64(*(__odp_force volatile odp_u64be_t *)addr);
+
+	odpdrv_io_rmb();
+	return value;


Ok, that's good to know. Thanks.

MykytaI · 2018-01-02T09:19:28Z

(couldn't reply to "abi-compat" request for changes from Bill directly for some reason, replying here instead.)

I would try to avoid adding 3rd endianness API here if at all possible. Are we going to support "abi-compat" for odpdrv APIs ? If not it could make sense to make odpdrv_ endianness conversion functions always inline. All mdev core and drivers code will use it then as it's really more performance critical than MMIO access.

Signed-off-by: Mykyta Iziumtsev <mykyta.iziumtsev@linaro.org>

Add helper function to query RX/TX queue lengths on network interface to facilitate upcoming native and mediated device drivers. Signed-off-by: Mykyta Iziumtsev <mykyta.iziumtsev@linaro.org>

Added char* and uint64_t sysfs attribute read functions to support upcoming native and mediated device drivers. Updated sysfs_netif_stats() to take interface name as argument instead of pktio_entry, because pktio_entry->s.name is not necessarily recognisable Linux network interface name. Signed-off-by: Mykyta Iziumtsev <mykyta.iziumtsev@linaro.org>

Gather macros needed for upcoming mediated devices in one location. Fix signed vs. unsigned comparisons caused by incorrect MIN / MAX usage. Signed-off-by: Mykyta Iziumtsev <mykyta.iziumtsev@linaro.org>

Bill-Fischofer-Linaro · 2018-01-03T16:59:25Z

Checkpatch issue confirmed to be a false positive. Ok to ignore.

heyi-arm · 2018-01-04T03:23:46Z

Merged and close.

muvarov changed the title ~~Caterpillar mdev auxiliary~~ [PATCH CATERPILLAR v1] Caterpillar mdev auxiliary Dec 21, 2017

muvarov added Email_sent labels Dec 21, 2017

MykytaI force-pushed the caterpillar_mdev_auxiliary branch from 5ae0d10 to 2faaf33 Compare December 21, 2017 16:45

muvarov changed the title ~~[PATCH CATERPILLAR v1] Caterpillar mdev auxiliary~~ [PATCH CATERPILLAR v2] Caterpillar mdev auxiliary Dec 21, 2017

muvarov added Email_sent and removed Email_sent labels Dec 21, 2017

MykytaI force-pushed the caterpillar_mdev_auxiliary branch from 2faaf33 to dce14ca Compare December 22, 2017 10:05

muvarov changed the title ~~[PATCH CATERPILLAR v2] Caterpillar mdev auxiliary~~ [PATCH CATERPILLAR v3] Caterpillar mdev auxiliary Dec 22, 2017

muvarov removed Email_sent labels Dec 22, 2017

muvarov added Email_sent labels Dec 22, 2017

MykytaI force-pushed the caterpillar_mdev_auxiliary branch from dce14ca to 0d30853 Compare December 22, 2017 11:01

muvarov changed the title ~~[PATCH CATERPILLAR v3] Caterpillar mdev auxiliary~~ [PATCH CATERPILLAR v4] Caterpillar mdev auxiliary Dec 22, 2017

muvarov added Email_sent and removed Email_sent labels Dec 22, 2017

Bill-Fischofer-Linaro reviewed Dec 22, 2017

View reviewed changes

joseppc reviewed Dec 22, 2017

View reviewed changes

Bill-Fischofer-Linaro suggested changes Dec 22, 2017

View reviewed changes

MykytaI added 4 commits January 2, 2018 11:52

linux-gen: add memory-mapped I/O access API

761e44f

Signed-off-by: Mykyta Iziumtsev <mykyta.iziumtsev@linaro.org>

linux-gen: extend ethtool helper functions

fcbc819

Add helper function to query RX/TX queue lengths on network interface to facilitate upcoming native and mediated device drivers. Signed-off-by: Mykyta Iziumtsev <mykyta.iziumtsev@linaro.org>

linux-gen: move common macros to odp_internal.h

756dc7c

Gather macros needed for upcoming mediated devices in one location. Fix signed vs. unsigned comparisons caused by incorrect MIN / MAX usage. Signed-off-by: Mykyta Iziumtsev <mykyta.iziumtsev@linaro.org>

MykytaI force-pushed the caterpillar_mdev_auxiliary branch from 0d30853 to 756dc7c Compare January 2, 2018 10:53

muvarov changed the title ~~[PATCH CATERPILLAR v4] Caterpillar mdev auxiliary~~ [PATCH CATERPILLAR v5] Caterpillar mdev auxiliary Jan 2, 2018

muvarov added Email_sent and removed Email_sent labels Jan 2, 2018

Bill-Fischofer-Linaro approved these changes Jan 3, 2018

View reviewed changes

heyi-arm closed this Jan 4, 2018

MykytaI deleted the caterpillar_mdev_auxiliary branch January 4, 2018 14:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PATCH CATERPILLAR v5] Caterpillar mdev auxiliary #359

[PATCH CATERPILLAR v5] Caterpillar mdev auxiliary #359

MykytaI commented Dec 21, 2017

MykytaI commented Dec 22, 2017

Bill-Fischofer-Linaro Dec 21, 2017

MykytaI Dec 22, 2017

Bill-Fischofer-Linaro Dec 21, 2017

MykytaI Dec 22, 2017

Bill-Fischofer-Linaro Dec 21, 2017

MykytaI Dec 22, 2017

Bill-Fischofer-Linaro Dec 21, 2017

MykytaI Dec 22, 2017

Bill-Fischofer-Linaro Dec 21, 2017

MykytaI Dec 22, 2017

Bill-Fischofer-Linaro Dec 22, 2017

Bill-Fischofer-Linaro Dec 21, 2017

MykytaI Dec 22, 2017

Bill-Fischofer-Linaro Dec 21, 2017

MykytaI Dec 22, 2017

Bill-Fischofer-Linaro Dec 22, 2017

joseppc Dec 22, 2017

MykytaI Jan 2, 2018 •

edited

Bill-Fischofer-Linaro Dec 22, 2017

Bill-Fischofer-Linaro Dec 22, 2017

MykytaI commented Jan 2, 2018

Bill-Fischofer-Linaro commented Jan 3, 2018

heyi-arm commented Jan 4, 2018

[PATCH CATERPILLAR v5] Caterpillar mdev auxiliary #359

[PATCH CATERPILLAR v5] Caterpillar mdev auxiliary #359

Conversation

MykytaI commented Dec 21, 2017

MykytaI commented Dec 22, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MykytaI Jan 2, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MykytaI commented Jan 2, 2018

Bill-Fischofer-Linaro commented Jan 3, 2018

heyi-arm commented Jan 4, 2018

MykytaI Jan 2, 2018 •

edited