[PATCH API-NEXT v3] api: pktio: clarify timeout and stop interaction #387

Bill-Fischofer-Linaro · 2018-01-10T21:43:48Z

Clarify how odp_pktin_recv_tmo() and odp_pktin_recv_mq_tmo()
interact with odp_pktio_stop(). In particular, specify that a stop
operation will terminate indefinite receive waits.

Signed-off-by: Bill Fischofer bill.fischofer@linaro.org

codecov · 2018-01-10T22:48:31Z

Codecov Report

Merging #387 into api-next will decrease coverage by 0.011%.
The diff coverage is 75%.

@@              Coverage Diff               @@
##           api-next      #387       +/-   ##
==============================================
- Coverage      78.7%   78.688%   -0.012%     
==============================================
  Files           196       196               
  Lines         34635     34639        +4     
==============================================
- Hits          27258     27257        -1     
- Misses         7377      7382        +5

Impacted Files	Coverage Δ
platform/linux-generic/odp_packet_io.c	`78.87% <75%> (+0.209%)`	⬆️
platform/linux-generic/pktio/pcap.c	`81.675% <0%> (-1.048%)`	⬇️
platform/linux-generic/pktio/ipc.c	`78.323% <0%> (-0.29%)`	⬇️
platform/linux-generic/pktio/netmap.c	`80.952% <0%> (-0.227%)`	⬇️
test/performance/odp_pktio_ordered.c	`71.341% <0%> (-0.204%)`	⬇️
platform/linux-generic/pktio/dpdk.c	`50.355% <0%> (-0.143%)`	⬇️

psavol · 2018-01-12T08:56:20Z

include/odp/api/spec/packet_io.h

@@ -884,14 +887,19 @@ int odp_pktin_recv(odp_pktin_queue_t queue, odp_packet_t packets[], int num);
 * @param      num        Maximum number of packets to receive
 * @param      wait       Wait time specified as as follows:
 *                        * ODP_PKTIN_NO_WAIT: Do not wait
- *                        * ODP_PKTIN_WAIT:    Wait infinitely
+ *                        * ODP_PKTIN_WAIT:    Wait indefinitely. Returns


The problem is caused by infinite wait option. I'd just deprecate the option. Implementation gets problematic when this call would need to monitor three different things: packet arrival, timeout and user calling stop. E.g. socket based implementation of this use select() which monitors packet and timeout, but not user input. If this change would be made, select could not sleep long but keep polling potential user call to stop (which normally would not happen almost ever).

So, it's better to leave stop synchronization to application control. It sets the timeout such that it can react fast enough to potential interface shutdown calls, but long enough to save energy. E.g. application can decide to use 1sec interface for shutdown polling, but implementation would need to poll more often e.g. every 10-100ms.

The application can always compensate for an implementation's limitations, but why should it have to? The ODP philosophy is for applications to state their functional needs and let the implementation provide those functional needs in a platform-optimal manner. Having an indefinite wait is simpler from an application standpoint and avoids needless overhead. If I have dozens of threads that specify ODP_PKTIN_WAIT the implementation is free to consolidate any internal timer management to amortize costs across all of those threads. If each is managing its own timers that's not possible.

I have no problem with having an explicit timeout as a wait option, but I'd argue that if we had to deprecate one it would be the variable timeouts since a well-structured application shouldn't have threads that are trying to do six different things that it's constantly flipping between. The provision of timeouts in general is to accommodate older apps moving to ODP that aren't as well structured to a world where threads are cheap and can be used "wastefully".

In any event, the ODP_PKTIN_WAIT feature has been around for some time and deprecating it would conceivably impact existing applications in significant ways, so I'd be reluctant to make such changes without careful consideration. But there is an ambiguity surrounding the intersection of this feature with odp_pktio_stop() behavior that this PR looks to clarify.

@bala-manoharan @bogdanPricope Can you comment on how ODP_PKTIN_WAIT is currently being used, to your knowledge?

All three possibilities for setting ‚wait’ parameter are useful in different application designs: ODP_PKTIN_NO_WAIT, ODP_PKTIN_WAIT, finite wait.... mainly because each option has its pros/cons and depending on your application profile and expected traffic you need to select one of them for best performance.

ODP_PKTIN_NO_WAIT
Pros: best performance on high core load with equally spread traffic
Cons: wildly looping when there is no traffic: considering direct RX mode with RSS, we can imagine a scenario where all traffic goes to a single core while the rest of cores are only looping

ODP_PKTIN_WAIT
Pros: no wild loop
Cons: At some point you need to restart you applications and you need to shutdown resources gracefully (else, in some cases initialization will fail and you will need to reboot host - bigger down time). You have no way to gracefully interrupt this function.

finite wait:
Pros: no wild loop, no graceful shutdown issue
Cons: You need to arm/disable a timer (or similar) for each loop. This is especially painful on high load.

Having another way to stop an infinite/finite wait, on request (odp_pktio_stop()) or error (someone tripped the cable in the lab, NIC failure, etc.) is very useful.

The select() example is not very good ‘con’ as select triggers on socket end-of-file (and errors); also it triggers on signals (I wonder why).

See man select:
“in particular, a file descriptor is also ready on end-of-file”
“EINTR A signal was caught; see signal(7).”

Main question is if NXP and Cavium can implement this functionality.

Else, I am OK with this.

Reviewed-by: Bogdan Pricope bogdan.pricope@linaro.org

The biggest CON of PKTIN_WAIT is that implementation needs to build mechanism (polling or signal) to catch a stop call which happens almost never (e.g. once a week, or about once every 1000 billion calls). If implementation e.g. polls a stop flag, etc - what's acceptable period of polling the flag (e.g. 1/10msec, ... ?). The more often implementation polls, the more overhead is created and the less power is saved. Application of the other hand can easily setup the correct poll period (e.g. 1/sec) as the timeout. Implementation can easily pool at too high rate (e.g. 100x more often than necessary).

ODP implementation should not use signals internally, but leave those to application use. Application signal handler should see all signals application has created and nothing else (ODP shoud not create signals).

Let's just deprecate PKTIN_WAIT option. Application can easily use a loop of long timeout instead. E.g. a wake up per few minutes would not affect performance, or power saving. During high load implementation would not setup timeout (or cancel it), as there would be packets to return instead. Timeout is set only when there's no packets == no high load.

The reason I changed the return from odp_pktin_recv_tmo(..., ODP_PKTIN_WAIT) to <0 when odp_pktio_stop() is called is to avoid the issue @psavol raises. You'd need to check for <0 returns anyway since these calls may fail for various reasons, so there's no additional work involved. The implementation of this behavior (in linux-generic, at least) is also pretty trivial.

How an implementation implements ODP_PKTIN_WAIT is up to it. The fact that linux-generic currently does a poll under the covers is not important. I'd expect Caterpillar to do an actual wait of some sort and have a low-overhead signaling mechanism to get packets when they arrive, or at least some sort of adaptive wait. Again, this is something applications should not need to worry about. If the application really wants to control the timeout it's free to do so with the existing API, so nothing changes here, but for those that don't care this is simpler from an application standpoint and equally easy to implement.

I should point out we have the same ambiguity today when odp_pktio_send() is called on a stopped pktio. The API doesn't specify what happens here but in linux-generic we'll try to send the packet, which will either be transmitted (thus violating the intent of the "stop" call) or else be abandoned on a transmit queue, which makes for difficult graceful termination. Here again, returning <0 if the pktio is in any other state except START makes sense since that's the only state in which we want to be accepting new packets for transmission.

It's not about application worrying about implementation. It's about these facts:

implementations have more work (more overhead) to monitor three inputs (packet, timeout and stop call) vs two inputs (packet and timeout).

any implementation (including caterpillar) should avoid internal usage of posix signals as application may have it's own signal handler.

application can easily monitor/synchronize it's own threads during interface shutdown by using pktin timeout and other ODP APIs

Stop call to recv synchronization would be needed rarely, but it would causes (potentially) major synchronization effort for implementation. In general, strict synchronization requirements between different API calls should be avoided as those increase implementation complexity and decrease parallelism/performance.

After having to implement such behavior in a hardware platform I was working on I have to side with @psavol. Such synchronization requires odp_pktio_stop() or odp_pktout_send() or odp_queue_enq() to update pktio state in appropriate queue, otherwise each queue would have to de-reference a pointer to pktio to read its state on every send. This incurs a penalty if the pktio structure is otherwise not needed in the fast path (i.e. no calls to odp_packet_input()). It's one thing to check the state once at the start of the send function, completely different if we need to check it every time (and it has to be a volatile check).
Also I'm not entirely convinced by the argument that it is easy to implement *at least in linux-generic`. My understanding was that ODP was an API for dataplane programming which means a low level access to the hardware with focus on offloading functionality to the hardware. Therefore the reasoning that it is easy to implement a feature in a completely abstract software environment works against the hardware offloading. Wouldn't it be better to always use an argument: it is not-hard / easy to implement in hardware, i.e. on vendor X, HW Y it can be implemented like {this}?

I would appreciate if we could deprecate infinite wait option(ODP_PKTIN_WAIT). It is most efficient to remove this feature of infinite wait rather than implement this using any other mechanism explained in this thread.

Bill-Fischofer-Linaro · 2018-01-15T14:40:23Z

Changes for v2:

Change return for recv_tmo() calls terminated by odp_pktio_stop() to <0 from 0.
Implement this behavior in linux-generic.

Clarify how odp_pktin_recv_tmo() and odp_pktin_recv_mq_tmo() interact with odp_pktio_stop(). In particular, specify that a stop operation will terminate indefinite receive waits. Signed-off-by: Bill Fischofer <bill.fischofer@linaro.org>

Fail send and recv calls if the pktio is not in the STARTED state. This allows pending recv_tmo() calls to be terminated by odp_pktio_stop(). Signed-off-by: Bill Fischofer <bill.fischofer@linaro.org>

Bill-Fischofer-Linaro · 2018-01-15T22:03:54Z

Changes for v3

Rebased to resolve forced update of api-next

muvarov · 2018-01-19T08:47:23Z

platform/linux-generic/odp_packet_io.c

@@ -1669,6 +1669,9 @@ int odp_pktin_recv(odp_pktin_queue_t queue, odp_packet_t packets[], int num)
 		return -1;
 	}

+	if (entry->s.state != PKTIO_STATE_STARTED)
+		return -1;


If state was changed in the middle of driver recv() this place is not quite correct. Entry is not locked and other thread can change the state while driver will sill "read" packets. Because of recv() is very short I think this can be ok here. odp_unlikely() might make some reason here.

This is line may not be polled any more as actual waiting may happen inside driver (search for select()).

Bill-Fischofer-Linaro · 2018-01-21T18:26:25Z

Assuming its approval, this PR is superseded by PR #410

muvarov changed the title ~~api: pktio: clarify timeout and stop interaction~~ [PATCH API-NEXT v1] api: pktio: clarify timeout and stop interaction Jan 10, 2018

muvarov added Email_sent labels Jan 10, 2018

psavol reviewed Jan 12, 2018

View reviewed changes

Bill-Fischofer-Linaro force-pushed the pktio-docs branch from 39df689 to e72188c Compare January 15, 2018 14:38

muvarov changed the title ~~[PATCH API-NEXT v1] api: pktio: clarify timeout and stop interaction~~ [PATCH API-NEXT v2] api: pktio: clarify timeout and stop interaction Jan 15, 2018

muvarov removed Email_sent labels Jan 15, 2018

muvarov added checkpatch labels Jan 15, 2018

muvarov force-pushed the api-next branch from 6303c7d to 634b380 Compare January 15, 2018 18:33

Bill-Fischofer-Linaro added 2 commits January 15, 2018 16:02

api: pktio: clarify timeout and stop interaction

7cca868

Clarify how odp_pktin_recv_tmo() and odp_pktin_recv_mq_tmo() interact with odp_pktio_stop(). In particular, specify that a stop operation will terminate indefinite receive waits. Signed-off-by: Bill Fischofer <bill.fischofer@linaro.org>

linux-generic: pktio: fail send/recv calls if pktio is not started

0db2676

Fail send and recv calls if the pktio is not in the STARTED state. This allows pending recv_tmo() calls to be terminated by odp_pktio_stop(). Signed-off-by: Bill Fischofer <bill.fischofer@linaro.org>

Bill-Fischofer-Linaro force-pushed the pktio-docs branch from e72188c to 0db2676 Compare January 15, 2018 22:03

muvarov changed the title ~~[PATCH API-NEXT v2] api: pktio: clarify timeout and stop interaction~~ [PATCH API-NEXT v3] api: pktio: clarify timeout and stop interaction Jan 15, 2018

muvarov removed Email_sent labels Jan 15, 2018

muvarov added Email_sent API API change labels Jan 15, 2018

muvarov reviewed Jan 19, 2018

View reviewed changes

Bill-Fischofer-Linaro closed this Jan 26, 2018

Bill-Fischofer-Linaro deleted the pktio-docs branch January 26, 2018 23:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PATCH API-NEXT v3] api: pktio: clarify timeout and stop interaction #387

[PATCH API-NEXT v3] api: pktio: clarify timeout and stop interaction #387

Bill-Fischofer-Linaro commented Jan 10, 2018

codecov bot commented Jan 10, 2018 •

edited

psavol Jan 12, 2018

Bill-Fischofer-Linaro Jan 12, 2018

Bill-Fischofer-Linaro Jan 14, 2018

bogdanPricope Jan 16, 2018

psavol Jan 16, 2018

Bill-Fischofer-Linaro Jan 16, 2018

psavol Jan 17, 2018

semihalf-kardach-stanislaw Jan 17, 2018

bala-manoharan Jan 19, 2018

Bill-Fischofer-Linaro commented Jan 15, 2018

Bill-Fischofer-Linaro commented Jan 15, 2018

muvarov Jan 19, 2018

psavol Jan 19, 2018

Bill-Fischofer-Linaro commented Jan 21, 2018

[PATCH API-NEXT v3] api: pktio: clarify timeout and stop interaction #387

[PATCH API-NEXT v3] api: pktio: clarify timeout and stop interaction #387

Conversation

Bill-Fischofer-Linaro commented Jan 10, 2018

codecov bot commented Jan 10, 2018 • edited

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Bill-Fischofer-Linaro commented Jan 15, 2018

Bill-Fischofer-Linaro commented Jan 15, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Bill-Fischofer-Linaro commented Jan 21, 2018

codecov bot commented Jan 10, 2018 •

edited