Skip to content

Commit

Permalink
Merge pull request #1 from ecree-solarflare/copyedit
Browse files Browse the repository at this point in the history
XDP: editing and grammar fixes by Edward Cree
  • Loading branch information
netoptimizer committed Sep 22, 2016
2 parents a4e60e2 + 5c2c801 commit fb6a3de
Show file tree
Hide file tree
Showing 13 changed files with 178 additions and 176 deletions.
72 changes: 36 additions & 36 deletions kernel/Documentation/networking/XDP/design/design.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,21 +9,22 @@ XDP is designed for programmability.

Users want programmability as close as possible to the device
hardware, to reap the performance gains, but they also want
portability. The purpose of XDP is making such programs portable
across multiple devices and vendors. (It is even imagined that XDP
programs, should be able to run in userspace, either for simulation
purposes or combined with other raw packet data-plane frameworks like
netmap or DPDK).

It is expected that, some HW vendors, take steps towards offloading
XDP programs into their hardware. It is fine they compete on this to
sell more hardware. It is no different from producing the fastest
chip. XDP encourage innovation, also for new HW features. But when
extending XDP programs with a new hardware features (e.g. which only a
single vendor supports), then this must be expressed towards the XDP
API as a capability or features (see section `Capabilities
negotiation`_). This functions as a common capabilities API that
vendors can choose implement (based on customer demand).
portability. The purpose of XDP is making such programs portable
across multiple devices and vendors.

(It is even imagined that XDP programs should be able to run in
user space, either for simulation purposes or combined with other raw
packet data-plane frameworks like netmap or DPDK).

It is expected that some HW vendors will take steps towards offloading
XDP programs into their hardware. It is fine if they compete on this
to sell more hardware. It is no different from producing the fastest
chip. XDP also encourages innovation for new HW features, but when
extending XDP programs with a new hardware feature (e.g. which only a
single vendor supports), this must be expressed within the XDP API as
a capability or feature (see section `Capabilities negotiation`_).
This functions as a common capabilities API from which vendors can
choose what to implement (based on customer demand).

.. _ref_prog_negotiation:

Expand All @@ -32,40 +33,39 @@ Capabilities negotiation

.. Warning:: This interface is missing in the implementation

XDP have hooks and feature dependencies in the device drivers.
Planning for extentability, not all device driver may support all the
future features of XDP, and new feature adaptation in device driver
will occur at different developement rates.
XDP has hooks and feature dependencies in the device drivers.
Planning for extendability, not all device drivers will necessarily
support all of the future features of XDP, and new feature adoption
in device drivers will occur at different development rates.

Thus, there is a need for the device driver to express what XDP
capabilities or features it provides.
Thus, there is a need for the device driver to express what XDP
capabilities or features it provides.

When attaching/loading an XDP program into the kernel, a feature or
capabilities negotiation should be conducted. This implies an XDP
program need to express what features it want to use.
capabilities negotiation should be conducted. This implies that an
XDP program needs to express what features it wants to use.

Loading an XDP program requesting features that the given device
drivers does not support, should simply result in rejecting loading
the program.
If an XDP program being loaded requests features that the given device
driver does not support, the program load should simply be rejected.

.. note:: I'm undecided on whether to have an query interface?
Because users can just use the regular load-interface to probe for
supported options. The down-side of probing is the issues SElinux
runs into, of false-alarms, when glibc tries to probe after
.. note:: I'm undecided on whether to have an query interface, because
users could just use the regular load-interface to probe for
supported options. The downside of probing is the issues SElinux
runs into, of false alarms, when glibc tries to probe for
capabilities.


Implementation issue
--------------------

The current implementation is missing this interface. Worse the two
actions :ref:`XDP_DROP` and :ref:`XDP_TX` should have been express as
two different capabilities, because XDP_TX requires more changes to
The current implementation is missing this interface. Worse, the two
actions :ref:`XDP_DROP` and :ref:`XDP_TX` should have been expressed
as two different capabilities, because XDP_TX requires more changes to
the device driver than a simple drop like XDP_DROP.

One can (easily) imagine that an older driver only want to implement
the XDP_DROP facility. The reason is that XDP_TX would requires
changing too much driver code, which is a concern for an old stable
One can (easily) imagine that an older driver only wants to implement
the XDP_DROP facility. The reason is that XDP_TX would require
changing too much driver code, which is a concern for an old, stable
and time-proven driver.

Data plane split
Expand Down
69 changes: 34 additions & 35 deletions kernel/Documentation/networking/XDP/design/requirements.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,64 +5,63 @@ Requirements
Driver RX hook
==============

Access to packet-data payload before allocating any meta-data
structures, like SKBs. It is key to performance, allowing processing
RX "packet-pages" directly out of drivers RX ring queue.
Gives us access to packet-data payload before allocating any meta-data
structures, like SKBs. This is key to performance, as it allows
processing RX "packet-pages" directly out of the driver's RX ring
queue.


Early drop
==========

Early drop is key for the DoS (Denial of Service) mitigation use-cases.
It builds upon a principal of spending/investing as few CPU cycles as
It builds upon a principle of spending/investing as few CPU cycles as
possible on a packet that will get dropped anyhow.

Doing this "inline" before delivery to the normal network stack, have
the advantage that packets that does need delivery to the normal
network stack, can still get all the features and benefits as before.
(No need to deploy a bypass facility, just to reinject "good" packets
into the stack again).
Doing this "inline", before delivery to the normal network stack, has
the advantage that a packet that *does* need delivery to the normal
network stack can still get all the features and benefits as before;
there is thus no need to deploy a bypass facility merely to re-inject
"good" packets into the stack again.


Write access to packet-data
Write access to packet data
===========================

Need the ability to modify packet-data. This is unfortunately often
difficult to obtain, as it requires fundamental changes to the drivers
memory model.
XDP needs the ability to modify packet data. This is unfortunately
often difficult to obtain, as it requires fundamental changes to the
driver's memory model.

Unfortunately most driver don't have "writable" packet-data as
default. The packet-data in drivers is often not writable, because
the drivers likely have choosen to work-arounds performance
bottlenecks in both the page-allocator and DMA APIs, which side-effect
is read-only packet-pages.
Unfortunately most drivers don't have "writable" packet data as
default. This is due to a workaround for performance bottlenecks in
both the page-allocator and DMA APIs, which has the side-effect of
necessitating read-only packet pages.

Instead most drivers, allocate both a SKB and a writable memory
buffer, which the packet headers are copied into (and for placing
``skb_shared_info``). Afterwards the SKB (and ``skb_shared_info``) is
adjusted to point into the remaining payload (pointing past the
headers just copied).
Instead, most drivers (currently) allocate both a SKB and a writable
memory buffer, in which to copy ("linearise") the packet headers, and
also store ``skb_shared_info``. Then the remaining payload (pointing
past the headers just copied) is attached as (read-only) paged data.


Header push and pop
===================

The ability to push (add) or pop (remove) packet headers indirectly
depend on write acces to packet-data. (One could argue that a pure
pop, could be implemented by only adjusting the payload offset, thus
no write-access).
depends on write access to packet-data. (One could argue that a pure
pop could be implemented by only adjusting the payload offset, thus
not needing write access).

This requirement goes hand-in-hand with tunnel encapsulation or
decapsulation. It is also relevant for e.g adding a VLAN head as
needed by the :doc:`../use-cases/xdp_use_case_ddos_scrubber` in-order
to workaround the :ref:`XDP_TX` single NIC limitation.
decapsulation. It is also relevant for e.g adding a VLAN header, as
needed by the :doc:`../use-cases/xdp_use_case_ddos_scrubber` in order
to work around the :ref:`XDP_TX` single NIC limitation.

This requirement implies the ability to adjust the packet-data start
offset/pointer and packet length. This requires additional data to be
returned
returned.

This also have implication for how much headroom drivers should
reserve.
This also has implications for how much headroom drivers should
reserve in the SKB.


Page per packet
Expand All @@ -73,9 +72,9 @@ Page per packet
Packet forwarding
=================

Implementing a router/forwarding data plane is DPDK prime example for
demonstrating superior performance. For the shear ability to compare
against DPDK, XDP also need a forwarding capability.
Implementing a router/forwarding data plane is DPDK's prime example
for demonstrating superior performance. For the sheer ability to
compare against DPDK, XDP also needs a forwarding capability.


RX bulking
Expand Down
16 changes: 8 additions & 8 deletions kernel/Documentation/networking/XDP/disclaimer.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,18 +10,18 @@ Important to understand
It is important to understand that the XDP speed gains comes at a cost
of loss of generalization and fairness.

XDP does not provide fairness. There is not buffering (qdisc) layer
that absorbs traffic bursts when the TX device is too slow, packets
will simply be dropped. Don't use XDP in situations where the RX
device is faster than the TX device, as there is not back-pressure to
save packet from being dropped. There is no qdisc layer or BQL (Byte
XDP does not provide fairness. There is no buffering (qdisc) layer to
absorb traffic bursts when the TX device is too slow, packets will
simply be dropped. Don't use XDP in situations where the RX device
is faster than the TX device, as there is no back-pressure to save
the packet from being dropped. There is no qdisc layer or BQL (Byte
Queue Limit) to save your from introducing massive bufferbloat.

Using XDP is about specialization. Crafting a solution towards a very
specialized purpose, that will require selecting and dimensioning the
appropriate hardware. Using XDP it requires understanding the dangers
and pitfalls, that comes from bypassing large parts of the kernel
network stack code base, which is there for good reasons.
appropriate hardware. Using XDP requires understanding the dangers and
pitfalls, that come from bypassing large parts of the kernel network
stack code base, which is there for good reasons.

That said, XDP can be the right solution for some use-cases, and can
yield huge (orders of magnitude) performance improvements, by allowing
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@
Drivers
=======

XDP have a dependency on drivers implementing the RX hook and setup
API. Adding driver support is fairly easy, unless it requires
changing the drivers memory model (which is often the case).
XDP depends on drivers implementing the RX hook and set-up API.
Adding driver support is fairly easy, unless it requires changing the
driver's memory model (which is often the case).


Mellanox: mlx4
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ See: :ref:`ref_prog_negotiation`
Missing: XDP program per RX queue
=================================

Changes to the userspace API is needed to add this feature.
Changes to the user space API are needed to add this feature.

Missing: Cache prefetching
==========================
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,13 @@ Userspace API

.. Warning::

The userspace API specification should have to be defined properly
before code was accepted upstream. Concerns have been raise about
The userspace API specification should have been defined properly
before code was accepted upstream. Concerns have been raised about
the current API upstream. Users should expect this first API
attempt will need adjustments. This cannot be considered a stable
attempt will need adjustments; this cannot be considered a stable
API yet.

Most importantly is the missing capabilities negotiation,
Most importantly, capabilities negotiation is missing;
see :ref:`ref_prog_negotiation`.


Expand All @@ -32,32 +32,32 @@ Struct xdp_prog
---------------

Currently (4.8-rc6) the XDP program is simply a bpf_prog pointer.
While it is good for simplicity, it is limiting extendability for
While this is good for simplicity, it limits extendability for
upcoming features.

Introducing a new ``struct xdp_prog``, that can carry information
related to the XDP program. Notice this approach does not affect
performance (tested and benchmarked), because the extra dereference
for the eBPF program only happens once per 64 packets in the poll
function.
Maybe we should introduce a new ``struct xdp_prog`` that can carry
information related to the XDP program. Notice this approach does
not affect performance (tested and benchmarked), because the extra
dereference for the eBPF program only happens once per 64 packets in
the poll function.

The features that need this is:
The features that need this are:

* Multi-port TX:
Need to know own port index and port lookup table.

* XDP program per RX queue:
Need setup info about program type, global or specific, due to
replace semantics.
program-replacement semantics.

* Capabilities negotiation:
Need to store information about features program want to use,
in-order to validate this.
Need to store information about features program wants to use,
in order to validate this.

.. TODO:: How kernel devel works: This new ``struct xdp_prog``
features cannot go into the kernel before one of the three users of
the struct is also implemented. (Note, Jesper have implemented this
struct change and have even benchmarked that it does not hurt
feature cannot go into the kernel before one of the three users of
the struct is also implemented. (Note, Jesper has implemented this
struct change and has even benchmarked that it does not hurt
performance).


Expand All @@ -67,14 +67,14 @@ Troubleshooting and Monitoring
==============================

Users need the ability to both monitor and troubleshoot an XDP
program. Partigular in case of error events like :ref:`XDP_ABORTED`,
and in case a XDP programs starts to return invalid and unsupported
action code (caught by the :ref:`action fall-through`).
program; particularly so in case of error events like :ref:`XDP_ABORTED`,
and in case an XDP program starts to return invalid and unsupported
action codes (caught by the :ref:`action fall-through`).

.. Warning::

The current (4.8-rc6) implementation is not optimal in this area.
In case of the :ref:`action fall-through` packets is dropped and a
In the :ref:`action fall-through` case, the packet is dropped and a
warning is generated **only once** about the invalid XDP program
action code, by calling: bpf_warn_invalid_xdp_action(action_code);

Expand All @@ -84,19 +84,19 @@ Two options are on the table currently:

* Counters.

Simply add counters to track these events. This allow admins and
monitor tools to catch and count these events. This does requires
Simply add counters to track these events. This allows admins and
monitoring tools to catch and count these events. This does require
standardizing these counters to help monitor tools.

* Tracepoints.

Another option is adding tracepoint to these situations. It is much
more flexible than counters. The downside is that these error
Another option is adding tracepoints to these situations. These are
much more flexible than counters. The downside is that these error
events might never be caught, if the tracepoint isn't active.

An important design consideration is the monitor facility must not be
too expensive to execute, even-though events like :ref:`XDP_ABORTED`
and :ref:`action fall-through` should be very rare events. This is
An important design consideration is that the monitor facility must
not be too expensive to execute, even though events like :ref:`XDP_ABORTED`
and :ref:`action fall-through` should normally be very rare. This is
because an external attacker (given the DDoS uses-cases) might find a
way to trigger these events, which would then serve as an attack
vector against XDP.
Expand Down

0 comments on commit fb6a3de

Please sign in to comment.