-
Notifications
You must be signed in to change notification settings - Fork 532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for Amazon ENA driver #264
Comments
Indeed, it would be great, but we don't have the ENA hardware to play with, nor ENA specifications. |
Hi! |
Hi,
In general you should take one patched driver as a reference, and see
what are the required functionalities. My suggestion is to use "e1000" as a
reference (code is included in the netmap repository).
The code is usually structured like this: (i) a small patch to the standard
Linux driver (e.g. e1000_main.c in the linux tree); and (ii) an header file
containing the netmap-specific functions (e.g. LINUX/if_e1000_netmap.h).
The steps are the following:
1) Patch the normal driver, see
LINUX/final-patches/vanilla--e1000--31200--99999.
a) Include the header (ii).
b) At the end of the device probe routine, call a netmap-specific
function (e.g. e1000_netmap_attach()) which fills in some NIC info (and
netmap methods) and calls netmap_attach(). At the beginning of the device
remove routine, call netmap_detach().
c) Call netmap_rx_irq where RX interrupt is handled (typically in the
NAPI poll routine), preventing the driver to access the RX ring. Similar
thing for the TX interupt, where you will call netmap_tx_irq.
d) Check where the driver code allocates RX buffers (can be sk_buffs,
pages, kmalloc buffers, etc, it depends on the driver) to put their
addresses in the RX ring(s). You must prevent that to happen and call a
netmap-specific function (e.g. e1000_netmap_init_buffers(), which will
instead put the addresses of netmap buffers in the RX rings(s). The
netmap-specific function is defined in the header.
2) Implement the nm_register method (e.g. e1000_netmap_reg). The general
scheme is
````
// put the interface down (if it was up)
if (onoff) {
nm_set_native_flags(na);
} else {
nm_clear_native_flags(na);
}
// put the interface up again (if it was up)
````
3) Implement the txsync method (e.g. e1000_netmap_txsync). The purpose of
this method is to see what changed in the netmap ring (an abstract
device-independent ring of buffers) w.r.t the last time txsync was called,
and reflect those changes to the real NIC TX ring. Also, see what changed
in the NIC TX ring and reflect it back to netmap ring. Start copying the
body of e1000_netmap_txsync.
a) You need to change the body of the loop to fill-int the NIC-specific
TX slots using the information stored in the corresponding "netmap_slot",
which are abstract and device-independent. Each netmap_slot corresponds to
a packet to be transmitted.
b) After the loop, notify the NIC about the new packets in the TX ring.
This typically happens writing to a NIC register (e.g. the TDT in case of
e1000).
c) update kring->nr_hwtail to reflect the index of the next TX packet
still to be processed by the hardware (e.g. the one after the last
processed). In case of e1000, this information is stored in the TDH
register.
4) Implement the rxsync method (e.g. e1000_netmap_rxsync). The purpose of
this method is again to reflect the changes in the netmap ring to the NIC
RX ring and the other way around. Start copying the body of
e1000_netmap_rxsync.
a) In the first loop you need to scan those slots in the RX ring that
have been used by the NIC to receive a packet (i.e.. new packets received).
You need to fill each netmap_slot using the information stored in the
corresponding slot in the NIC RX ring.
b) In the second loop you need to clean those NIC RX slots that have
been "used" by the netmap application (e.g. userspace process) and give
them back to the NIC to be reused for new receive operations. You typically
need to write to a NIC register to notify the NIC that new RX slots are
available (e.g. RDT register in e1000). If netmap buffers changed (e.g.
because of zerocopy swap), you also need to update the address in the NIC
RX slot (see the NS_BUF_CHANGED flag)
5) Implement the initialization function which links netmap buffers to RX
rings (e.g. e1000_netmap_init_buffers). Basically you need to scan the
rings and fill in the NIC RX slots using the addess/len info contained in
the netmap RX slots. The same is not usually necessary for TX rings.
My suggestion is to start with (1) (ignoring 1.d), then go for (2) and (3).
At this point you should be able to test transmission.
Then you can go ahead with (1.d), (4) and (5). In any case at least provide
stubs for (4), (5) and (1.d)
2016-12-28 18:45 GMT+01:00 Ivanov Anton <notifications@github.com>:
… Hi!
I'm very interested in guidelines for developing netmap-friendly drivers.
I have network adapter based on FPGA. It would so great if you cloud give
some tips.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#264 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEsSwezqBS5vNqvrTuZphkyZmpp4Grqlks5rMqA3gaJpZM4LRhrK>
.
--
Vincenzo Maffione
|
@vmaffione DPDK seems to already support ena. I am willing to give this a shot if you don't mind giving feedback. |
Hi Gerard, Anyway I see the datapath operation is sketched quite clearly here Netmap support in principle requires way less effort than DPDK support, as only a small datapath-related patch is needed. The original driver is still in charge of everything related to configuration and control path (apart from management of RX buffer refill, see above). If you want to try sketching a netmap patch for ena we can give you feedback and suggestions for sure. |
Hello @vmaffione, Thanks for your support. I have read through your guidelines, netmap source, and ena source in more detail. I will attempt to implement this in five phases (you list two phase 4's) as you described in your guidelines. I will post a gist with a prototype of steps 1a-1c.
Questions: |
Hi Gerard, Your plan looks great to me. Regarding the netmap_init_buffer(), it is not optional (nothing it is optional in the list). |
Hey @vmaffione Thanks again for your support. Gerard ENA Netmap Prototype Phase 1Steps for phase one are as follows:
ena_netmap_linux.h prototype#include <bsd_glue.h>
#include <net/netmap.h>
#include <netmap/netmap_kern.h>
#ifdef NETMAP_LINUX_ENA_PTR_ARRAY
#define NM_ENA_TX_RING(a, r) ((a)->tx_rings[(r)])
#define NM_ENA_RX_RING(a, r) ((a)->rx_rings[(r)])
#else
#define NM_ENA_TX_RING(a, r) (&(a)->tx_rings[(r)])
#define NM_ENA_RX_RING(a, r) (&(a)->rx_rings[(r)])
#endif
/*
* The attach routine, called near the end of ena_probe(),
* fills the parameters for netmap_attach() and calls it.
* It cannot fail, in the worst case (such as no memory)
* netmap mode will be disabled and the driver will only
* operate in standard mode.
*/
static void
ena_netmap_attach(struct ena_adapter *adapter)
{
struct netmap_adapter na;
bzero(&na, sizeof(na));
na.ifp = adapter->netdev;
na.na_flags = NAF_BDG_MAYSLEEP;
na.pdev = &adapter->pdev->dev;
// XXX check that queues is set.
na.num_tx_desc = NM_ENA_TX_RING(adapter, 0)->count;
na.num_rx_desc = NM_ENA_RX_RING(adapter, 0)->count;
// na.nm_txsync = ena_netmap_txsync; // Task 3
// na.nm_rxsync = ena_netmap_rxsync; // Task 4
// na.nm_register = ena_netmap_reg; // Task 2
na.num_tx_rings = na.num_rx_rings = adapter->num_queue_pairs;
netmap_attach(&na);
} ena_netmap.c patchPatch is to 1.1.3 tag of ena_netdev.c diff --git a/ena_netdev.c b/ena_netdev.c
index 0facf46..445857b 100644
--- a/ena_netdev.c
+++ b/ena_netdev.c
@@ -54,6 +54,10 @@
#include "ena_pci_id_tbl.h"
#include "ena_sysfs.h"
+#if defined(CONFIG_NETMAP) || defined(CONFIG_NETMAP_MODULE)
+#include <ena_netmap.h>
+#endif
+
static char version[] = DEVICE_NAME " v" DRV_MODULE_VERSION "\n";
MODULE_AUTHOR("Amazon.com, Inc. or its affiliates");
@@ -696,6 +700,13 @@ static int ena_clean_tx_irq(struct ena_ring *tx_ring, u32 budget)
int tx_pkts = 0;
int rc;
+ struct ena_adapter *adapter = tx_ring->adapter;
+ struct net_device *netdev = adapter->netdev;
+#ifdef DEV_NETMAP
+ if (netmap_tx_irq(netdev, 0))
+ return true; /* cleaned ok */
+#endif /* DEV_NETMAP */
+
next_to_clean = tx_ring->next_to_clean;
txq = netdev_get_tx_queue(tx_ring->netdev, tx_ring->qid);
@@ -1013,6 +1024,18 @@ static int ena_clean_rx_irq(struct ena_ring *rx_ring, struct napi_struct *napi,
int total_len = 0;
int rx_copybreak_pkt = 0;
+ struct net_device *netdev = adapter->netdev;
+#ifdef DEV_NETMAP
+#ifdef CONFIG_ENA_NAPI
+#define NETMAP_DUMMY work_done
+#else
+ int dummy;
+#define NETMAP_DUMMY &dummy
+#endif
+ if (netmap_rx_irq(netdev, 0, NETMAP_DUMMY))
+ return true;
+#endif /* DEV_NETMAP */
+
netif_dbg(rx_ring->adapter, rx_status, rx_ring->netdev,
"%s qid %d\n", __func__, rx_ring->qid);
res_budget = budget;
@@ -3377,6 +3400,10 @@ static int ena_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
adapter->timer_service.function = ena_timer_service;
adapter->timer_service.data = (unsigned long)adapter;
+#ifdef DEV_NETMAP
+ ena_netmap_attach(adapter);
+#endif /* DEV_NETMAP */
+
add_timer(&adapter->timer_service);
dev_info(&pdev->dev, "%s found at mem %lx, mac addr %pM Queues %d\n",
@@ -3517,6 +3544,11 @@ static void ena_remove(struct pci_dev *pdev)
ena_com_destroy_interrupt_moderation(ena_dev);
vfree(ena_dev);
+
+#ifdef DEV_NETMAP
+ netmap_detach(netdev);
+#endif /* DEV_NETMAP */
+
}
static struct pci_driver ena_pci_driver = { |
Hi @gspivey .
|
It would be great if netmap had support for this driver, it only ships with DPDK support.
https://aws.amazon.com/blogs/aws/elastic-network-adapter-high-performance-network-interface-for-amazon-ec2/
The text was updated successfully, but these errors were encountered: