diff --git a/FAQ.md b/FAQ.md index 28680d69bd3..4d2f33db84d 100644 --- a/FAQ.md +++ b/FAQ.md @@ -445,9 +445,9 @@ A: Yes. How you configure it depends on what you mean by "promiscuous A: Firstly, you must have a DPDK-enabled version of Open vSwitch. - If your version is DPDK-enabled it will support the --dpdk - argument on the command line and will display lines with - "EAL:..." during startup when --dpdk is supplied. + If your version is DPDK-enabled it will support the other-config:dpdk-init + configuration in the database and will display lines with "EAL:..." + during startup when other_config:dpdk-init is set to 'true'. Secondly, when adding a DPDK port, unlike a system port, the type for the interface must be specified. For example; diff --git a/INSTALL.DPDK.md b/INSTALL.DPDK.md index 7f76df8205e..f8c59672f2b 100644 --- a/INSTALL.DPDK.md +++ b/INSTALL.DPDK.md @@ -138,22 +138,62 @@ Using the DPDK with ovs-vswitchd: 5. Start vswitchd: - DPDK configuration arguments can be passed to vswitchd via `--dpdk` - argument. This needs to be first argument passed to vswitchd process. - dpdk arg -c is ignored by ovs-dpdk, but it is a required parameter - for dpdk initialization. + DPDK configuration arguments can be passed to vswitchd via Open_vSwitch + other_config column. The recognized configuration options are listed. + Defaults will be provided for all values not explicitly set. + + * dpdk-init + Specifies whether OVS should initialize and support DPDK ports. This is + a boolean, and defaults to false. + + * dpdk-lcore-mask + Specifies the CPU cores on which dpdk lcore threads should be spawned. + The DPDK lcore threads are used for DPDK library tasks, such as + library internal message processing, logging, etc. Value should be in + the form of a hex string (so '0x123') similar to the 'taskset' mask + input. + If not specified, the value will be determined by choosing the lowest + CPU core from initial cpu affinity list. Otherwise, the value will be + passed directly to the DPDK library. + For performance reasons, it is best to set this to a single core on + the system, rather than allow lcore threads to float. + + * dpdk-alloc-mem + This sets the total memory to preallocate from hugepages regardless of + processor socket. It is recommended to use dpdk-socket-mem instead. + + * dpdk-socket-mem + Comma separated list of memory to pre-allocate from hugepages on specific + sockets. + + * dpdk-hugepage-dir + Directory where hugetlbfs is mounted + + * cuse-dev-name + Option to set the vhost_cuse character device name. + + * vhost-sock-dir + Option to set the path to the vhost_user unix socket files. + + NOTE: Changing any of these options requires restarting the ovs-vswitchd + application. + + Open vSwitch can be started as normal. DPDK will be initialized as long + as the dpdk-init option has been set to 'true'. + ``` export DB_SOCK=/usr/local/var/run/openvswitch/db.sock - ovs-vswitchd --dpdk -c 0x1 -n 4 -- unix:$DB_SOCK --pidfile --detach + ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-init=true + ovs-vswitchd unix:$DB_SOCK --pidfile --detach ``` If allocated more than one GB hugepage (as for IVSHMEM), set amount and use NUMA node 0 memory: ``` - ovs-vswitchd --dpdk -c 0x1 -n 4 --socket-mem 1024,0 \ - -- unix:$DB_SOCK --pidfile --detach + ovs-vsctl --no-wait set Open_vSwitch . other_config:dpdk-socket-mem="1024,0" + ovs-vswitchd unix:$DB_SOCK --pidfile --detach ``` 6. Add bridge & ports @@ -540,11 +580,12 @@ in the names. `/usr/local/var/run/openvswitch/vhost-user-1`, which you must provide to your VM on the QEMU command line. More instructions on this can be found in the next section "DPDK vhost-user VM configuration" - Note: If you wish for the vhost-user sockets to be created in a - directory other than `/usr/local/var/run/openvswitch`, you may specify - another location on the ovs-vswitchd command line like so: + - If you wish for the vhost-user sockets to be created in a directory other + than `/usr/local/var/run/openvswitch`, you may specify another location + in the ovsdb like so: - `./vswitchd/ovs-vswitchd --dpdk -vhost_sock_dir /my-dir -c 0x1 ...` + `./utilities/ovs-vsctl --no-wait \ + set Open_vSwitch . other_config:vhost-sock-dir=path` DPDK vhost-user VM configuration: --------------------------------- @@ -692,14 +733,13 @@ DPDK vhost-cuse VM configuration: 1. This step is only needed if using an alternative character device. - The new character device filename must be specified on the vswitchd - commandline: + The new character device filename must be specified in the ovsdb: - `./vswitchd/ovs-vswitchd --dpdk --cuse_dev_name my-vhost-net -c 0x1 ...` + `./utilities/ovs-vsctl --no-wait set Open_vSwitch . \ + other_config:cuse-dev-name=my-vhost-net` - Note that the `--cuse_dev_name` argument and associated string must be the first - arguments after `--dpdk` and come before the EAL arguments. In the example - above, the character device to be used will be `/dev/my-vhost-net`. + In the example above, the character device to be used will be + `/dev/my-vhost-net`. 2. This step is only needed if reusing the standard character device. It will conflict with the kernel vhost character device so the user must first @@ -811,8 +851,8 @@ steps. ``` refers to "vhost-net" if using the `/dev/vhost-net` - device. If you have specificed a different name on the ovs-vswitchd - commandline using the "--cuse_dev_name" parameter, please specify that + device. If you have specificed a different name in the database + using the "other_config:cuse-dev-name" parameter, please specify that filename instead. 2. Disable SELinux or set to permissive mode diff --git a/NEWS b/NEWS index 3167e8dca76..c20b64c223d 100644 --- a/NEWS +++ b/NEWS @@ -26,6 +26,11 @@ Post-v2.5.0 assignment. * Type of log messages from PMD threads changed from INFO to DBG. * QoS functionality with sample egress-policer implementation. + * The mechanism for configuring DPDK has changed to use database + * Sensible defaults have been introduced for many of the required + configuration options + * DB entries have been added for many of the DPDK EAL command line + arguments - ovs-benchmark: This utility has been removed due to lack of use and bitrot. - ovs-appctl: diff --git a/lib/automake.mk b/lib/automake.mk index 76dfc07e912..affbb5c3f8b 100644 --- a/lib/automake.mk +++ b/lib/automake.mk @@ -354,6 +354,10 @@ if DPDK_NETDEV lib_libopenvswitch_la_SOURCES += \ lib/netdev-dpdk.c \ lib/netdev-dpdk.h +else +lib_libopenvswitch_la_SOURCES += \ + lib/netdev-nodpdk.c \ + lib/netdev-dpdk.h endif if WIN32 diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index c4b64766e20..59ea51d61e8 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -29,6 +29,7 @@ #include #include #include +#include #include "dirs.h" #include "dp-packet.h" @@ -107,7 +108,9 @@ BUILD_ASSERT_DECL((MAX_NB_MBUF / ROUND_DOWN_POW2(MAX_NB_MBUF/MIN_NB_MBUF)) #define OVS_VHOST_QUEUE_DISABLED (-2) /* Queue was disabled by guest and not * yet mapped to another queue. */ +#ifdef VHOST_CUSE static char *cuse_dev_name = NULL; /* Character device cuse_dev_name. */ +#endif static char *vhost_sock_dir = NULL; /* Location of vhost-user sockets */ /* @@ -649,8 +652,15 @@ netdev_dpdk_cast(const struct netdev *netdev) static struct netdev * netdev_dpdk_alloc(void) { - struct netdev_dpdk *dev = dpdk_rte_mzalloc(sizeof *dev); - return &dev->up; + struct netdev_dpdk *dev; + + if (!rte_eal_init_ret) { /* Only after successful initialization */ + dev = dpdk_rte_mzalloc(sizeof *dev); + if (dev) { + return &dev->up; + } + } + return NULL; } static void @@ -783,6 +793,10 @@ netdev_dpdk_vhost_cuse_construct(struct netdev *netdev) struct netdev_dpdk *dev = netdev_dpdk_cast(netdev); int err; + if (rte_eal_init_ret) { + return rte_eal_init_ret; + } + ovs_mutex_lock(&dpdk_mutex); strncpy(dev->vhost_id, netdev->name, sizeof(dev->vhost_id)); err = vhost_construct_helper(netdev); @@ -807,6 +821,10 @@ netdev_dpdk_vhost_user_construct(struct netdev *netdev) return EINVAL; } + if (rte_eal_init_ret) { + return rte_eal_init_ret; + } + ovs_mutex_lock(&dpdk_mutex); /* Take the name of the vhost-user port and append it to the location where * the socket is to be created, then register the socket. @@ -2251,28 +2269,12 @@ dpdk_vhost_class_init(void) static int dpdk_vhost_cuse_class_init(void) { - int err = -1; - - - /* Register CUSE device to handle IOCTLs. - * Unless otherwise specified on the vswitchd command line, cuse_dev_name - * is set to vhost-net. - */ - err = rte_vhost_driver_register(cuse_dev_name); - - if (err != 0) { - VLOG_ERR("CUSE device setup failure."); - return -1; - } - - dpdk_vhost_class_init(); return 0; } static int dpdk_vhost_user_class_init(void) { - dpdk_vhost_class_init(); return 0; } @@ -2283,7 +2285,6 @@ dpdk_common_init(void) "[netdev] up|down", 1, 2, netdev_dpdk_set_admin_state, NULL); - ovs_thread_create("dpdk_watchdog", dpdk_watchdog, NULL); } /* Client Rings */ @@ -2732,22 +2733,20 @@ static const struct dpdk_qos_ops egress_policer_ops = { static int process_vhost_flags(char *flag, char *default_val, int size, - char **argv, char **new_val) + const struct smap *ovs_other_config, + char **new_val) { + const char *val; int changed = 0; + val = smap_get(ovs_other_config, flag); + /* Depending on which version of vhost is in use, process the vhost-specific - * flag if it is provided on the vswitchd command line, otherwise resort to - * a default value. - * - * For vhost-user: Process "-vhost_sock_dir" to set the custom location of - * the vhost-user socket(s). - * For vhost-cuse: Process "-cuse_dev_name" to set the custom name of the - * vhost-cuse character device. + * flag if it is provided, otherwise resort to default value. */ - if (!strcmp(argv[1], flag) && (strlen(argv[2]) <= size)) { + if (val && (strlen(val) <= size)) { changed = 1; - *new_val = xstrdup(argv[2]); + *new_val = xstrdup(val); VLOG_INFO("User-provided %s in use: %s", flag, *new_val); } else { VLOG_INFO("No %s provided - defaulting to %s", flag, default_val); @@ -2757,68 +2756,185 @@ process_vhost_flags(char *flag, char *default_val, int size, return changed; } -int -dpdk_init(int argc, char **argv) +static char ** +grow_argv(char ***argv, size_t cur_siz, size_t grow_by) { - int result; - int base = 0; - char *pragram_name = argv[0]; - int err; - int isset; - cpu_set_t cpuset; + return xrealloc(*argv, sizeof(char *) * (cur_siz + grow_by)); +} - if (argc < 2 || strcmp(argv[1], "--dpdk")) - return 0; +static void +dpdk_option_extend(char ***argv, int argc, const char *option, + const char *value) +{ + char **newargv = grow_argv(argv, argc, 2); + *argv = newargv; + newargv[argc] = xstrdup(option); + newargv[argc+1] = xstrdup(value); +} - /* Remove the --dpdk argument from arg list.*/ - argc--; - argv++; +static int +construct_dpdk_options(const struct smap *ovs_other_config, + char ***argv, const int initial_size) +{ + struct dpdk_options_map { + const char *ovs_configuration; + const char *dpdk_option; + bool default_enabled; + const char *default_value; + } opts[] = { + {"dpdk-lcore-mask", "-c", false, NULL}, + {"dpdk-hugepage-dir", "--huge-dir", false, NULL}, + }; + + int i, ret = initial_size; + + /*First, construct from the flat-options (non-mutex)*/ + for (i = 0; i < ARRAY_SIZE(opts); ++i) { + const char *lookup = smap_get(ovs_other_config, + opts[i].ovs_configuration); + if (!lookup && opts[i].default_enabled) { + lookup = opts[i].default_value; + } - /* Reject --user option */ - int i; - for (i = 0; i < argc; i++) { - if (!strcmp(argv[i], "--user")) { - VLOG_ERR("Can not mix --dpdk and --user options, aborting."); + if (lookup) { + dpdk_option_extend(argv, ret, opts[i].dpdk_option, lookup); + ret += 2; + } + } + + return ret; +} + +#define MAX_DPDK_EXCL_OPTS 10 + +static int +construct_dpdk_mutex_options(const struct smap *ovs_other_config, + char ***argv, const int initial_size) +{ + struct dpdk_exclusive_options_map { + const char *category; + const char *ovs_dpdk_options[MAX_DPDK_EXCL_OPTS]; + const char *eal_dpdk_options[MAX_DPDK_EXCL_OPTS]; + const char *default_value; + int default_option; + } excl_opts[] = { + {"memory type", + {"dpdk-alloc-mem", "dpdk-socket-mem", NULL,}, + {"-m", "--socket-mem", NULL,}, + "1024,0", 1 + }, + }; + + int i, ret = initial_size; + for (i = 0; i < ARRAY_SIZE(excl_opts); ++i) { + int found_opts = 0, scan, found_pos = -1; + const char *found_value; + struct dpdk_exclusive_options_map *popt = &excl_opts[i]; + + for (scan = 0; scan < MAX_DPDK_EXCL_OPTS + && popt->ovs_dpdk_options[scan]; ++scan) { + const char *lookup = smap_get(ovs_other_config, + popt->ovs_dpdk_options[scan]); + if (lookup && strlen(lookup)) { + found_opts++; + found_pos = scan; + found_value = lookup; + } + } + + if (!found_opts) { + if (popt->default_option) { + found_pos = popt->default_option; + found_value = popt->default_value; + } else { + continue; + } } + + if (found_opts > 1) { + VLOG_ERR("Multiple defined options for %s. Please check your" + " database settings and reconfigure if necessary.", + popt->category); + } + + dpdk_option_extend(argv, ret, popt->eal_dpdk_options[found_pos], + found_value); + ret += 2; + } + + return ret; +} + +static int +get_dpdk_args(const struct smap *ovs_other_config, char ***argv) +{ + int i = construct_dpdk_options(ovs_other_config, argv, 1); + i = construct_dpdk_mutex_options(ovs_other_config, argv, i); + return i; +} + +static char **dpdk_argv; +static int dpdk_argc; + +static void +deferred_argv_release(void) +{ + int result; + for (result = 0; result < dpdk_argc; ++result) { + free(dpdk_argv[result]); + } + + free(dpdk_argv); +} + +static void +dpdk_init__(const struct smap *ovs_other_config) +{ + char **argv = NULL; + int result; + int argc; + int err; + cpu_set_t cpuset; + + if (!smap_get_bool(ovs_other_config, "dpdk-init", false)) { + VLOG_INFO("DPDK Disabled - to change this requires a restart.\n"); + return; } + VLOG_INFO("DPDK Enabled, initializing"); + #ifdef VHOST_CUSE - if (process_vhost_flags("-cuse_dev_name", xstrdup("vhost-net"), - PATH_MAX, argv, &cuse_dev_name)) { + if (process_vhost_flags("cuse-dev-name", xstrdup("vhost-net"), + PATH_MAX, ovs_other_config, &cuse_dev_name)) { #else - if (process_vhost_flags("-vhost_sock_dir", xstrdup(ovs_rundir()), - NAME_MAX, argv, &vhost_sock_dir)) { + if (process_vhost_flags("vhost-sock-dir", xstrdup(ovs_rundir()), + NAME_MAX, ovs_other_config, &vhost_sock_dir)) { struct stat s; - int err; err = stat(vhost_sock_dir, &s); if (err) { - VLOG_ERR("vHostUser socket DIR '%s' does not exist.", - vhost_sock_dir); - return err; + VLOG_ERR("vhost-user sock directory '%s' does not exist.", + vhost_sock_dir); } #endif - /* Remove the vhost flag configuration parameters from the argument - * list, so that the correct elements are passed to the DPDK - * initialization function - */ - argc -= 2; - argv += 2; /* Increment by two to bypass the vhost flag arguments */ - base = 2; } /* Get the main thread affinity */ CPU_ZERO(&cpuset); - err = pthread_getaffinity_np(pthread_self(), sizeof(cpu_set_t), &cpuset); + err = pthread_getaffinity_np(pthread_self(), sizeof(cpu_set_t), + &cpuset); if (err) { VLOG_ERR("Thread getaffinity error %d.", err); - return err; } - /* Keep the program name argument as this is needed for call to - * rte_eal_init() - */ - argv[0] = pragram_name; + argv = grow_argv(&argv, 0, 1); + argv[0] = xstrdup(ovs_get_program_name()); + argc = get_dpdk_args(ovs_other_config, &argv); + + argv = grow_argv(&argv, argc, 1); + argv[argc] = NULL; + + optind = 1; /* Make sure things are initialized ... */ result = rte_eal_init(argc, argv); @@ -2827,23 +2943,54 @@ dpdk_init(int argc, char **argv) } /* Set the main thread affinity back to pre rte_eal_init() value */ - err = pthread_setaffinity_np(pthread_self(), sizeof(cpu_set_t), &cpuset); - if (err) { - VLOG_ERR("Thread setaffinity error %d", err); - return err; + if (!err) { + err = pthread_setaffinity_np(pthread_self(), sizeof(cpu_set_t), + &cpuset); + if (err) { + VLOG_ERR("Thread setaffinity error %d", err); + } } + dpdk_argv = argv; + dpdk_argc = argc; + + atexit(deferred_argv_release); + rte_memzone_dump(stdout); rte_eal_init_ret = 0; - if (argc > result) { - argv[result] = argv[0]; - } - /* We are called from the main thread here */ RTE_PER_LCORE(_lcore_id) = NON_PMD_CORE_ID; - return result + 1 + base; + ovs_thread_create("dpdk_watchdog", dpdk_watchdog, NULL); + +#ifdef VHOST_CUSE + /* Register CUSE device to handle IOCTLs. + * Unless otherwise specified, cuse_dev_name is set to vhost-net. + */ + err = rte_vhost_driver_register(cuse_dev_name); + + if (err != 0) { + VLOG_ERR("CUSE device setup failure."); + return; + } +#endif + + dpdk_vhost_class_init(); + + /* Finally, register the dpdk classes */ + netdev_dpdk_register(); +} + +void +dpdk_init(const struct smap *ovs_other_config) +{ + static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER; + + if (ovs_other_config && ovsthread_once_start(&once)) { + dpdk_init__(ovs_other_config); + ovsthread_once_done(&once); + } } static const struct netdev_class dpdk_class = @@ -2905,23 +3052,14 @@ static const struct netdev_class OVS_UNUSED dpdk_vhost_user_class = void netdev_dpdk_register(void) { - static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER; - - if (rte_eal_init_ret) { - return; - } - - if (ovsthread_once_start(&once)) { - dpdk_common_init(); - netdev_register_provider(&dpdk_class); - netdev_register_provider(&dpdk_ring_class); + dpdk_common_init(); + netdev_register_provider(&dpdk_class); + netdev_register_provider(&dpdk_ring_class); #ifdef VHOST_CUSE - netdev_register_provider(&dpdk_vhost_cuse_class); + netdev_register_provider(&dpdk_vhost_cuse_class); #else - netdev_register_provider(&dpdk_vhost_user_class); + netdev_register_provider(&dpdk_vhost_user_class); #endif - ovsthread_once_done(&once); - } } int diff --git a/lib/netdev-dpdk.h b/lib/netdev-dpdk.h index 646d3e21fb1..ee748e0183d 100644 --- a/lib/netdev-dpdk.h +++ b/lib/netdev-dpdk.h @@ -4,6 +4,7 @@ #include struct dp_packet; +struct smap; #ifdef DPDK_NETDEV @@ -22,7 +23,6 @@ struct dp_packet; #define NON_PMD_CORE_ID LCORE_ID_ANY -int dpdk_init(int argc, char **argv); void netdev_dpdk_register(void); void free_dpdk_buf(struct dp_packet *); int pmd_thread_setaffinity_cpu(unsigned cpu); @@ -33,15 +33,6 @@ int pmd_thread_setaffinity_cpu(unsigned cpu); #include "util.h" -static inline int -dpdk_init(int argc, char **argv) -{ - if (argc >= 2 && !strcmp(argv[1], "--dpdk")) { - ovs_fatal(0, "DPDK support not built into this copy of Open vSwitch."); - } - return 0; -} - static inline void netdev_dpdk_register(void) { @@ -61,4 +52,7 @@ pmd_thread_setaffinity_cpu(unsigned cpu OVS_UNUSED) } #endif /* DPDK_NETDEV */ + +void dpdk_init(const struct smap *ovs_other_config); + #endif diff --git a/lib/netdev-nodpdk.c b/lib/netdev-nodpdk.c new file mode 100644 index 00000000000..8a8afaa788a --- /dev/null +++ b/lib/netdev-nodpdk.c @@ -0,0 +1,20 @@ +#include +#include "netdev-dpdk.h" +#include "smap.h" +#include "ovs-thread.h" +#include "openvswitch/vlog.h" + +VLOG_DEFINE_THIS_MODULE(dpdk); + +void +dpdk_init(const struct smap *ovs_other_config) +{ + static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER; + + if (ovsthread_once_start(&once)) { + if (smap_get_bool(ovs_other_config, "dpdk-init", false)) { + VLOG_ERR("DPDK not supported in this copy of Open vSwitch."); + } + ovsthread_once_done(&once); + } +} diff --git a/lib/netdev.c b/lib/netdev.c index 3e50694ac62..4fc06ce3ba9 100644 --- a/lib/netdev.c +++ b/lib/netdev.c @@ -165,8 +165,6 @@ netdev_initialize(void) netdev_register_provider(&netdev_internal_class); netdev_vport_tunnel_register(); #endif - netdev_dpdk_register(); - ovsthread_once_done(&once); } } diff --git a/tests/ofproto-macros.at b/tests/ofproto-macros.at index bd38ac6b04c..632f7473406 100644 --- a/tests/ofproto-macros.at +++ b/tests/ofproto-macros.at @@ -281,7 +281,8 @@ m4_define([_OVS_VSWITCHD_START], /reconnect|INFO|/d /ofproto|INFO|using datapath ID/d /netdev_linux|INFO|.*device has unknown hardware address family/d -/ofproto|INFO|datapath ID changed to fedcba9876543210/d']]) +/ofproto|INFO|datapath ID changed to fedcba9876543210/d +/dpdk|INFO|DPDK Disabled - to change this requires a restart./d']]) ]) # OVS_VSWITCHD_START([vsctl-args], [vsctl-output], [=override]) diff --git a/utilities/ovs-dev.py b/utilities/ovs-dev.py index c1217061081..a74b528b139 100755 --- a/utilities/ovs-dev.py +++ b/utilities/ovs-dev.py @@ -265,9 +265,11 @@ def run(): cmd = [build + "/vswitchd/ovs-vswitchd"] if options.dpdk: - cmd.append("--dpdk") - cmd.extend(options.dpdk) - cmd.append("--") + _sh("ovs-vsctl --no-wait set Open_vSwitch %s " \ + "other_config:dpdk-init=true" % root_uuid) + else: + _sh("ovs-vsctl --no-wait set Open_vSwitch %s " \ + "other_config:dpdk-init=false" % root_uuid) if options.gdb: cmd = ["gdb", "--args"] + cmd @@ -421,9 +423,8 @@ def main(): help="run ovs-vswitchd under gdb") group.add_option("--valgrind", dest="valgrind", action="store_true", help="run ovs-vswitchd under valgrind") - group.add_option("--dpdk", dest="dpdk", action="callback", - callback=parse_subargs, - help="run ovs-vswitchd with dpdk subopts (ended by --)") + group.add_option("--dpdk", dest="dpdk", action="store_true", + help="run ovs-vswitchd with dpdk") group.add_option("--clang", dest="clang", action="store_true", help="Use binaries built by clang") group.add_option("--user", dest="user", action="store", default="", diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c index 439ca94aa25..b4e5ea7ff44 100644 --- a/vswitchd/bridge.c +++ b/vswitchd/bridge.c @@ -34,6 +34,7 @@ #include "if-notifier.h" #include "jsonrpc.h" #include "lacp.h" +#include "lib/netdev-dpdk.h" #include "mac-learning.h" #include "mcast-snooping.h" #include "netdev.h" @@ -2892,6 +2893,10 @@ bridge_run(void) } cfg = ovsrec_open_vswitch_first(idl); + if (cfg) { + dpdk_init(&cfg->other_config); + } + /* Initialize the ofproto library. This only needs to run once, but * it must be done after the configuration is set. If the * initialization has already occurred, bridge_init_ofproto() diff --git a/vswitchd/ovs-vswitchd.8.in b/vswitchd/ovs-vswitchd.8.in index 601b7a112d6..3dacfc34bc4 100644 --- a/vswitchd/ovs-vswitchd.8.in +++ b/vswitchd/ovs-vswitchd.8.in @@ -84,9 +84,9 @@ only allow privileged users, such as the superuser, to use it. unavailable or unsuccessful. . .SS "DPDK Options" -.IP "\fB\-\-dpdk\fR" -Initialize \fBovs\-vswitchd\fR DPDK datapath. Refer to INSTALL.DPDK -for details. +For details on initializing the \fBovs\-vswitchd\fR DPDK datapath, +refer to INSTALL.DPDK.md or \fBovs\-vswitchd.conf.db\fR(5) for +details. .SS "Daemon Options" .ds DD \ \fBovs\-vswitchd\fR detaches only after it has connected to the \ diff --git a/vswitchd/ovs-vswitchd.c b/vswitchd/ovs-vswitchd.c index e78ecda935b..7d467a175c2 100644 --- a/vswitchd/ovs-vswitchd.c +++ b/vswitchd/ovs-vswitchd.c @@ -48,7 +48,6 @@ #include "openvswitch/vconn.h" #include "openvswitch/vlog.h" #include "lib/vswitch-idl.h" -#include "lib/netdev-dpdk.h" VLOG_DEFINE_THIS_MODULE(vswitchd); @@ -71,13 +70,6 @@ main(int argc, char *argv[]) int retval; set_program_name(argv[0]); - retval = dpdk_init(argc,argv); - if (retval < 0) { - return retval; - } - - argc -= retval; - argv += retval; ovs_cmdl_proctitle_init(argc, argv); service_start(&argc, &argv); @@ -166,7 +158,7 @@ parse_options(int argc, char *argv[], char **unixctl_pathp) {"bootstrap-ca-cert", required_argument, NULL, OPT_BOOTSTRAP_CA_CERT}, {"enable-dummy", optional_argument, NULL, OPT_ENABLE_DUMMY}, {"disable-system", no_argument, NULL, OPT_DISABLE_SYSTEM}, - {"dpdk", required_argument, NULL, OPT_DPDK}, + {"dpdk", optional_argument, NULL, OPT_DPDK}, {NULL, 0, NULL, 0}, }; char *short_options = ovs_cmdl_long_options_to_short_options(long_options); @@ -219,7 +211,7 @@ parse_options(int argc, char *argv[], char **unixctl_pathp) exit(EXIT_FAILURE); case OPT_DPDK: - ovs_fatal(0, "--dpdk must be given at beginning of command line."); + ovs_fatal(0, "Using --dpdk to configure DPDK is not supported."); break; default: @@ -256,18 +248,9 @@ usage(void) daemon_usage(); vlog_usage(); printf("\nDPDK options:\n" - " --dpdk [VHOST] [DPDK] Initialize DPDK datapath.\n" - " where DPDK are options for initializing DPDK lib and VHOST is\n" -#ifdef VHOST_CUSE - " option to override default character device name used for\n" - " for use with userspace vHost\n" - " -cuse_dev_name NAME\n" -#else - " option to override default directory where vhost-user\n" - " sockets are created.\n" - " -vhost_sock_dir DIR\n" -#endif - ); + "Configuration of DPDK via command-line is removed from this\n" + "version of Open vSwitch. DPDK is configured through ovsdb.\n" + ); printf("\nOther options:\n" " --unixctl=SOCKET override default control socket name\n" " -h, --help display this help message\n" diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index 90806337aa9..c36cb59455d 100644 --- a/vswitchd/vswitch.xml +++ b/vswitchd/vswitch.xml @@ -171,6 +171,49 @@

+ +

+ Set this value to true to enable runtime support for + DPDK ports. The vswitch must have compile-time support for DPDK as + well. +

+

+ The default value is false. Changing this value requires + restarting the daemon +

+

+ If this value is false at startup, any dpdk ports which + are configured in the bridge will fail due to memory errors. +

+
+ + +

+ Specifies the CPU cores where dpdk lcore threads should be spawned. + The DPDK lcore threads are used for DPDK library tasks, such as + library internal message processing, logging, etc. Value should be in + the form of a hex string (so '0x123') similar to the 'taskset' mask + input. +

+

+ The lowest order bit corresponds to the first CPU core. A set bit + means the corresponding core is available and an lcore thread will be + created and pinned to it. If the input does not cover all cores, + those uncovered cores are considered not set. +

+

+ For performance reasons, it is best to set this to a single core on + the system, rather than allow lcore threads to float. +

+

+ If not specified, the value will be determined by choosing the lowest + CPU core from initial cpu affinity list. Otherwise, the value will be + passed directly to the DPDK library. +

+
+

Specifies CPU mask for setting the cpu affinity of PMD (Poll @@ -190,6 +233,71 @@

+ +

+ Specifies the amount of memory to preallocate from the hugepage pool, + regardless of socket. It is recommended that dpdk-socket-mem is used + instead. +

+

+ If not specified, the value is 0. Changing this value requires + restarting the daemon. +

+
+ + +

+ Specifies the amount of memory to preallocate from the hugepage pool, + on a per-socket basis. +

+

+ The specifier is a comma-separated string, in ascending order of CPU + socket (ex: 1024,2048,4096,8192 would set socket 0 to preallocate + 1024MB, socket 1 to preallocate 2048MB, etc.) +

+

+ If not specified, the default value is 1024,0. Changing this value + requires restarting the daemon. +

+
+ + +

+ Specifies the path to the hugetlbfs mount point. +

+

+ If not specified, this will be guessed by the DPDK library (default + is /dev/hugepages). Changing this value requires restarting the + daemon. +

+
+ + +

+ Specifies the name of the vhost-cuse character device to open for + vhost-cuse support. +

+

+ The default is vhost-net. Changing this value requires restarting + the daemon. +

+
+ + +

+ Specifies the path to the vhost-user unix domain socket files. +

+

+ Defaults to the working directory of the application. Changing this + value requires restarting the daemon. +

+
+