Skip to content

Commit

Permalink
regexdev: introduce API
Browse files Browse the repository at this point in the history
As RegEx usage become more used by DPDK applications, for example:
* Next Generation Firewalls (NGFW)
* Deep Packet and Flow Inspection (DPI)
* Intrusion Prevention Systems (IPS)
* DDoS Mitigation
* Network Monitoring
* Data Loss Prevention (DLP)
* Smart NICs
* Grammar based content processing
* URL, spam and adware filtering
* Advanced auditing and policing of user/application security policies
* Financial data mining - parsing of streamed financial feeds
* Application recognition.
* Dmemory introspection.
* Natural Language Processing (NLP)
* Sentiment Analysis.
* Big data database acceleration.
* Computational storage.

Number of PMD providers started to work on HW implementation,
along side with SW implementations.

This lib adds the support for those kind of devices.

The RegEx Device API is composed of two parts:
- The application-oriented RegEx API that includes functions to setup
  a RegEx device (configure it, setup its queue pairs and start it),
  update the rule database and so on.

- The driver-oriented RegEx API that exports a function allowing
  a RegEx poll Mode Driver (PMD) to simultaneously register itself as
  a RegEx device driver.

RegEx device components and definitions:

    +-----------------+
    |                 |
    |                 o---------+    rte_regexdev_[en|de]queue_burst()
    |   PCRE based    o------+  |               |
    |  RegEx pattern  |      |  |  +--------+   |
    | matching engine o------+--+--o        |   |    +------+
    |                 |      |  |  | queue  |<==o===>|Core 0|
    |                 o----+ |  |  | pair 0 |        |      |
    |                 |    | |  |  +--------+        +------+
    +-----------------+    | |  |
           ^               | |  |  +--------+
           |               | |  |  |        |        +------+
           |               | +--+--o queue  |<======>|Core 1|
       Rule|Database       |    |  | pair 1 |        |      |
    +------+----------+    |    |  +--------+        +------+
    |     Group 0     |    |    |
    | +-------------+ |    |    |  +--------+        +------+
    | | Rules 0..n  | |    |    |  |        |        |Core 2|
    | +-------------+ |    |    +--o queue  |<======>|      |
    |     Group 1     |    |       | pair 2 |        +------+
    | +-------------+ |    |       +--------+
    | | Rules 0..n  | |    |
    | +-------------+ |    |       +--------+
    |     Group 2     |    |       |        |        +------+
    | +-------------+ |    |       | queue  |<======>|Core n|
    | | Rules 0..n  | |    +-------o pair n |        |      |
    | +-------------+ |            +--------+        +------+
    |     Group n     |
    | +-------------+ |<-------rte_regexdev_rule_db_update()
    | |             | |<-------rte_regexdev_rule_db_compile_activate()
    | | Rules 0..n  | |<-------rte_regexdev_rule_db_import()
    | +-------------+ |------->rte_regexdev_rule_db_export()
    +-----------------+

RegEx: A regular expression is a concise and flexible means for matching
strings of text, such as particular characters, words, or patterns of
characters. A common abbreviation for this is â~@~\RegExâ~@~].

RegEx device: A hardware or software-based implementation of RegEx
device API for PCRE based pattern matching syntax and semantics.

PCRE RegEx syntax and semantics specification:
http://regexkit.sourceforge.net/Documentation/pcre/pcrepattern.html

RegEx queue pair: Each RegEx device should have one or more queue pair to
transmit a burst of pattern matching request and receive a burst of
receive the pattern matching response. The pattern matching
request/response embedded in *rte_regex_ops* structure.

Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
Match ID and Group ID to identify the rule upon the match.

Rule database: The RegEx device accepts regular expressions and converts
them into a compiled rule database that can then be used to scan data.
Compilation allows the device to analyze the given pattern(s) and
pre-determine how to scan for these patterns in an optimized fashion that
would be far too expensive to compute at run-time. A rule database
contains a set of rules that compiled in device specific binary form.

Match ID or Rule ID: A unique identifier provided at the time of rule
creation for the application to identify the rule upon match.

Group ID: Group of rules can be grouped under one group ID to enable
rule isolation and effective pattern matching. A unique group identifier
provided at the time of rule creation for the application to identify
the rule upon match.

Scan: A pattern matching request through *enqueue* API.

It may possible that a given RegEx device may not support all the
features
of PCRE. The application may probe unsupported features through
struct rte_regexdev_info::pcre_unsup_flags

By default, all the functions of the RegEx Device API exported by a PMD
are lock-free functions which assume to not be invoked in parallel on
different logical cores to work on the same target object. For instance,
the dequeue function of a PMD cannot be invoked in parallel on two logical
cores to operates on same RegEx queue pair. Of course, this function
can be invoked in parallel by different logical core on different queue
pair. It is the responsibility of the upper level application to
enforce this rule.

In all functions of the RegEx API, the RegEx device is
designated by an integer >= 0 named the device identifier *dev_id*

At the RegEx driver level, RegEx devices are represented by a generic
data structure of type *rte_regexdev*.
RegEx devices are dynamically registered during the PCI/SoC device
probing phase performed at EAL initialization time.
When a RegEx device is being probed, a *rte_regexdev* structure and
a new device identifier are allocated for that device. Then, the
regexdev_init() function supplied by the RegEx driver matching the
probed device is invoked to properly initialize the device.

The role of the device init function consists of resetting the hardware
or software RegEx driver implementations.

If the device init operation is successful, the correspondence between
the device identifier assigned to the new device and its associated
*rte_regexdev* structure is effectively registered.
Otherwise, both the *rte_regexdev* structure and the device identifier
are freed.

The functions exported by the application RegEx API to setup a device
designated by its device identifier must be invoked in the following
order:
    - rte_regexdev_configure()
    - rte_regexdev_queue_pair_setup()
    - rte_regexdev_start()

Then, the application can invoke, in any order, the functions
exported by the RegEx API to enqueue pattern matching job, dequeue
pattern matching response, get the stats, update the rule database,
get/set device attributes and so on

If the application wants to change the configuration (i.e. call
rte_regexdev_configure() or rte_regexdev_queue_pair_setup()), it must
call rte_regexdev_stop() first to stop the device and then do the
reconfiguration before calling rte_regexdev_start() again. The enqueue and
dequeue functions should not be invoked when the device is stopped.

Finally, an application can close a RegEx device by invoking the
rte_regexdev_close() function.

Each function of the application RegEx API invokes a specific function
of the PMD that controls the target device designated by its device
identifier.

For this purpose, all device-specific functions of a RegEx driver are
supplied through a set of pointers contained in a generic structure of
type *regexdev_ops*.
The address of the *regexdev_ops* structure is stored in the
*rte_regexdev* structure by the device init function of the RegEx driver,
which is invoked during the PCI/SoC device probing phase, as explained
earlier.

In other words, each function of the RegEx API simply retrieves the
*rte_regexdev* structure associated with the device identifier and
performs an indirect invocation of the corresponding driver function
supplied in the *regexdev_ops* structure of the *rte_regexdev*
structure.

For performance reasons, the address of the fast-path functions of the
RegEx driver is not contained in the *regexdev_ops* structure.
Instead, they are directly stored at the beginning of the *rte_regexdev*
structure to avoid an extra indirect memory access during their
invocation.

RTE RegEx device drivers do not use interrupts for enqueue or dequeue
operation. Instead, RegEx drivers export Poll-Mode enqueue and dequeue
functions to applications.

The *enqueue* operation submits a burst of RegEx pattern matching
request to the RegEx device and the *dequeue* operation gets a burst of
pattern matching response for the ones submitted through *enqueue*
operation.

Typical application utilisation of the RegEx device API will follow the
following programming flow.

- rte_regexdev_configure()
- rte_regexdev_queue_pair_setup()
- rte_regexdev_rule_db_update() Needs to invoke if precompiled rule
  database not
  provided in rte_regexdev_config::rule_db for rte_regexdev_configure()
  and/or application needs to update rule database.
- rte_regexdev_rule_db_compile_activate() Needs to invoke if
  rte_regexdev_rule_db_update function was used.
- Create or reuse exiting mempool for *rte_regex_ops* objects.
- rte_regexdev_start()
- rte_regexdev_enqueue_burst()
- rte_regexdev_dequeue_burst()

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Signed-off-by: Ori Kam <orika@mellanox.com>
  • Loading branch information
jerinjacobk authored and tmonjalo committed Jul 6, 2020
1 parent 4c4f839 commit bab9497
Show file tree
Hide file tree
Showing 14 changed files with 1,739 additions and 1 deletion.
5 changes: 5 additions & 0 deletions MAINTAINERS
Expand Up @@ -449,6 +449,11 @@ F: app/test/test_compressdev*
F: doc/guides/prog_guide/compressdev.rst
F: doc/guides/compressdevs/features/default.ini

RegEx API - EXPERIMENTAL
M: Ori Kam <orika@mellanox.com>
F: lib/librte_regexdev/
F: doc/guides/prog_guide/regexdev.rst

Eventdev API
M: Jerin Jacob <jerinj@marvell.com>
T: git://dpdk.org/next/dpdk-next-eventdev
Expand Down
5 changes: 5 additions & 0 deletions config/common_base
Expand Up @@ -735,6 +735,11 @@ CONFIG_RTE_LIBRTE_PMD_ISAL=n
#
CONFIG_RTE_LIBRTE_PMD_ZLIB=n

#
# Compile RegEx device support
#
CONFIG_RTE_LIBRTE_REGEXDEV=y

#
# Compile generic event device library
#
Expand Down
1 change: 1 addition & 0 deletions doc/api/doxy-api-index.md
Expand Up @@ -20,6 +20,7 @@ The public API headers are grouped by topics:
[security] (@ref rte_security.h),
[compressdev] (@ref rte_compressdev.h),
[compress] (@ref rte_comp.h),
[regexdev] (@ref rte_regexdev.h),
[eventdev] (@ref rte_eventdev.h),
[event_eth_rx_adapter] (@ref rte_event_eth_rx_adapter.h),
[event_eth_tx_adapter] (@ref rte_event_eth_tx_adapter.h),
Expand Down
1 change: 1 addition & 0 deletions doc/api/doxy-api.conf.in
Expand Up @@ -59,6 +59,7 @@ INPUT = @TOPDIR@/doc/api/doxy-api-index.md \
@TOPDIR@/lib/librte_power \
@TOPDIR@/lib/librte_rawdev \
@TOPDIR@/lib/librte_rcu \
@TOPDIR@/lib/librte_regexdev \
@TOPDIR@/lib/librte_reorder \
@TOPDIR@/lib/librte_rib \
@TOPDIR@/lib/librte_ring \
Expand Down
1 change: 1 addition & 0 deletions doc/guides/prog_guide/index.rst
Expand Up @@ -26,6 +26,7 @@ Programmer's Guide
bbdev
cryptodev_lib
compressdev
regexdev
rte_security
rawdev
link_bonding_poll_mode_drv_lib
Expand Down
177 changes: 177 additions & 0 deletions doc/guides/prog_guide/regexdev.rst
@@ -0,0 +1,177 @@
.. SPDX-License-Identifier: BSD-3-Clause
Copyright 2020 Mellanox Technologies, Ltd
RegEx Device Library
====================

The RegEx library provides a RegEx device framework for management and
provisioning of hardware and software RegEx poll mode drivers, defining generic
APIs which support a number of different RegEx operations.


Design Principles
-----------------

The RegEx library follows the same basic principles as those used in DPDK's
Ethernet Device framework and the Crypto framework. The RegEx framework provides
a generic Crypto device framework which supports both physical (hardware)
and virtual (software) RegEx devices as well as a generic RegEx API which allows
RegEx devices to be managed and configured and supports RegEx operations to be
provisioned on RegEx poll mode driver.


Device Management
-----------------

Device Creation
~~~~~~~~~~~~~~~

Physical RegEx devices are discovered during the PCI probe/enumeration of the
EAL function which is executed at DPDK initialization, based on
their PCI device identifier, each unique PCI BDF (bus/bridge, device,
function). Specific physical ReEx devices, like other physical devices in DPDK
can be white-listed or black-listed using the EAL command line options.


Device Identification
~~~~~~~~~~~~~~~~~~~~~

Each device, whether virtual or physical is uniquely designated by two
identifiers:

- A unique device index used to designate the RegEx device in all functions
exported by the regexdev API.

- A device name used to designate the RegEx device in console messages, for
administration or debugging purposes.


Device Configuration
~~~~~~~~~~~~~~~~~~~~

The configuration of each RegEx device includes the following operations:

- Allocation of resources, including hardware resources if a physical device.
- Resetting the device into a well-known default state.
- Initialization of statistics counters.

The rte_regexdev_configure API is used to configure a RegEx device.

.. code-block:: c
int rte_regexdev_configure(uint8_t dev_id,
const struct rte_regexdev_config *cfg);
The ``rte_regexdev_config`` structure is used to pass the configuration
parameters for the RegEx device for example number of queue pairs, number of
groups, max number of matches and so on.

.. code-block:: c
struct rte_regexdev_config {
uint16_t nb_max_matches;
/**< Maximum matches per scan configured on this device.
* This value cannot exceed the *max_matches*
* which previously provided in rte_regexdev_info_get().
* The value 0 is allowed, in which case, value 1 used.
* @see struct rte_regexdev_info::max_matches
*/
uint16_t nb_queue_pairs;
/**< Number of RegEx queue pairs to configure on this device.
* This value cannot exceed the *max_queue_pairs* which previously
* provided in rte_regexdev_info_get().
* @see struct rte_regexdev_info::max_queue_pairs
*/
uint32_t nb_rules_per_group;
/**< Number of rules per group to configure on this device.
* This value cannot exceed the *max_rules_per_group*
* which previously provided in rte_regexdev_info_get().
* The value 0 is allowed, in which case,
* struct rte_regexdev_info::max_rules_per_group used.
* @see struct rte_regexdev_info::max_rules_per_group
*/
uint16_t nb_groups;
/**< Number of groups to configure on this device.
* This value cannot exceed the *max_groups*
* which previously provided in rte_regexdev_info_get().
* @see struct rte_regexdev_info::max_groups
*/
const char *rule_db;
/**< Import initial set of prebuilt rule database on this device.
* The value NULL is allowed, in which case, the device will not
* be configured prebuilt rule database. Application may use
* rte_regexdev_rule_db_update() or rte_regexdev_rule_db_import() API
* to update or import rule database after the
* rte_regexdev_configure().
* @see rte_regexdev_rule_db_update(), rte_regexdev_rule_db_import()
*/
uint32_t rule_db_len;
/**< Length of *rule_db* buffer. */
uint32_t dev_cfg_flags;
/**< RegEx device configuration flags, See RTE_REGEXDEV_CFG_* */
};
Configuration of Rules Database
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Each Regex device should be configured with the rule database.
There are two modes of setting the rule database, online or offline.
The online mode means, that the rule database in being compiled by the
RegEx PMD while in the offline mode the rule database is compiled by external
compiler, and is being loaded to the PMD as a buffer.
The configuration mode is depended on the PMD capabilities.

Online rule configuration is done using the following API functions:
``rte_regexdev_rule_db_update`` which add / remove rules from the rules
precomplied list, and ``rte_regexdev_rule_db_compile_activate``
which compile the rules and loads them to the RegEx HW.

Offline rule configuration can be done by adding a pointer to the compiled
rule database in the configuration step, or by using
``rte_regexdev_rule_db_import`` API.


Configuration of Queue Pairs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Each RegEx device can be configured with number of queue pairs.
Each queue pair is configured using ``rte_regexdev_queue_pair_setup``


Logical Cores, Memory and Queues Pair Relationships
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Multiple logical cores should never share the same queue pair for enqueuing
operations or dequeuing operations on the same RegEx device since this would
require global locks and hinder performance.


Device Features and Capabilities
---------------------------------

RegEx devices may support different feature set.
In order to get the supported PMD feature ``rte_regexdev_info_get``
API which return the info of the device and it's supported features.


Enqueue / Dequeue Burst APIs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The burst enqueue API uses a RegEx device identifier and a queue pair
identifier to specify the device queue pair to schedule the processing on.
The ``nb_ops`` parameter is the number of operations to process which are
supplied in the ``ops`` array of ``rte_regex_ops`` structures.
The enqueue function returns the number of operations it actually enqueued for
processing, a return value equal to ``nb_ops`` means that all packets have been
enqueued.

Data pointed in each op, should not be released until the dequeue of for that
op.

The dequeue API uses the same format as the enqueue API of processed but
the ``nb_ops`` and ``ops`` parameters are now used to specify the max processed
operations the user wishes to retrieve and the location in which to store them.
The API call returns the actual number of processed operations returned, this
can never be larger than ``nb_ops``.

5 changes: 5 additions & 0 deletions doc/guides/rel_notes/release_20_08.rst
Expand Up @@ -56,6 +56,11 @@ New Features
Also, make sure to start the actual text at the margin.
=========================================================
* **Added the RegEx Library, a generic RegEx service library.**

Added the RegEx library which provides an API for offload of regular
expressions search operations to hardware or software accelerator devices.

* **Updated PCAP driver.**

Updated PCAP driver with new features and improvements, including:
Expand Down
2 changes: 2 additions & 0 deletions lib/Makefile
Expand Up @@ -40,6 +40,8 @@ DEPDIRS-librte_security += librte_cryptodev
DIRS-$(CONFIG_RTE_LIBRTE_COMPRESSDEV) += librte_compressdev
DEPDIRS-librte_compressdev := librte_eal librte_mempool librte_ring librte_mbuf
DEPDIRS-librte_compressdev += librte_kvargs
DIRS-$(CONFIG_RTE_LIBRTE_REGEXDEV) += librte_regexdev
DEPDIRS-librte_regexdev := librte_eal librte_mbuf
DIRS-$(CONFIG_RTE_LIBRTE_EVENTDEV) += librte_eventdev
DEPDIRS-librte_eventdev := librte_eal librte_ring librte_ethdev librte_hash \
librte_mempool librte_timer librte_cryptodev
Expand Down
30 changes: 30 additions & 0 deletions lib/librte_regexdev/Makefile
@@ -0,0 +1,30 @@
# SPDX-License-Identifier: BSD-3-Clause
# Copyright(C) 2019 Marvell International Ltd.
# Copyright 2020 Mellanox Technologies, Ltd

include $(RTE_SDK)/mk/rte.vars.mk

# library name
LIB = librte_regexdev.a

EXPORT_MAP := rte_regex_version.map

# library version
LIBABIVER := 1

# build flags
CFLAGS += -O3
CFLAGS += $(WERROR_FLAGS)
LDLIBS += -lrte_eal -lrte_mbuf

# library source files
# all source are stored in SRCS-y
SRCS-$(CONFIG_RTE_LIBRTE_REGEXDEV) := rte_regexdev.c

# export include files
SYMLINK-$(CONFIG_RTE_LIBRTE_REGEXDEV)-include += rte_regexdev.h

# versioning export map
EXPORT_MAP := rte_regexdev_version.map

include $(RTE_SDK)/mk/rte.lib.mk
6 changes: 6 additions & 0 deletions lib/librte_regexdev/meson.build
@@ -0,0 +1,6 @@
# SPDX-License-Identifier: BSD-3-Clause
# Copyright 2020 Mellanox Technologies, Ltd

sources = files('rte_regexdev.c')
headers = files('rte_regexdev.h')
deps += ['mbuf']
6 changes: 6 additions & 0 deletions lib/librte_regexdev/rte_regexdev.c
@@ -0,0 +1,6 @@
/* SPDX-License-Identifier: BSD-3-Clause
* Copyright(C) 2019 Marvell International Ltd.
* Copyright 2020 Mellanox Technologies, Ltd
*/

#include "rte_regexdev.h"

0 comments on commit bab9497

Please sign in to comment.