Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cake: tc shows number of active flows within class #1

Closed
wants to merge 1 commit into from

Conversation

ldir-EDB0
Copy link
Contributor

Hi Dave,

A small tweak with tc to show the number of active flows from cake. Update to pkt_sched.h to match sch_cake's idea of the structure.

Kevin

@ldir-EDB0 ldir-EDB0 closed this Sep 25, 2015
tohojo pushed a commit that referenced this pull request Mar 6, 2018
As Stephen Hemminger mentioned on the last submission the new_json_obj
function is always called with fp == stdout, so right now, there's no
need of this extra argument.

The background for the rework is the following:
The ip monitor didn't call `new_json_obj` (even for in non json context),
so the static FILE* _fp variable wasn't initialized, thus raising a
SIGSEGV in ipaddress.c. This patch should fix this issue for good, new
paths won't have to call `new_json_obj`.

How to reproduce:

$ ip -t mon label link
(gdb) bt
.#0  _IO_vfprintf_internal (s=s@entry=0x0, format=format@entry=0x45460d “%d: “, ap=ap@entry=0x7fffffff7f18) at vfprintf.c:1278
.#1  0x0000000000451310 in color_fprintf (fp=0x0, attr=<optimized out>, fmt=0x45460d “%d: “) at color.c:108
.#2  0x000000000044a856 in print_color_int (t=t@entry=PRINT_ANY, color=color@entry=4294967295, key=key@entry=0x4545fc “ifindex”,
    fmt=fmt@entry=0x45460d “%d: “, value=<optimized out>) at ip_print.c:132
.#3  0x000000000040ccd2 in print_int (value=<optimized out>, fmt=0x45460d “%d: “, key=0x4545fc “ifindex”, t=PRINT_ANY) at ip_common.h:189
.#4  print_linkinfo (who=<optimized out>, n=0x7fffffffa380, arg=0x7ffff77a82a0 <_IO_2_1_stdout_>) at ipaddress.c:1107
.#5  0x0000000000422e13 in accept_msg (who=0x7fffffff8320, ctrl=0x7fffffff8310, n=0x7fffffffa380, arg=0x7ffff77a82a0 <_IO_2_1_stdout_>) at ipmonitor.c:89
.#6  0x000000000044c58f in rtnl_listen (rtnl=0x672160 <rth>, handler=handler@entry=0x422c70 <accept_msg>, jarg=0x7ffff77a82a0 <_IO_2_1_stdout_>)
    at libnetlink.c:761
.#7  0x00000000004233db in do_ipmonitor (argc=<optimized out>, argv=0x7fffffffe5a0) at ipmonitor.c:310
.#8  0x0000000000408f74 in do_cmd (argv0=0x7fffffffe7f5 “mon”, argc=3, argv=0x7fffffffe588) at ip.c:116
.#9  0x0000000000408a94 in main (argc=4, argv=0x7fffffffe580) at ip.c:311

Fixes: 6377572 ("ip: ip_print: add new API to print JSON or regular format output")
Reported-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: Julien Fortin <julien@cumulusnetworks.com>
tohojo pushed a commit that referenced this pull request Mar 6, 2018
As Stephen Hemminger mentioned on the last submission the new_json_obj
function is always called with fp == stdout, so right now, there's no
need of this extra argument.

The background for the rework is the following:
The ip monitor didn't call `new_json_obj` (even for in non json context),
so the static FILE* _fp variable wasn't initialized, thus raising a
SIGSEGV in ipaddress.c. This patch should fix this issue for good, new
paths won't have to call `new_json_obj`.

How to reproduce:

$ ip -t mon label link
(gdb) bt
.#0  _IO_vfprintf_internal (s=s@entry=0x0, format=format@entry=0x45460d “%d: “, ap=ap@entry=0x7fffffff7f18) at vfprintf.c:1278
.#1  0x0000000000451310 in color_fprintf (fp=0x0, attr=<optimized out>, fmt=0x45460d “%d: “) at color.c:108
.#2  0x000000000044a856 in print_color_int (t=t@entry=PRINT_ANY, color=color@entry=4294967295, key=key@entry=0x4545fc “ifindex”,
    fmt=fmt@entry=0x45460d “%d: “, value=<optimized out>) at ip_print.c:132
.#3  0x000000000040ccd2 in print_int (value=<optimized out>, fmt=0x45460d “%d: “, key=0x4545fc “ifindex”, t=PRINT_ANY) at ip_common.h:189
.#4  print_linkinfo (who=<optimized out>, n=0x7fffffffa380, arg=0x7ffff77a82a0 <_IO_2_1_stdout_>) at ipaddress.c:1107
.#5  0x0000000000422e13 in accept_msg (who=0x7fffffff8320, ctrl=0x7fffffff8310, n=0x7fffffffa380, arg=0x7ffff77a82a0 <_IO_2_1_stdout_>) at ipmonitor.c:89
.#6  0x000000000044c58f in rtnl_listen (rtnl=0x672160 <rth>, handler=handler@entry=0x422c70 <accept_msg>, jarg=0x7ffff77a82a0 <_IO_2_1_stdout_>)
    at libnetlink.c:761
.#7  0x00000000004233db in do_ipmonitor (argc=<optimized out>, argv=0x7fffffffe5a0) at ipmonitor.c:310
.#8  0x0000000000408f74 in do_cmd (argv0=0x7fffffffe7f5 “mon”, argc=3, argv=0x7fffffffe588) at ip.c:116
.#9  0x0000000000408a94 in main (argc=4, argv=0x7fffffffe580) at ip.c:311

Fixes: 6377572 ("ip: ip_print: add new API to print JSON or regular format output")
Reported-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: Julien Fortin <julien@cumulusnetworks.com>
ldir-EDB0 pushed a commit to ldir-EDB0/iproute2 that referenced this pull request Jun 28, 2020
Ido Schimmel  says:

====================

From: Ido Schimmel <idosch@mellanox.com>

This patchset adds devlink-trap support in iproute2.

Patch dtaht#1 increases the number of options devlink can handle.

Patches dtaht#2-dtaht#3 gradually add support for all devlink-trap commands.

Patch dtaht#4 adds a man page for devlink-trap.

See individual commit messages for example usage and output.

Changes in v2:
* Remove report option and monitor command since monitoring is done
  using drop monitor

====================

Signed-off-by: David Ahern <dsahern@gmail.com>
ldir-EDB0 pushed a commit to ldir-EDB0/iproute2 that referenced this pull request Jun 28, 2020
Petr Machata  says:

====================

A new Qdisc, "ETS", has been accepted into Linux at kernel commit
6bff00170277 ("Merge branch 'ETS-qdisc'"). Add iproute2 support for this
Qdisc.

Patch dtaht#1, changes libnetlink to admit NLA_F_NESTED in nested attributes.
Patch dtaht#2 then adds ETS support as such.

Examples (taken from the kernel patchset):

- Add a Qdisc with 6 bands, 3 strict and 3 ETS with 45%-30%-25% weights:

    # tc qdisc add dev swp1 root handle 1: \
	ets strict 3 quanta 4500 3000 2500 priomap 0 1 1 1 2 3 4 5
    # tc qdisc sh dev swp1
    qdisc ets 1: root refcnt 2 bands 6 strict 3 quanta 4500 3000 2500 priomap 0 1 1 1 2 3 4 5 5 5 5 5 5 5 5 5

- Tweak quantum of one of the classes of the previous Qdisc:

    # tc class ch dev swp1 classid 1:4 ets quantum 1000
    # tc qdisc sh dev swp1
    qdisc ets 1: root refcnt 2 bands 6 strict 3 quanta 1000 3000 2500 priomap 0 1 1 1 2 3 4 5 5 5 5 5 5 5 5 5
    # tc class ch dev swp1 classid 1:3 ets quantum 1000
    Error: Strict bands do not have a configurable quantum.

- Purely strict Qdisc with 1:1 mapping between priorities and TCs:

    # tc qdisc add dev swp1 root handle 1: \
	ets strict 8 priomap 7 6 5 4 3 2 1 0
    # tc qdisc sh dev swp1
    qdisc ets 1: root refcnt 2 bands 8 strict 8 priomap 7 6 5 4 3 2 1 0 7 7 7 7 7 7 7 7

- Use "bands" to specify number of bands explicitly. Underspecified bands
  are implicitly ETS and their quantum is taken from MTU. The following
  thus gives each band the same weight:

    # tc qdisc add dev swp1 root handle 1: \
	ets bands 8 priomap 7 6 5 4 3 2 1 0
    # tc qdisc sh dev swp1
    qdisc ets 1: root refcnt 2 bands 8 quanta 1514 1514 1514 1514 1514 1514 1514 1514 priomap 7 6 5 4 3 2 1 0 7 7 7 7 7 7 7 7

====================

Signed-off-by: David Ahern <dsahern@gmail.com>
ldir-EDB0 pushed a commit to ldir-EDB0/iproute2 that referenced this pull request Dec 9, 2020
Petr Machata  says:

====================

To allow configuring user-defined actions as a result of inner workings of
a qdisc, a concept of qevents was recently introduced to the kernel.
Qevents are attach points for TC blocks, where filters can be put that are
executed as the packet hits well-defined points in the qdisc algorithms.
The attached blocks can be shared, in a manner similar to clsact ingress
and egress blocks, arbitrary classifiers with arbitrary actions can be put
on them, etc.

For example:

 # tc qdisc add dev eth0 root handle 1: \
	red limit 500K avpkt 1K qevent early_drop block 10
 # tc filter add block 10 \
	matchall action mirred egress mirror dev eth1

This patch set introduces the corresponding iproute2 support. Patch dtaht#1 adds
the new netlink attribute enumerators. Patch dtaht#2 adds a set of helpers to
implement qevents, and dtaht#3 adds a generic documentation to tc.8. Patch dtaht#4
then adds two new qevents to the RED qdisc: mark and early_drop.

====================

Signed-off-by: David Ahern <dsahern@kernel.org>
ldir-EDB0 pushed a commit to ldir-EDB0/iproute2 that referenced this pull request Dec 9, 2020
Petr Machata  says:

====================

When a list of filters at a given block is requested, tc first validates
that the block exists before doing the filter query. Currently the
validation routine checks ingress and egress blocks. But now that blocks
can be bound to qevents as well, qevent blocks should be looked for as
well:

    # ip link add up type dummy
    # tc qdisc add dev dummy1 root handle 1: \
         red min 30000 max 60000 avpkt 1000 qevent early_drop block 100
    # tc filter add block 100 pref 1234 handle 102 matchall action drop
    # tc filter show block 100
    Cannot find block "100"

This patchset fixes this issue:

    # tc filter show block 100
    filter protocol all pref 1234 matchall chain 0
    filter protocol all pref 1234 matchall chain 0 handle 0x66
      not_in_hw
            action order 1: gact action drop
             random type none pass val 0
             index 2 ref 1 bind 1

In patch dtaht#1, the helpers and necessary infrastructure is introduced,
including a new qdisc_util callback that implements sniffing out bound
blocks in a given qdisc.

In patch dtaht#2, RED implements the new callback.

v3:
- Patch dtaht#1:
    - Do not pass &ctx->found directly to has_block. Do it through a
      helper variable, so that the callee does not overwrite the result
      already stored in ctx->found.

v2:
- Patch dtaht#1:
    - In tc_qdisc_block_exists_cb(), do not initialize 'q'.
    - Propagate upwards errors from q->has_block.

====================

Signed-off-by: David Ahern <dsahern@kernel.org>
ldir-EDB0 pushed a commit to ldir-EDB0/iproute2 that referenced this pull request Feb 12, 2022
Petr Machata  says:
====================

The Linux DCB interface allows configuration of a broad range of
hardware-specific attributes, such as TC scheduling, flow control, per-port
buffer configuration, TC rate, etc.

Currently a common libre tool for configuration of DCB is OpenLLDP. This
suite contains a daemon that uses Linux DCB interface to configure HW
according to the DCB TLVs exchanged over an interface. The daemon can also
be controlled by a client, through which the user can adjust and view the
configuration. The downside of using OpenLLDP is that it is somewhat
heavyweight and difficult to use in scripts, and does not support
extensions such as buffer and rate commands.

For access to many HW features, one would be perfectly fine with a
fire-and-forget tool along the lines of "ip" or "tc". For scripting in
particular, this would be ideal. This author is aware of one such tool,
mlnx_qos from Mellanox OFED scripts collection[1].

The downside here is that the tool is very verbose, the command line
language is awkward to use, it is not packaged in Linux distros, and
generally has the appearance of a very vendor-specific tool, despite not
being one.

This patchset addresses the above issues by providing a seed of a clean,
well-documented, easily usable, extensible fire-and-forget tool for DCB
configuration:

    # dcb ets set dev eni1np1 \
                  tc-tsa all:strict 0:ets 1:ets 2:ets \
		  tc-bw all:0 0:33 1:33 2:34

    # dcb ets show dev eni1np1 tc-tsa tc-bw
    tc-tsa 0:ets 1:ets 2:ets 3:strict 4:strict 5:strict 6:strict 7:strict
    tc-bw 0:33 1:33 2:34 3:0 4:0 5:0 6:0 7:0

    # dcb ets set dev eni1np1 tc-bw 1:30 2:37

    # dcb -j ets show dev eni1np1 | jq '.tc_bw[2]'
    37

The patchset proceeds as follows:

- Many tools in iproute2 have an option to work in batch mode, where the
  commands to run are given in a file. The code to handle batching is
  largely the same independent of the tool in question. In patch dtaht#1, add a
  helper to handle the batching, and migrate individual tools to use it.

- A number of configuration options come in a form of an on-off switch.
  This in turn can be considered a special case of parsing one of a given
  set of strings. In patch dtaht#2, extract helpers to parse one of a number of
  strings, on top of which build an on-off parser.

  Currently each tool open-codes the logic to parse the on-off toggle. A
  future patch set will migrate instances of this code over to the new
  helpers.

- The on/off toggles from previous list item sometimes need to be dumped.
  While in the FP output, one typically wishes to maintain consistency with
  the command line and show actual strings, "on" and "off", in JSON output
  one would rather use booleans. This logic is somewhat annoying to have to
  open-code time and again. Therefore in patch dtaht#3, add a helper to do just
  that.

- The DCB tool is built on top of libmnl. Several routines will be
  basically the same in DCB as they are currently in devlink. In patches
  dtaht#4-dtaht#6, extract them to a new module, mnl_utils, for easy reuse.

- Much of DCB is built around arrays. A syntax similar to the iplink_vlan's
  ingress-qos-map / egress-qos-map is very handy for describing changes
  done to such arrays. Therefore in patch dtaht#7, extract a helper,
  parse_mapping(), which manages parsing of key-value arrays. In patch dtaht#8,
  fix a buglet in the helper, and in patch dtaht#9, extend it to allow setting
  of all array elements in one go.

- In patch dtaht#10, add a skeleton of "dcb", which contains common helpers and
  dispatches to subtools for handling of individual objects. The skeleton
  is empty as of this patch.

  In patch dtaht#11, add "dcb_ets", a module for handling of specifically DCB
  ETS objects.

  The intention is to gradually add handlers for at least PFC, APP, peer
  configuration, buffers and rates.

[1] https://github.com/Mellanox/mlnx-tools/tree/master/ofed_scripts

====================

Signed-off-by: David Ahern <dsahern@gmail.com>
ldir-EDB0 pushed a commit to ldir-EDB0/iproute2 that referenced this pull request Feb 12, 2022
Ido Schimmel  says:

====================

From: Ido Schimmel <idosch@nvidia.com>

Patch dtaht#1 prints the recently added 'RTNH_F_TRAP' flag.

Patch dtaht#2 makes sure that nexthop flags are always printed for nexthop
objects. Even when the nexthop does not have a device, such as a
blackhole nexthop or a group.

Example output with netdevsim:

$ ip nexthop
id 1 via 192.0.2.2 dev eth0 scope link trap
id 2 blackhole trap
id 3 group 2 trap

Example output with mlxsw:

$ ip nexthop
id 1 via 192.0.2.2 dev swp3 scope link offload
id 2 blackhole offload
id 3 group 2 offload

Tested with fib_nexthops.sh that uses "ip nexthop" output:

Tests passed: 164
Tests failed:   0

====================

Signed-off-by: David Ahern <dsahern@gmail.com>
ldir-EDB0 pushed a commit to ldir-EDB0/iproute2 that referenced this pull request Feb 12, 2022
Petr Machata says:
==================

The DCB tool will have commands that deal with buffer sizes and traffic
rates. TC is another tool that has a number of such commands, and functions
to support them: get_size(), get_rate/64(), s/print_size() and
s/print_rate(). In this patchset, these functions are moved from TC to lib/
for possible reuse and modernized.

s/print_rate() has a hidden parameter of a global variable use_iec, which
made the conversion non-trivial. The parameter was made explicit,
print_rate() converted to a mostly json_print-like function, and
sprint_rate() retired in favor of the new print_rate. Patches dtaht#1 and dtaht#2
deal with this.

The intention was to treat s/print_size() similarly, but unfortunately two
use cases of sprint_size() cannot be converted to a json_print-like
print_size(), and the function sprint_size() had to remain as a discouraged
backdoor to print_size(). This is done in patch dtaht#3.

Patch dtaht#4 then improves the code of sprint_size() a little bit.

Patch dtaht#5 fixes a buglet in formatting small rates in IEC mode.

Patches dtaht#6 and dtaht#7 handle a routine movement of, respectively,
get_rate/64() and get_size() from tc to lib.

This patchset does not actually add any new uses of these functions. A
follow-up patchset will add subtools for management of DCB buffer and DCB
maxrate objects that will make use of them.

====================

Signed-off-by: David Ahern <dsahern@gmail.com>
ldir-EDB0 pushed a commit to ldir-EDB0/iproute2 that referenced this pull request Feb 12, 2022
Petr Machata  says:
====================

Add support to the dcb tool for the following three DCB objects:

- PFC, for "Priority-based Flow Control", allows configuration of priority
  lossiness, and related toggles.

- DCBNL buffer interfaces are an extension to the 802.1q DCB interfaces and
  allow configuration of port headroom buffers.

- DCBNL maxrate interfaces are an extension to the 802.1q DCB interfaces
  and allow configuration of rate with which traffic in a given traffic
  class is sent.

Patches dtaht#1-dtaht#4 fix small issues in the current DCB code and man pages.

Patch dtaht#5 adds new helpers to the DCB dispatcher.

Patches dtaht#6 and dtaht#7 add support for command line arguments -s and -i. These
enable, respectively, display of statistical counters, and ISO/IEC mode of
rate units.

Patches dtaht#8-dtaht#10 add the subtools themselves and their man pages.

====================

Signed-off-by: David Ahern <dsahern@gmail.com>
ldir-EDB0 pushed a commit to ldir-EDB0/iproute2 that referenced this pull request Feb 12, 2022
Petr Machata  says:

====================

Add support to the dcb tool for the following two DCB objects:

- APP, which allows configuration of traffic prioritization rules based on
  several possible packet headers.

- DCBX, which is a 1-byte bitfield of flags that configure whether the DCBX
  protocol is implemented in the device or in the host, and which version
  of the protocol should be used.

Patch dtaht#1 adds a new helper for finding a name of a given dsfield value.
This is useful for APP DSCP-to-priority rules, which can use human-readable
DSCP names.

Patches dtaht#2, dtaht#3 and dtaht#4 extend existing interfaces for, respectively, parsing
of the X:Y mappings, for setting a DCB object, and for getting a DCB
object.

In patch dtaht#5, support for the command line argument -N / --Numeric is
added. The APP tool later uses it to decide whether to format DSCP values
as human-readable strings or as plain numbers.

Patches dtaht#6 and dtaht#7 add the subtools themselves and their man pages.

v2:
- Two patches dropped and sent to iproute2 branch as "dcb: Fixes".
  This patch set now depends on that one.
- Patch dtaht#5:
    - Make it -N / --Numeric instead of -n / --no-nice-names
    - Rename the flag from no_nice_names to numeric as well
- Patch dtaht#6:
    - Adjust to s/no_nice_names/numeric/ from another patch.

====================

Signed-off-by: David Ahern <dsahern@kernel.org>
ldir-EDB0 pushed a commit to ldir-EDB0/iproute2 that referenced this pull request Feb 12, 2022
Petr Machata  says:

====================

Support for resilient next-hop groups was recently accepted to Linux
kernel[1]. Resilient next-hop groups add a layer of indirection between the
SKB hash and the next hop. Thus the hash is used to reference a hash table
bucket, which is then used to reference a particular next hop. This allows
the system more flexibility when assigning SKB hash space to next hops.
Previously, each next hop had to be assigned a continuous range of SKB hash
space. With a hash table as an intermediate layer, it is possible to
reassign next hops with a hash table bucket granularity. In turn, this
mends issues with traffic flow redirection resulting from next hop removal
or adjustments in next-hop weights.

In this patch set, introduce support for resilient next-hop groups to
iproute2.

- Patch dtaht#1 brings include/uapi/linux/nexthop.h and /rtnetlink.h up to date.

- Patches dtaht#2 and dtaht#3 add new helpers that will be useful later.

- Patch dtaht#4 extends the ip/nexthop sub-tool to accept group type as a
  command line argument, and to dispatch based on the specified type.

- Patch dtaht#5 adds the support for resilient next-hop groups.

- Patch dtaht#6 adds the support for resilient next-hop group bucket interface.

To illustrate the usage, consider the following commands:

 # ip nexthop add id 1 via 192.0.2.2 dev dummy1
 # ip nexthop add id 2 via 192.0.2.3 dev dummy1
 # ip nexthop add id 10 group 1/2 type resilient \
	buckets 8 idle_timer 60 unbalanced_timer 300

The last command creates a resilient next-hop group. It will have 8
buckets, each bucket will be considered idle when no traffic hits it for at
least 60 seconds, and if the table remains out of balance for 300 seconds,
it will be forcefully brought into balance.

And this is how the next-hop group bucket interface looks:

 # ip nexthop bucket show id 10
 id 10 index 0 idle_time 5.59 nhid 1
 id 10 index 1 idle_time 5.59 nhid 1
 id 10 index 2 idle_time 8.74 nhid 2
 id 10 index 3 idle_time 8.74 nhid 2
 id 10 index 4 idle_time 8.74 nhid 1
 id 10 index 5 idle_time 8.74 nhid 1
 id 10 index 6 idle_time 8.74 nhid 1
 id 10 index 7 idle_time 8.74 nhid 1

[1] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=2a0186a37700b0d5b8cc40be202a62af44f02fa2

====================

Signed-off-by: David Ahern <dsahern@kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant