Skip to content

Enable ECN on lossless queues

Ying Xie edited this page Oct 18, 2017 · 7 revisions

Overview

IP based RDMA protocol (e.g. RoCEv2) uses PFC to enable drop-free network. ECN is the important algorithm to support the model. However, ECN is currently enabled for lossy queues for current configuration. Therefore, we need also enable ECN for lossless queues for new configurations.

SAI specific changes

Requirement

Enabling ECN on lossless queue is considered necessary and it should be enabled in SONiC configuration. For non-ECN capable packet in lossless queue, if threshold is exceeded it should be dropped. Design test cases to ensure ECN works on lossless queue.

ECN shall always go out before PFC packets got generated on the same link.

ECN threshold will be calculated and push down to SAI. It is not SAI's responsibility to auto-calculate ECN threshold.

SAI client should be able to change ECN on the flight without restarting service.

Example change

QoS configuration for TD2

Change queue 0-1 at the end of the line to 3-4.

SONiC specific changes

Goal

  1. Enable dynamic ECN configuration change when the switch is already up and running.
  2. Move QoS configuration to conf_DB. So dynamic configuration changes will be persisted.
  3. Move away from APP_DB gradually.

Requirements

  1. Dynamic ECN configuration

    1.1. Design a cli (console) command that supports showing and setting ECN configuration.

    1.2. Any dynamic set value should go to conf_DB directly.

  2. Move QoS configuration to conf_DB

    2.1. Currently the configuration resides inside swss container, under /etc/swss/config.d/

    2.2. On a system with the target code, this file should not exist at above location.

    2.3. swssconfig.sh should be updated not to load the configuration from above location. 2.4. The configuration j2 file should be mounted or symbolic linked to a common place in the base o.s. (The j2 file should be moved to template folder on base o.s.)

    2.4. A mechanism should run once at the first boot to load the configuration into conf_DB.

    2.4.1 At first boot, there is a procedure of initialize config_DB, one step is to load minigraph
    
    2.4.2 The conversion of qos.j2 to qos.json should happen after loading minigraph
    
    2.4.3 There is already mechanism to execute this logic only once.
    

    2.5. Subsequent boots should take configuration directly from conf_DB.

    2.6. Reload minigraph script should also invoke qos configuration conversion and application procedure.

  3. Move away from APP_DB

    3.1. Orchagent should subscribe to both conf_DB and APP_DB for the time being.

    3.2. Design (2) should be strictly followed, making sure there is only one copy of data being fed to orchagent.

    3.3. Orchagent will apply the latest change to SAI.

    3.4. [information only, out of scope of this design] Broadcom platform will continue using the existing configuration management until the relevant j2 files is produced. Then move to the common infrastructure by this design.

Reference

Please reference section 4 of Congestion Control for Large-Scale RDMA Deployments

Clone this wiki locally