Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic port breakout HLD #450

Merged
merged 3 commits into from Jan 24, 2020
Merged

Dynamic port breakout HLD #450

merged 3 commits into from Jan 24, 2020

Conversation

zhenggen-xu
Copy link
Collaborator

Dynamic port breakout HLD

Signed-off-by: Zhenggen Xu zxu@linkedin.com

Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
@stevenlu99
Copy link

  1. For newly introduced hwsku_breakout.json
    Why not use platform.json file for different hwsku? We can let platform.json in each hwsku directory override platform.json under platform/ directory. If no platform.json exist under hwsku, then use it under platform/. so that we do not have to introduce another file, and also we can cover other possible differences in hwsku in the future.
  2. platform-running-mode.json
    why we need a separate file instead of store in redis_db and config_db.json? Can we just extend config_db.json and add a section "breakout-config": {}?
    Using a separate file there might be an issue when we do upgrade, unless we migrate platform-running-mode.json. But will duplicate config migrate logic, which already exist for config_db.
  3. Seems need add coverage of "config re-play", when reboot/reload with saved port breakout configurations
    During config replay we need guarantee that port breakout config will be executed first.

@zhenggen-xu
Copy link
Collaborator Author

zhenggen-xu commented Aug 27, 2019

  1. For newly introduced hwsku_breakout.json
    Why not use platform.json file for different hwsku? We can let platform.json in each hwsku directory override platform.json under platform/ directory. If no platform.json exist under hwsku, then use it under platform/. so that we do not have to introduce another file, and also we can cover other possible differences in hwsku in the future.

This is a fair idea, I was trying to not duplicate the same file too many times, thus use the file for hwsku specific things. We can discuss a bit in the meeting and make a call.
[update], this file is used for each hwsku too, and will have precedence over the platform.json at platform layer.

  1. platform-running-mode.json
    why we need a separate file instead of store in redis_db and config_db.json? Can we just extend config_db.json and add a section "breakout-config": {}?
    Using a separate file there might be an issue when we do upgrade, unless we migrate platform-running-mode.json. But will duplicate config migrate logic, which already exist for config_db.

Putting to configDB should be feasible, and it was initially thought too. This configuration is not going to be used by the backend code, thus I purposely moved it to a file, and yes, this should be migrated during the upgrade, which is trivial. But let's make it open for discussion.
[Update] Moved to configDB.

  1. Seems need add coverage of "config re-play", when reboot/reload with saved port breakout configurations
    During config replay we need guarantee that port breakout config will be executed first.

Since the configDB has the individual port configurations, I don't see why we need replay the config.

The capability file is named `platform.json` and should be provided to each platform if we need support DPB. And we also keep a running breakout mode in separate file as `platform-running-mode.json`.
`platform.json` looks like this:
```
"Ethernet0": {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generic question : Can't ASIC vendors support port-breakout capabiliteis per platform? why is it only per interface?
Can the ASIC vendors support portt-breakout capabilites range of interfaces or group of ports?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The design uses per interface (group) breakout to introduce flexibility. For example, on some ASIC, the breakout mode is not valid for some certain ports. Also, for a particular platform, we may want to restrict the breakout mode differently on different ports due to the physical design.
The capability file includes lanes etc information, it would be challenge to support port ranges as usually those may not have patterns.

"breakout_modes": "1x100G[40G],2x50G,4x25G[10G],2x25G(2)+1x50G(2),1x50G(2)+2x25G(2)",
"default_mode": "1x100G[40G]"
}
...

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does port-breakout CLI allows the user breakout from one to other? let's say 4 *100 -> 8 * 50 -> 2 *200?
Can the user port breakin always moving from one breakout to other?
What is the design around port configs lifecycle (w.r.t delete/add) when port move from one mode to other? Is it push/pull for sonic applications to react on config changes?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the CLI allow the user to breakout from any mode to any mode.
When the port moved from one to other, the port will get deleted along with the configurations deleted if we use force-remove option in CLI. (by default, it let user manually delete configuration). And then the new ports will get added, by default, no user configuration will be put on the port, we give an option again to load some predefine configurations on the port to save operation run time overhead.

- "default_mode": This defines the default breakout mode if no mode changes were applied on the system.

Syntax for breakout_modes:
- `number x speed1 [speed2, speed3]`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this features provides the list of breakout capabilities for the User?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The breakout capability is really per platform specific, the feature design is not restrict to any breakout that can be expressed by the syntax as long as the backend and SAI could handle it.
We do have examples in the design.

}
"Ethernet8": {
"default_mode": "2x50G"
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have suggestions/ASIC vendors to make sure the the broken ports are functional? Is there any port scan kind of tool in SONiC?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not aware of the port scan tool in SONiC. The current assumption is we will reply on the test cases. We can leverage the ansible framework to do the traffic testing on different platforms.

- moved the running breakout mode to configDB
- use the platform.json for each HWSKU too.
- show CLI changes.

Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
@rameshsanth
Copy link

Under configuration management, how will you handle port configuration which are applicable in only certain speeds ? For ex, FEC mode
If FEC was configured on Ethernet0 and the port was broken out to lower speeds, the FEC mode would no longer be applicable to lower speed. If the configuration for Ethernet0 reapplied to new broken out port, then it will cause failure.

}
```
#### Inactive configuration --phase 2
When the new ports were added above, we could see the other configurations like VLAN etc are not applied. This could potentially require operators to apply the configurations manually after the port breakout. In some scenarios, the configuration could be the same before and after breakout. To make the operation easier, we introduce a way to automatically apply the configurations prepared by users.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For phase 1, Can we assume that the operator has to manually apply the forwarding and QoS configuration. In that case, there is a window of time between the instant a port is created, and before the configuration is applied by the operator. There is a possibility of misforwarding packets during this window. To avoid this, will the newly created port be in admin down state initially, and the operator manually admin up the port after applying the forwarding and QoS configuration.

config interface Ethernet0 breakout 4x25G[10G] # Eth0 is created in admin down state!
config interface Ethernet0 speed 10G
config qos reload # buffers.json contains the buffer profile configuration for the new port
port admin up

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Yes, we can put the newly created ports in admin down by default until user to make it admin up later.

@zhenggen-xu
Copy link
Collaborator Author

Under configuration management, how will you handle port configuration which are applicable in only certain speeds ? For ex, FEC mode
If FEC was configured on Ethernet0 and the port was broken out to lower speeds, the FEC mode would no longer be applicable to lower speed. If the configuration for Ethernet0 reapplied to new broken out port, then it will cause failure.

This is not necessarily tied to port-breakout feature itself. In today's configuration, if you have FEC settings for low speed ports statically, it will fail too. So we need user to have this sanity check outside SONiC or we need introduce some "speed" to "FEC" mapping somewhere in SONiC for checking. This is outside the design scope for now.

Update the default state of newly created ports.

Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
### Special HW capability consideration
On some platforms, there are some limitations about the number of port/mac available for the total count or for a particular group. For example, on some platforms, there were limitations where only 4 port/MAC are available for 8 lanes, in this case, we could define the `platform.json` to fit that platform. e,g:
```
"Ethernet0": {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed with you in design review call, below are dell comments:
1# What you are doing here is grouping the different ports in group and giving the name as starting interface name. We propose use consider interface name as group and derive interface names form context.
Example:
"Group0": {
"index": "1,1,1,1,2,2,2,2",
"lanes": "0,1,2,3,4,5,6,7",
"port_names": [
"Ethernet" : "Ethernet0, Ethernet4",
"FibreChannel" : "FibreChannel0, FibreChannel4"
],
"alias_at_lanes":"Eth1/1, Eth1/2, Eth1/3, Eth1/4, Eth2/1, Eth2/2, Eth2/3, Eth2/4",
"breakout_modes": "2x100G[40G],4x50G,1x100G40G+2x50G(4),2x50G(4)+1x100G40G,4x25G10G, None(4)+4x25G10G",
"default_mode": "Ethernet",
"default_breakout_mode": "2x100G[40G]"
}

2# If you follow is above convention, it will have option to select port mode Ethernet/Fiber channel and derive default names used based on mode. it will be cleaner to isolate Ethernet/fiber channel ports in application.

3# Mode can be changed dynamical, based on configured more Port name will be decided.
Example:
"Group0": {
"index": "1,1,1,1",
"lanes": "0,1,2,3",
"port_names": {"Ethernet" : "Ethernet0",
"FibreChannel" : "FibreChannel0"
},
"alias_at_lanes": {"Ethernet" : "Eth1/1, Eth1/2, Eth1/3, Eth1/4",
"FibreChannel" : "FC1/1, FC1/2, FC1/3, FC1/4"},
"breakout_modes": {"Ethernet" : "2x100G[40G],4x50G,1x100G40G+2x50G(4),2x50G(4)+1x100G40G,4x25G10G, None(4)+4x25G10G",
"FibreChannel" : "..." },
"default_mode": "Ethernet",
"default_brkout_mode": "2x100G[40G]"
}

Since we are doing this file format, we can consider this will be useful for future.

As per our discussion, You will continue back end implementation and update HLD will above proposal.
Make necessary changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants