-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[doc]: Init Smartswitch database High Level Design #1534
Conversation
Pterosaur
commented
Dec 4, 2023
•
edited
Loading
edited
PR title | State |
---|---|
sonic-net/sonic-buildimage#17161 | |
sonic-net/sonic-buildimage#17443 | |
sonic-net/sonic-buildimage#18178 | |
sonic-net/sonic-buildimage#17562 |
Signed-off-by: Ze Gan <ganze718@gmail.com>
doc/smart-switch/smart-switch-database-architecture/smart-switch-database-design.md
Show resolved
Hide resolved
There are four new tables introduction for the DPU overlay database: | ||
|
||
``` json | ||
"DPU_APPL_DB": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@prsunny , do you mind to help double confirm the DB name change? I think Ze is proposing modifying all DASH_* dbs to DPU_* dbs. Will any extra change needed in swss?
Also, if this is expected, we will also need to update all dash docs to make us align too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @r12f for tagging. Yes this should be fine as it is new DB instances. Objects in the DB shall still be with DASH_* prefixes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on a second thought, i'm thinking why to define a with prefix DPU_*? Since this is a separate container for a specific DPU, say DPU0, why not we just keep it APPL_DB?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel that using the prefix DPU_* is clearer and more extensible. Here's why:
- Similar to DPU_STATE_DB: We want to organize all similar objects into a single instance. This improves clarity and maintainability. Even if the memory footprint of DPU_STATE_DB is minimal.
- Future-proof: If we decide to move underlayer objects of DPU to NPU in the future, this naming convention will maintain isolation and avoid breaking functionality.
Do you have any concerns with defining these new entities?
}, | ||
"DPU_COUNTERS_DB": { | ||
"id": 18, | ||
"separator": ":", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should counters also uses |?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in case ipv6 stuffs showing up and complicates the splitting logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is just a SONiC convention. https://github.com/sonic-net/sonic-buildimage/blob/ada7c6a72e27a9729628f383f48cd5f75d4da227/dockers/docker-database/database_config.json.j2#L32
I'm worried that there is some hard-code separator :
in the existing code base.
But you are right, there is some conflict with ipv6. This may happen in APPL_DB also.
@qiluo-msft do you know why we use the colon as the separator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is as per design for some DBs. With IPv6, we ensure IPv6 is always at the end of key so as to not conflict with expected Seperator
|
||
This section outlines critical workflows interacting with the DPU overlay database. | ||
|
||
- Update Overlay Objects via GNMI: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will be better to change these into headers or indent the content of each section, so it shows better inside of a single list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
### Configuration and management | ||
|
||
No new CLI commands are required |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to mention that protobuf support is added as part of CLI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
``` yang | ||
container sonic-feature { | ||
container FEATURE { | ||
leaf has_per_dpu_scope { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this feature used or expected? just to confirm, since I am not aware that we have another mode.
also, for objects inside of db, it will be better to add a link to maybe dash api proto repo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It just follows the multi-asic logic.
IMO, This field does explicitly tell users that corresponding database services will be started for each DPU. Otherwise SONiC will only start a global database service.
Without this new field, we have to add some specific logic for DPU and database service. I don't think it's flexible enough.
sonic-net/sonic-host-services#84
|
||
The following tables comprises two parts: Global tables and per ENI tables. Notably, when calculating the total size per card, the memory consumption of per ENI tables is adjusted by multiplying it by the exact number of ENIs. | ||
|
||
- Global Tables |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will be better to use headings instead of list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
| Table name | Entry size (bytes) | No. of entries in the Table | Total size per card (KB) | | ||
| ----------------------- | ------------------ | --------------------------- | ------------------------ | | ||
| DASH_VNET_TABLE | 448 | 1,024 | 448 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the size of tables might change overtime, so should we put a specific commit id to make the reference more explicit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
### Overview | ||
|
||
The Smart Switch comprises two integral components: the Network Processing Unit (NPU) and the Data Processing Unit (DPU), both operating on the SONiC OS. The database stack encompasses the entire database infrastructure for both the NPU and DPU. However, due to memory limitations on the DPU, certain overlayer objects, such as DASH objects, are stored in the NPU. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Add it in the Definitions/Abbreviations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is overlay
objects. Please fix all places in doc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
- All databases, including those on both NPU and DPU, must be accessible through the GNMI server. | ||
- Each DPU database instance on the NPU is associated with a unique TCP port and domain Unix socket path. | ||
- All DPU database instances on the NPU will be bound to the IP address of the midplane bridge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Add it in the Definitions/Abbreviations
- Each DPU database instance on the NPU is associated with a unique TCP port and domain Unix socket path. | ||
- All DPU database instances on the NPU will be bound to the IP address of the midplane bridge. | ||
- All database instances on the NPU share the same network namespace to facilitate seamless communication. | ||
- DPUs can access their respective overlay database instances using the IP of the midplane bridge and a pre-assigned unique TCP port. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's used to store the overlay objects. The overlay objects has been defined in the section: Definitions/Abbreviations
NUM_DPU=2 | ||
``` | ||
|
||
To align with the established multi-ASIC design in SONiC, a new field, `"has_per_dpu_scope": "True"``, is introduced in the database feature table within config_db.json. This field plays a crucial role in ensuring that each DPU database instance is initiated within a dedicated database container. This design approach maintains consistency with SONiC's existing architecture while accommodating the specific requirements of DPU overlay databases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same concept as the has_per_asic_scope
. This field will be read by the featured
to start database service container instances for each DPU.
}, | ||
``` | ||
|
||
Within the NPU, the management of DPU overlay databases involves specific configurations. Each DPU overlay database instance is bound to the IP address of the midplane bridge (169.254.200.254 by default). The TCP port assignment follows a predictable pattern, with each DPU ID associated with a unique port (6380 + DPU ID). Additionally, the Unix domain socket path is structured as /var/run/redisdpu{DPU_ID}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is a doc. Actually, this rule has been defined in the database_global.json also: https://github.com/sonic-net/sonic-buildimage/blob/ada7c6a72e27a9729628f383f48cd5f75d4da227/dockers/docker-database/database_global.json.j2#L30
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Update a paragraph to describe the database_global.json.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var/run/redisdpu{DPU_ID} -> var/run/redisdpu{database_name}.
Maybe you should remove this part, because it is part of database_global.json
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Signed-off-by: Ze Gan <ganze718@gmail.com>
Signed-off-by: Ze Gan <ganze718@gmail.com>
01031c4
to
e833301
Compare
|
||
### Configuration and management | ||
|
||
A enhanced database CLI offers the capability to convert binary messages within the DPU_APPL_DB into human-readable text. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Signed-off-by: Ze Gan <ganze718@gmail.com>
Signed-off-by: Ze Gan <ganze718@gmail.com>
Signed-off-by: Ze Gan <ganze718@gmail.com>
Hi @oleksandrivantsiv , Add the section about DHCP determining TCP port and remote redis configuration that we discussed in the community meeting. Please help to check it. |
|
||
In the architecture of our Smart Switch, DPU operation involves accessing both local and remote database services. This document is on elucidating the interaction with remote database services in the NPU, specifically for overlay objects. | ||
|
||
The DPU employs a DHCP server hosted on the NPU, ensuring each DPU fetches a predetermined and consistent IP address. Leveraging this IP address, the DPU determines the TCP port of its associated overlay database. Following this design principle, the DPU autonomously generates the requisite configuration for remote database services. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might have a problem here with the boot order of the systemd services. The database service is one of the first services that start after the system boot. To start the database service we need to have database_config.json
file available. However, to generate the file we need to have configuration from the DHCP server. The configuration can be received from the DHCP server only after the start of the interface-config.service
which starts later than the database service. So with the existing boot order, this can't be done.
I think to proceed with this we need to have a flow diagram for the database_config.json
file generation to better understand all the dependencies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your suggestion, I will check that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the further discussion in PR: #1586
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Pterosaur discussion pending, so please hold on
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, please check with other active reviewers.
Need more discussion on database_name
…f Smart Switch (#18178) Why I did it Add systemd network service of Smart Switch for configuring the midplane network. How I did it Add related systemd services and networkd configuration According to paltform.json, the systemd-sonic-generator will install required services. The changes are based on the HLD: sonic-net/SONiC#1534 and sonic-net/SONiC#1586 How to verify it Check Azp that no failure to the existing testcases Check NPU scenario: 2.1 Start a KVM and add the NPU platform.json as following echo '{"DPUS":{"dpu0":{}, "dpu1":{}}}' | sudo tee /usr/share/sonic/device/x86_64-kvm_x86_64-r0/platform.json 2.2 You should get two specific database containers for DPU admin@vlab-01:~$ docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 4455f7ab611c docker-database:latest "/usr/local/bin/dock…" 5 minutes ago Up 5 minutes databasedpu0 8e983b5beb27 docker-database:latest "/usr/local/bin/dock…" 5 minutes ago Up 5 minutes databasedpu1 2.3 Check the expected service midplane-network-npu.service is enabled but the midplane-network-dpu.service is disabled admin@vlab-01:~$ sudo systemctl list-unit-files '*midplane*' UNIT FILE STATE PRESET midplane-network-dpu.service disabled enabled midplane-network-npu.service enabled-runtime enabled 2.4 Check the bridge-midplane has been created. admin@vlab-01:~$ sudo brctl show bridge name bridge id STP enabled interfaces ... bridge-midplane 8000.76c5a51a18f6 no dummy-midplane 2.5 If we create two mock dpu interfaces, they should be added into the bridge-midplane automatically admin@vlab-01:~$ sudo ip link add dpu0 type veth peer dpu1 admin@vlab-01:~$ sudo brctl show bridge name bridge id STP enabled interfaces ... bridge-midplane 8000.76c5a51a18f6 no dpu0 dpu1 dummy-midplane Check DPU scenario: 3.1 Start a KVM and add the DPU platform.json as following echo '{"DPU":{}}' | sudo tee /usr/share/sonic/device/x86_64-kvm_x86_64-r0/platform.json 3.2 Check the expected service midplane-network-dpu.service is enabled but the midplane-network-npu.service is disabled admin@vlab-01:~$ sudo systemctl list-unit-files '*midplane*' UNIT FILE STATE PRESET midplane-network-dpu.service disabled enabled midplane-network-npu.service enabled-runtime enabled 3.3 The database service should be started after(waiting for) midplane-network-dpu.service with a 10mins delay admin@vlab-01:~$ uptime 16:08:21 up 29 min, 1 user, load average: 0.01, 0.08, 0.12 admin@vlab-01:~$ docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES ... ae5195283cd6 docker-database:latest "/usr/local/bin/dock…" 35 minutes ago Up 18 minutes database 29min-18min≈10mins (Because in the KVM, we don't have an exact eth0-midplane interface and its DHCP provider) Signed-off-by: Ze Gan <ganze718@gmail.com>
Signed-off-by: Ze Gan <ganze718@gmail.com>
Signed-off-by: Ze Gan <ganze718@gmail.com>
### Why I did it According to the HLD: sonic-net/SONiC#1534, update the DPU database configuration. ### How I did it 1. Add the remote redis configuration in database_config.json.j2 2. Rename the database_global.json.j2 3. Determine the remote DB port according to the midplane IP 4. Get the DPU information from platform.json #### How to verify it Check them locally
…f Smart Switch (#18178) Why I did it Add systemd network service of Smart Switch for configuring the midplane network. How I did it Add related systemd services and networkd configuration According to paltform.json, the systemd-sonic-generator will install required services. The changes are based on the HLD: sonic-net/SONiC#1534 and sonic-net/SONiC#1586 How to verify it Check Azp that no failure to the existing testcases Check NPU scenario: 2.1 Start a KVM and add the NPU platform.json as following echo '{"DPUS":{"dpu0":{}, "dpu1":{}}}' | sudo tee /usr/share/sonic/device/x86_64-kvm_x86_64-r0/platform.json 2.2 You should get two specific database containers for DPU admin@vlab-01:~$ docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 4455f7ab611c docker-database:latest "/usr/local/bin/dock…" 5 minutes ago Up 5 minutes databasedpu0 8e983b5beb27 docker-database:latest "/usr/local/bin/dock…" 5 minutes ago Up 5 minutes databasedpu1 2.3 Check the expected service midplane-network-npu.service is enabled but the midplane-network-dpu.service is disabled admin@vlab-01:~$ sudo systemctl list-unit-files '*midplane*' UNIT FILE STATE PRESET midplane-network-dpu.service disabled enabled midplane-network-npu.service enabled-runtime enabled 2.4 Check the bridge-midplane has been created. admin@vlab-01:~$ sudo brctl show bridge name bridge id STP enabled interfaces ... bridge-midplane 8000.76c5a51a18f6 no dummy-midplane 2.5 If we create two mock dpu interfaces, they should be added into the bridge-midplane automatically admin@vlab-01:~$ sudo ip link add dpu0 type veth peer dpu1 admin@vlab-01:~$ sudo brctl show bridge name bridge id STP enabled interfaces ... bridge-midplane 8000.76c5a51a18f6 no dpu0 dpu1 dummy-midplane Check DPU scenario: 3.1 Start a KVM and add the DPU platform.json as following echo '{"DPU":{}}' | sudo tee /usr/share/sonic/device/x86_64-kvm_x86_64-r0/platform.json 3.2 Check the expected service midplane-network-dpu.service is enabled but the midplane-network-npu.service is disabled admin@vlab-01:~$ sudo systemctl list-unit-files '*midplane*' UNIT FILE STATE PRESET midplane-network-dpu.service disabled enabled midplane-network-npu.service enabled-runtime enabled 3.3 The database service should be started after(waiting for) midplane-network-dpu.service with a 10mins delay admin@vlab-01:~$ uptime 16:08:21 up 29 min, 1 user, load average: 0.01, 0.08, 0.12 admin@vlab-01:~$ docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES ... ae5195283cd6 docker-database:latest "/usr/local/bin/dock…" 35 minutes ago Up 18 minutes database 29min-18min≈10mins (Because in the KVM, we don't have an exact eth0-midplane interface and its DHCP provider) Signed-off-by: Ze Gan <ganze718@gmail.com>