Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[doc]: Init Smartswitch database High Level Design #1534

Merged
merged 8 commits into from
Apr 24, 2024

Conversation

Pterosaur
Copy link
Contributor

@Pterosaur Pterosaur commented Dec 4, 2023

Signed-off-by: Ze Gan <ganze718@gmail.com>
@Pterosaur Pterosaur changed the title [doc]: Smartswitch database Design [doc]: Init Smartswitch database High Level Design Dec 4, 2023
There are four new tables introduction for the DPU overlay database:

``` json
"DPU_APPL_DB": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@prsunny , do you mind to help double confirm the DB name change? I think Ze is proposing modifying all DASH_* dbs to DPU_* dbs. Will any extra change needed in swss?

Also, if this is expected, we will also need to update all dash docs to make us align too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @r12f for tagging. Yes this should be fine as it is new DB instances. Objects in the DB shall still be with DASH_* prefixes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on a second thought, i'm thinking why to define a with prefix DPU_*? Since this is a separate container for a specific DPU, say DPU0, why not we just keep it APPL_DB?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that using the prefix DPU_* is clearer and more extensible. Here's why:

  1. Similar to DPU_STATE_DB: We want to organize all similar objects into a single instance. This improves clarity and maintainability. Even if the memory footprint of DPU_STATE_DB is minimal.
  2. Future-proof: If we decide to move underlayer objects of DPU to NPU in the future, this naming convention will maintain isolation and avoid breaking functionality.

Do you have any concerns with defining these new entities?

},
"DPU_COUNTERS_DB": {
"id": 18,
"separator": ":",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should counters also uses |?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in case ipv6 stuffs showing up and complicates the splitting logic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is just a SONiC convention. https://github.com/sonic-net/sonic-buildimage/blob/ada7c6a72e27a9729628f383f48cd5f75d4da227/dockers/docker-database/database_config.json.j2#L32
I'm worried that there is some hard-code separator : in the existing code base.

But you are right, there is some conflict with ipv6. This may happen in APPL_DB also.
@qiluo-msft do you know why we use the colon as the separator?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is as per design for some DBs. With IPv6, we ensure IPv6 is always at the end of key so as to not conflict with expected Seperator


This section outlines critical workflows interacting with the DPU overlay database.

- Update Overlay Objects via GNMI:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will be better to change these into headers or indent the content of each section, so it shows better inside of a single list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


### Configuration and management

No new CLI commands are required
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to mention that protobuf support is added as part of CLI?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

``` yang
container sonic-feature {
container FEATURE {
leaf has_per_dpu_scope {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this feature used or expected? just to confirm, since I am not aware that we have another mode.

also, for objects inside of db, it will be better to add a link to maybe dash api proto repo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just follows the multi-asic logic.
IMO, This field does explicitly tell users that corresponding database services will be started for each DPU. Otherwise SONiC will only start a global database service.
Without this new field, we have to add some specific logic for DPU and database service. I don't think it's flexible enough.
sonic-net/sonic-host-services#84


The following tables comprises two parts: Global tables and per ENI tables. Notably, when calculating the total size per card, the memory consumption of per ENI tables is adjusted by multiplying it by the exact number of ENIs.

- Global Tables
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will be better to use headings instead of list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


| Table name | Entry size (bytes) | No. of entries in the Table | Total size per card (KB) |
| ----------------------- | ------------------ | --------------------------- | ------------------------ |
| DASH_VNET_TABLE | 448 | 1,024 | 448 |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the size of tables might change overtime, so should we put a specific commit id to make the reference more explicit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


### Overview

The Smart Switch comprises two integral components: the Network Processing Unit (NPU) and the Data Processing Unit (DPU), both operating on the SONiC OS. The database stack encompasses the entire database infrastructure for both the NPU and DPU. However, due to memory limitations on the DPU, certain overlayer objects, such as DASH objects, are stored in the NPU.
Copy link
Contributor

@qiluo-msft qiluo-msft Dec 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overlayer objects

What is definition of "overlayer objects"? #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Add it in the Definitions/Abbreviations

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is overlay objects. Please fix all places in doc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


- All databases, including those on both NPU and DPU, must be accessible through the GNMI server.
- Each DPU database instance on the NPU is associated with a unique TCP port and domain Unix socket path.
- All DPU database instances on the NPU will be bound to the IP address of the midplane bridge.
Copy link
Contributor

@qiluo-msft qiluo-msft Dec 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

midplane bridge

What is the definition of "midplane bridge"? #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Add it in the Definitions/Abbreviations

- Each DPU database instance on the NPU is associated with a unique TCP port and domain Unix socket path.
- All DPU database instances on the NPU will be bound to the IP address of the midplane bridge.
- All database instances on the NPU share the same network namespace to facilitate seamless communication.
- DPUs can access their respective overlay database instances using the IP of the midplane bridge and a pre-assigned unique TCP port.
Copy link
Contributor

@qiluo-msft qiluo-msft Dec 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overlay database

What is "overlay database"? #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used to store the overlay objects. The overlay objects has been defined in the section: Definitions/Abbreviations

NUM_DPU=2
```

To align with the established multi-ASIC design in SONiC, a new field, `"has_per_dpu_scope": "True"``, is introduced in the database feature table within config_db.json. This field plays a crucial role in ensuring that each DPU database instance is initiated within a dedicated database container. This design approach maintains consistency with SONiC's existing architecture while accommodating the specific requirements of DPU overlay databases.
Copy link
Contributor

@qiluo-msft qiluo-msft Dec 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ensuring

Could you explain the meaning of has_per_dpu_scope? and how does it ensure "each DPU database instance is initiated within a dedicated database container" ? #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same concept as the has_per_asic_scope. This field will be read by the featured to start database service container instances for each DPU.

},
```

Within the NPU, the management of DPU overlay databases involves specific configurations. Each DPU overlay database instance is bound to the IP address of the midplane bridge (169.254.200.254 by default). The TCP port assignment follows a predictable pattern, with each DPU ID associated with a unique port (6380 + DPU ID). Additionally, the Unix domain socket path is structured as /var/run/redisdpu{DPU_ID}.
Copy link
Contributor

@qiluo-msft qiluo-msft Dec 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/var/run/redisdpu{DPU_ID}

This rule should be defined in runtime config file database_global.json. #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Update a paragraph to describe the database_global.json.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

var/run/redisdpu{DPU_ID} -> var/run/redisdpu{database_name}.
Maybe you should remove this part, because it is part of database_global.json

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Signed-off-by: Ze Gan <ganze718@gmail.com>
Signed-off-by: Ze Gan <ganze718@gmail.com>

### Configuration and management

A enhanced database CLI offers the capability to convert binary messages within the DPU_APPL_DB into human-readable text.
Copy link
Contributor

@qiluo-msft qiluo-msft Dec 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A

A -> An #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Signed-off-by: Ze Gan <ganze718@gmail.com>
Signed-off-by: Ze Gan <ganze718@gmail.com>
Signed-off-by: Ze Gan <ganze718@gmail.com>
@Pterosaur
Copy link
Contributor Author

Hi @oleksandrivantsiv , Add the section about DHCP determining TCP port and remote redis configuration that we discussed in the community meeting. Please help to check it.


In the architecture of our Smart Switch, DPU operation involves accessing both local and remote database services. This document is on elucidating the interaction with remote database services in the NPU, specifically for overlay objects.

The DPU employs a DHCP server hosted on the NPU, ensuring each DPU fetches a predetermined and consistent IP address. Leveraging this IP address, the DPU determines the TCP port of its associated overlay database. Following this design principle, the DPU autonomously generates the requisite configuration for remote database services.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might have a problem here with the boot order of the systemd services. The database service is one of the first services that start after the system boot. To start the database service we need to have database_config.json file available. However, to generate the file we need to have configuration from the DHCP server. The configuration can be received from the DHCP server only after the start of the interface-config.service which starts later than the database service. So with the existing boot order, this can't be done.

I think to proceed with this we need to have a flow diagram for the database_config.json file generation to better understand all the dependencies.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your suggestion, I will check that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the further discussion in PR: #1586

Copy link
Contributor

@prgeor prgeor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Pterosaur discussion pending, so please hold on

qiluo-msft
qiluo-msft previously approved these changes Dec 29, 2023
Copy link
Contributor

@qiluo-msft qiluo-msft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, please check with other active reviewers.

@qiluo-msft qiluo-msft dismissed their stale review December 29, 2023 23:25

Need more discussion on database_name

lguohan pushed a commit to sonic-net/sonic-buildimage that referenced this pull request Apr 2, 2024
…f Smart Switch (#18178)

Why I did it
Add systemd network service of Smart Switch for configuring the midplane network.

How I did it

Add related systemd services and networkd configuration
According to paltform.json, the systemd-sonic-generator will install required services.
The changes are based on the HLD: sonic-net/SONiC#1534 and sonic-net/SONiC#1586

How to verify it
Check Azp that no failure to the existing testcases
Check NPU scenario:
2.1 Start a KVM and add the NPU platform.json as following
echo '{"DPUS":{"dpu0":{}, "dpu1":{}}}' | sudo tee /usr/share/sonic/device/x86_64-kvm_x86_64-r0/platform.json
2.2 You should get two specific database containers for DPU

admin@vlab-01:~$ docker ps -a
CONTAINER ID   IMAGE                                COMMAND                  CREATED          STATUS         PORTS     NAMES
4455f7ab611c   docker-database:latest               "/usr/local/bin/dock…"   5 minutes ago    Up 5 minutes             databasedpu0
8e983b5beb27   docker-database:latest               "/usr/local/bin/dock…"   5 minutes ago    Up 5 minutes             databasedpu1
2.3 Check the expected service midplane-network-npu.service is enabled but the midplane-network-dpu.service is disabled

admin@vlab-01:~$ sudo systemctl list-unit-files '*midplane*'
UNIT FILE                    STATE           PRESET
midplane-network-dpu.service disabled        enabled
midplane-network-npu.service enabled-runtime  enabled
2.4 Check the bridge-midplane has been created.

admin@vlab-01:~$ sudo brctl show
bridge name     bridge id               STP enabled     interfaces
...
bridge-midplane         8000.76c5a51a18f6       no              dummy-midplane
2.5 If we create two mock dpu interfaces, they should be added into the bridge-midplane automatically

admin@vlab-01:~$ sudo ip link add dpu0 type veth peer dpu1
admin@vlab-01:~$ sudo brctl show
bridge name     bridge id               STP enabled     interfaces
...
bridge-midplane         8000.76c5a51a18f6       no              dpu0 dpu1 dummy-midplane
Check DPU scenario:
3.1 Start a KVM and add the DPU platform.json as following
echo '{"DPU":{}}' | sudo tee /usr/share/sonic/device/x86_64-kvm_x86_64-r0/platform.json
3.2 Check the expected service midplane-network-dpu.service is enabled but the midplane-network-npu.service is disabled

admin@vlab-01:~$ sudo systemctl list-unit-files '*midplane*'
UNIT FILE                    STATE           PRESET
midplane-network-dpu.service disabled        enabled
midplane-network-npu.service enabled-runtime enabled
3.3 The database service should be started after(waiting for) midplane-network-dpu.service with a 10mins delay

admin@vlab-01:~$ uptime
 16:08:21 up 29 min,  1 user,  load average: 0.01, 0.08, 0.12
admin@vlab-01:~$ docker ps -a
CONTAINER ID   IMAGE                                COMMAND                  CREATED          STATUS          PORTS     NAMES
...
ae5195283cd6   docker-database:latest               "/usr/local/bin/dock…"   35 minutes ago   Up 18 minutes             database
29min-18min≈10mins (Because in the KVM, we don't have an exact eth0-midplane interface and its DHCP provider)


Signed-off-by: Ze Gan <ganze718@gmail.com>
Signed-off-by: Ze Gan <ganze718@gmail.com>
Signed-off-by: Ze Gan <ganze718@gmail.com>
qiluo-msft pushed a commit to sonic-net/sonic-buildimage that referenced this pull request Apr 9, 2024
### Why I did it
According to the HLD: sonic-net/SONiC#1534, update the DPU database configuration.
### How I did it
1. Add the remote redis configuration in database_config.json.j2
2. Rename the database_global.json.j2
3. Determine the remote DB port according to the midplane IP
4. Get the DPU information from platform.json

#### How to verify it
Check them locally
theasianpianist pushed a commit to theasianpianist/sonic-buildimage-msft that referenced this pull request Apr 9, 2024
…f Smart Switch (#18178)

Why I did it
Add systemd network service of Smart Switch for configuring the midplane network.

How I did it

Add related systemd services and networkd configuration
According to paltform.json, the systemd-sonic-generator will install required services.
The changes are based on the HLD: sonic-net/SONiC#1534 and sonic-net/SONiC#1586

How to verify it
Check Azp that no failure to the existing testcases
Check NPU scenario:
2.1 Start a KVM and add the NPU platform.json as following
echo '{"DPUS":{"dpu0":{}, "dpu1":{}}}' | sudo tee /usr/share/sonic/device/x86_64-kvm_x86_64-r0/platform.json
2.2 You should get two specific database containers for DPU

admin@vlab-01:~$ docker ps -a
CONTAINER ID   IMAGE                                COMMAND                  CREATED          STATUS         PORTS     NAMES
4455f7ab611c   docker-database:latest               "/usr/local/bin/dock…"   5 minutes ago    Up 5 minutes             databasedpu0
8e983b5beb27   docker-database:latest               "/usr/local/bin/dock…"   5 minutes ago    Up 5 minutes             databasedpu1
2.3 Check the expected service midplane-network-npu.service is enabled but the midplane-network-dpu.service is disabled

admin@vlab-01:~$ sudo systemctl list-unit-files '*midplane*'
UNIT FILE                    STATE           PRESET
midplane-network-dpu.service disabled        enabled
midplane-network-npu.service enabled-runtime  enabled
2.4 Check the bridge-midplane has been created.

admin@vlab-01:~$ sudo brctl show
bridge name     bridge id               STP enabled     interfaces
...
bridge-midplane         8000.76c5a51a18f6       no              dummy-midplane
2.5 If we create two mock dpu interfaces, they should be added into the bridge-midplane automatically

admin@vlab-01:~$ sudo ip link add dpu0 type veth peer dpu1
admin@vlab-01:~$ sudo brctl show
bridge name     bridge id               STP enabled     interfaces
...
bridge-midplane         8000.76c5a51a18f6       no              dpu0 dpu1 dummy-midplane
Check DPU scenario:
3.1 Start a KVM and add the DPU platform.json as following
echo '{"DPU":{}}' | sudo tee /usr/share/sonic/device/x86_64-kvm_x86_64-r0/platform.json
3.2 Check the expected service midplane-network-dpu.service is enabled but the midplane-network-npu.service is disabled

admin@vlab-01:~$ sudo systemctl list-unit-files '*midplane*'
UNIT FILE                    STATE           PRESET
midplane-network-dpu.service disabled        enabled
midplane-network-npu.service enabled-runtime enabled
3.3 The database service should be started after(waiting for) midplane-network-dpu.service with a 10mins delay

admin@vlab-01:~$ uptime
 16:08:21 up 29 min,  1 user,  load average: 0.01, 0.08, 0.12
admin@vlab-01:~$ docker ps -a
CONTAINER ID   IMAGE                                COMMAND                  CREATED          STATUS          PORTS     NAMES
...
ae5195283cd6   docker-database:latest               "/usr/local/bin/dock…"   35 minutes ago   Up 18 minutes             database
29min-18min≈10mins (Because in the KVM, we don't have an exact eth0-midplane interface and its DHCP provider)

Signed-off-by: Ze Gan <ganze718@gmail.com>
@Pterosaur Pterosaur merged commit 35688b2 into sonic-net:master Apr 24, 2024
1 check passed
@Pterosaur Pterosaur deleted the smartswitch_db_design branch April 25, 2024 01:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants