From ac5db685ec535b4bfa7076ce0998eac8364bc50d Mon Sep 17 00:00:00 2001 From: Stefan Majer Date: Fri, 18 Aug 2023 11:43:48 +0200 Subject: [PATCH 01/14] MEP-13 --- .../src/development/proposals/MEP13/README.md | 20 +++++++++++++++++++ docs/src/development/proposals/index.md | 3 ++- 2 files changed, 22 insertions(+), 1 deletion(-) create mode 100644 docs/src/development/proposals/MEP13/README.md diff --git a/docs/src/development/proposals/MEP13/README.md b/docs/src/development/proposals/MEP13/README.md new file mode 100644 index 0000000000..b290fa8aec --- /dev/null +++ b/docs/src/development/proposals/MEP13/README.md @@ -0,0 +1,20 @@ +# BGP data plane visibility + +Currently a operator can not identify if a certain IP, which is allocated, is actually announced to the outer world. +We want to gather information about the routes on the edge of the network of every partition and store them in the metal-api. + +This will bring more visibility to the network and ip address usage in the dataplane. + +To achieve this goal we need to implement a new microservice which collects these data and send them via grpc to the metal-api. +The metal-api will store them in a separate table. Later when a network or single IP is described +a lookup to that table is made to show when this ip was last announced. + +## metal-api + +TODO: describe the new grpc endpoint API and the table structure where the data is stored + +## new microservice on the border router + +TODO: decide name + +HINT: reuse the frr api logic from frr-monitor diff --git a/docs/src/development/proposals/index.md b/docs/src/development/proposals/index.md index 41a5bc2429..700c5b9b7c 100644 --- a/docs/src/development/proposals/index.md +++ b/docs/src/development/proposals/index.md @@ -18,7 +18,7 @@ Possible states are: Once a proposal was accepted, an issue should be raised and the implementation should be done in a separate PR. | Name | Description | State | -| :------------------------ | :--------------------------------------------- | :-------------: | +|:--------------------------|:-----------------------------------------------|:---------------:| | [MEP-1](MEP1/README.md) | Distributed Control Plane Deployment | `In Discussion` | | [MEP-2](MEP2/README.md) | Two Factor Authentication | `Aborted` | | [MEP-3](MEP3/README.md) | Machine Re-Installation to preserve local data | `Completed` | @@ -30,3 +30,4 @@ Once a proposal was accepted, an issue should be raised and the implementation s | [MEP-10](MEP10/README.md) | SONiC Support | `Completed` | | [MEP-11](MEP11/README.md) | Auditing of metal-stack resources | `Completed` | | [MEP-12](MEP12/README.md) | Rack Spreading | `Completed` | +| [MEP-13](MEP13/README.md) | BGP data plane Visibility | `In Discussion` | From e3370a05c5e61c9fa8e09fcb091e47f316f81b53 Mon Sep 17 00:00:00 2001 From: Ilja Rotar Date: Tue, 17 Jun 2025 13:56:16 +0200 Subject: [PATCH 02/14] rename to mep-17 --- docs/src/development/proposals/{MEP13 => MEP17}/README.md | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename docs/src/development/proposals/{MEP13 => MEP17}/README.md (100%) diff --git a/docs/src/development/proposals/MEP13/README.md b/docs/src/development/proposals/MEP17/README.md similarity index 100% rename from docs/src/development/proposals/MEP13/README.md rename to docs/src/development/proposals/MEP17/README.md From 05bf785b8636091498f8753600e21f1fc5d3f4d5 Mon Sep 17 00:00:00 2001 From: Ilja Rotar Date: Tue, 17 Jun 2025 13:58:12 +0200 Subject: [PATCH 03/14] link to mep17 --- docs/src/developers/proposals/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/src/developers/proposals/index.md b/docs/src/developers/proposals/index.md index 9a226e0df6..974a37be51 100644 --- a/docs/src/developers/proposals/index.md +++ b/docs/src/developers/proposals/index.md @@ -35,5 +35,5 @@ Once a proposal was accepted, an issue should be raised and the implementation s | [MEP-14](MEP14/README.md) | Independence from external sources | `Completed` | | MEP-15 | HAL Improvements | `In Discussion` | | [MEP-16](MEP16/README.md) | Firewall Support for Cluster API Provider | `In Discussion` | -| MEP-17 | Announced IP visibility | `In Discussion` | +| [MEP-17](MEP17/README.md) | Announced IP visibility | `In Discussion` | | [MEP-18](MEP18/README.md) | Autonomous Control Plane | `In Discussion` | From 4f0b9c19e2e9d69b0703b5a159109731ed04ab7e Mon Sep 17 00:00:00 2001 From: Ilja Rotar Date: Tue, 17 Jun 2025 16:56:35 +0200 Subject: [PATCH 04/14] first draft for metal-core enhancement --- docs/src/developers/proposals/index.md | 4 +- .../src/development/proposals/MEP17/README.md | 112 ++++++++++++++++-- 2 files changed, 102 insertions(+), 14 deletions(-) diff --git a/docs/src/developers/proposals/index.md b/docs/src/developers/proposals/index.md index 974a37be51..40ecd6d49c 100644 --- a/docs/src/developers/proposals/index.md +++ b/docs/src/developers/proposals/index.md @@ -1,4 +1,4 @@ -# Metal Stack Enhancement Proposals (MEPs) +Metal Stack Enhancement Proposals (MEPs) This section contains proposals which address substantial modifications to metal-stack. @@ -35,5 +35,5 @@ Once a proposal was accepted, an issue should be raised and the implementation s | [MEP-14](MEP14/README.md) | Independence from external sources | `Completed` | | MEP-15 | HAL Improvements | `In Discussion` | | [MEP-16](MEP16/README.md) | Firewall Support for Cluster API Provider | `In Discussion` | -| [MEP-17](MEP17/README.md) | Announced IP visibility | `In Discussion` | +| [MEP-17](MEP17/README.md) | BGP Data Plane Visibility | `In Discussion` | | [MEP-18](MEP18/README.md) | Autonomous Control Plane | `In Discussion` | diff --git a/docs/src/development/proposals/MEP17/README.md b/docs/src/development/proposals/MEP17/README.md index b290fa8aec..3c95f9ba12 100644 --- a/docs/src/development/proposals/MEP17/README.md +++ b/docs/src/development/proposals/MEP17/README.md @@ -1,20 +1,108 @@ -# BGP data plane visibility +# BGP Data Plane Visibility -Currently a operator can not identify if a certain IP, which is allocated, is actually announced to the outer world. -We want to gather information about the routes on the edge of the network of every partition and store them in the metal-api. +Currently, an operator cannot identify if an allocated IP is actually announced to the outer world. +At the edge of the network we would like to gather information about all routes announced from within the network. +This information should include a timestamp of the last announcement for each route. +If the timestamp is older than some threshold we will assume that the addresses are not longer used. -This will bring more visibility to the network and ip address usage in the dataplane. +To achieve this we will extend the scope of metal-core so it can run on all types of switches, not only on leaves. +As a byproduct of this enhancement all switches will become visible via `metalctl switch ls`. +On the switches the metal-core will collect BGP routes and report them to the metal-apiserver. +The metal-apiserver will store these data to a separate table and query this table when an IP address is described. -To achieve this goal we need to implement a new microservice which collects these data and send them via grpc to the metal-api. -The metal-api will store them in a separate table. Later when a network or single IP is described -a lookup to that table is made to show when this ip was last announced. +## metal-core -## metal-api +### Switch Types -TODO: describe the new grpc endpoint API and the table structure where the data is stored +First of all, the metal-core should accept as an argument the type or role of the switch it is running on. +Possible types are: -## new microservice on the border router +- `leaf` +- `spine` +- `exit` +- `mgmtleaf` +- `mgmtspine` +- `mgmteor` -TODO: decide name +Depending on the type its reconcilation loop will differ. +The current behavior should mostly remain unchanged for leaf switches. +Things to change for non-leaves: -HINT: reuse the frr api logic from frr-monitor +**Phoned Home** + +Currently, a [go-lldp](https://github.com/metal-stack/go-lldpd) client is used to listen for LLDP messages from provisioned machines to report these as phoned-home events to the metal-api. +This mechanism is only needed on leaf switches. +On all other types of switch this entire procedure can be skipped. + +**Port Configuration** + +There are four kinds of ports for a leaf switch: spine uplink, unprovisioned port, firewall port, machine port. +Depending on the kind of port its configuration will differ in regards to MTU, VLAN binding and VRF binding. +Any non-leaf switches don't know anything about machines, firewalls and the provisioning cycle. +Their port configuration is static. + +**FRR Config** + +The same goes for the FRR config. +To dynamically adapt to machines being provisioned and unprovisioned, the metal-core periodically writes the `frr.conf` file. +This dynamic configuration is only necessary on the leaf switches. +All other switches need a static FRR config. + +> In a future MEP we consider delegating the entire configuration of a switch to the metal-core. +> For now, all configuration that doesn't need to be dynamically adjusted will be deployed on the switch via metal-roles and the metal-core will mostly just report switch information to the metal-apiserver. + +### BGP Announcements + +Route information can be retrieved in JSON format from vtysh. +The metal-core should collect all routes it knows about and send them to the metal-apiserver along with a timestamp. + +### Switch-to-Switch Connections + +Similarly to the switch-to-machine connections where LLDP neighborship is used to learn about the physical connections, we can use LLDP to report connections between switches to the metal-apiserver. +For this, a separate LLDP client should be used, that forwards all LLDP messages, not only those of provisioned machines. + +## metal-apiserver + +A new GRPC endpoint should be exposed by the metal-apiserver to report BGP routes. + +```proto +service IPService { + rpc Get(IPServiceReportBGPRoutesRequest) returns (IPServiceReportBGPRoutesResponse) { + option (project_roles) = PROJECT_ROLE_OWNER; + option (project_roles) = PROJECT_ROLE_EDITOR; + option (project_roles) = PROJECT_ROLE_VIEWER; + option (auditing) = AUDITING_EXCLUDED; + } +} + +message IPServiceReportBGPRoutesRequest { + repeated BGPRoute bgpRoutes = 1; +} + +message BGPRoute { + string cidr = 1; + google.protobuf.Timestamp last_announced = 2; +} +``` + +There should be a table for BGP routes in metal-db. +Whenever new routes are reported they get merged into the existing ones by the strategy: + +- when new, just add +- when existing, update `last_announced` timestamp + +An expiration threshold should be defined and all expired routes should be cleaned up periodically. + +When an IP address is described with `metalctl network ip describe` the BGP routes should be queried. +If no route to the described IP was announced it should be indicated, e.g. + +```bash +allocationuuid: allocation-id +description: my ip address +ipaddress: 100.0.0.1 +name: ip-name +networkid: network-id +projectid: project-id +type: static +used: no # otherwise 'yes' +``` From 99620be1e5ed3a74fa5db7e2950a4db5a4d47f59 Mon Sep 17 00:00:00 2001 From: Ilja Rotar Date: Tue, 17 Jun 2025 16:57:36 +0200 Subject: [PATCH 05/14] typo --- docs/src/development/proposals/MEP17/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/src/development/proposals/MEP17/README.md b/docs/src/development/proposals/MEP17/README.md index 3c95f9ba12..ffb151f31a 100644 --- a/docs/src/development/proposals/MEP17/README.md +++ b/docs/src/development/proposals/MEP17/README.md @@ -24,7 +24,7 @@ Possible types are: - `mgmtspine` - `mgmteor` -Depending on the type its reconcilation loop will differ. +Depending on the type its reconciliation loop will differ. The current behavior should mostly remain unchanged for leaf switches. Things to change for non-leaves: From 2baa390f2cdb143d62073543dd122341d51cfbb2 Mon Sep 17 00:00:00 2001 From: Ilja Rotar Date: Wed, 18 Jun 2025 08:30:15 +0200 Subject: [PATCH 06/14] review --- docs/src/development/proposals/MEP17/README.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/docs/src/development/proposals/MEP17/README.md b/docs/src/development/proposals/MEP17/README.md index ffb151f31a..ad1c69cea8 100644 --- a/docs/src/development/proposals/MEP17/README.md +++ b/docs/src/development/proposals/MEP17/README.md @@ -67,7 +67,7 @@ A new GRPC endpoint should be exposed by the metal-apiserver to report BGP route ```proto service IPService { - rpc Get(IPServiceReportBGPRoutesRequest) returns (IPServiceReportBGPRoutesResponse) { + rpc ReportBGPRoutes(IPServiceReportBGPRoutesRequest) returns (IPServiceReportBGPRoutesResponse) { option (project_roles) = PROJECT_ROLE_OWNER; option (project_roles) = PROJECT_ROLE_EDITOR; option (project_roles) = PROJECT_ROLE_VIEWER; @@ -81,7 +81,8 @@ message IPServiceReportBGPRoutesRequest { message BGPRoute { string cidr = 1; - google.protobuf.Timestamp last_announced = 2; + string switch_id = 2; + google.protobuf.Timestamp last_announced = 3; } ``` From 06967263a093013af5deb451e3f1dc509db40efd Mon Sep 17 00:00:00 2001 From: Ilja Rotar Date: Wed, 18 Jun 2025 08:32:24 +0200 Subject: [PATCH 07/14] rename --- docs/src/{development => developers}/proposals/MEP17/README.md | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename docs/src/{development => developers}/proposals/MEP17/README.md (100%) diff --git a/docs/src/development/proposals/MEP17/README.md b/docs/src/developers/proposals/MEP17/README.md similarity index 100% rename from docs/src/development/proposals/MEP17/README.md rename to docs/src/developers/proposals/MEP17/README.md From bff754b22c27d14b13cd816a78e6b35aa3780861 Mon Sep 17 00:00:00 2001 From: Ilja Rotar Date: Wed, 18 Jun 2025 08:45:00 +0200 Subject: [PATCH 08/14] remove eor --- docs/src/developers/proposals/MEP17/README.md | 1 - 1 file changed, 1 deletion(-) diff --git a/docs/src/developers/proposals/MEP17/README.md b/docs/src/developers/proposals/MEP17/README.md index ad1c69cea8..4a5f7ce1c0 100644 --- a/docs/src/developers/proposals/MEP17/README.md +++ b/docs/src/developers/proposals/MEP17/README.md @@ -22,7 +22,6 @@ Possible types are: - `exit` - `mgmtleaf` - `mgmtspine` -- `mgmteor` Depending on the type its reconciliation loop will differ. The current behavior should mostly remain unchanged for leaf switches. From 7697d6e9ad5a235ead4e2c5b0cfa9905fe21f32c Mon Sep 17 00:00:00 2001 From: Ilja Rotar Date: Wed, 18 Jun 2025 09:31:20 +0200 Subject: [PATCH 09/14] describe data structure for bgp routes in metaldb --- docs/src/developers/proposals/MEP17/README.md | 30 ++++++++++++++----- 1 file changed, 22 insertions(+), 8 deletions(-) diff --git a/docs/src/developers/proposals/MEP17/README.md b/docs/src/developers/proposals/MEP17/README.md index 4a5f7ce1c0..e4260b7946 100644 --- a/docs/src/developers/proposals/MEP17/README.md +++ b/docs/src/developers/proposals/MEP17/README.md @@ -65,16 +65,15 @@ For this, a separate LLDP client should be used, that forwards all LLDP messages A new GRPC endpoint should be exposed by the metal-apiserver to report BGP routes. ```proto -service IPService { - rpc ReportBGPRoutes(IPServiceReportBGPRoutesRequest) returns (IPServiceReportBGPRoutesResponse) { - option (project_roles) = PROJECT_ROLE_OWNER; - option (project_roles) = PROJECT_ROLE_EDITOR; - option (project_roles) = PROJECT_ROLE_VIEWER; - option (auditing) = AUDITING_EXCLUDED; +service SwitchService { + rpc ReportBGPRoutes(SwitchServiceReportBGPRoutesRequest) returns (SwitchServiceReportBGPRoutesResponse) { + option (metalstack.api.v2.infra_roles) = INFRA_ROLE_EDITOR; + option (metalstack.api.v2.infra_roles) = INFRA_ROLE_VIEWER; + option (metalstack.api.v2.auditing) = AUDITING_EXCLUDED; } } -message IPServiceReportBGPRoutesRequest { +message SwitchServiceReportBGPRoutesRequest { repeated BGPRoute bgpRoutes = 1; } @@ -85,7 +84,22 @@ message BGPRoute { } ``` -There should be a table for BGP routes in metal-db. +There should be a table for BGP routes in metaldb: + +```go +type ( + BGPRoute struct { + CIDR string `rethinkdb:"cidr"` + LastAnnounced time.Time `rethinkdb:"lastannounced"` + } + + BGPRoutes struct { + PartitionID string `rethinkdb:"partitionid"` + Routes []BGPRoute `rethinkdb:"routes"` + } +) +``` + Whenever new routes are reported they get merged into the existing ones by the strategy: - when new, just add From 8c1888597e90d33bf26084bbef1ad02c08e18dd8 Mon Sep 17 00:00:00 2001 From: Ilja Rotar Date: Wed, 18 Jun 2025 09:45:34 +0200 Subject: [PATCH 10/14] parition id irrelevant for routes --- docs/src/developers/proposals/MEP17/README.md | 1 - 1 file changed, 1 deletion(-) diff --git a/docs/src/developers/proposals/MEP17/README.md b/docs/src/developers/proposals/MEP17/README.md index e4260b7946..1fad7a57a5 100644 --- a/docs/src/developers/proposals/MEP17/README.md +++ b/docs/src/developers/proposals/MEP17/README.md @@ -94,7 +94,6 @@ type ( } BGPRoutes struct { - PartitionID string `rethinkdb:"partitionid"` Routes []BGPRoute `rethinkdb:"routes"` } ) From b574521e62871e998d752a31d69fae33d2f5bc33 Mon Sep 17 00:00:00 2001 From: Ilja Rotar Date: Wed, 18 Jun 2025 09:50:54 +0200 Subject: [PATCH 11/14] ignore internal prefixes --- docs/src/developers/proposals/MEP17/README.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/docs/src/developers/proposals/MEP17/README.md b/docs/src/developers/proposals/MEP17/README.md index 1fad7a57a5..d880e02172 100644 --- a/docs/src/developers/proposals/MEP17/README.md +++ b/docs/src/developers/proposals/MEP17/README.md @@ -106,6 +106,9 @@ Whenever new routes are reported they get merged into the existing ones by the s An expiration threshold should be defined and all expired routes should be cleaned up periodically. +Only routes to external networks should be stored. +Cluster-internal prefixes should be ignored. + When an IP address is described with `metalctl network ip describe` the BGP routes should be queried. If no route to the described IP was announced it should be indicated, e.g. From f9dc5699a0531b65715be23a389ea988bf2352e1 Mon Sep 17 00:00:00 2001 From: Ilja Rotar Date: Wed, 18 Jun 2025 15:35:31 +0200 Subject: [PATCH 12/14] add switch id --- docs/src/developers/proposals/MEP17/README.md | 29 ++++--------------- 1 file changed, 6 insertions(+), 23 deletions(-) diff --git a/docs/src/developers/proposals/MEP17/README.md b/docs/src/developers/proposals/MEP17/README.md index d880e02172..7eb1884f89 100644 --- a/docs/src/developers/proposals/MEP17/README.md +++ b/docs/src/developers/proposals/MEP17/README.md @@ -74,41 +74,24 @@ service SwitchService { } message SwitchServiceReportBGPRoutesRequest { - repeated BGPRoute bgpRoutes = 1; + string switch_id = 1; + repeated BGPRoute bgpRoutes = 2; } message BGPRoute { string cidr = 1; - string switch_id = 2; - google.protobuf.Timestamp last_announced = 3; } ``` -There should be a table for BGP routes in metaldb: - -```go -type ( - BGPRoute struct { - CIDR string `rethinkdb:"cidr"` - LastAnnounced time.Time `rethinkdb:"lastannounced"` - } - - BGPRoutes struct { - Routes []BGPRoute `rethinkdb:"routes"` - } -) -``` - +Reported routes should be stored to a redis database along with the switch that reported them and the timestamp of the last time they were reported. +An expiration threshold should be defined and all expired routes should be cleaned up periodically. +Only routes to external networks should be stored. +Cluster-internal prefixes should be ignored. Whenever new routes are reported they get merged into the existing ones by the strategy: - when new, just add - when existing, update `last_announced` timestamp -An expiration threshold should be defined and all expired routes should be cleaned up periodically. - -Only routes to external networks should be stored. -Cluster-internal prefixes should be ignored. - When an IP address is described with `metalctl network ip describe` the BGP routes should be queried. If no route to the described IP was announced it should be indicated, e.g. From b152156a8c54b0e888c6f33f5e38c43f3e34f548 Mon Sep 17 00:00:00 2001 From: Ilja Rotar Date: Thu, 7 Aug 2025 14:01:57 +0200 Subject: [PATCH 13/14] clearer and more high-level description --- docs/src/developers/proposals/MEP17/README.md | 116 +++++------------- docs/src/developers/proposals/index.md | 2 +- 2 files changed, 33 insertions(+), 85 deletions(-) diff --git a/docs/src/developers/proposals/MEP17/README.md b/docs/src/developers/proposals/MEP17/README.md index 7eb1884f89..25445fa9a1 100644 --- a/docs/src/developers/proposals/MEP17/README.md +++ b/docs/src/developers/proposals/MEP17/README.md @@ -1,21 +1,22 @@ -# BGP Data Plane Visibility +# Global Network View -Currently, an operator cannot identify if an allocated IP is actually announced to the outer world. -At the edge of the network we would like to gather information about all routes announced from within the network. -This information should include a timestamp of the last announcement for each route. -If the timestamp is older than some threshold we will assume that the addresses are not longer used. +> [!IMPORTANT] +> This MEP assumes the implementation of the metal-apiserver as described by [MEP-4](../MEP4/README.md) which is currently work in progress. -To achieve this we will extend the scope of metal-core so it can run on all types of switches, not only on leaves. -As a byproduct of this enhancement all switches will become visible via `metalctl switch ls`. -On the switches the metal-core will collect BGP routes and report them to the metal-apiserver. -The metal-apiserver will store these data to a separate table and query this table when an IP address is described. +Having a complete view of the network topology is useful when working with deployments or troubleshooting connectivity issues. +Currently, the API doesn't know of any other switches than the leaf switches. +Information about all other switches and their connections must be gathered from Ansible inventories or by accessing the switches via SSH. +Documentation of each partition's network must be kept in-sync with all changes made to the deployment or cabling. +We would like to expand the API's knowledge of the network to the entire underlay including inter-switch connections as well as BGP statistics and health status. -## metal-core +## Switch Types -### Switch Types - -First of all, the metal-core should accept as an argument the type or role of the switch it is running on. -Possible types are: +Registering a switch at the API is done by the metal-core. +Apart from that, it also reconciles port and FRR configuration to adapt to the machine provisioning cycle. +This reconfiguration is only necessary on the leaf switches. +To allow deploying the metal-core on other switches than leaves we need a way of telling it what type of switch it is running on so it can act accordingly. +On any non-leaf switches it will only register the switch and report statistic but not change any configuration. +Supported switch types are - `leaf` - `spine` @@ -23,85 +24,32 @@ Possible types are: - `mgmtleaf` - `mgmtspine` -Depending on the type its reconciliation loop will differ. -The current behavior should mostly remain unchanged for leaf switches. -Things to change for non-leaves: - -**Phoned Home** - -Currently, a [go-lldp](https://github.com/metal-stack/go-lldpd) client is used to listen for LLDP messages from provisioned machines to report these as phoned-home events to the metal-api. -This mechanism is only needed on leaf switches. -On all other types of switch this entire procedure can be skipped. - -**Port Configuration** - -There are four kinds of ports for a leaf switch: spine uplink, unprovisioned port, firewall port, machine port. -Depending on the kind of port its configuration will differ in regards to MTU, VLAN binding and VRF binding. -Any non-leaf switches don't know anything about machines, firewalls and the provisioning cycle. -Their port configuration is static. - -**FRR Config** - -The same goes for the FRR config. -To dynamically adapt to machines being provisioned and unprovisioned, the metal-core periodically writes the `frr.conf` file. -This dynamic configuration is only necessary on the leaf switches. -All other switches need a static FRR config. +## Network Topology -> In a future MEP we consider delegating the entire configuration of a switch to the metal-core. -> For now, all configuration that doesn't need to be dynamically adjusted will be deployed on the switch via metal-roles and the metal-core will mostly just report switch information to the metal-apiserver. +All switches should periodically report their LLDP neighbors and port configuration. +This information can be used to quickly identify common network issues, like MTU mismatch or the like. +Ideally, there would be some graphical representation of the network topology containing only the most important information for a quick overview. +It should contain all switches and machines as nodes and all connections as edges of a graph. +Ports, VRFs, and maybe also IPs should be associated with a connection. -### BGP Announcements +Apart from the topology graph, there should be a way to display more detailed information about both ports of a connection, like -Route information can be retrieved in JSON format from vtysh. -The metal-core should collect all routes it knows about and send them to the metal-apiserver along with a timestamp. +- MTU +- speed +- IP +- UP/DOWN status +- VRF +- VLAN +- whether it participates in a BGP session -### Switch-to-Switch Connections - -Similarly to the switch-to-machine connections where LLDP neighborship is used to learn about the physical connections, we can use LLDP to report connections between switches to the metal-apiserver. -For this, a separate LLDP client should be used, that forwards all LLDP messages, not only those of provisioned machines. - -## metal-apiserver - -A new GRPC endpoint should be exposed by the metal-apiserver to report BGP routes. - -```proto -service SwitchService { - rpc ReportBGPRoutes(SwitchServiceReportBGPRoutesRequest) returns (SwitchServiceReportBGPRoutesResponse) { - option (metalstack.api.v2.infra_roles) = INFRA_ROLE_EDITOR; - option (metalstack.api.v2.infra_roles) = INFRA_ROLE_VIEWER; - option (metalstack.api.v2.auditing) = AUDITING_EXCLUDED; - } -} - -message SwitchServiceReportBGPRoutesRequest { - string switch_id = 1; - repeated BGPRoute bgpRoutes = 2; -} - -message BGPRoute { - string cidr = 1; -} -``` +## BGP Announcements +The metal-core should collect all routes it knows about and send them to the API along with a timestamp. Reported routes should be stored to a redis database along with the switch that reported them and the timestamp of the last time they were reported. An expiration threshold should be defined and all expired routes should be cleaned up periodically. -Only routes to external networks should be stored. -Cluster-internal prefixes should be ignored. Whenever new routes are reported they get merged into the existing ones by the strategy: - when new, just add - when existing, update `last_announced` timestamp -When an IP address is described with `metalctl network ip describe` the BGP routes should be queried. -If no route to the described IP was announced it should be indicated, e.g. - -```bash -allocationuuid: allocation-id -description: my ip address -ipaddress: 100.0.0.1 -name: ip-name -networkid: network-id -projectid: project-id -type: static -used: no # otherwise 'yes' -``` +By querying the BGP announcements we can find out whether an allocated IP is still in use. diff --git a/docs/src/developers/proposals/index.md b/docs/src/developers/proposals/index.md index 40ecd6d49c..cc5997a4dd 100644 --- a/docs/src/developers/proposals/index.md +++ b/docs/src/developers/proposals/index.md @@ -35,5 +35,5 @@ Once a proposal was accepted, an issue should be raised and the implementation s | [MEP-14](MEP14/README.md) | Independence from external sources | `Completed` | | MEP-15 | HAL Improvements | `In Discussion` | | [MEP-16](MEP16/README.md) | Firewall Support for Cluster API Provider | `In Discussion` | -| [MEP-17](MEP17/README.md) | BGP Data Plane Visibility | `In Discussion` | +| [MEP-17](MEP17/README.md) | Global Network View | `In Discussion` | | [MEP-18](MEP18/README.md) | Autonomous Control Plane | `In Discussion` | From 47762b0e2fdf45f4a2d855008b127030c5956a89 Mon Sep 17 00:00:00 2001 From: Ilja Rotar Date: Thu, 7 Aug 2025 14:09:36 +0200 Subject: [PATCH 14/14] accident --- docs/src/developers/proposals/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/src/developers/proposals/index.md b/docs/src/developers/proposals/index.md index f4e9d27d3b..f0525ab746 100644 --- a/docs/src/developers/proposals/index.md +++ b/docs/src/developers/proposals/index.md @@ -1,4 +1,4 @@ -Metal Stack Enhancement Proposals (MEPs) +# Metal Stack Enhancement Proposals (MEPs) This section contains proposals which address substantial modifications to metal-stack.