Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Config template support 🚀 #383

Merged
merged 38 commits into from
Jun 30, 2021
Merged

Config template support 🚀 #383

merged 38 commits into from
Jun 30, 2021

Conversation

kellerza
Copy link
Contributor

@kellerza kellerza commented Apr 12, 2021

Configuration templates.

  • The configuration template system support roles, if no role specified, "role"="kind". Roles are used to find the template file. <template>-<role>.tmpl. A list of <template>s is specified with the -l parameter
  • Template functions moved to external package, see functions here. All functions are chainable & where internal ones were duplicated it supports the original syntax and the chainable syntax
  • This PR includes a basic template for vr-sros & SRL to configure Ports, Interfaces, IGP (ISIS) and SR on SROS
  • The SSH deployer support SROS MD-CLI and SRL. Transaction support is automatic, except if the template starts with "show"
  • The SSH deployer can be used for allow CLI/SSH based configuration from a file referenced in topo #317

How does this work?

A clab.yml file with some additional labels. see lab-examples/vr05

name: conf1
topology:
  defaults:
    kind: vr-sros
    image: registry.srlinux.dev/pub/vr-sros:21.2.R1
    license: /home/kellerza/containerlabs/license-sros21.txt
    labels:
      isis_iid: 0
  nodes:
    sr1:
      labels:
        systemip: 10.0.50.31/32
        sid_idx: 1
    sr2:
      labels:
        systemip: 10.0.50.32/32
        sid_idx: 2
    sr3:
      labels:
        systemip: 10.0.50.33/32
        sid_idx: 3
    sr4:
      labels:
        systemip: 10.0.50.34/32
        sid_idx: 4
  links:
    - endpoints: [sr1:eth1, sr2:eth2]
      labels:
        port: 1/1/c1/1, 1/1/c2/1
        ip: 1.1.1.2/30
        vlan: 99
    - endpoints: [sr2:eth1, sr3:eth2]
      labels:
        port: 1/1/c1/1, 1/1/c2/1
    - endpoints: [sr3:eth1, sr4:eth2]
      labels:
        port: 1/1/c1/1, 1/1/c2/1
    - endpoints: [sr4:eth1, sr1:eth2]
      labels:
        port: 1/1/c1/1, 1/1/c2/1

This lab can be deployed with the existing clab

./containerlab deploy -t conf1.clab.yml 

Whats new?

The fun starts when all nodes are up and running :-)

./containerlab config -t conf1.clab.yml --template-path ./templates

Presto. An instant lab

SSH & run some show commands

A:admin@sr4# show router route-table

===============================================================================
Route Table (Router: Base)
===============================================================================
Dest Prefix[Flags]                            Type    Proto     Age        Pref
      Next Hop[Interface Name]                                    Metric
-------------------------------------------------------------------------------
1.1.1.0/30                                    Remote  ISIS      00h00m10s  18
       1.31.34.0                                                    20
1.31.34.0/31                                  Local   Local     00h00m20s  0
       to_sr1                                                       0
1.32.33.0/31                                  Remote  ISIS      00h00m09s  18
       1.33.34.0                                                    20
1.33.34.0/31                                  Local   Local     00h00m20s  0
       to_sr3                                                       0
10.0.50.31/32                                 Remote  ISIS      00h00m10s  18
       1.31.34.0                                                    10
10.0.50.32/32                                 Remote  ISIS      00h00m10s  18
       1.31.34.0                                                    20
10.0.50.33/32                                 Remote  ISIS      00h00m09s  18
       1.33.34.0                                                    10
10.0.50.34/32                                 Local   Local     00h00m20s  0
       system                                                       0
-------------------------------------------------------------------------------
No. of Routes: 8
Flags: n = Number of times nexthop is repeated
       B = BGP backup route available
       L = LFA nexthop available
       S = Sticky ECMP requested
===============================================================================

How does this work?

The labels are variables that will be used in the configuration templates.. In the future this will be settings

Variables are prepared for the node. It will include an array for all links on the node. For each link, any string with a comma (e.g. a label of port: 1/1/1, 1/1/2) refers to the two endpoints of the link. In this case, the value for port will be 1/1/1 and port_far will be 1/1/2 for node A and swapped for node B.

Some fields have special generators, like link IPs. Specify only a single IP, the far end will be calculated. Specify nothing and an IP will be calculated from the systemip's

An example of prepared variables (you can also see this by adding -dd to the CLI

vars = {
      "isis_iid": "0",
      "links": [
            {
                  "ip": "1.50.51.1/31",
                  "ip_far": "1.50.51.0/31",
                  "isis_iid": "0",
                  "name": "to_srl",
                  "name_far": "to_sros",
                  "port": "1/1/c1/1",
                  "port_far": "ethernet-1/1",
                  "systemip": "10.0.50.51/32",
                  "systemip_far": "10.0.50.50/32",
                  "vlan": "10"
            }
      ],
      "role": "vr-sros",
      "sid_idx": "10",
      "systemip": "10.0.50.51/32"
  }

Layering config templates

The current 'base' template might be a bit opinionated (ISIS, SR, RSVP-TE), but you can always use your own.

In general it is good to keep the config templates small, as they can be layered: -l base,show-route-table will configure the route table and then run the show commands

Help

$ clab config --help
configure a lab based using templates and variables from the topology definition file
reference: https://containerlab.srlinux.dev/cmd/config/

Usage:
  containerlab config [flags]

Aliases:
  config, conf

Flags:
  -c, --check int               render dry-run & print n lines of config
  -h, --help                    help for config
  -l, --template-list strings   comma separated list of template names to render
  -p, --template-path string    directory with templates used to render config

Global Flags:
  -d, --debug count        enable debug mode
  -n, --name string        lab name
  -r, --runtime string     container runtime (default "docker")
      --timeout duration   timeout for docker requests, e.g: 30s, 1m, 2m30s (default 30s)
  -t, --topo string        path to the file with topology information

@hellt
Copy link
Member

hellt commented Apr 13, 2021

Thanks @kellerza
that is a nice step forward and a big chunk of work you did!

I think we won't be able to get away without making the architecture extensible. I wonder if you feel the same?

Kind specific configs

First, I think we need to have a certain set of per-kind driver files which define the core configuration pieces of a kind. Maybe something similar to what scrapli does - https://github.com/scrapli/scrapli_community#adding-a-platform

With a per-kind config we can flexibly work with a session manager (i.e. ssh), without hardcoded on-open, on-close, commit commands.

SSH session config

I think with the above you will unlock the possibility to configure the session flexibly. Maybe you can look into https://github.com/networklore/netrasp to see how they handle session configuration and vendor specific configs

@hellt
Copy link
Member

hellt commented Apr 13, 2021

Maybe @karimra can help us out here with architectural advise on how to pack nicely the interfaces to implement

  • pluggable transports
  • pluggable drivers (aka kinds)

@kellerza
Copy link
Contributor Author

In my view we should separate the config generation (templating) from delivering the config & that is the main reason I haven't spend that much time on the delivery (support only one kind for example) today

Kind specific delivery (config of the SSH session, prompt, SSH delivery, properly parsing prompts) can even be offloaded to something like netrasp. Saying that we already have save implemented with per device defaults. So maybe adding some device kind to the SSH deployer is not so unrealistic.

I don't see go as easily extendable as Python (like scrapli or galaxy where you get a new package, since everything is interpreted & no need to recompile), so you will probably need a more configurable system (maybe this is netrasp today)

@hellt
Copy link
Member

hellt commented Apr 13, 2021

@kellerza yes, I should have clarified that by pluggable I meant something that can read necessary inputs from a file (like different prompts, command sequences that are needed to be used on open and close of session)

I am all in to try and bring the sros/srl support for netrasp and reuse it's code, it is even more awesome. We/I can chat with Patrick (netrasp maintainer) about that.

In the meantime we can focus on templation architecture and discuss how to properly model configuration piece.
Maybe it is not a bad idea to explore a separate container under topo file to define configuration.

I'll hook up @hansthienpondt here as well as he might be interested in this

@karimra
Copy link
Member

karimra commented Apr 13, 2021

For transport you could do something along the lines of:

type Transport interface {
        // creates ssh connection and session
        // or creates a gNMI client and authenticates ( use capabilities to discover the kind ? )
        // or creates an http client for json-rpc
	Open(address string, opts ...opts) error
        // here the byte slice would be what is generated from the templates,
        // cli commands && write in case of ssh, 
        // json body in case of gNMI then sends a gNMI Set update to path `/`
        // json body or commands in case of json-rpc, then sends a set or cli RPC
	Write(data []byte) error
        // cleanup
	Close() error
}

Where opts can be used to pass params like auth method for ssh, dialOptions for gRPC, http headers for json-rpc.
And pass the kind to the transport.

an ssh implementation would look like:

type sshTransport struct {
    // unexported config fields to customize the behavior; prompt, kind, etc
   c *ssh.Connection
   s *ssh.Session
}

gnmi implementation

type gnmiTransport struct {
    // unexported fields to save the kind,...
    c       gnmi.GNMIClient
    cfn     context.CancelFunc
}

json-rpc implementation:

type sshTransport struct {
   // unexported fields to save the kind,...
    c *http.Client
}

For Kinds, I had a discussion with @steiler last week about this, we reached this:

type Kind interface{
    Initialize()
    Deploy()
    PostDeploy()
    Destroy()
}

Not sure how far this has gotten, and there was not much thought on the specific methods signatures.

@steiler
Copy link
Collaborator

steiler commented Apr 13, 2021

Haven't had the time up until now. It is quite some code that needs to be reorged. However I'm still on top of it.

@kellerza
Copy link
Contributor Author

kellerza commented Apr 13, 2021

type sshTransport struct {
   // unexported fields to save the kind,...
    c *http.Client
}
type Kind interface{
    Initialize()
    Deploy()
    PostDeploy()
    Destroy()
}

Thaks @karimra for an SSH session in this PR I have

type Session struct {
	In      io.Reader
	Out     io.WriteCloser
	Session *ssh.Session
}

here we should probably allow an extendable kind using one of these SSH session, on an interface that adds the relevant Connect, configure global, commit actions

type KindSSH interface {
  func connect()
  func config()
  func write()
  func commit()
  func close()
}

edit: maybe the write should internally wrap the config() and commit() actions

@karimra
Copy link
Member

karimra commented Apr 13, 2021

yes my answer was about to be your edit :)
your code can be reused as sshTransport for SRL.

BTW the Kind here is the Node kind (srl, ceos,...) not the transport kind.

@hellt
Copy link
Member

hellt commented Apr 13, 2021

@kellerza what if the next step would be to make this common Transport interface and make your sshTransport to implement it?
What I would want to do next is to create gnmic transport and create a few templates that could be pushed over it to the nodes.

To me that would make an MVP for this one. It will deliver the extensibility of transports and will have two variants of templation ways (CLI and YAML)

but that means that we would need to add configuration of the transport method somewhere in the topo file, to indicate which transport to use and (later) which templates to render

maybe something like this?

name: cfg
settings: # or maybe metadata?
  config-transport: gnmi
  config-templates: ["a.tmpl", "b.tmpl"]

ADD1: on the other hand, the selection of config templates to apply are likely needed to be used on a per-node/kind/default level
as well as transport mode can be set inside the topology

ADD2:

topology:
  defaults:
    kind: vr-sros
    image: registry.srlinux.dev/pub/vr-sros:21.2.R1
    license: /home/kellerza/containerlabs/license-sros21.txt
    labels:
      isis_iid: 0
      config-transport: ssh
      config-templates: ifaces.tmpl, bgp.tmpl
  nodes:
    sr1:
      labels:
        systemip: 10.0.50.31/32
        sid_idx: 1
        config-templates: bgp.tmpl
    sr2:
      labels:
        systemip: 10.0.50.32/32
        sid_idx: 2
        config-transport: gnmi
    sr3:
      labels:
        systemip: 10.0.50.33/32
        sid_idx: 3
  links:
    - endpoints: [sr1:eth1, sr2:eth2]
      labels:
        port: 1/1/c1/1, 1/1/c2/1
        ip: 1.1.1.2/30
        vlan: 99
    - endpoints: [sr2:eth1, sr3:eth2]
      labels:
        port: 1/1/c1/1, 1/1/c2/1

@hellt
Copy link
Member

hellt commented Apr 13, 2021

Another proposal from internal discussions

config:
  vars:
    isis_id: 0
  transport: ssh
  templates:
    - bgp.yml
    - ifaces.yml

@kellerza kellerza mentioned this pull request Apr 13, 2021
21 tasks
@henderiw
Copy link
Contributor

henderiw commented Apr 14, 2021

I believe we need something more flexible.

here is what I am using myself for now:

there is a dynamic case to build the infra -> it auto-assigns the AS, ISL subnets, Loopbacks from a pool, you can select single-stack, dual stack or v6-only; e/I/BGP, ISIS, OSPF

infrastructure:
  # dual-stack, ipv4-only, ipv6-only
  addressing_schema: "dual-stack"
  networks:
    loopback: {ipv4_cidr: 100.112.100.0/24, ipv6_cidr: 3100:100::/48}
    isl: {ipv4_cidr: 100.64.0.0/16, ipv6_cidr: 3100:64::/48, ipv4_itfce_prefix_length: 31, ipv6_itfce_prefix_length: 127}
  protocols:
    protocol: ebgp
    as_pool: [65000, 65100]
    overlay_as: 65002
    overlay_protocol: evpn

After you add labels to the links. Lag become interesting as you have to add some logical object to the link.

I did this so far.

links:
    # server connectivity
    - endpoints: ["leaf1:e1-1", "master0:eno5"]
      labels: {"kind": "access", "type": "esi1", "client-name": "bond0", "sriov": true, "ipvlan": true, "speed": "10G", "pxe": true}
    - endpoints: ["leaf2:e1-1", "master0:eno6"]
      labels: {"kind": "access", "type": "esi1", "client-name": "bond0", "sriov": true, "ipvlan": true, "speed": "10G"}
    - endpoints: ["leaf1:e1-2", "master0:ens2f0"]
      labels: {"kind": "access", "type": "esi2", "client-name": "bond1", "sriov": true, "ipvlan": true, "speed": "10G", "pxe": true}
    - endpoints: ["leaf2:e1-2", "master0:ens2f1"]
      labels: {"kind": "access", "type": "esi2", "client-name": "bond1", "sriov": true, "ipvlan": true, "speed": "10G"}

I also added a position the the node to know if this device is part of the infra or not.

topology:
  kinds:
    srl:
      type: ixrd2
      position: network
    vr-sros:
      type: sr-1s
      position: network
    linux:
      type: rhel8
      labels: {"need_update_dist": false, "install_rt_sched": false, "install_net_driver": true} 
      position: access
      storage:
        nfs_server: 100.112.1.201
        nfs_mount: /nfs
        csi: nfs-csi
  nodes:
    leaf1:
      kind: srl
      mgmt_ipv4: 172.20.20.3
      labels: {"target": "leaf-grp1"}      
    leaf2:
      kind: srl
      mgmt_ipv4: 172.20.20.4
      labels: {"target": "leaf-grp1"} 
    master0:
      kind: linux
      mgmt_ipv4: 100.112.3.11
      labels: {"target": "servers"}
    worker0:
      kind: linux
      mgmt_ipv4: 100.112.3.12
      labels: {"target": "servers"}
    worker1:
      kind: linux
      mgmt_ipv4: 100.112.3.13
      labels: {"target": "servers"}
    worker2:
      kind: linux
      mgmt_ipv4: 100.112.3.14
      labels: {"target": "servers"}
    worker3:
      kind: linux
      mgmt_ipv4: 100.112.3.15
      labels: {"target": "servers"}
    dcgw1:
      kind: sros
      mgmt_ipv4: 172.20.20.1
      labels: {"target": "dcgw-grp1"}
    dcgw2:
      kind: sros
      mgmt_ipv4: 172.20.20.2
      labels: {"target": "dcgw-grp1"}

After I have workloads which do the even/ipvn, etc.

workloads:
  provisioning:
    servers:
      vlans:
        itfce: {vlan_id: 0, kind: bridged}
  infrastructure:
    servers:
      vlans:
        itfce: {vlan_id: 40, kind: irb, ipv4_cidr: 100.112.3.11/24, ipv6_cidr: 2010:100:3::/64,}
    dcgw-grp1:
      vlans:
        itfce: {vlan_id: 45, kind: routed, ipv4_cidr: 10.100.40.0/24, ipv6_cidr: 2010:100:40::/48, ipv4_itfce_prefix_length: 31, ipv6_itfce_prefix_length: 127, addressing_schema: "dual-stack"}
  multus-mgmt:
    servers:
      vlans:
        ipvlan: {vlan_id: 101, kind: irb, ipv4_cidr: 10.1.11.0/24, ipv6_cidr: 2010:100:11::/64,}
        sriov1: {vlan_id: 102, kind: irb, ipv4_cidr: 10.1.12.0/24, ipv6_cidr: 2010:100:12::/64, target: leaf1}
        sriov2: {vlan_id: 103, kind: irb, ipv4_cidr: 10.1.13.0/24, ipv6_cidr: 2010:100:13::/64, target: leaf2}
      networks:
        loopback: {ipv4_cidr: 10.254.15.0/24}
    dcgw-grp1:
      vlans:
        itfce: {vlan_id: 105, kind: routed, ipv4_cidr: 10.100.15.0/24, ipv6_cidr: 2010:100:15::/48, ipv4_itfce_prefix_length: 31, ipv6_itfce_prefix_length: 127, addressing_schema: "dual-stack"}
  multus-internal:
    servers:
      vlans:
        ipvlan: {vlan_id: 201, kind: irb, ipv4_cidr: 10.1.21.0/24, ipv6_cidr: 2010:100:21::/64}
        sriov1: {vlan_id: 202, kind: irb, ipv4_cidr: 10.1.22.0/24, ipv6_cidr: 2010:100:22::/64, target: leaf1}
        sriov2: {vlan_id: 203, kind: irb, ipv4_cidr: 10.1.23.0/24, ipv6_cidr: 2010:100:23::/64, target: leaf2}
      networks:
        loopback: {ipv4_cidr: 10.254.25.0/24}
    dcgw-grp1:
      vlans:
        itfce: {vlan_id: 205, kind: routed, ipv4_cidr: 10.100.25.0/24, ipv6_cidr: 2010:100:25::/48, ipv4_itfce_prefix_length: 31, ipv6_itfce_prefix_length: 127, addressing_schema: "dual-stack"}
  multus-external:
    servers:
      vlans:
        ipvlan: {vlan_id: 301, kind: irb, ipv4_cidr: 10.1.31.0/24, ipv6_cidr: 2010:100:31::/64}
        sriov1: {vlan_id: 302, kind: irb, ipv4_cidr: 10.1.32.0/24, ipv6_cidr: 2010:100:32::/64, target: leaf1}
        sriov2: {vlan_id: 303, kind: irb, ipv4_cidr: 10.1.33.0/24, ipv6_cidr: 2010:100:33::/64, target: leaf2}
      networks:
       loopback: {ipv4_cidr: 10.254.35.0/24}
    dcgw-grp1:
      vlans:
        itfce: {vlan_id: 305, kind: routed, ipv4_cidr: 10.100.35.0/24, ipv6_cidr: 2010:100:35::/48, ipv4_itfce_prefix_length: 31, ipv6_itfce_prefix_length: 127, addressing_schema: "dual-stack"}
  multus-sba:
    servers:
      vlans:
        ipvlan: {vlan_id: 401, kind: irb, ipv4_cidr: 10.1.41.0/24, ipv6_cidr: 2010:100:41::/64}
        sriov1: {vlan_id: 402, kind: irb, ipv4_cidr: 10.1.42.0/24, ipv6_cidr: 2010:100:42::/64, target: leaf1}
        sriov2: {vlan_id: 403, kind: irb, ipv4_cidr: 10.1.43.0/24, ipv6_cidr: 2010:100:43::/64, target: leaf2}
      networks:
       loopback: {ipv4_cidr: 10.254.45.0/24}
    dcgw-grp1:
      vlans:
        itfce: {vlan_id: 405, kind: routed, ipv4_cidr: 10.100.45.0/24, ipv6_cidr: 2010:100:45::/48, ipv4_itfce_prefix_length: 31, ipv6_itfce_prefix_length: 127, addressing_schema: "dual-stack"}
  multus-internet:
    servers:
      vlans:
        ipvlan: {vlan_id: 501, kind: irb, ipv4_cidr: 10.1.51.0/24, ipv6_cidr: 2010:100:51::/64}
        sriov1: {vlan_id: 502, kind: irb, ipv4_cidr: 10.1.52.0/24, ipv6_cidr: 2010:100:52::/64, target: leaf1}
        sriov2: {vlan_id: 503, kind: irb, ipv4_cidr: 10.1.53.0/24, ipv6_cidr: 2010:100:53::/64, target: leaf2}
      networks:
        loopback: {ipv4_cidr: 10.254.55.0/24}
    dcgw-grp1:
      vlans:
        itfce: {vlan_id: 505, kind: routed, ipv4_cidr: 10.100.55.0/24, ipv6_cidr: 2010:100:55::/48, ipv4_itfce_prefix_length: 31, ipv6_itfce_prefix_length: 127, addressing_schema: "dual-stack"}

The result is a bunch of parameters you get that can feed into a template or any other system if we want to use this. it provides a lot of flexibility as I am using this for myself for a while and would benefit others.

@kellerza
Copy link
Contributor Author

@henderiw indeed having pools or lists for auto-assignment would be great and can look at adding some infrastructure and a template function to get some of this at some point

Do you keep lease state for assigned IPs? Its probably required if we want to keep this idempotent.. or will we be ok to have an IP change if there is a slight topology change for example?

@kellerza
Copy link
Contributor Author

Should be good to go. SROS MD-CLI and SRL works well

# View config
clab_ config --path $TEMPLATES0 --topo ring1.clab.yml --templates base -p 100
# Deploy config
clab_ config --path $TEMPLATES0 --topo ring1.clab.yml --templates base
# Show route tables
clab_ config --path $TEMPLATES0 --topo ring1.clab.yml --templates show-route-table

Converted the global -d to be a count, so you can have various level of debug to display sent/received from SSH etc

  • -d normal
  • -dd show sent/received and prompt parsing
  • -ddd show internal channels debug
  • -dddd add byte based printing to debug ssh&prompts

cmd/root.go Show resolved Hide resolved
@hellt
Copy link
Member

hellt commented Apr 23, 2021

there seems to be an empty file templates/vr-sros/show-route-table-link.tmpl

@kellerza
Copy link
Contributor Author

there seems to be an empty file templates/vr-sros/show-route-table-link.tmpl

This is expected, since all templates needs an explicit -node and -link file. This show command is not applicable to links

The alternative is to silently ignore if not found/warn if debugging, but believe more explicit is better in this case (else bad filenames will hide issues).

cmd/config.go Outdated Show resolved Hide resolved
cmd/config.go Outdated Show resolved Hide resolved
cmd/config.go Outdated Show resolved Hide resolved
cmd/config.go Outdated Show resolved Hide resolved
cmd/config.go Outdated Show resolved Hide resolved
cmd/config.go Outdated Show resolved Hide resolved
clab/config/functions.go Outdated Show resolved Hide resolved
clab/config/functions.go Outdated Show resolved Hide resolved
clab/config/functions.go Outdated Show resolved Hide resolved
clab/config/functions.go Outdated Show resolved Hide resolved
clab/config/functions.go Outdated Show resolved Hide resolved
clab/config/transport.go Outdated Show resolved Hide resolved
@kellerza
Copy link
Contributor Author

kellerza commented Jun 9, 2021

Ready for review. Top comment updated.

@kellerza kellerza mentioned this pull request Jun 12, 2021
@hellt hellt merged commit 929abda into srl-labs:master Jun 30, 2021
@hellt
Copy link
Member

hellt commented Jun 30, 2021

thanks @kellerza
we will continue refining the ideas of config provisioning here #487

@kellerza kellerza deleted the config branch July 5, 2021 19:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants