Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addition of Cisco 1000v and 9300v with Containerlab Support #1168

Open
fluffytrlolz opened this issue May 9, 2024 · 10 comments
Open

Addition of Cisco 1000v and 9300v with Containerlab Support #1168

fluffytrlolz opened this issue May 9, 2024 · 10 comments
Labels
enhancement New feature or request

Comments

@fluffytrlolz
Copy link

I'm looking for support for the Cisco 9000v (9300v) and 1000v with the containerlab platform. I've tested it out, and it looks like it may be close, less the interface names that netlab sets up and passes into the clab.yml file. As an example :

interfaces:
- ifindex: 1
  ifname: Ethernet1/1
  ipv4: 10.1.0.1/30
  linkindex: 1
  name: r1 -> r2
  neighbors:

If the ifname was changed to "eth1", this would align with containerlab topology file and I believe the remaining pieces would not require change.

I'm happy to help with testing out any implementations to assist in this update.

Thanks!

@fluffytrlolz fluffytrlolz added the enhancement New feature or request label May 9, 2024
@ipspace
Copy link
Owner

ipspace commented May 10, 2024

CSR1000v should be a simple case of interface name mapping. The interfaces will have one name within the container, and another name within the virtual machine inside the container (I'm assuming you're doing this stuff: https://containerlab.dev/manual/kinds/vr-csr/)

To do that, you have to define a bunch of stuff under devices.csr.clab. The easiest way to start would be to define them in the topology file:

defaults:
  devices.csr.clab:
    image: <docker-image>
    node.kind: cisco_csr1000v
    interface.name: eth{ifindex+1}

I'm guessing the interface.name bit based on vMX definition.

Next, you'll have to write an 'are we ready' task list because the container starts "immediately" while the VM within the container takes "forever". See https://github.com/ipspace/netlab/blob/dev/netsim/ansible/tasks/readiness-check/vptx.yml for an example).

Alternatively, if you could somehow get the two Docker images over to me, I will try to figure it all out ;)

@fluffytrlolz
Copy link
Author

Apologies for taking so long to come back to this topic. I did the following updates to my topology.yml file:

defaults:
  device: nxos
  devices.nxos:
    clab.image: vrnetlab/vr-n9300v:9.3.6
    clab.node.kind: cisco_n9kv
    clab.interface.name: eth{ifindex}
  provider: clab
module: [ospf]

nodes:
  r1:
  r2:
links: [r1-r2]

I then went into the netsim/ansible/tasks/readiness-check/ and added a nxos.yml that contained a simple wait 15min:

- name: Wait for at least 15 minutes for 9000v inside CLAB...
  pause:
    minutes: 15
  when: |
    netlab_provider == 'clab'

This seemed to create the proper naming structure for the clab.yml:

interfaces:                                                                                 
- clab:                                                                                     
    name: eth1                                                                              
  ifindex: 1                                                                                
  ifname: Ethernet1/1                                                                       
  ipv4: 10.1.0.1/30                                                                         
  linkindex: 1                                                                              
  name: r1 -> r2                                                                            
  neighbors:                                                                                
  - ifname: Ethernet1/1                                                                     
    ipv4: 10.1.0.2/30                                                                       
    node: r2                                                                                
  ospf:                                                                                     
    area: 0.0.0.0                                                                           
    network_type: point-to-point                                                            
    passive: false                                                                          
  type: p2p                

I set everything up this way since I started toying around with the 9000v, and figured the NXOS model would match the closet as it runs NXOS. I am running into issues when creating the lab where the group_vars/nxos/topology.yml file created is using a different password than the 9000v defaults (admin/admin):

  # Ansible inventory created from ['/home/clab-user/netlab/cisco-test/topology.yml', 'package:topology-defaults.yml']
#

ansible_connection: network_cli
ansible_network_os: nxos
ansible_ssh_pass: vagrant
ansible_user: vagrant

I can get past this by manually changing those and then firing up the lab. However, should I create a different device type for the 9000v that sets the proper default password or is the better approach to override the default value for this? (I just haven't quite figured out the appropriate syntax to get that ansible_ssh_pass and ansible_user overridden)

@ipspace
Copy link
Owner

ipspace commented Jun 14, 2024

So glad to hear you got this far, although I'd prefer a more robust readiness check, maybe something along the lines of what @ssasso did for vMX: https://github.com/ipspace/netlab/blob/dev/netsim/ansible/tasks/vmx/initial.yml#L8 (it should be moved into readiness_check but that's a different story).

Ansible variables are easy. Just set devices.nxos.clab.group_vars to whatever values you need. See https://github.com/ipspace/netlab/blob/dev/netsim/devices/eos.yml#L85 for an example.

@fluffytrlolz
Copy link
Author

I was able to get the 9000v working with the following updates :

topology.yml:

  device: nxos
  devices.nxos.clab:
    group_vars.ansible_ssh_pass: admin
    group_vars.ansible_user: admin
    image: vrnetlab/vr-n9300v:9.3.6
    node.kind: cisco_n9kv
    interface.name: eth{ifindex}
  provider: clab
module: [ospf]

nodes:
  r1:
  r2:
links: [r1-r2]

nxos.yml added to the readiness-check tasks :

- name: Execute local ssh command to check 9000v readiness
  local_action:
    module: shell
    cmd: >
      sshpass -p '{{ ansible_ssh_pass }}' ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null {{ ansible_user }}@{{ ansible_host }} 'show int eth1/1'
  register: command_out
  until: command_out.rc == 0
  retries: 40
  delay: 30
  when: clab.kind is defined

- name: Confirm readiness of each host
  debug:
    msg: "Host {{ hostvars[inventory_hostname].hostname }} is ready."
  when: command_out.rc == 0

I'll look at the 1000v next and see what changes are required to make those come up. Let me know if you have any other ideas on how I should clean this up, or if this is how I should proceed. Would it be possible to integrate the nxos.yml into the main? I wondered if the other nxos VMs had similar issues to what I encountered or if it's just how I am implementing in containerlab.

Thanks!

@ipspace
Copy link
Owner

ipspace commented Jun 17, 2024

I was able to get the 9000v working with the following updates:

Gee, if only you were a day faster, they would have been in 1.8.3 ;)

Let me know if you have any other ideas on how I should clean this up, or if this is how I should proceed.

This looks pretty decent to me. Not much I would change. I would probably repackage your code into a more generic "vm-in-container" task list and include it in nxos-clab.yml.

Would it be possible to integrate the nxos.yml into the main?

Of course. I'll add it, noting where it came from.

Would it be possible for you to test the solution once I do the packaging?

I wondered if the other nxos VMs had similar issues to what I encountered or if it's just how I am implementing in containerlab.

NXOS has a generic problem that it claims it's ready before its interfaces are ready (Junos on vPTX seems to have a similar problem). We're dealing with that in the NXOS config deployment task list, and I planned to move that to the readyness check for a long time. Now I have a good reason to get it done ;)

What you're experiencing though is specific to the way the VM is packaged in a container with an SSH proxy sitting in front of it.

@fluffytrlolz
Copy link
Author

Would it be possible for you to test the solution once I do the packaging?

Yep, I can test the updates when included. I will try to test the 1000v this evening and see if it is just an interface naming convention or if a readiness check is also required. With any luck, I'll have the 1000v and 9000v tested in time for 1.8.4 :)

@ipspace
Copy link
Owner

ipspace commented Jun 18, 2024

So I tried the hellt/vrnetlab project and nxos keeps crashing. I will not waste any more time trying to troubleshoot that.

Anyway, I copied your settings (apart from the image name) into nxos.yml, moved "Ethernet 1/1" readyness check into nxos-specific task list, added a generic "test if the VM in a container is ready" test, and nxos-clab.yml task list that just invokes the other two. The results are in the nxos-clab branch (changes in dev...nxos-clab)

There is pretty high probability that this should work, but I can't be 100% sure ;) Anyway, pull down the latest changes, switch to nxos-clab branch and give it a try. Keeping my fingers crossed ;))

As for CSR 1Kv, you'll have to use the same readyness check (see comments in https://github.com/ipspace/netlab/blob/dev/netsim/ansible/tasks/vmx/initial.yml for details). Copy the nxos-clab.yml into csr-clab.yml, and remove the "check for Ethernet 1/1" include_tasks

@ipspace
Copy link
Owner

ipspace commented Jun 18, 2024

So I tried the hellt/vrnetlab project and nxos keeps crashing. I will not waste any more time trying to troubleshoot that.

I'm an idiot. I tried to build a nxos container, not n9kv one. It all works now.

I changed the image name to what hellt/vrnetlab generates and reduced the retries to 20 (my setup worked after three retries, as each retry takes 30 seconds to time out).

@ipspace
Copy link
Owner

ipspace commented Jun 18, 2024

FWIW, I added the CSR part. It should work once you get the container up and running (it didn't work for me out of the box and I didn't have time to troubleshoot it)

@fluffytrlolz
Copy link
Author

I'll test it out today, thanks ! Interesting how quick your 9000v spun up. Mine definitely takes the 12 minutes , I'll have to double check the specs I gave the VM I've been running everything on. I'll also take a look at which rev I have, maybe a contributing factor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants