Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add documentation on the expectations of vendor images. #535

Merged
merged 3 commits into from
May 20, 2024
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,13 @@ KNE can easily be scaled to run large topologies utilizing its Kubernetes
backbone. This guide describes how to set up a k8s multi worker node cluster
and get a 150 node KNE topology up and running.

## Vendor Image Requirements

[Vendor Image Requirements](vendor.md)

KNE uses vendor supplied images. This document describes the expectations
for those images.

## Kubernetes Reference

[Kubernetes Reference](kubernetes_reference.md)
Expand Down
87 changes: 87 additions & 0 deletions docs/vendor.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
# Vendor Image Requirements

A Vendor Image is a docker container that can be used with KNE to emulate a
vendor's devices.

Without vendor images KNE is just an empty virtual machine rack that does
nothing. Vendor supplied images are what the user of KNE sees and is interested
in. This document describes the requirements and expectations of vendor images
to be used with KNE.

This document describes the image requirements and expectations in order to be
considered *KNE Qualified*.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does KNE qualified mean in this case? Will there be an indicator anywhere that the image is KNE qualified, as in, is this the requirements for us to merge a new node impl? Or do mark some of the node impls as "qualified"?


## KNE Uses

KNE was built to enable testing the functionality of networks without physical
hardware. Due to the obvious limitations of emulation, KNE is not designed to
test bandwidth and latency of connections. KNE is designed to enable testing of
the control protocols and interaction between devices. There are several
different types of testing.

### Testing new Topologies

KNE is used to test changes in network topology. Changes in network topology
can impact various protocols use in the network (e.g. BGP).

### Testing Changes in Protocol or Configuration

KNE is used to test protocol changes or other configuration changes.

### Testing Device Functionality

KNE is used to test changes to a device's Network Operating System (NOS). This
is a crucial step in validating a devices usability for a particular purpose
when a new NOS is released.

## Fidelity

A network device in KNE can be viewed as two main components, the control plane
and the data plane (the ASIC).

KNE is used to test the control plane of the NOS. This requires the control
software in the virtual device behave the same as in the hardware. It is
expected that the control software used in an image is the same as the
software used on the physical device and that it is configured and reacts in the
same way as the hardware.

KNE is not designed to test the data plane or ASIC. The emulated data plane
must support routing and packet forwarding. ASIC specific commands and features
do not need to be supported as long as the data plane provides basic
functionality.

All of these use cases require that the vendor images to behave functionally as
if it were the hardware. The image is expected to be built from the same source
code base as the NOS used in the hardware. Faithful emulation of the ASIC is
not a requirement. The emulated ASIC (data plane) must correctly handle routing
changes and packet forwarding.

### Deviations

The vendor should supply a document that describes what series of devices this
image emulates as well as known limits and deviations. These include

* Protocols not supported
* Protocols that deviate from the hardware (and how)
* OpenConfig paths only supported by hardware
* OpenConfig paths that report different results compared to the hardware.
* Known limitations of the emulated device

The listed OpenConfig paths need not be leaf nodes. Wildcards may be used in
the path where applicable.

## Testing

Vendor images must be tested prior to publication. A standard set of tests is
found at <INSERT LOCATION HERE>. It is expected that the image undergoes
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which tests are intended here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure these tests yet exist, though at a minimum KNE when the image is used bringing up KNE should not hang or crash because of the image, nor prevent other images from being used.

repeated testing to identify non-deterministic errors.

## Support

Vendors are responsible for support of their images. This includes the node
implementation in
[kne/topo/node](https://github.com/openconfig/kne/tree/main/topo/node) as well
as the vendor specific examples in
[kne/examples](https://github.com/openconfig/kne/tree/main/examples).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are community contributions to vendor node implementations welcome? Do you expect or require vendors to provide review on community contributions?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think community contributions should be welcome though the vendor should be the one to approve changes to code they are responsible for. I will updated the document.

Thanks!

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree in principle, but I would caution about being too dogmatic about this. For example, I have a vendor who is not inclined to provide support for one containerized platform of theirs in KNE; I have local (not ready to publish, but do intend to) modifications to provide that support and I would be concerned if I seemed there was limited hope of this being accepted upstream given the vendors' apparent reticence.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the question will come down to who has taken the responsibility to maintain that code. If the ACME Network Switch company has provided a node implementation (implying they are the maintainer of that code) for most of their switches and you want to add in code to support one of the missing switches you have two paths. The best option is to provide a PR that updates their node implementation and get their approval (as they maintain that node implementation). If that path is not possible there can always be a new node implementation offered, though the benefits of this would need to out weigh the costs and technical debt. If a the maintainer of a node implementation is non-response then you could take over maintenance of that code.

The vendors should review and provide approval for any community contributions
to the code they are responsible for.
Loading